Otolaryngology - Theses

Permanent URI for this collection

Search Results

Now showing 1 - 1 of 1
  • Item
    Thumbnail Image
    Deep Learning for Three-Dimensional Multi-Modal Medical Image Processing: From Classification to Segmentation
    ISLAM, KH TOHIDUL ( 2022)
    Deep learning is a state-of-the-art method in machine learning, proven successful in many application domains. This research aims to use a deep learning framework for three-dimensional (3D) medical image segmentation. Multi-modal images are used in this process as they provide additional information that can make segmentation easier. For example, in computed tomography (CT), more rigid materials such as bone are better defined, while in magnetic resonance imaging (MRI), softer materials such as anatomical structures are better defined. However, multi-modal images for the same person may not be in the same orientation and have different resolutions. Therefore, aligning (or registering) multi-modal images prior to segmentation is an important task. Furthermore, the type of image under consideration is essential in the segmentation process. For instance, a method used to segment the brain may not be applicable in segmenting the heart. Therefore, to fully automate the process of segmentation, it is crucial to classify the multi-modal images and register them before performing a segmentation. Thus, in this thesis, we introduce methods based on deep learning for the classification, registration, and segmentation of multi-modal medical images. For each of these tasks, mainly due to practical limitations such as availability of datasets, we develop and validate our methods for one application/dataset. Firstly, we explore the problem of classifying 3D multi-modal images of different organs. To this end, we introduce a rotation and translation invariant classification model. We use the fact that most human organs are (approximately) symmetrical to simplify the problem. We extract a two-dimensional (2D) representative slice of the 3D organs and use that 2D slice as the input to a deep learning model to perform the classification. We prove experimentally that our method is comparable to existing classification techniques when the assumptions of viewing direction and patient orientation are met. Moreover, we establish that it shows high accuracy even when these assumptions are violated, where other methods fail. Secondly, we introduce a novel deep learning method for registering 3D multi-modal medical images of the head. We use image augmentation methods to create synthetic images to supplement an existing dataset. We use a validated method of registration (one plus one evolutionary optimization) to generate ground truth data and use the symmetry of the human head as an initial alignment to aid the optimization process. Before performing the registration, we also use a classification model to identify the imaging modality (MRI and CT) in order to determine the order of input for the registration to make the approach fully automatic. Then, we combine deep and conventional machine learning methods to predict the transformation/registration parameters. We show that the proposed methods outperform similar existing methods on publicly available MRI and CT images of the head. Lastly, we introduce a deep learning framework to perform brain tumor segmentation. To achieve this, we present a method of enhancing an existing MRI dataset by generating synthetic corresponding CT images, where prior domain knowledge of the registration method is applied to get a paired multi-modal ground-truth dataset. We fine-tune our network architecture and training strategies to segment brain tumors and show that the proposed model outperforms similar existing methods.