Computing and Information Systems - Theses

Permanent URI for this collection

Search Results

Now showing 1 - 2 of 2
  • Item
    Thumbnail Image
    Semi-supervised community detection and clustering
    Ganji, Mohadeseh ( 2017)
    Data clustering and community detection in networks are two important tasks in machine learning which aim to group the data into similar objects or densely connected sub-graphs. However, applying an appropriate similarity measure to obtain the highest accuracy is always a challenge. Furthermore, in some real- world applications, some background knowledge and information exists about the true or desired assignments or properties of clusters and communities. The side-information could be obtained by experiments, domain knowledge or user preferences in different applications. Some constraints may also be imposed to the system due to natural complexity of the problem or resource limitations. Community detection (clustering) in the presence of side-information represented as supervision constraints is called semi-supervised community detection (clustering). However, finding efficient approaches to take the most advantage of this pre-existing information to improve quality of the solutions is still a challenge. In this thesis, we study community detection and clustering problems with and without incorporating domain knowledge for which we propose a similarity measure and exact and approximate optimization techniques to improve the accuracy of the results. In this thesis, we address limitations of a popular community detection measure called modularity and propose an improved measure called generalized modularity which quantifies similarity of network vertices more realistically and comprehensively by incorporating vertex similarity concepts. The pro- posed generalized modularity outperforms the state of the art modularity optimization approach in community detection. In addition, to incorporate background knowledge and user preferences into community detection process, two semi-supervised approaches are proposed in this thesis: First we address the modelling flexibility issue in the literature of semi- supervised community detection to simultaneously model instance level and community level constraints. We propose a generic framework using constraint programming technology which enables incorporating a variety of instance level, community level and complex supervision constraints at the same time. The framework also enables modelling local community definitions as constraints to a global community detection scheme and is able to incorporate a variety of similarity measures and community detection objective functions. Using a high level modelling language enables the proposed semi-supervised community detection framework to have the flexibility of applying both complete and incomplete techniques and has the advantage of proving optimality of the solutions found when using complete techniques. Second, a new algorithm for semi-supervised community detection is pro- posed based on discrete Lagrange multipliers which incorporates pairwise constraints. Unlike most of the existing semi-supervised community detection schemes that modify the graph adjacency based on the supervision constraints, the pro- posed algorithm works with quality measures such as modularity or generalized modularity and guides the community detection process by systematically modifying the similarity matrix only for hard-to satisfy constraints. The pro- posed algorithm commits to satisfy (almost) all of the constraints to take the most advantage of the existing supervision. It outperforms the existing semi- supervised community detection algorithms in terms of satisfying the supervision constraints and noise resistance. Another contribution of this thesis is to incorporate instance level supervision constraints into clustering problem. In this regard, a k-means type semi- supervised clustering algorithm is proposed which can take the most advantage of the pre-existing information to achieve high quality solutions satisfying the constraints. The proposed algorithm is based on continuous Lagrange multipliers and penalizes the constraint violations in a systematic manner which guides the cluster centroids and cluster assignments towards satisfying all of the constraints. The achievements of this thesis are supported by several experiments and publications.
  • Item
    Thumbnail Image
    Automatic optical coherence tomography imaging analysis for retinal disease screening
    Hussain, Md Akter ( 2017)
    The retina and the choroid are two important structures of the eye and on which the quality of eye sight depends. They have many tissue layers which are very important for monitoring the health and the progression of the eye disease from an early stage. These layers can be visualised using Optical Coherence Tomography (OCT) imaging. The abnormalities in these layers are indications of several eye diseases that can lead to blindness, such as Diabetic Macular Edema (DME), Age-related Macular Degeneration (AMD) and Glaucoma. If the retina and the choroid are damaged there is little chance to recover normal sight. Moreover, any damage in them will lead to blindness if no or late treatment is administered. With eye diseases, early detection and treatment are more effective and cheaper. Biomarkers extracted from these tissue layers, such as changes in thickness of the layers, will note the presence of abnormalities called pathologies such as drusen and hyper-reflective intra-retinal spots, and are very effective in the early detection and monitoring the progression of eye disease. Large scale and reliable biomarker extraction by manual grading for early detection is infeasible and prone to error due to subjective bias and are also cost ineffective. Automatic biomarker extraction is the best solution. However, OCT image analysis for extracting biomarkers is very challenging because of noisy images, low contrast, extremely thin retinal layers, the presence of pathologies and complex anatomical structures such as the optic disc and macula. In this thesis, a robust, efficient and accurate automated 3D segmentation algorithm for OCT images is proposed for the retinal tissue layers and the choroid, thus overcoming those challenges. By mapping OCT image segmentation problem as a graph problem, we converted the detection of layer boundaries to the problem of finding the shortest paths in the mapped graph. The proposed method exploits layer-oriented small regions of interest, edge pixels from canny edge detections as nodes of the graph, and incorporates prior knowledge of the structures into edge weight computation for finding the shortest path using Dijkstra’s shortest path algorithm as a boundary of the layers. Using this segmentation scheme, we were able to segment all the retinal and choroid tissue layers very accurately and extract eight novel biomarkers such as attenuation of the retinal nerve fibre layer, relative intensity of the ellipsoid zone, thickness of the retinal layers, and volume of pathologies i.e. drusen, etc. In addition, we demonstrated that using these biomarkers provides a very accurate (98%) classification model for classifying eye patients into those with normal, DME and AMD diseases which can be built using a Random Forest classifier. The proposed segmentation method and classification method have been evaluated on several datasets collected locally at the Center for Eye Research Australia and from the public domain. In total, the dataset contains 56 patients for the evaluation of the segmentation algorithms and 72 patients for the classification model. The method developed from this study has shown high accuracy for all layers of the retina and the choroid over eight state-of-the-art methods. The root means square error between manually delineated and automatically segmented boundaries is as low as 0.01 pixels. The quantification of biomarkers has also shown a low margin of error from the manually quantified values. Furthermore, the classification model has shown more than 98% accuracy, which outperformed four state-of-the-art methods with an area under the receiver operating characteristic curve (AUC) of 0.99. The classification model can also be used in the early detection of diseases which allows significant prevention of blindness as well as providing a score/index for the condition or prediction of the eye diseases. In this thesis, we have also developed a fully automated prototype system, OCTInspector, for OCT image analysis using these proposed algorithms and methods.