Computing and Information Systems - Theses

Permanent URI for this collection

Search Results

Now showing 1 - 10 of 13
  • Item
    Thumbnail Image
    Strategic information security policy quality assessment: a multiple constituency perspective
    MAYNARD, SEAN ( 2010)
    An integral part of any information security management program is the information security policy. The purpose of an information security policy is to define the means by which organisations protect the confidentiality, integrity and availability of information and its supporting infrastructure from a range of security threats. The tenet of this thesis is that the quality of information security policy is inadequately addressed by organisations. Further, although information security policies may undergo multiple revisions as part of a process development lifecycle and, as a result, may generally improve in quality, a more explicit systematic and comprehensive process of quality improvement is required. A key assertion of this research is that a comprehensive assessment of information security policy requires the involvement of the multiple stakeholders in organisations that derive benefit from the directives of the information security policy. Therefore, this dissertation used a multiple-constituency approach to investigate how security policy quality can be addressed in organisations, given the existence of multiple stakeholders. The formal research question under investigation was: How can multiple constituency quality assessment be used to improve strategic information security policy? The primary contribution of this thesis to the Information Systems field of knowledge is the development of a model: the Strategic Information Security Policy Quality Model. This model comprises three components: a comprehensive model of quality components, a model of stakeholder involvement and a model for security policy development. The strategic information security policy quality model gives a holistic perspective to organisations to enable management of the security policy quality assessment process. This research contributes six main contributions as stated below:  This research has demonstrated that a multiple constituency approach is effective for information security policy assessment  This research has developed a set of quality components for information security policy quality assessment  This research has identified that efficiency of the security policy quality assessment process is critical for organisations  This research has formalised security policy quality assessment within policy development  This research has developed a strategic information security policy quality model  This research has identified improvements that can be made to the security policy development lifecycle The outcomes of this research contend that the security policy lifecycle can be improved by: enabling the identification of when different stakeholders should be involved, identifying those quality components that each of the different stakeholders should assess as part of the quality assessment, and showing organisations which quality components to include or to ignore based on their individual circumstances. This leads to a higher quality information security policy, and should impact positively on an organisation’s information security.
  • Item
    Thumbnail Image
    Internet host geolocation based on probabilistic latency models
    Arif, Mohammed Jubaer ( 2010)
    The robust and scalable growth of the Internet has allowed value added services that provide enhanced user experience. Offering information based on geographic location to Internet users is one of the newest and notable advancements. Finding the geographical location of the user on the Internet, commonly referred to as geolocation, is one of the challenging problems currently addressed by the research community. Geolocation of Internet hosts is not straightforward, mainly due to the underlying behavior of the IP network. In the conventional telephone network each telephone number has a physical address associated with it. On the contrary, an IP address does not have any notion of location or any physical location associated with it. Dynamic assignment of IP addresses makes the task of geolocation even harder. This thesis focuses on scalable and accurate approaches to finding the physical location of Internet users. We consider the scenario of a user connected to the Internet through a machine, which we refer to as a host. Of the two commonly used approaches, repository-based and measurement-based, we primarily focus on geolocating Internet hosts using the measurement-based approach. Existing measurement-based geolocation approaches are based on simple bounds on latency measurements and do not take into consideration the uncertainty associated with latency measurements effectively. We focus on improving host geolocation based on probabilistic models for Internet latency. In this thesis, we address the following issues: • We provide a systematic study to understand the relationship between Internet latency and distance. The observations from this study motivate us to propose a probabilistic model for the latency to distance relationship that captures the high variability of latency against distance. • We propose two distinct probabilistic latency models for the latency to distance relationship- strict and adaptable. • We propose and analyze two novel measurement-based geolocation techniques based on the latency models derived earlier. • The scalability and reliability of the proposed approaches are investigated and further improvements are proposed taking into account the relative landmark location and latency data anomalies. • We illustrate and demonstrate the applicability of geolocation techniques with different accuracy levels; in particular, with a novel Voice over IP (VoIP) architecture Service Oriented VoIP (SOVoIP). The main contributions of the thesis are more accurate and scalable measurement-based geolocation techniques, suitable for Internet-scale geolocation.
  • Item
    Thumbnail Image
    Designing sports: exertion games
    Mueller, Florian (Floyd) ( 2010)
    Exertion games are computer games that require intense physical effort from its users. Unlike traditional computer games, exertion games offer physical health benefits in addition to the social benefits derived from networked games. This thesis contributes an understanding of exertion games from an interaction design perspective to support researchers analysing and designers creating more engaging exertion games. Playing with other participants can increase engagement and hence facilitate the associated benefits. Computer technology can support such social play by expanding the range of possible participants through networking advances. However, there is a lack of understanding how technological design can facilitate the relationship between exertion and social play, especially in mediated environments. In response, this thesis establishes an understanding of how mediating technology can support social exertion play, in particular when players are in geographically distant locations. This understanding is forged through the design of three “sports over a distance” games. The experience of engaging with them was studied qualitatively to gain a rich understanding of how design facilitates social play in exertion games. The three games “Jogging over a Distance”, “Table Tennis for Three”, and “Remote Impact - Shadowboxing over a Distance” allow investigating different perspectives of mediated exertion play, since they represent three categories of richness on a social play continuum across both the virtual and the physical world. Studies of the experience of engaging with the three games resulted in an exertion framework that consists of six conceptual themes framed by four perspectives on the body and three on games. A fourth study demonstrated that the understanding derived from the investigation of the use and design of the games can support designers and researchers with the analysis of existing games and aid the creative process of designing new exertion games. This thesis provides the first understanding of how technology design facilitates social play in exertion games. In doing so, it expands our knowledge of how to design for the active body, broadening the view of the role of the body when interacting with computers. Offering an increased understanding of exertion games enables game designers to create more engaging games, hence providing players more reasons to exert their bodies, supporting them in profiting from the many benefits of exertion.
  • Item
    Thumbnail Image
    Scheduling and management of data intensive application workflows in grid and cloud computing environments
    PANDEY, SURAJ ( 2010)
    Large-scale scientific experiments are being conducted in collaboration with teams that are dispersed globally. Each team shares its data and utilizes distributed resources for conducting experiments. As a result, scientific data are replicated and cached at distributed locations around the world. These data are part of application workflows, which are designed for reducing the complexity of executing and managing on distributed computing environments. In order to execute these workflows in time and cost efficient manner, a workflow management system must take into account the presence of multiple data sources in addition to distributed compute resources provided by platforms such as Grids and Clouds. Therefore, this thesis builds upon an existing workflow architecture and proposes enhanced scheduling algorithms, specifically designed for managing data intensive applications. It begins with a comprehensive survey of scheduling techniques that formed the core of Grid systems in the past. It proposes an architecture that incorporates data management components and examines its practical feasibility by executing several real world applications such as Functional Magnetic Resonance Imaging (fMRI), Evolutionary Multi-objective Optimization algorithms, and so forth, using distributed Grid and Cloud resources. It then proposes several heuristics based algorithms that take into account time and cost incurred for transferring data from multiple sources while scheduling tasks. All the heuristic proposed are based on multi-source-parallel-data-retrieval technique in contrast to retrieving data from a single best resource, as done in the past. In addition to non-linear modeling approach, the thesis explores iterative techniques, such as particle-swarm optimization, to obtain schedules quicker. In summary, this thesis makes several contributions towards the scheduling and management of data intensive application workflows. The major contributions are: (i) enhanced the abstract workflow architecture by including components that handle multisource parallel data transfers; (ii) deployed several real-world application workflows using the proposed architecture and tested the feasibility of the design on real test beds; (iii) proposed a non-linear model for scheduling workflows with an objective to minimize both execution time and execution cost; (iv) proposed static and dynamic workflow scheduling heuristic that leverages the presence of multiple data sources to minimize total execution time; (v) designed and implemented a particle-swarm-optimization based heuristic that provides feasible solutions to the workflow scheduling problem with good convergence; (vi) implemented a prototype workflow management system that consists of a portal as user-interface, a workflow engine that implements all the proposed scheduling heuristic and the real-world application workflows, and plug ins to communicate with Grid and Cloud resources.
  • Item
    Thumbnail Image
    Privacy-aware spatial surveys
    Xie, Hai Ruo ( 2010)
    Surveying spatial knowledge may invade the privacy of individuals. The privacy concerns may significantly affect the quality of the collected data. We address the privacy issues for two classes of monitoring, implicit monitoring and explicit monitoring. In implicit monitoring, spatial data is collected without notifying the individuals being monitored. Implicit monitoring is a common means to surveying moving objects in large areas. Different to implicit monitoring, explicit monitoring needs the consent of individuals to collect private data. Such monitoring is normally conducted through public surveys. Both classes of monitoring can cause significant privacy issues. To address the privacy issues, we propose three approaches in this research.For implicit monitoring, we develop two data structures. The first data structure, Distributed Euler Histograms (DEHs), is designed for monitoring objects that can move freely in a space. It guarantees a high level of privacy protection by avoiding the collection of any identification information. The second data structure, Euler Histograms on Short ID (EHSID), is suitable for monitoring objects with constrained movements such as vehicles in road networks. A monitoring system built upon EHSID not only solves a broad range of aggregate queries on the spatial data, but also guarantees a high level of privacy protection as real identification data is not collected. For explicit monitoring, we focus on an innovative type of public surveys, negative surveys, which collect data that is complementary to the truth. We develop Gaussian Negative Surveys that significantly improve the accuracy level from the existing negative surveys.
  • Item
    Thumbnail Image
    Measurement in information retrieval evaluation
    Webber, William Edward ( 2010)
    Full-text retrieval systems employ heuristics to match documents to user queries. Retrieval correctness cannot, therefore, be formally proven, but must be evaluated through human assessment. To make evaluation automatable and repeatable, assessments of which documents are relevant to which queries are collected in advance, to form a test collection. Collection-based evaluation has been the standard in retrieval experiments for half a century, but only recently have its statistical foundations been considered. This thesis makes several contributions to the reliable and efficient measurement of the behaviour and effectiveness of information retrieval systems. First, the high variability in query difficulty makes effectiveness scores difficult to interpret, analyze, and compare. We therefore propose the standardization of scores, based on the observed results of a set of reference systems for each query. We demonstrate that standardization controls variability and enhances comparability. Second, while testing evaluation results for statistical significance has been established as standard practice, the importance of ensuring that significance can be reliably achieved for a meaningful improvement (the power of the test) is poorly understood. We introduce the use of statistical power analysis to the field of retrieval evaluation, finding that most test collections cannot reliably detect incremental improvements in performance. We also demonstrate the pitfalls in predicting score standard deviation during design-phase power analysis, and offer some pragmatic methodological suggestions. Third, in constructing a test collection, it is not feasible to assess every document for relevance to every query. The practice instead is to run a set of systems against the collection, and pool their top results for assessment. Pooling is potentially biased against systems which are neither included in nor similar to the pooled set. We propose a robust, empirical method for estimating the degree of pooling bias, through performing a leave-one-out experiment on fully pooled systems and adjusting unpooled scores accordingly. Fourth, there are many circumstances in which one wishes directly to compare the document rankings produced by different retrieval systems, independent of their effectiveness. These rankings are top-weighted, non-conjoint, and of arbitrary length, and no suitable similarity measures have been described for such rankings. We propose and analyze such a rank similarity measure, called rank-biased overlap, and demonstrate its utility, on real and simulated data. Finally, we conclude the thesis with an examination of the state and function of retrieval evaluation. A survey of published results shows that there has been no measurable improvement in retrieval effectiveness over the past decade. This lack of progress has been obscured by the general use of uncompetitive baselines in published experiments, producing the appearance of substantial and statistically significant improvements for new systems without actually advancing the state of the art.
  • Item
    Thumbnail Image
    Algorithms for the study of RNA and protein structure
    Stivala, Alexander David ( 2010)
    The growth in the number of known sequences and structures of RNA and protein molecules has led to the need to solve many computationally demanding problems in the analysis of RNA and protein structure. This thesis describes algorithms for structural comparison of RNA and protein molecules. In the case of proteins, it also describes a technique for automatically generating two-dimensional diagrammatic representations for visual comparison. A general technique for parallelizing dynamic programs in a straightforward way, by means of a shared lock-free hash table implementation and randomization of subproblem ordering is described. This generic approach is applied to several well-known dynamic programs, as well as a dynamic program for structural alignment of RNA molecules by aligning their base pairing probability matrices. Two algorithms for protein structure and substructure searching are described. These algorithms are also capable of finding non-sequential matches, that is, matches between structures where the sequential order of secondary structure elements is not preserved. The first algorithm is based on the relaxation of an earlier quadratic integer problem (QIP) formulation to a quadratic program (QP). The second algorithm uses the same formulation but approximates it using simulated annealing. It is shown that this results in significant increases in speed. This algorithm is also capable of greater accuracy when assessed as a fold recognition method. A parallel implementation of this algorithm on modern graphics processing unit (GPU) hardware is also described. This parallel implementation results in a further significant speedup, and, to the best of our knowledge, is the first use of a GPU for the protein structural search problem. Finally, a system to automatically generate two-dimensional representations of protein structure is described. Such diagrams are particularly useful in analysing complex protein folds. A method for using these diagrams as an interface to the protein substructure search methods is also described.
  • Item
    Thumbnail Image
    Visualising the impact of changes to precision grammars
    Letcher, Ned ( 2010)
    The development of precision grammars is an inherently resource intensive process. In this thesis we investigate approaches for providing grammar engineers with greater feedback on the impact of changes made to grammars. We describe two different visualisations which are created by comparing parser output from two different states of the grammar. The first involves the ranking of features found in parser output according to their magnitude of change so as to provide a low-level picture of the affected parts of the grammar. The second involves performing clustering over sentences whose parsability has changed in an attempt to find related groups of changes and accompanying sentences which exemplify each locus of change. These approaches provide complimentary avenues of feedback which can hopefully improve the efficiency of the grammar engineering development process.
  • Item
    Thumbnail Image
    Meta scheduling for market-oriented grid and utility computing
    Garg, Saurabh Kumar ( 2010)
    Grid computing enables the sharing and aggregation of autonomous IT resources to deliver them as computing utilities to end users. The management of the Grid environment is a complex task as resources are geographically distributed, heterogeneous and autonomous in nature, and their users are self-interested. In utility-oriented Grids, users define their application requirements and compete to access the most efficient and cheapest resources. Traditional resource management systems and algorithms are based on system-centric approaches which do not take into account individual requirements and interests. To this end, market-oriented scheduling is an adequate way to solve the problem. But current market-oriented systems generally, either try to maximise one user’s utility or one provider’s utility. Such approaches fail to solve the problem of contention for cheap and efficient resources which may lead to unnecessary delays in job execution and underutilisation of resources. To address these problems, this thesis proposes a market-oriented meta-scheduler called “Meta-Broker”, which not only coordinates the resource demand but also allocates the best resources to users in terms of monetary and performance costs. The thesis results demonstrate that considerable cost reduction and throughput can be gained by adopting our proposed approach. The meta-broker has a semi-decentralised architecture, where only scheduling decisions are made by the meta-broker while job submission, execution and monitoring are delegated to user and provider middleware. This thesis also investigates market-oriented meta-scheduling algorithms which aim to maximise the utility of participants. The market-oriented algorithms consider Quality of Service (QoS) requirements of multiple users to map jobs against autonomous and heterogeneous resources. This thesis also presents a novel Grid Market Exchange architecture which provides the flexibility to users in choosing their own negotiation protocol for resource trading. The key research findings and contributions of this thesis are: - The consideration of QoS requirements of all users is necessary for maximising users’ utility and utilisation of resources. The uncoordinated scheduling of applications by personalised user-brokers leads to overloading of cheap and efficient resources. - It is important to exploit the heterogeneity between different resource sites/data centers while scheduling jobs to maximise the provider’s utility. This consideration not only reduce energy cost of computing infrastructure by 33% on average, but also enhance the efficiency of resources in terms of carbon emissions. - By considering both system metrics and market parameters, we can enable more effective scheduling which maximises the utility of both users and resource providers.
  • Item
    Thumbnail Image
    The information needs of information carers: a framework to support a comprehensive understanding
    ALZOUGOOL, BASIL ( 2010)
    There has been little research that fully explores the nature and aspects of information needs of informal carers or provides a complete account of these needs. A comprehensive understanding of these needs is important because it is the first step to meeting these needs effectively. This research project set out to identify all the potential aspects of informal carers' information needs that could be used for investigating, understanding and classifying these needs comprehensively. Three main studies were undertaken. The project initially adopted a conceptual approach (study 1) to develop a framework for these aspects. Then this framework was tested, refined and validated using a qualitative approach (study 2), followed by a quantitative approach (study 3). The conceptual study involved analysing critically the existing literature on the information needs of informal carers, as well as related literature on health informatics and information needs more generally. Following this analysis, a conceptual framework was developed that encompassed the potential aspects of informal carers' information needs. This framework divides these aspects into two broad types: (i) the information needs focus, which includes four foci (the persons needing care, the informal carers themselves, the interaction between informal carers and persons needing care, and the interaction between informal carers and other parties) and (ii) the information needs state, which includes four states (recognised-demanded, recognised-undemanded, unrecognised-demanded, and unrecognised-undemanded). The qualitative study involved conducting two separate but related sub-studies (study 2A and study 2B) with nine informal carers of diabetic children as an example of a group of informal carers. Study 2A aimed to confirm or disconfirm the existence of the four foci of information needs of informal carers and to identify any additional foci. Study 2B aimed to confirm or disconfirm the existence of the four states of information needs of informal carers and to identify any additional states. It also aimed to develop items for the questionnaire that was used to measure the information needs state with a large sample of informal carers in the quantitative study. The quantitative study employed a questionnaire with 198 informal carers (above 18 years old) of all kinds. It aimed to test the defined information needs states, to examine whether they were distinct from each other in the wider community of informal carers, and to examine whether they varied in terms of the demographic and socioeconomic characteristics of informal carers. The existence of the two broad types of aspects of information needs of informal carers was confirmed empirically. This research also showed that informal carers may give priority to some foci over others at different stages during their caring journey, and the occurrence and frequency of the four states may vary among informal carers. Thus the validated framework worked well in portraying a comprehensive picture of the information needs of informal carers in this study. These two different aspects of information needs of informal carers are in turn useful to researchers and practitioners. For researchers, these two aspects provide a new perspective from which to better investigate, understand and classify the information needs of informal carers and information needs in general. For practitioners, these two aspects assist in designing and providing information that may meet the needs of informal carers more effectively.