Computing and Information Systems - Theses

Permanent URI for this collection

Search Results

Now showing 1 - 10 of 30
  • Item
    Thumbnail Image
    Safe acceptance of zero-confirmation transactions in Bitcoin
    Yang, Renlord ( 2016)
    Acceptance of zero confirmation transactions in Bitcoin is inherently unsafe due to the lack of consistency in states between nodes in the network. As a consequence of this, Bitcoin users must endure a mean wait time of 10 minutes to accept confirmed transactions. Even so, due to the possibility of forks in the Blockchain, users who may want to avoid invalidation risks completely may have to wait up to 6 confirmations, which in turn results in a 60 minute mean wait time. This is untenable and remains a deterrent to the utility of Bitcoin as a payment method for merchants. Our work seeks to address this problem by introducing a novel insurance scheme to guarantee a deterministic outcome for transaction recipients. The proposed insurance scheme utilizes standard Bitcoin scripts and transactions to produce inter-dependent transactions which will be triggered or invalidated based on the occurance of potential doublespend attacks. A library to setup the insurance scheme and a test suite was implemented for anyone who may be interested in using this scheme to setup a fully anonymous and trustless insurance scheme. Based on our test in Testnet, our insurance scheme was successful at defending against 10 out of 10 doublespend attacks.
  • Item
    Thumbnail Image
    Automatic caloric expenditure estimation with smartphone's built-in sensors
    Cabello Wilson, Nestor Stiven ( 2016)
    Fitness-tracking systems are technologies commonly used to enhance peoples' lifestyles. Feedback, usability, and ease of acquisition are fundamental to achieving the good physical condition goal. Users need constant motivation as a way to keep their interest in the fitness system and consequently, continue on a healthy lifestyle track. However, although feedback is increasingly being incorporated in many fitness-tracking systems, usability and ease of acquisition are remaining shortcomings that need to be enhanced. Features such as automatic activity identification, low-energy consumption, simplicity and goals-achieved notifications provide a good user experience. Nevertheless, most of these functions require the acquisition of a relatively expensive fitness-tracking device. Smartphones provide a partial solution by allowing users an easy access to multiple fitness applications, which reduce the need for purchasing another gadget. Nonetheless, improvements in the user experience are still necessary. In the other hand, wearables devices satisfy the usability, however, the cost of their acquisition represents an impediment to some users. The system proposed in this research aims to handle these issues and offers a solution by combining the benefits from mobile applications such as feedback and ease of acquisition, with the usability that wearable devices provide, into a smartphone Android application. Data collected from a single user while performing a series of common daily activities namely walking, jogging, cycling, climbing stairs, and walking downstairs, was used to classify and provide an automatic identification of these activities with an overall accuracy of 91%, and identifying the stairs activities with an accuracy of 81%. Finally, the caloric expenditure, which we considered the most important metric for motivating a user to perform a physical activity, was estimated by following the oxygen consumption equations from the American College of Sports Medicine (ACSM).
  • Item
    Thumbnail Image
    A secure innovation process for start-ups: Minimising knowledge leakage and protecting IP
    Pitruzzello, Sam ( 2016)
    Failing to profit from innovations as a result of knowledge leakage is a key business risk for high-tech start-ups. Innovation is central to the success of a start-up and their competitive advantage in the market place therefore methods to protect intellectual property (IP) and minimise knowledge leakage is crucial. However, high-tech start-ups have limited resources rendering them more vulnerable to knowledge leakage risks compared to mature enterprises. Unfortunately, research on knowledge leakage and innovation processes falls short of addressing the needs of high-tech start-ups. Since knowledge leakage can occur in a number of ways involving many scenarios, organisations typically employ a variety of IP protection and knowledge leakage mitigation methods to minimise the risks. This minor thesis fills the research gaps on innovation processes and knowledge leakage for start-ups. A literature review was conducted into the bodies of research on knowledge leakage and innovation. Following the literature review, a secure innovation process (SIP) model was developed from the research. SIP includes the concept of the risk window which allows a start-up to identify, assess and manage knowledge leakage risks at various stages in the innovation process.
  • Item
    Thumbnail Image
    An exploratory study of information security auditing
    Kudallur Ramanathan, Ritu lakshmi ( 2016)
    Management of Information security in organizations is a form of risk management where threats to information assets are managed by implementing various controls. An important task in this cycle of Information Security risk management is Audit, whose function is to provide assurance to organizations that their security controls are indeed working as intended. Numerous frameworks and guidelines are available for auditing Information security. However, there is scant empirical evidence for the process followed in practice. This research explores how security audits are conducted in practice. In order to do so, a qualitative study is conducted where 11 auditors are interviewed. The findings indicate a gap between what is expected of audit and what actually happens in practice. On exploring the Accounting roots of audit, we postulate that this gap is due to the differences in conceptualization of risk between the Accounting and Information Security discipline.
  • Item
    Thumbnail Image
    On the predictability and efficiency of cultural markets with social influence and position biases
    Abeliuk Kimelman, Andrés ( 2016)
    Every day people make a staggering number of decisions about what to buy, what to read and where to eat. The interplay between individual choices and collective opinion is responsible for much of the observed complexity of social behaviors. The impact of social influence on the behavior of individuals may distort the quality perceived by the customers, making quality and popularity out of sync. Understanding how people respond to this information will enable us to predict social behavior and even steer it towards desired goals. In this thesis, we take that step forward by studying how and to what extent one can optimize cultural markets to reduce the unpredictability and improve the efficiency of the market. Our results contrast with earlier work which focused on showing the unpredictability and inequalities created by social influence. We show, experimentally and theoretically, that social influence can help detect correctly high-quality products and that much of its induced unpredictability can be controlled. We study a dynamic process in which choices are affected by social influence and by the position in which products are displayed. This model is used to explore the evolution of cultural markets under different policies on how items are displayed. We show that in the presence of social signals, by leveraging the position effects, one can increase the expected profit and reduce the unpredictability in cultural markets. In particular, we propose two policies for displaying products and prove that the limiting distribution of market shares converges to a monopoly for the product of highest quality, making the market both optimal and predictable asymptotically. Finally, we put to experimental test our theoretical results and show a policy that mitigates the disparities between popularity and quality that emerge from social and position biases. We report results on a randomized social experiment that we conducted online. The experiment consisted of a web interface displaying science news articles that participants can read and later recommend. We evaluated different policies for presenting items to people and measure their impact on the unpredictability of the market. Our results provide a unique insight into the impact of policy decisions for displaying the products in the dynamics of cultural markets.
  • Item
    Thumbnail Image
    Unsupervised all-words sense distribution learning
    Bennett, Andrew ( 2016)
    There has recently been significant interest in unsupervised methods for learning word sense distributions, or most frequent sense information, in particular for applications where sense distinctions are needed. In addition to their direct application to word sense disambiguation (WSD), particularly where domain adaptation is required, these methods have successfully been applied to diverse problems such as novel sense detection or lexical simplification. Furthermore, they could be used to supplement or replace existing sources of sense frequencies, such as SemCor, which have many significant flaws. However, a major gap in the past work on sense distribution learning is that it has never been optimised for large-scale application to the entire vocabularies of a languages, as would be required to replace sense frequency resources such as SemCor. In this thesis, we develop an unsupervised method for all-words sense distribution learning, which is suitable for language-wide application. We first optimise and extend HDP-WSI, an existing state-of-the-art sense distribution learning method based on HDP topic modelling. This is mostly achieved by replacing HDP with the more efficient HCA topic modelling algorithm in order to create HCA-WSI, which is over an order of magnitude faster than HDP-WSI and more robust. We then apply HCA-WSI across the vocabularies of several languages to create LexSemTm, which is a multilingual sense frequency resource of unprecedented size. Of note, LexSemTm contains sense frequencies for approximately 88% of polysemous lemmas in Princeton WordNet, compared to only 39% for SemCor, and the quality of data in each is shown to be roughly equivalent. Finally, we extend our sense distribution learning methodology to multiword expressions (MWEs), which to the best of our knowledge is a novel task (as is applying any kind of general-purpose WSD methods to MWEs). We demonstrate that sense distribution learning for MWEs is comparable to that for simplex lemmas in all important respects, and we expand LexSemTm with MWE sense frequency data.
  • Item
    Thumbnail Image
    Simulation of whole mammalian kidneys using complex networks
    Gale, Thomas ( 2016)
    Modelling of kidney physiology can contribute to understanding of kidney function by formalising existing knowledge into mathematical equations and computational procedures. Modelling in this way can suggest further research or stimulate theoretical development. The quantitative description provided by the model can then be used to make predictions and identify further areas for experimental or theoretical research, which can then be carried out, focusing on areas where the model and reality are different, creating an iterative process of improved understanding. Better understanding of organ function can contribute to the prevention and treatment of disease, as well as to efforts to engineer artificial organs. Existing research in the area of kidney modelling generally falls into one of three categories: • Morphological and anatomical models that describe the form and structure of the kidney • Tubule and nephron physiological models that describe the function of small internal parts of the kidney • Whole kidney physiological models that describe aggregate function but without any internal detail There is little overlap or connection between these categories of kidney models as they currently exist. This thesis brings together these three types of kidney models by computer generating an anatomical model using data from rat kidneys, simulating dynamics and interactions using the resulting whole rat kidney model with explicit representation of each nephron, and comparing the simulation results against physiological data from rats. This thesis also describes methods for simulation and analysis of the physiological model using high performance computer hardware. In unifying the three types of models above, this thesis makes the following contributions: • Development of methods for automated construction of anatomical models of arteries, nephrons and capillaries based on rat kidneys. These methods produce a combined network and three-dimensional euclidean space model of kidney anatomy. • Extension of complex network kidney models to include modelling of blood flow in an arterial network and modelling of vascular coupling communication between nephrons using the same arterial network. • Development of methods for simulation of kidney models on high performance computer hardware, and storage and analysis of the resulting data. The methods used include multithreaded parallel computation and GPU hardware acceleration. • Analysis of results from whole kidney simulations explicitly modelling all nephrons in a rat kidney, including comparison with animal data at both whole organ level and the nephron level. Analysis methods that bring together the three dimensional euclidean space representation of anatomy with the complex network used for simulation are developed and applied. • Demonstration that the computational methods presented are able to scale up to the quantities of nephrons found in human kidneys.
  • Item
    Thumbnail Image
    Machine learning for feedback in massive open online courses
    HE, JIZHENG ( 2016)
    Massive Open Online Courses (MOOCs) have received widespread attention for their potential to scale higher education, with multiple platforms such as Coursera, edX and Udacity recently appearing. Online courses from elite universities around the world are offered for free, so that anyone with internet access can learn anywhere. Enormous enrolments and diversity of students have been widely observed in MOOCs. Despite their popularity, MOOCs are limited in reaching their full potential by a number of issues. One of the major problems is the notoriously low completion rates. A number of studies have focused on identifying the factors leading to this problem. One of the factors is the lack of interactivity and support. There is broad agreement in the literature that interaction and communication play an important role in improving student learning. It has been indicated that interaction in MOOCs helps students ease their feelings of isolation and frustration, develop their own knowledge, and improve learning experience. A natural way of improving interactivity is providing feedback to students on their progress and problems. MOOCs give rise to vast amounts of student engagement data, bringing opportunities to gain insights into student learning and provide feedback. This thesis focuses on applying and designing new machine learning algorithms to assist instructors in providing student feedback. In particular, we investigate three main themes: i) identifying at-risk students not completing courses as a step towards timely intervention; ii) exploring the suitability of using automatically discovered forum topics as instruments for modelling students' ability; iii) similarity search in heterogeneous information networks. The first theme can be helpful for assisting instructors to design interventions for at-risk students to improve retention. The second theme is inspired by recent research on measurement of student learning in education research communities. Educators explore the suitability of using latent complex patterns of engagement instead of traditional visible assessment tools (e.g. quizzes and assignments), to measure a hypothesised distinctive and complex learning skill of promoting learning in MOOCs. This process is often human-intensive and time-consuming. Inspired by this research, together with the importance of MOOC discussion forums for understanding student learning and providing feedback, we investigate whether students' participation across forum discussion topics can indicate their academic ability. The third theme is a generic study of utilising the rich semantic information in heterogeneous information networks to help find similar objects. MOOCs contain diverse and complex student engagement data, which is a typical example of heterogeneous information networks, and so could benefit from this study. We make the following contributions for solving the above problems. Firstly, we propose transfer learning algorithms based on regularised logistic regression, to identify students who are at risk of not completing courses weekly. Predicted probabilities with well-calibrated and smoothed properties can not only be used for the identification of at-risk students but also for subsequent interventions. We envision an intervention that presents probability of success/failure to borderline students with the hypothesis that they can be motivated by being classified as "nearly there". Secondly, we combine topic models with measurement models to discover topics from students' online forum postings. The topics are enforced to fit measurement models as statistical evidence of instruments for measuring student ability. In particular, we focus on two measurement models, the Guttman scale and the Rasch model. To the best our knowledge, this is the first study to explore the suitability of using discovered topics from MOOC forum content as instruments for measuring student ability, by combining topic models with psychometric measurement models in this way. Furthermore, these scaled topics imply a range of difficulty levels, which can be useful for monitoring the health of a course and refining curricula, student assessment, and providing personalised feedback based on student ability levels and topic difficulty levels. Thirdly, we extend an existing meta path-based similarity measure by incorporating transitive similarity and temporal dynamics in heterogeneous information networks, evaluated using the DBLP bibliographic network. The proposed similarity measure might apply to MOOC settings to find similar students or threads, or thread recommendation in MOOC forums, by modelling student interactions in MOOC forums as a heterogeneous information network.
  • Item
    Thumbnail Image
    Large scale real-time traffic flow prediction using SCATS volume data
    Panda, Rabindra ( 2016)
    Road traffic congestion is a global issue that results in significant wastage of time and resources. Rising population, urbanisation, growing economies and affordable personal vehicles aggravate the issue. Many urban cities have been trying to mitigate this by expanding and modernising the transportation infrastructure. Even though increasing the road capacity accommodates the travel demands, studies have shown this does not eliminate the congestion problem. Hence, since 1970’s advanced traffic management systems have been used to address the issue of congestion. But for these systems to increase their operational efficiencies and fully realise their effectiveness, they need to have the predictive capabilities in the short term, usually ranging between few seconds to few hours. The research in short term traffic prediction has been active since the 1970’s. Numerous models have been proposed to use the traffic data collected by in- ductive loop detectors for short term traffic prediction. Most of the works have shown promising results through experiments at particular locations, however we are still to find a robust and globally adaptable solution. In last decade the attention have shifted from theoretically well established parametric methods to non parametric data driven algorithms. This work is an extension to that. Neural networks have always been one of the most capable mathematical models that can model complex non-linear relations. Up to 2006, their use have been hindered by practical issues related to the training. But recent breakthroughs in new ways of training deep neural architectures have made them reemerged as victors by realising the capabilities they had promised. In this thesis we study and extend their applications to short term traffic predictions. We applied three deep recurrent neural networks (Simple RNN, LSTM and GRU) in predicting the short term traffic volumes. The goal was to use both the temporal and spatial relationships that are present in the traffic flow data. We used these networks at univariate and multivariate settings to make predictions at single location and multiple locations respectively. For this work we used the volume data collected by VicRoads in Melbourne. We compared the results of our work with several existing methods and found promising results.
  • Item
    Thumbnail Image
    Application of automated feedback for the improvement of data quality in web-based clinical collaborations
    Glöckner, Stephan ( 2016)
    Background Biomedical research typically relies on data collected from patients in clinical settings. This is currently a fraught process due to the diversity and heterogeneity of data management systems, the numerous data standards and the sensitivities around the access to and sharing of such data. To tackle this, international biomedical registries are often established and targeted to specific diseases and communities. The quality of the data in such registries is essential to ensure that the clinical research findings can be translated into clinical care. However, at present clinical data management systems developed for biomedical research rarely perform quality assurance procedures during ongoing data collection. Similarly, clinical trials typically perform data quality assessment at the end of the trial. This is too late. We argue that data quality assurance procedures for cost reduction and data process improvements have to be implemented as an integral and ongoing part of disease registries and the data that they are used to collect. Such an approach requires all aspects of data collection efforts are considered including the intrinsic and extrinsic motivational factors of data entry personnel and the organisations in which they work. Technical solutions that encourage better behaviour and hence improve data quality are thus encouraged. Hypothesis The web-based interactions between data entry users and data management systems can be used to improve data quality. Leveraging technological advancement of web-based registries, new feedback mechanisms can be used to improve the overall data quality of the data that is captured by registries. This should lead to streamlined and improved data capture methods that support the users and ultimately benefit the clinical research more generally. This thesis proposes that web-based data quality feedback can motivate registry data entry personnel, increase their contributions and ultimately improve the quality of registry data and its (re-)use to support clinical trials. Methods To explore causes of low data quality and user motivation, a survey and an assessment of quality indicators in a multicentre clinical setting was performed. Based on this, we developed and evaluated a stage wise framework for web-based feedback and measured data quality trends including the factors that can impact user motivation in their data entry. This was explored in the International Niemann-Pick Disease Registry (INPDR) and two major international clinical trials associated with the European Network for the Study of Adrenal Tumours (ENSAT). We also considered the role of patients in data collection through mobile applications supporting data collection with the context of the Environmental Determinants of Islet Auto-immunity (ENDIA) clinical study. Results Researchers are motivated when they see the contribution resulting from their data entry and the improvement in treatment of patients. The results of the survey and the framework evaluation highlight the effectiveness of web-based automated data quality feedback. It was discovered that data quality feedback to researchers and the research community improve the data quality. Case studies showed an increase of data quality over the period of observation of this research – noting that these studies are still ongoing. The stage wise framework to evaluate data entry user behaviour after feedback was applied to one trial, which showed that feedback encouraged users to enter both more and higher quality data. Conclusions Recent literature confirms the need for data quality feedback as an ongoing and near-real time activity associated with data capture. Centralised data monitoring requires a general framework that can be adjusted for a variety of trials and studies. The proposed stage wide research method must be improved to measure the outcome of data quality feedback against a control group and/or where known benchmarks exist. Data quality dimensions need to be adjusted to all research interests. In the age of big data and mobile health possibilities, further research needs to be performed with regards to the upcoming challenges of data trustworthiness and record eligibility to tackle current and future research objectives. The findings highlight how biomedical research registries have to be designed with focus on data quality and feedback mechanisms.