Computing and Information Systems - Theses

Permanent URI for this collection

Search Results

Now showing 1 - 10 of 98
  • Item
    Thumbnail Image
    An empirical analysis of the systemic performance of decentralised cryptocurrency systems
    Yang, Renlord Nicholas ( 2022-11)
    Decentralised cryptocurrency networks have gained significant adoption over the years and attracted significant user interest. These networks operate on a peer-to-peer on the protocol level with minimal trust assumptions. Networks such as Bitcoin and Ethereum require consensus participants to maintain a full copy of the blockchain to be able to derive state to verify user transactions. Due to surging demand and popularity, these networks face scalability issues when processing transactions as these decentralised networks are throughput bound by the consensus participant that can provide the least throughput. Central to the health of decentralised cryptocurrency network is its security, scalability, a decentralisation state where no central authority is relied upon and minimal trust assumptions. However, to maintain the above tenets of the network, user participation is key. The thesis is centred on the hypothesis that user participation is hindered by barriers of entry imposed by high compute resource requirement when users need to bootstrap their clients in order to participate in the network with minimal trust assumptions. To that end, the body of our research investigated in detail where compute resources are spent and how client implementations can be further improved to reduce compute resources or at least yield higher transaction verification throughput during the bootstrap process to reduce the time taken for users to complete the bootstrap process. First, we analysed how state synchronisation works under the hood formally and discussed the trade-offs of various implementation-specific state synchronisation optimisation techniques. In our measurement and analysis, we identified that performance decays as state data increases which threatens the scalability of the network comprised of independent node operators. Our work measures and shows that despite state and blockchain size grows in linear pace, run time performance degrades in an exponential scale as users verify towards the end of the block tip. We concluded that under the current trust models, it is infeasible to apply system optimisation that scale well with the increasing workload incurred due to unbounded growth of the blockchain. Secondly, we thoroughly studied the Ethereum Gas Mechanism. We collected detailed collection of EVM traces for each transaction, with opcode execution times, measured on two disparate machines with defined specifications. Then, identified that some EVM opcodes are mispriced, particularly those involving disk I/O, presenting potential denial-of-service vectors. Subsequently, we analysed the effects of I/O and the impact of in-memory caching strategies employed by the reference client to mitigate I/O costs. We show that the current gas cost model favours more performant machines, which we identify as an ongoing threat to node diversity and decentralisation. We confirmed potential denial-of-service vectors resulting from mis-priced EVM opcodes involving disk I/O. We also measured the effects of I/O in transaction costs, finding that it remains poorly captured by the current Gas cost model and that its effects are getting worse over time. We also found that under the current Gas cost model, nodes with modest computational resources are disadvantaged compared to their better resourced peers, which we identified as an ongoing threat to node diversity and network decentralisation. In both aforementioned works, we also presented proposed improvements in client implementation that should lead to improved performance outcomes. Finally, we present a model of the stateful transaction verification and outline the requisite building blocks that make up the verification model. With the model, we proposed a scheme to do transaction verification without relying a state database locally. Then we discussed the computational feasibility of replacing I/O-driven state querying from a local authenticated database against transaction verification with verifiable state verification in transactions.
  • Item
    Thumbnail Image
    Incorporating structured and unstructured information for geospatial question answering
    Li, Haonan ( 2022-12)
    In daily life, people ask questions involving geographic entities or concepts; we call them geospatial questions. Automatically answering these questions is challenging for machines because of the difficulties in: (1) identifying geographic entities or concepts; (2) interpreting spatial and non-spatial constraints from questions; and (3) incorporating various knowledge sources for language processing and spatial reasoning. In this thesis, we aim to tackle these problems using deep learning and natural language processing techniques. We first investigate a fundamental task for geospatial language processing --- toponym (place name) detection and disambiguation. We propose an attention-based neural network model using character-level word representation for toponym detection. We demonstrate that character-level information is essential for toponym detection considering language irregularity (e.g., mis-spellings and abbreviations). After detecting the toponyms, we devise a feature-based model to assign each toponym a unique id (with geo-coordinates) from a gazetteer. During our investigation of the toponym detection approach, we found that about 20\% toponyms are used metonymically, where a toponym does not refer to a place but a closely related thing or concept. However, general toponym detection models do not distinguish them. We hypothesize that a good metonymy resolution model should benefit toponym detection and further benefit geospatial question answering. For metonymy resolution, we argue that whether a toponym (i.e., target word) refers to a place or not should be inferred from context rather than the toponym itself, and propose a pretrained language model (PLM)-based target word masking approach which achieves the state-of-the-art over five datasets. We further verify our hypothesis by integrating the proposed metonymy resolution model into an end-to-end toponym resolution model, and using the toponym resolution model to tag the questions and related passages in later question answering (QA) tasks. For geospatial question answering, we argue that different question answering systems can be built based on different knowledge sources, including unstructured web documents and structured knowledge bases. We investigate the current state of geospatial QA systems using different knowledge sources. The lack of resources is the main bottleneck for unstructured knowledge source-based QA (i.e., IR-based QA). We find that there is no IR-based geospatial QA dataset, and existing open-domain datasets do not adequately cover the difficulties of geospatial questions, especially questions that have to be answered with multiple spans extracted from a document. To bridge this gap, we propose a new reading comprehension task with a dataset that enables models to predict multi-span answers. We demonstrate that a neural network model trained on our dataset can effectively answer multi-span questions, including but not limited to questions in the geospatial domain. For structured KB-based QA, we propose a pipelined method consisting of three steps: (1) detecting geospatial semantic concepts from questions; (2) modeling geospatial relationships between the concepts into a semantic graph; and (3) translating the semantic graph into a knowledge base query. We use a neural sequence tagger for the first step and a neural semantic parser for the second step. Experimental results show that neural network-based approaches are better at capturing semantic features and result in better question answering systems. We finally argue that structured and unstructured knowledge sources can complement each other, and more complex geospatial questions can be answered by incorporating different knowledge sources. We propose a model to integrate structured geo-coordinates and unstructured place descriptions to represent a place, and demonstrate that the model combining multiple knowledge sources outperforms models utilizing a single knowledge source in a real-world Point-of-Interest (POI) recommendation QA task. Overall, this thesis contributes to geospatial question answering and a series of fundamental tasks relating to toponyms. We demonstrate that these tasks can be effectively and efficiently approached by applying advanced deep learning and natural language processing techniques. The proposed methods, datasets, and findings can be used or built on in future geospatial question answering research.
  • Item
    Thumbnail Image
    Scalable and Explainable Time Series Classification
    Cabello Wilson, Nestor Stiven ( 2022)
    Time series data are ubiquitous, and the explosion of sensor technologies on industry applications has further accelerated their growth. Modern applications collecting large datasets with long time series require fast automated decisions. Besides, legislation such as the European General Data Protection Regulation now require explanations for automated decisions. Although state-of-the-art time series classification methods are highly accurate, their computationally expensive and complex models are typically impractical for large datasets or cannot provide explanations. To address these issues, this thesis proposes two time series classification methods that are comparable to state-of-the-art methods in terms of accuracy, while the two methods provide scalable and explainable classifications. Our first method proposes a novel supervised selection of sub-series to pre-compute a set of features that maximizes the classification accuracy when fed into a tree-based ensemble. Our second method further proposes a perturbation scheme for the supervised feature selection and the node-splitting process when training the tree-based ensemble. We also propose a highly time-efficient strategy to build the tree-based ensemble. Both methods enable explainability for our classification results, while they are orders of magnitude faster than the state-of-the-art methods. Our second method, in particular, is significantly faster and more accurate than our first approach, while it is not significantly different from the state-of-the-art methods in terms of classification accuracy. Moreover, motivated to explore a more general model for time series classification, we propose a novel graph-based method to learn to classify time series without the order constraint inherent to time series data. This method classifies time series by learning the relationships among the data points independently from their positions (i.e., time-stamps) within the series. We show that this method outperforms state-of-the-art methods over several time series datasets, thus opening up a new direction for the design of time series classifiers.
  • Item
    Thumbnail Image
    Traffic Optimization with Deep Reinforcement Learning in a Connected Transport Ecosystem
    Gunarathna, Pathirannahelage Udesh Madubasha ( 2022)
    The emergence of connected autonomous vehicles and smart traffic infrastructure provides an opportunity to manage traffic using novel techniques as the data generated from these entities can be used for intelligent real-time traffic control. Reinforcement learning has been widely adopted in various real-time control and sequential decision-making tasks including robotics, health care, NLP, and games due to the ability to control complex environments using observed data. Despite the success of reinforcement learning, applying off-the-shelf reinforcement algorithms for novel traffic management solutions is difficult due to the scalability, dynamicity, graph-structured data and multi-objectivity nature present in these transportation problems. In this thesis, we propose several solutions for intelligent traffic management control using novel reinforcement learning algorithms and architectures to overcome the above limitations. The thesis starts by exploring novel reinforcement learning-based solutions for intersection level traffic management problems and proposes a multi-agent solution with a novel reinforcement learning algorithm to handle multi-objectivity. Then, the thesis expands to road network-level traffic management problems and proposes a coordination-based reinforcement learning architecture to tackle the existing limitations in the road network level traffic optimization. Finally, we propose a generalized reinforcement learning architecture for common combinatorial graph problems. This enables many intelligent problems in the transportation sector to be solved using a single reinforcement learning framework. This generalized reinforcement learning framework is able to outperform the state-of-the-art on a variety of dynamic graph problems and can be applied to several traffic management solutions in a connected transport system. Based on the research outcomes, it is evident that using novel deep reinforcement learning algorithms for novel traffic management techniques has several advantages, making deep reinforcement learning as the go-to solution for the connected transport ecosystem.
  • Item
    Thumbnail Image
    Enhancing Deep Multimodal Representation: Online, Noise-robust and Unsupervised Learning
    Silva, Dadallage Amila Ruwansiri ( 2022)
    Information that is generated and shared today uses data that involves different modalities. These multimodalities are not limited to the well-known sensory media (e.g., text, image, video, and audio), but could be any abstract or inferred form of encoding information (e.g., propagation network of a news article and sentiment of a text) that represents a different viewpoint of the same object. For machine learning models to be competitive with humans, they should be able to extract and combine information from these modalities. Thus, multimodal representation learning has emerged as a broad research domain that aims to understand complex multimodal environments while narrowing the heterogeneity gap among different modalities. Due to the potential of representing latent information in complex data structures, deep learning-based techniques have recently attracted much attention for multimodal representation learning. Nevertheless, most existing deep multimodal representation learning techniques lack the following: (1) ability to continuously learn and update representations in a memory-efficient manner while being recency-aware and avoiding catastrophic forgetting of historical knowledge; (2) ability to learn unsupervised representations for under-exploited multimodalities with complex data structures (i.e., temporally evolving networks) and high diversity (cross-domain multimodal data); and (3) ability to directly serve as features to address various real-world applications without fine-tuning using an application-specific labelled dataset. This thesis aims to bridge these research gaps in deep multimodal representation learning approaches. In addition, this thesis addresses real-world applications involving multimodal data such as misinformation detection, spatiotemporal activity modeling and online market basket analysis. The main contributions of this thesis include: (1) proposing two novel online learning strategies for learning deep multimodal representations, and proposing two frameworks using the proposed online learning strategies to address two real-world applications -- i.e., user-guided spatiotemporal activity modeling (USTAR) and online market basket analysis (OMBA); (2) proposing METEOR, a memory and time efficient online representation learning algorithm for making deep multimodal representations compact and scalable to cope with the different data rates of real-world multimodal data streams; (3) developing an unsupervised framework to capture and preserve domain-specific and domain-shared knowledge in cross-domain data streams, and applying the proposed framework to address cross-domain fake news detection; (4) proposing an unsupervised model to learn representations for temporally evolving graphs by mimicking the future knowledge of an evolving graph at an early timestep, and developing a new framework called Propagation2Vec with the help of the proposed objective functions for fake news early detection; and (5) developing a theoretically-motivated noise-robust unsupervised learning framework, which can filter out the noise (i.e., fine-tune) in multimodal representations learned from general pretraining objective functions without requiring a labelled dataset, and applying the findings to address the unsupervised fake news detection task.
  • Item
    Thumbnail Image
    Detection of Repeating Activities Recorded by Sensors in the Absence of Labeled Data
    Mirmomeni, Mahtab ( 2022)
    Time series data with repetition is generated in many applications such as remote patient monitoring using wearable sensors. The consecutive repeating patterns in a time series indicate that an event is re-occurring consecutively over time. As an example, a patient’s rehabilitation exercise movements collected by an accelerometer can determine whether the patient is repeatedly exercising. In such applications collecting and labelling real data is difficult, lengthy and expensive. The exercises also might be performed differently by different patients and even one patient might start to perform the exercise in a certain way but the movement can change due to for example fatigue, thus generating varying repeating patterns in the underlying time series. In this thesis, we have explored pattern mining, machine learning and deep learning techniques that can be used for detecting consecutive repeating patterns from a time series without previous knowledge about the repeats and without use of labelled data. We have shown the limitations of the defacto benchmark called the Matrix Profile, for detecting consecutive repeating patterns and have addressed those limitations by modifying Matrix Profile and proving how to set its key input parameter. We have created a transferable deep learning technique, RP-Mask, for detecting and localising segments of repeating patterns on a time series, which learns entirely from synthetic data and is able to transfer this learning to an unseen real dataset. We further explore whether RP-Mask is using the correct information in the time series to make decisions. We show that the state-of-the-art explainability techniques generate different explanations for the same model and create a new explainability approach using a low pass frequency filter on the time series.
  • Item
    Thumbnail Image
    Dynamic Multi-objective Optimisation Using Evolutionary Algorithms
    Herring, Daniel George ( 2022)
    Dynamic Multi-objective Optimization Problems (DMOPs) offer an opportunity to examine and solve challenging real world scenarios where trade-off solutions between conflicting objectives change over time. Definition of benchmark problems allows modelling of industry scenarios across transport, power and communications networks, manufacturing and logistics. Recently, significant progress has been made in the variety and complexity of DMOP benchmarks and the incorporation of realistic dynamic characteristics. However, significant gaps still exist in standardised methodology for DMOPs, specific problem domain examples and in the understanding of the impacts and explanations of dynamic characteristics. This thesis provides major contributions on these three topics within evolutionary dynamic multi-objective optimization. Firstly, experimental protocols for DMOPs are varied. This limits the applicability and relevance of results produced and conclusions made in the field. A major source of the inconsistency lies in the parameters used to define specific problem instances being examined. The uninformed selection of these has historically held back understanding of their impacts and standardisation in experimental approach to these parameters in the multi-objective problem domain. Using the frequency and severity (or magnitude) of change events, a more informed approach to DMOP experimentation is conceptualized, implemented and evaluated. Establishment of a baseline performance expectation across a comprehensive range of dynamic instances for well-studied DMOP benchmarks is analyzed. To maximize relevance, these profiles are composed from the performance of evolutionary algorithms commonly used for baseline comparisons and those with simple dynamic responses. Comparison and contrast with the coverage of parameter combinations in the sampled literature highlights the importance of these contributions. Secondly, the provision of useful and realistic DMOPs in the combinatorial domain is limited in previous literature. A novel dynamic benchmark problem is presented by the extension of the Travelling Thief Problem (TTP) to include a variety of realistic and contextually justified dynamic changes. Investigation of problem information exploitation and it's potential application as a dynamic response is a key output of these results and context is provided through comparison to results obtained by adapting existing TTP heuristics. Observation driven iterative development prompted the investigation of multi-population island model strategies, together with improvements in the approaches to accurately describe and compare the performance of algorithm models for DMOPs, a contribution which is applicable beyond the dynamic TTP. Thirdly, the purpose of DMOPs is to reconstruct realistic scenarios, or features from them, to allow for experimentation and development of better optimization algorithms. However, numerous important characteristics from real systems still require implementation and will drive research and development of algorithms and mechanisms to handle these industrially relevant problem classes. The novel challenges associated with these implementations are significant and diverse, even for a simple development such as consideration of DMOPs with multiple time dependencies. Real world systems with dynamics are likely to contain multiple temporally changing aspects, particularly in energy and transport domains. Problems with more than one dynamic problem component allow for asynchronous changes and a differing severity between components that leads to an explosion in the size of the possible dynamic instance space. Both continuous and combinatorial problem domains require structured investigation into the best practices for experimental design, algorithm application and performance measurement, comparison and visualization. Highlighting the challenges, the key requirements for effective progress and recommendations on experimentation are explored here.
  • Item
    Thumbnail Image
    Improving Agile Sprint Planning Through Empirical Studies of Documented Information and Story Points Estimation
    Pasuksmit, Jirat ( 2022)
    In Agile iterative development (e.g., Scrum), effort estimation is an integral part of the development iteration planning (i.e., sprint planning). Unlike traditional software development teams, an Agile team relies on a lightweight estimation method based on the team consensus (e.g., Planning Poker) for effort estimation and the estimated effort is continuously refined (or changed) to improve the estimation accuracy. However, such lightweight estimation methods are prone to be inaccurate and late changes of the estimated effort may cause the sprint plan to become unreliable. Despite a large body of research, only few studies have reviewed the reasons for inaccurate estimations and the approaches to improve effort estimation. We conducted a systematic literature review and found that the quality of the available information is one of the most common reasons for inaccurate estimations. We found several manual approaches aim to help the team improve the information quality and manage the uncertainty in effort estimation. However, prior work reported that the practitioners were reluctant to use them as they added additional overhead to the development process. The goals of this thesis are to better understand and propose the approaches that help the team achieves accurate estimation without introducing additional overhead. To achieve this goal, we conducted studies in this thesis in two broad areas. We first conducted two empirical studies to investigate the importance of documented information for effort estimation and the impact of estimation changes in a project. In the first empirical study, we aim to investigate the importance and quality of documented information for effort estimation. We conducted a survey study with 121 Agile practitioners from 25 countries. We found that the documented information is considered important for effort estimation. We also found that the useful documented information for effort estimation is often changed and the practitioners would re-estimate effort when the change of documented information occurred, even after the work had started. In the second empirical study, we aim to better understand the change of effort (in Story Points unit; SP). We examined the prevalence of SP changes, the accuracy of changed SP, and the impact of information changes on SP changes. We found that the SP were not often changed after sprint planning. However, when the SP were changed, the changing size was relatively large and the changed SP may be inaccurate. We also found that the SP changes were often occurred along with the information changes for scope modification. These findings suggest that a change of documented information could lead to a change of effort, and the changed effort could have a large impact on the sprint plan. To mitigate the risk of an unreliable sprint plan, the documented information and the estimated effort should be verified and stabilized before finalizing the sprint plan. Otherwise, the team may have to re-estimate the effort and adjust the sprint plan. However, revisiting all documented information and estimated SP could be a labor-intensive task and may not comply with the Agile principles. To help the team manages these uncertainties without introducing additional overhead, we proposed the automated approaches called DocWarn and SPWarn to predict the documentation changes and SP changes that may occur after sprint planning. We built DocWarn and SPWarn using machine learning and deep learning techniques based on the metrics that measure the characteristics of the work items. We evaluated DocWarn and SPWarn using the work items extracted from the open-source projects. Our empirical evaluations show that DocWarn achieved an average AUC of 0.75 and SPWarn achieved an average AUC of 0.73, which are significantly higher than baseline models. These results suggest that our approaches can predict future changes of documented information and SP based on the currently-available information. With our approaches, the team will be better aware and pay attention to the potential documentation changes and SP changes during sprint planning. Thus, the team can manage uncertainty and reduce the risk of unreliable effort estimation and sprint planning without additional overhead.
  • Item
    Thumbnail Image
    A Novel Perspective on Robustness in Deep Learning
    Mohaghegh Dolatabadi, Hadi ( 2022)
    Nowadays, machine learning plays a crucial role in our path toward automated decision-making. Traditional machine learning algorithms would require careful, often manual, feature engineering to deliver satisfactory results. Deep Neural Networks (DNNs) have shown great promise in an attempt to automate this process. Today, DNNs are the primary candidate for various applications, from object detection to high-dimensional density estimation and beyond. Despite their impressive performance, DNNs are vulnerable to different security threats. For instance, in adversarial attacks, an adversary can alter the output of a DNN for its benefit by adding carefully crafted yet imperceptible distortions to clean samples. As another example, in backdoor (Trojan) attacks, an adversary intentionally plants a loophole in the DNN during the learning process. This is often done via attaching specific triggers to the benign samples during training such that the model creates an association between the trigger and a particularly intended output. Once such a loophole is planted, the attacker can activate the backdoor with the learned triggers and bypass the model. All these examples demonstrate the fragility of DNNs in their decision-making, which questions their widespread use in safety-critical applications such as autonomous driving. This thesis studies these vulnerabilities in DNNs from novel perspectives. To this end, we identify two key challenges in the previous studies around the robustness of neural networks. First, while a plethora of existing algorithms can robustify DNNs against attackers to some extent, these methods often lack the efficiency required for their use in real-world applications. Second, the true nature of these adversaries has been less studied, leading to unrealistic assumptions about their behavior. This is particularly crucial as building defense mechanisms using such assumptions would fail to address the underlying threats and create a false belief in the security of DNNs. This thesis studies the first challenge in the context of robust DNN training. In particular, we leverage the theory of coreset selection to form informative weighted subsets of data. We use this framework in two different settings. First, we develop an online algorithm for filtering poisonous data to prevent backdoor attacks. Specifically, we identify two critical properties of poisonous samples based on their gradient space and geometrical representation and define an appropriate selection objective based on these criteria to select clean samples. Second, we extend the idea of coreset selection to adversarial training of DNNs. Although adversarial training is one of the most effective methods in defending DNNs against adversarial attacks, it requires generating costly adversarial examples for each training sample iteratively. To ease the computational burden of various adversarial training methods in a unified manner, we build a weighted subset of the training data that can faithfully approximate the DNN gradient. We show how our proposed solution can lead to robust neural network training more efficiently in both of these scenarios. Then, we touch upon the second challenge and question the validity of one of the widely used assumptions around adversarial attacks. More precisely, it is often assumed that adversarial examples stem from an entirely different distribution than clean data. To challenge this assumption, we resort to generative modeling, particularly Normalizing Flows (NF). Using an NF model pre-trained on clean data, we demonstrate how one can create adversarial examples closely following the clean data distribution. We then use our approach against state-of-the-art adversarial example detection methods to show that methods that explicitly assume a difference in the distribution of adversarial attacks vs. clean data might greatly suffer. Our study reveals the importance of correct assumptions in treating adversarial threats. Finally, we extend the distribution modeling component of our adversarial attacker to increase its density estimation capabilities. In summary, this thesis advances the current state of robustness in deep learning by i) proposing more effective training algorithms against backdoor and adversarial attacks and ii) challenging a fundamental prevalent misconception about the distributional properties of adversarial threats. Through these contributions, we aim to help create more robust neural networks, which is crucial before their deployment in real-world applications. Our work is supported by theoretical analysis and experimental investigations based on publications.
  • Item
    Thumbnail Image
    Attentional Reality: Understanding and Managing Limited Attentional Resources in Augmented Reality
    Syiem, Brandon Victor ( 2022)
    Limits of human attention restrict the amount of information we can perceive when using augmented reality (AR) applications. This leads to consequences when the unperceived information, either from the digital content or the real surrounding, is vital for the user's experience or safety. Despite such consequences, it is unclear how attentional resources are allocated in AR applications and what measures can be taken to improve attention management in AR. This thesis aims to better our understanding of attention allocation in AR applications, isolate variables related to AR that demand excessive attentional resources, and develop and evaluate adaptive techniques to improve efficiency of attention allocation in AR. Our findings show that users excessively focus on the digital content in AR at the cost of neglecting information from other sources. We demonstrate how the excessive allocation of attention towards the AR content is related to the task of scanning for and processing task-relevant digital content. Finally, we show how an intelligent adaptive agent, based on the theories of selective attention, can improve attention management in AR but faces challenges when users are less receptive to the agent's support. These findings and the resulting discussions presented in this thesis yield novel insights regarding user attention in AR and provide valuable lessons in designing AR applications.