Computing and Information Systems - Theses

Permanent URI for this collection

Search Results

Now showing 1 - 10 of 21
  • Item
    Thumbnail Image
    Designing a tangible user interface for the learning of motor skills in spinal mobilisation
    Chacon Salas, Dimas Antony ( 2018)
    Current techniques in the learning of psychomotor skills in physiotherapy, especially in spinal mobilisation, follow the traditional classroom approach: an expert performs a demonstration and students try to emulate the task by practising on each other while receiving mostly verbal feedback from the instructor. The introduction of a tailored tangible user interface would overcome the limitation of requiring the presence of a tutor and an extra fellow student, improving the scalability of the teaching delivery, and provide more objective feedback. Inspired by this opportunity, this work presents SpinalLog, a visuo-haptic interface that replicates the shape and deformable sensation of a human lower spine for the learning of spinal mobilisation techniques by employing conductive foam. This smart material is used simultaneously to sense vertebral displacements and provide passive haptic feedback to the user, emulating the flexibility of a spine. However, there is a need to understand the impact of the feedback provided in the learning of spinal mobilisation. Therefore, this work aims to design and implement SpinalLog to improve the teaching of this activity, and to investigate the effect of visual feedback, deformable haptic perception and shape fidelity in the learning of this delicate psychomotor task. We evaluated each of these three features—Visual Feedback, Passive Haptic Feedback, and Physical Fidelity—in the first part of an experiment to understand their effects on physiotherapy students' ability to replicate a mobilisation pattern recorded by an expert. Whereas in the second and last part of the experiment we presented the full features of our system to the students to gather their viewpoint for future improvement. From the first part of the experiment, we found that simultaneous feedback has the largest effect, followed by passive haptic feedback. The high fidelity of the interface has little quantitative effect, but it plays an important role in students' perceptions of the benefit of the system. From the second part of the experiment, we found that students had a favourable view on the SpinalLog suggesting improvements for the shape fidelity and the visual components.
  • Item
    Thumbnail Image
    High-quality lossless web page template and data separation
    Zhao, Chenxu ( 2018)
    Web page separation is an important task that aims to separate a web page into template code and data records populated into the template. Web page separation needs to work in a lossless manner where the web page can be reconstructed by running the template code on the data records. In this thesis, we investigate two sub-problems of web page separation for obtaining (1) high-quality template code and (2) high-quality data records. For the first sub-problem, we focus on improving the maintainability of the template code. Easily maintainable template code is reliable and will simplify further developments on top of the template code, e.g., to update the web templates. We formulate such a problem and analyze its complexity. We show that this problem is NP-hard. We then propose a heuristic algorithm to solve the problem. The main idea of our algorithm is to parse a web page into a tree and then to process it recursively in a bottom-up manner with three steps: splitting, folding, and alignment. In particular, we split siblings in the tree and fold them into chunks, where the alignment step is used to align sibling in the same chunk. During the sibling splitting step, to determine which siblings should be grouped into the same chunk, we further propose a population-based optimization algorithm named dual teaching and learning based optimization. We perform experiments on real data sets to evaluate the performance of our proposed algorithms in maximizing the maintainability of the template code produced. Experimental results show that our proposed algorithms outperform the baseline algorithms in the maintainability measure. For the second sub-problem, we focus on extracting data records from a set of web pages which are generated by different unknown templates and deducing the schemas that provide the data records. The extracted data records can be used in many applications, such as stock market prediction and personalized recommendation systems. We formulate such a problem and propose a framework to tackle the problem. Our framework processes web pages with four steps: web page template and data separation, template clustering, template alignment, and data record filtering. The web page template and data separation step separates web pages into template code and data records. The template clustering step then clusters the web pages by the similarity of template code. The template alignment step captures the differences among templates to construct a generalized template code which can generate all web pages in the same group. The data filtering step utilizes the template code to verify the data records extracted by the web page template and data separation step and modifies those which are incorrectly extracted. We perform experiments on real data sets to evaluate the performance of our framework. Experimental results show that our proposed framework outperforms baseline algorithms which assume a pre-known clustering of the set of web pages in the F-Score.
  • Item
    Thumbnail Image
    Highly efficient distributed hypergraph analysis: real-time partitioning and quantized learning
    Jiang, Wenkai ( 2018)
    Hypergraphs have been shown to be highly effective when modeling a wide range of applications where high-order relationships are of interest, such as social network analysis and object classification via hypergraph embedding. Applying deep learning techniques on large scale hypergraphs is challenging due to the size and complex structure of hypergraphs. This thesis addresses two problems of hypergraph analysis, real-time partitioning and quantized neural networks training, in a distributed computing environment. When processing a large scale hypergraph in real-time and in a distributed fashion, the quality of hypergraph partitioning has a significant influence on communication overhead and workload balance among the machines participating in the distributed processing. The main challenge of real-time hypergraph partitioning is that hypergraphs are represented as a dynamic hypergraph stream formed by a sequence of hyperedge insertions and deletions, where the structure of a hypergraph is constantly changing. The existing methods that require all information of a hypergraph are inapplicable in this case as only a sub-graph is available to the algorithm at a time. We solve this problem by proposing a streaming refinement partitioning (SRP) algorithm that partitions a real-time hypergraph flow in two phases. With extensive experiments on a scalable hypergraph framework named HyperX, we show that SRP can yield partitions that are of the same quality as that achieved by offline partitioning algorithms in terms of communication overhead and workload balance. For machine learning tasks over hypergraphs, studies have shown that using deep neural networks (DNNs) can improve the learning outcomes. This is because the learning objectives in hypergraph analysis are becoming more complex these days, where features are difficult to define and are highly-correlated. DNNs can be used as a powerful classifier to construct features automatically. However, DNNs require high computational power and network bandwidth as the size of DNN models are getting larger. Moreover, the widely adopted training algorithm, stochastic gradient descent (SGD), suffers in two main problems: vast communication overhead that comes from the broadcasts of parameters during the partial gradient aggregations, and the inherent variance between partial gradients, making the training process even longer as it impedes the convergence rate of SGD. We investigate these two problems in depth. Without sacrificing the performance, we develop a quantization technique to reduce the communication overhead and a new training paradigm, named cooperated low-precision training (C-LPT), in which importance sampling is used to reduce variance, and the master and workers collaborate together to make compensation for the precision loss due to the quantization. Incorporating deep learning techniques into distributed hypergraph analysis shows a great potential in query processing and knowledge mining on high-dimensional data records where relationships among them are highly correlated. On one hand, such a process takes the advantage of strong representational power of DNNs as an appearance-based classifier; on the other hand, such a process exploits hypergraph representations to gain benefits from its strong capability in capturing high-order relationships.
  • Item
    Thumbnail Image
    Towards highly accurate publication information extraction from academic homepages
    Zhang, Yiqing ( 2018)
    More and more researchers list their research profiles in academic homepages online. Publications from a researcher's academic homepage contain rich information, such as the researcher's fields of expertise, research interests, and collaboration network. Extracting publication information from academic homepages is an essential step in automatic profile analysis, which enables many applications such as academic search, bibliometrics and citation analysis. The publications extracted from academic homepages can also be a supplementary source for bibliographic databases. We investigate two publication extraction problems in this thesis: (i) Given an academic homepage, how can we precisely extract all the individual publication strings from the homepage? Here, a publication string is a text string that describes a publication record. We call this problem publication string extraction. (ii) Given a publication string, how can we extract different fields, such as publication authors, publication title, and publication venue, from the publication string? We call this problem publication field extraction. There are two types of traditional approaches to these two problems, rule-based approaches and machine learning based approaches. Rule-based approaches cannot accommodate the large variety of styles in the homepages, and they require significant efforts in rule designing. Machine learning based approaches rely on a large amount of high-quality training data as well as suitable model structures. To tackle these challenges, we first collect two datasets and annotate them manually. We propose a training data enhancement method to generate large sets of semi-real data for training our models. For the publication string extraction problem, we propose a PubSE model that can model the structure of a publication list in both line-level and webpage-level. For the publication field extraction problem, we propose an Adaptive Bi-LSTM-CRF model that can utilize the generated and the manually labeled training data to the full extent. Extensive experiment results show that the proposed methods outperform the state-of-the-art methods in the publication extraction problems studied.
  • Item
    Thumbnail Image
    Policy learning for task-oriented dialogue systems via reinforcement learning techniques
    Yin, Chuandong ( 2018)
    Task-oriented dialogue systems such as Apple Siri and Microsoft Cortana are becoming increasingly popular and are attracting much attention. Task-oriented dialogue systems aim to serve people as virtual personal assistants. For example, they can help users create calendar events or look up the weather through conversations, which improves work efficiency and life convenience. Nevertheless, task-oriented dialogue systems are still in their infant stage and are encountering many problems in dialogue policy learning (i.e., learning the next response given a user input and a dialogue context). Most existing work uses supervised learning techniques to learn dialogue policies. However, supervised learning models, especially deep learning models, are usually data- hungry and need a large number of training dialogues. It is difficult to obtain such a large number of training dialogues since intensive labeling efforts are needed to collect training dialogues for a given task domain. Moreover, it is also difficult to measure the quality (correctness) of a training dialogue or the quality of each response in a dialogue, while a supervised learning method needs such information as the training signal to guide policy learning. To overcome these shortcomings, we take a reinforcement learning based approach for policy learning in task-oriented dialogue systems. In the reinforcement learning paradigm, a user simulator (i.e., the environment) is introduced to mimic various user behaviors and to interact with an agent (i.e., a task-oriented dialogue system) that needs to be trained. Via simulated interactions with the user simulator, the agent is able to learn how to make correct responses using a small number of training dialogues and eventually becomes well-trained to serve real users. We identify two limitations of the reinforcement learning based approach and offer solutions to overcome such limitations in this study. First, existing reinforcement learning based training procedures have not taken into account the context uncertainties of ambiguous user inputs when learning dialogue policies. In task-oriented dialogue systems, user inputs are first transformed to a series of ⟨entity name, entity value, conf idence⟩ tuples by the automatic speech recognition (ASR) and natural language understanding (NLU) components. Here, entity name represents an attribute that makes up for a target task (e.g., cuisine types in a restaurant booking task) and entity value represents its value (e.g., “Korean”). The confidence field indicates how confident the ASR and NLU components are for recognizing the entity name and entity value from user input. For ambiguous user input (e.g., due to environment noises or user accents), the confidence might be lower. A low conference value triggers a “confirmation” response of the task-oriented dialogue system, which asks the user to confirm whether the recognized entity name and value are the intended entity name and value. Existing work uses a fixed confidence threshold to trigger confirmation responses. However, the confidence threshold should vary from person to person. For users with accents, even if the entity values are correctly recognized, the confidence is generally lower than those without accents. If a universal confidence threshold is used, it may lead to many rounds of unnecessary confirmations and lengthy dialogues, which will impinge the user experience. To address this issue, we propose to learn a dynamic threshold based on the dialogue context. But learning a dynamic threshold is very challenging because the response (action) space is too large (i.e., each different response of the task-oriented dialogue system may require a different confidence threshold). Finding an optimal dynamic threshold that suits the entire response space needs a large number of simulation steps. As a result, it is difficult for reinforcement learning models to fully explore the response space. We therefore propose a parameterized auxiliary asynchronous advantage actor-critic (PA4C) model to solve this problem. PA4C utilizes the action parameterization technique to reduce the size of the response space and introduces auxiliary tasks to efficiently explore responses, which is beneficial to learn the dynamic threshold and to improve task completion rates. We evaluate PA4C on a calendar event creation task, and the experimental results show that PA4C outperforms the state-of-the-art baselines by over 10% in completion rate, dialogue length, and episode reward. Second, existing task-oriented dialogue systems such as Apple Siri can only handle short dialogues that can be completed in a few dialogue turns, such as looking up the weather. As the dialogue gets longer, learning the optimal policy in each turn becomes increasingly difficult. This is because we can only obtain a global reward at the end of a training dialogue to indicate whether a task has been completed or not. There are no local rewards to evaluate the quality of the response in each intermediate dialogue turn. Propagating the global reward back to each dialogue turn to optimize the turn-by-turn policy is challenging. Such a problem is called the sparse rewards problem in reinforcement learning and may lead reinforcement learning models into local optima. To address this issue, we propose a new reinforcement learning model named PGGAN to provide local rewards for each dialogue turn by incorporating policy gradient (PG) and generative adversarial networks (GANs). In particular, we train a discriminator to evaluate the similarity between a response produced by the dialogue agent and a human response. This similarity score is then used as the local reward to optimize the dialogue agent (i.e., generator) in each intermediate dialogue turn. In this way, the sparse rewards problem can be solved and the dialogue agent can therefore handle long dialogues. We evaluate PGGAN on a public dialogue dataset, and the experimental results show that PGGAN significantly outperforms the state-of-the-art baselines by up to 25% in completion rate.
  • Item
    Thumbnail Image
    On the effectiveness of removing location information from trajectory data for preserving location privacy
    Hossain, Amina ( 2017)
    The ubiquity of GPS enabled smartphones with Internet connectivity has resulted in the widespread development of location-based services (LBSs). People use these services to obtain useful advises for their daily activities. For example, a user can open a navigation app to find a route that results in the shortest driving time from the current location to a destination. Nevertheless, people have to reveal location information to the LBS providers to leverage such services. Location information is sensitive since it can reveal habits about an individual. LBS providers are aware of this and take measures to protect user privacy. One well established and simple approach is to remove GPS data from user data working with the assumption that it will lead to a high degree of privacy. In this thesis, we challenge this notion of removing location information while retaining other features would lead to a high degree of location privacy. We find that it is possible to reconstruct the original routes by analyzing just the turn instructions provided to a user by a navigation service. We evaluated our approach using both synthetic and real road network data and demonstrate the effectiveness of this new attack in a range of realistic scenarios.
  • Item
    Thumbnail Image
    Supervised algorithms for complex relation extraction
    Khirbat, Gitansh ( 2017)
    Binary relation extraction is an essential component of information extraction systems, wherein the aim is to extract meaningful relations that might exist between a pair of entities within a sentence. Binary relation extraction systems have witnessed a significant improvement over past three decades, ranging from rule-based systems to statistical natural language techniques including supervised, semi-supervised and unsupervised machine learning approaches. Modern question answering and summarization systems have motivated the need for extracting complex relations wherein the number of related entities is more than two. Complex relation extraction (CRE) systems are highly domain specific and often rely on traditional binary relation extraction techniques employed in a pipeline fashion, thus susceptible to processing-induced error propagation. In this thesis, we investigate and develop approaches to extract complex relations directly from natural language text. In particular, we deviate from the traditional disintegration of complex relations into constituent binary relations and propose usage of shortest dependency parse spanning the n related entities as an alternative to facilitate direct CRE. We investigate this proposed approach by a comprehensive study of supervised learning algorithms with a special focus on training support vector machines, convolutional neural networks and deep learning ensemble algorithms. Research in the domain of CRE is stymied by paucity of annotated data. To facilitate future exploration, we create two new datasets to evaluate our proposed CRE approaches on a pilot biographical fact extraction task. An evaluation of results on new and standard datasets concludes that usage of shortest path dependency parse in a supervised setting enables direct CRE with an improved accuracy, beating current state-of-the-art CRE systems. We further show the application of CRE to achieve state-of-the-art performance for directly extracting events without the need of disintegrating them into event trigger and event argument extraction processes.
  • Item
    Thumbnail Image
    Efficient orthogonal parametrisation of recurrent neural networks using householder reflections
    Mhammedi, Zakaria ( 2017)
    In machine learning, Recurrent Neural Networks (RNNs) have been successfully used in many applications. They are particularly well suited for problems involving time-series. This is because RNNs process input sequences one element at a time, and thus, the chronological order of these elements is taken into account. In many practical prediction tasks involving time-series, long-term time dependencies in data are crucial. These are the features that RNNs need to learn. However, learning these long-term dependencies in sequences using RNNs is still a major challenge. This is mainly due to the exploding and vanishing gradient problems, which are more prominent the longer the input sequences are. Recent methods have been suggested to solve this problem by constraining the transition matrix to be unitary during training, which ensures that its norm is exactly equal to one. This sets an upper bound on the norm of the back-propagated gradients, preventing them from growing exponentially. However, existing methods either have limited expressiveness or scale poorly with the size of the network when compared with the simple RNN case, especially when using stochastic gradient descent. Our contributions are as follows. We first show that constraining the transition matrix to be unitary is a special case of an orthogonal constraint. Therefore, it may not be necessary to work with complex-valued matrices. Then we present a new parametrisation of the transition matrix which allows efficient training of an RNN while ensuring that the matrix is always orthogonal. Furthermore, a good trade-off between speed and expressiveness can be chosen by selecting the right number of reflection vectors - the vectors used in the new parametrisation. In particular, when the number of reflection vector is equal to the size of the hidden layer the transition matrix is allowed to span the full set of orthogonal matrices. Using our approach, one stochastic gradient step can, in the worst case, be performed in time complexity $\mathcal{O}(\bm{T} n^2)$, where $T$ and $n$ are the length of the input sequence and the size of the hidden layer respectively. This time complexity is the same as that of the simple RNN. Finally, we test our new parametrisation on problems with long-term dependencies. Our results suggest that the orthogonal constraint on the transition matrix has similar benefits to the unitary constraint.
  • Item
    Thumbnail Image
    Low-cost leaving home activity recognition using mobile sensing
    Li, Han ( 2017)
    Leaving home activity recognition (LHAR) is essential in context-aware applications. For example, on a rainy day, a smartphone can remind a user to bring an umbrella when the leaving home activity is recognized. However, research in this field is substantially lacking. Most existing studies require extra hardware such as sensors installed at home to help recognize such activities, which limits the applicability of these studies. With the ubiquity of mobile sensing technique, it becomes feasible for a smartphone to sense the ambient environment and a user’s current context. In this thesis, we develop a low-cost system using mobile sensing for timely recognition of leaving home activities. To the best of our knowledge, we are the first to recognize leaving home activities using only sensors on smartphones. Overall, our system can recognize leaving home activities within 20 seconds after the home door is closed with a precision of 93.1\% and recall of 100\%. Recognizing leaving home activities while only leveraging sensors on smartphones can be challenging in two aspects: 1) the diversity of home environments result in inconsistent features, which significantly affects the recognition performance; and 2) mobile sensing is restricted by limited resources on smartphones, e.g., power and computation capability. To overcome such limitations, we first investigate sensors available on commodity smartphones and find that features extracted from WiFi, barometer, cell tower, magnetic field sensor and accelerometer readings are relatively robust in recognizing leaving home activities. Second, due to the variety of residential property environments, we propose a sensor selection algorithm to adaptively select the most suitable sensors for each home to personalize the training. Both classification performance and sensing cost are taken into consideration to provide the user an option to trade power consumption for recognition accuracy, and vice versa. Inspired by the observation that leaving home activity usually involves walking, we activate the power-hungry sensors to start the recognition only when the walking event is detected. Thus, we reduce the sensing cost by 76.65\%.
  • Item
    Thumbnail Image
    Linking news and tweets
    Lin, Xiaojie ( 2016)
    In recent years, the rise of social media such as Twitter has been changing the way people acquire information. Meanwhile, traditional information sources such as news articles are still irreplaceable. These have led to a new branch of study on understand- ing the relationship between news articles and social media posts and fusing information from these heterogeneous sources. In this study, we focus on techniques for linking news and relevant tweets. Specifically, given a set of news articles and a set of tweets, we aim to find a subset of relevant tweets for each article in the news set. We propose a super- vised linking algorithm that employs multiple features such as time, named entities and event phrases, which outperforms all the baseline methods in our experiments. Since humans can accurately determine whether a tweet is relevant to a news article, we study two types of crowdsourcing algorithms, namely validation algorithms and amendment algorithms. Validation algorithms put a news-tweet pair in the result set only if the pair is validated by the crowd in some way. We propose three validation algorithms, namely Highest, EP and EP+, all of which effectively collect relevant tweets while maintaining a high precision in the experiments. Amendment algorithms first use a machine-based approach to generate an initial set of results, and then improve the result set by asking the crowd informative questions. The amendment algorithms Half , HHalf and LHalf also demonstrate their effectiveness on improving the quality of the result sets in the experi- ments. We extend our study to social network subgraph matching algorithms, which can be used to i) improve the performance of our linking algorithms; ii) analyze and extract information from the news-tweet network or other social network graphs. We propose an efficient GPU algorithm, which significantly outperforms the state-of-the-art CPU al- gorithm in terms of running time in our experiments when the workload is heavy.