Computing and Information Systems - Theses
Now showing items 1-12 of 375
Towards Robust Representation of Natural Language Processing
There are many challenges in building robust natural language applications. Machine learning based methods require large volumes of annotated text data, and variations over text can lead to problems, namely: (1) language can be highly variable and expressed with different variations, such as lexical and syntactic. Robust models should be able to handle these variations. (2) A text corpus is heterogeneous, often making language systems domain-brittle. Solutions for domain adaptation and training with corpora comprised of multiple domains are required for language applications in the real world. (3) Many language applications tend to be biased to the demographic of the authors of documents the system is trained on, and lack model fairness. Demographic bias also causes privacy issues when a model is made available to others. In this thesis, I aim to build robust natural language models to tackle these problems, focusing on deep learning approaches which have shown great success in language processing via representation learning. I pose three basic research questions: how to learn representations that are robust to language variation, robust to domain variation, and robust to demographic variables. Each of these research questions is tackled using different approaches, including data augmentation, adversarial learning, and variational inference. For learning robust representations to language variation, I study lexical variation and syntactic variation. To be specific, a regularisation method is proposed to tackle lexical variation, and a data augmentation method is proposed to build robust models, using a range of language generation methods from both linguistic and machine learning perspectives. For domain robustness, I focus on multi-domain learning and investigate domain supervised and unsupervised learning, where domain labels may or may not be available. Two types of models are proposed, via adversarial learning and latent domain gating, to build robust models for heterogeneous text. For robustness to demographics, I show that demographic bias in the training corpus leads to model fairness problems with respect to the demographic of the authors, as well as privacy issues under inference attacks. Adversarial learning is adopted to mitigate bias in representation learning, to improve model fairness and privacy-preservation. To demonstrate the proposed approaches, a range of tasks are considered, including text classification and POS tagging. To evaluate the generalisation and robustness, both in-domain and out-of-domain experiments are conducted with two classes of language tasks: text classification and part-of-speech tagging. For multi-domain learning, multi-domain language identification and multi-domain sentiment classification are conducted, and I simulate domain supervised learning and domain unsupervised learning to evaluate domain robustness. I evaluate model fairness with different demographic attributes and apply inference attacks to test model privacy. The experiments show the advantages and the robustness of the proposed methods. Finally, I discuss the relations between the different forms of robustness, including their commonalities and differences. The limitations of this thesis are discussed in detail, including potential methods to address these shortcomings in future work, and potential opportunities to generalise the proposed methods to other language tasks. Above all, these methods of learning robust representations can contribute towards progress in natural language processing.
Temporal analytics for understanding students’ study behaviours in digital educational environments
The growth of using technologies in education has motivated the development of research to study and promote students’ academic success in the educational environments applying them. Digital educational settings provide a high volume of data for this analytical purpose and enable researches to easily collect and analyse data from students’ interactions within the system (audit trails) as they proceed towards their study goals. This has motivated the development of Learning Analytics (LA) approaches in these educational settings that offer innovative applications of analytics methods to understand and promote students’ study behaviours. One of the main challenges of LA approaches is making connections between students’ data traces and educational assumptions which is necessary to ensure the improvement of education and needs interdisciplinary knowledge. In addition, in digital environments, students’ data are available at different levels (i.e., fine-grain to coarse-grain) and can be structured in varied ways (e.g., aggregated, temporal). This data requires appropriate formulation to be able to reveal information regarding specific aspects of students’ study behaviours. In this matter, the main focus of LA research is on representing students’ study behaviour based on the aggregated measures of their task level interactions within digital environments that helped to identify various patterns of learning processes associated with learning outcomes. However, this suffers from some limitations, mainly due to neglecting the time dimension that could better reveal the effect of processes students used during studying. In addition, investigation of students’ behaviour at specific context levels such as session is understudied by research that may reveal novel insight into particular aspects of students’ study processes. This thesis provides an understanding of students’ study behaviours by considering specific levels of conceptualizing students’ data and the level at which their behaviours can be structured. In the first part of this thesis, a temporal analysis based on clustering and statistical tests is performed in the context of a Massive Open Online Course (MOOCs) where students’ study behaviour is investigated at session level; that is, dedicated blocks of time in which learners complete single or multiple contiguous learning tasks without interruption. The concept of “session” has rarely been explicitly examined in relation to learning outcome in online learning. Creating and managing sessions when learning online has been associated with an important factor that is associated with students’ time management strategy that subsequently can impact their academic outcome. The result of this study provides insight into varied ways that students organize and prioritise their time in terms of sessions when learning in a MOOC and how these behaviours impact students’ academic outcome. In the second part, a study is conducted in the context of two offerings of a MOOC, where the impact of sequential representations of students’ task level behaviour on their learning outcome is investigated. This study considers assessment task outcomes as a proxy for learning outcome rather than students’ final achievement that could provide more insight regarding students’ progress over time. For this purpose, temporal and non-temporal prediction models are used to show how the sequential nature of learners’ task level behaviour in a MOOC is more informative (predictive) of their assessment outcome rather than aggregated measures examined in most studies. Additionally, it provides insight into variations in behavioural sequences of high and low achieving students when preparing for assessments using a sequential pattern mining approach. The results show that it is possible to successfully predict students’ readiness for assessment tasks, particularly if the sequential aspects of students’ behaviour are represented in the model. Moreover, the results reveal some behavioural patterns reflecting specific learning strategies that may be more effective in promoting learning. In the third part of this thesis, a study is performed in the context of a digital word processing software to examine the importance of the temporal nature of students’ writing behaviour on their writing outcome. It helps to understand how particular aspects of the writing process at specific moments of writing influence the writing outcome. This view is understudied in writing research using students’ audit trails (i.e., keystrokes). For this purpose, a temporal approach is proposed combining classification and local feature interpretation as methods. The results reveal the importance of temporal analysis when studying students’ writing behaviour. Findings also reveal that the influence of specific writing behaviours on writing quality is likely determined in combination with other writing characteristics, that emphasise the necessity of using models that capture and take the interrelationship between features into account. In summary, this work contributes to the learning analytics research by raising awareness regarding the need to account for various levels of conceptualizing data and different dimensions when studying learners’ behaviour. Various stakeholders could take benefits from the knowledge discovered in this research to improve learners’ study behaviours. In particular, educators can identify which study behaviours require support - and (most importantly) when – so they can select relevant interventions to include in their courses.
Budget-constrained Workflow Applications Scheduling in Workflow-as-a-Service Cloud Computing Environments
The adoption of workflow, an inter-connected tasks and data processing application model, in the scientific community has led to the acceleration of scientific discovery. The workflow facilitates the execution of complex scientific applications that involves a vast amount of data. These workflows are large-scale applications and require massive computational infrastructures. Therefore, deploying them in distributed systems, such as cloud computing environments, is a necessity to acquire a reasonable amount of processing time. With the increasing demand for scientific workflows execution and the rising trends of cloud computing environments, there is a potential market to provide a computational service for executing scientific workflows in the clouds. Hence, the term Workflow-as-a-Service (WaaS) emerges along with the rising of the Everything-as-a-Service concept. This WaaS concept escalates the functionality of a conventional workflow management system (WMS) to serve a more significant number of users in a utility service model. In this case, the platform, which is called the WaaS platform, must be able to handle multiple workflows scheduling and resource provisioning in cloud computing environments in contrast to its single workflow management of traditional WMS. This thesis investigates the novel approaches for budget-constrained multiple workflows resource provisioning and scheduling in the context of the WaaS platform. They address the challenges in managing multiple workflows execution that not only comes from the users' perspective, which includes the heterogeneity of workloads, quality of services, and software requirements, but also problems that arise from the cloud environments as the underlying computational infrastructure. The latter aspect brings up the issues of the heterogeneity of resources, performance variability, and uncertainties in the form of overhead delays of resource provisioning and network-related activities. It pushes a boundary in the area by making the following contributions: - A taxonomy and survey of the state-of-the-art multiple workflows scheduling in multi-tenant distributed computing systems. - A budget distribution strategy to assign tasks' budgets based on the heterogeneous type of VMs in cloud computing environments. - A budget-constrained resource provisioning and scheduling algorithm for multiple workflows that aims to minimize workflows' makespan while meeting the budget. - An online and incremental learning approach to predict task runtime that considers the performance variability of cloud computing environments. - The implementation of multiple workflows scheduling algorithm and its integration to extend the existing WMS for the development of WaaS platform.
Health Information Systems Enabled Transformation of Service Ecosystems: The Case of Indonesian Healthcare
Information and Communication Technology (ICT) has contributed significantly to the socio-economic development of societies. In particular, developing countries are now beginning to undertake ICT-enabled transformations that previously took place in the western world. However, while the proliferation of ICT is considered a crucial enabler of this transformation, ICT for Development (ICT4D) projects continue to fail as they do not achieve the anticipated societal impacts. Therefore, a holistic and systemic perspective of ICT4D research is needed to enhance the current understanding of these phenomena. This study addresses this knowledge gap through an in-depth investigation on how the structure of public health ecosystem in Indonesia is changed and transformed following Health Information Systems (HIS) introduction. A qualitative multiple case study was conducted across three district-level government. The analysis reveals the distinctive impacts of HIS introduction on the structural properties of the ecosystem, which include institutional rules, resources configuration, actors’ institutional logics, and practices. This study also identifies three mechanisms (adoption-incorporation, breaking-making, and self-reinforcing) of HIS enabled transformation which constitute two pathways (enslaving and emergence) of the ecosystem's transformation. The findings of this study offer theoretical contributions to ICT4D and service literature and practical contributions to HIS implementation in Indonesia. The transformation process of the ecosystem’s structure offers a systemic perspective of ICT4D, which overcomes the tendency to overemphasise the significance role of technology and agency in developing countries. The pathways of transformation complement those earlier studies investigating the reasons for numerous failures of the top-down technological transfer and the importance of inclusion, engagement, and empowerment of the societal groups in ICT4D. To service literature, this study offers insights into the origins and lifecycle of practices and how they emerge in the ecosystem, which shed light on the dynamic and evolving nature of ecosystem’s structure that currently has not been adequately understood. Finally, the results of this study advocate the autonomy of the district’s health providers, the inclusion and engagement of local actors, and the use of the incremental approach to HIS implementation in public health ecosystem.
Privacy-Preserving Approaches to Analyzing Sensitive Trajectory Data
The evolution of smart devices and sensor-enabled vehicles has brought forward the capability of collecting large and rich datasets. The datasets provide unprecedented opportunities for devising the next generation of location-based decision systems. Analysing detailed continually updated information of a user's status such as location, speed and direction is vital in improving the safety, reliability, mobility and efficiency of any form of location-based services in smart cities. More generally, trajectory data is paramount for studying people's movement patterns, shopping behaviour and preferences (i.e., visited cafes, parks, and their sequence of points of interest). However, such fine-grained data raises significant concerns about the privacy of individuals, which in turn hinders the further development of next generation applications that benefit from trajectory data. Such data can reveal various sensitive information about individuals such as their home and workplace locations, whereabouts over time and health. Recent approaches to address such concerns use a strong privacy guarantee -- known as differential privacy. Their aim is to tackle a core privacy challenge: publishing modified datasets of individuals without compromising their privacy while not sacrificing the utility of the published data. However, the current approaches guaranteeing differential privacy are limited in scalability and utility for real applications which both are crucial for later usage or data analytics. In this thesis, we are concerned with publishing trajectory data which poses privacy risks due to its sequential nature. A key issue is that the known algorithms fail to preserve the utility of published trajectory data when perturbing it to satisfy differential privacy. Critical information of trajectory datasets such as total travel distances and frequent location patterns in trajectories cannot be fully preserved by the existing differentially private algorithms. This thesis investigates three research issues. First, it is known that simple histograms, which is widely studied under differential privacy, are insufficient to capture aggregated information for spatial data. Our first work shows how to use instead spatial histograms to provide accurate distribution of traffic counts with differential privacy guarantee. Spatial histograms must satisfy sequential constraints (spatial) and naively applying differential privacy can destroy sequential constraints. Our proposed algorithm computes new information about trajectory counts without destroying spatial constraints and hence, improves the utility of published data. We further refine the algorithm to improve the utility of the published data by incorporating the traffic distribution. Intuitively, dense regions gain more information about the trajectory counts compared to sparse regions. Since the density of different regions might be uneven, we need to directly use trajectory densities to accurately compute information about the trajectory distribution in the regions for efficiently scaling the added noise to ensure differential privacy. Spatial histogram data has limitations in terms of spatial queries. For example, we cannot ask queries such as ``how many trajectories start from location A and end at location B?''. To address this limitation, in our third work, instead of using count information from trajectories as in spatial histograms we use actual trajectory data. We introduce a graphical model to capture accurate statistics about the movement behaviours in trajectories. Using this model, our algorithm privately generates synthetic trajectories such that the noise is optimally added to capture the movement direction of a trajectory. Our algorithm preserves both the spatial and temporal information of trajectories in the generated dataset, requires less memory and computation than competing approaches, and preserves the properties of original trajectory data in terms of travelled distance, movement patterns and locations of interest. Our extensive theoretical and experimental analysis shows the significant improvement in the utility of published data generated by our algorithms.
Voice interaction game design and gameplay
This thesis is concerned with the phenomenon of voice-operated interaction with characters and environments in videogames. Voice interaction with virtual characters has become common in recent years, due to the proliferation of conversational user interfaces that respond to speech or text input through the persona of an intelligent personal assistant. Previous studies have shown that users experience a strong sense of social presence when speaking aloud to a virtual character, and that voice interaction can facilitate playful, social and imaginative experiences of the type that are often experienced when playing a videogame. Despite this, the user experience of voice interaction is frequently marred by frustration, embarrassment and unmet expectations. The aim of this thesis is to understand how voice interaction can be used in videogames to support more enjoyable and meaningful player experiences. Voice-operated videogames have existed for more than three decades, yet little research exists on how they are designed and how they are received by players. The thesis addresses that knowledge gap through four empirical studies. The first study looks at player responses to a videogame character that can be given commands through a natural language interface. The second study is a historical analysis of voice-operated games that examines the technological and cultural factors that have shaped their form and popularity. The third study develops a pattern language for voice game design based on a survey of 471 published videogames with voice interaction features. The fourth study compares player responses to videogames that feature speech-based voice interaction and non-verbal voice interaction, and applies the theoretical perspective of frame analysis to interpret their reactions. Through these studies, the thesis makes two main contributions to the human-computer interaction and games studies literature. First, it identifies five genres of voice gameplay that are based upon fundamentally different types of vocal activities, and details the design patterns and design goals that are distinctive to each genre. Second, it presents an empirically grounded theoretical model of gameplay that accounts for players’ feelings of engagement, social presence, frustration and embarrassment during voice gameplay. Overall, the thesis demonstrates that the fictional framing a videogame presents is a crucial factor in determining how players will experience its voice interaction features.
Contrast Data Mining of Multi-source Heterogeneous Trajectory Data
The rapid growth of location-acquisition and mobile computing techniques has led to an increasing availability of human trajectory data. This raises the challenge of detecting and understanding human mobility from these trajectory datasets to extract useful knowledge in a variety of domains, such as business management and urban computing. In this thesis, we focus on research into knowledge discovery from multi-source heterogeneous trajectory data. To be specific, five research questions in three scenarios are studied. The details are as follows. The first research question is how to perform trajectory pattern identification and anomaly detection for pedestrian flows. We propose to adopt contour maps as the visualization method of the origin-destination flow matrix to describe the distribution of pedestrian movements in terms of entry/exit areas. By transforming the origin-destination flow matrix into a dissimilarity matrix, a visual clustering algorithm is applied to visually cluster the most popular and related areas. We also propose a clustering-based algorithm to detect normal/abnormal time periods with similar/anomalous pedestrian flow patterns. Our results on one synthetic and one real-life dataset validate the effectiveness of our proposed algorithms. The second research question is how to perform contrast pattern mining from multi- source datasets in retail environments. Given the sales data and customers’ trajectory data, in order to find patterns where there has been a big change in one dataset but little change in the other dataset, we define a new kind of contrast pattern, conditional contrast patterns, which are a subset of traditional contrast patterns in one kind of dataset conditioned on a property of these patterns in another kind of dataset. Accordingly, we propose an algorithm based on tree search for mining these patterns. Experiments on a synthetic dataset as well as a real-life retail dataset show that our proposed patterns are more informative and actionable for decision makers than traditional contrast patterns, and our tree-based algorithm has good performance in terms of computational efficiency. Three research questions are studied in the third scenario, i.e., human behavior analysis in heterogeneous mobile networks. First, we focus on identifying the underlying geographical corridors of trajectories generated in mobile networks. We propose a hierarchical multi-scale trajectory clustering algorithm for corridor identification by analyzing the non-homogeneity of the spatial distribution of cell towers and users’ movements. Results on a three-week real-life dataset from China Mobile show that our method can achieve the best performance with more than 10% improvement in clustering quality compared with other state-of-the-art methods. Identifying static corridors plays an important role in managing networks for the long term design of a network. However, there is also a great opportunity for dynamically reconfiguring a network in response to changes in traffic flows. Therefore, in our fourth work, we propose a framework based on contrast data mining to identify significantly different corridors during different time periods. Contrast corridors are defined and a distance measure based on Hausdorff distance and earth movers’ distance is proposed to calculate the dissimilarity between the identified corridors. Experimental results on synthetic as well as real-life datasets show that our method can effectively and robustly detect contrast corridors from trajectories generated from different time periods in mobile networks by improving the F1 score by 20% on average. Finally, we focus on how to design caching strategies at the edge of networks. Edge caching in mobile networks can improve users’ experience, reduce latency and balance the network traffic load. Considering that cells located in different places have different levels of predictability due to the heterogeneity of mobile users’ content preferences and mobility, we propose an adaptive edge caching algorithm based on content popularity as well as the individual’s prediction results to provide an optimal caching strategy, aiming to maximize the cache hit rate with acceptable file replacement cost. Our results on a real-life dataset as well as simulation data show that our method is more appropriate for resource-limited and heterogeneous network than other methods. In summary, we have proposed several trajectory data mining approaches to extract useful knowledge from heterogeneous trajectory data or multi-source datasets in three different scenarios. We have shown that our proposed methods can achieve better performance compared to existing state-of-art techniques on a variety of real-life datasets.
Context-Aware Recommendations for Point-of-Interests
The rapidly growing location-based social networks allow web users to check-in at point- of-interests (POIs) and share their check-ins with the public. The large amount of web breadcrumbs left by users has enabled researchers to investigate human mobility patterns, which opens new research opportunities for incorporating better personalization into location-based services. In general, two types of recommendation tasks have been extensively investigated. The first type is POI recommendation which aims to provide a ranking list of POIs according to their attractiveness to a user. The second type is trip recommendation which aims to suggest an itinerary, i.e., an ordered sequence of POIs, for users. Developing recommendation models for POIs is challenging mainly due to three reasons. First, the observed POI visits of an individual user are limited in quantity. Second, users’ preferences over POIs are usually influenced by various contextual factors. Examples of these contextual factors include sequential contexts (i.e., the influence of a user’s recent POI visits on her next visit), temporal contexts (i.e., the influence of the visiting time on a user’s preference over POIs), and geographical contexts (i.e., the influence of a user’s geographical location on her preference over POIs). Finally, another reason that makes the development of POI recommendation models challenging is that the valuable information (e.g., the temporal contexts, the reviews left by users, and the geo-tagged photos post by users), which is useful to enhance the recommendation accuracy, has heterogeneous forms (e.g., the numerical, textual, and visual forms). In this thesis, we aim to address the following questions in the domain of POI recommendations: 1. How can we capture the complex interactions between users’ preferences and the contextual factors effectively and efficiently? 2. How can we model the temporal dynamics in users’ preferences over POIs given the limited observations of users’ historical POI visits? 3. How can we utilize the online reviews generated by users to improve the accuracy of POI recommendations? 4. How to generate personalized trip itineraries for users effectively and efficiently? To address the first research question, we propose a Gaussian process factorization model for POI recommendation. To further improve the scalability of the proposed model, we propose a query-aware Bayesian committee machine (QBCM) for scalable Gaussian progress regression. We show that the proposed QBCM model improves the prediction accuracy in Gaussian process regression by up to 23.3% comparing with state-of-the-art GP approximation models. To address the second research question, we propose time-modulated self-attentive network for time-aware next POI recommendation. The proposed model learns the relevance between users’ historical POI visits and their next POI visits via the self-attention mechanism, where the relevance is modulated by the impact of the temporal contexts. We show that the proposed model improves the recommendation accuracy by up to 17.1% while maintaining high training efficiency. To address the third research question, we propose a distillation framework for POI recommendation leveraging user reviews. The framework first uses a teacher model to extract the fine-grained sentiment orientations of the textual reviews left by users. Then the extracted information is fed into a light-weight student recommendation model via knowledge distillation. We show that the proposed model achieves competitive results compared with state-of-the-art review-based recommendation models. In particular, it can improve the accuracy in rating prediction by up to 10.1%. To address the final research question, we propose a unified trip recommendation framework which jointly considers the impact of three factors over the probability of a POI being visited in a trip, namely POI popularities, users’ personal preferences, and POI co-occurring probabilities. We show that the proposed trip recommendation framework consistently outperforms state-of-the-art algorithms, with an advantage of up to 43% in F1-score.
Pattern Recognition and Predictive Modelling in Smart Grids
Smart grids are a modification of traditional electric power grid to achieve a bidirectional, automatic, intelligent and adaptive power system. In smart grids the electricity distribution and power system management is improved by leveraging advanced two-way communications and integrating computing capabilities to achieve better reliability, stability, efficiency, and security of the power system. The smart grid introduces the two-way flow of data between electricity suppliers and customers to transfer real-time information and facilitate the near real-time balance of supply-demand management. In contrast to many other industries who have the capability to store and reserve their products, the electric power industry does not have such a capability to store a massive amount of electricity using today’s technologies. Therefore, due to the storage limitations of electricity, one of the crucial tasks of power system operation is to keep a balance between supply and demand at every moment. As a result, forecasting is an essential and important function in the electricity power grid. Recent advances in the energy industry, including smart grid and smart meters, provide new capabilities to electrical utilities for forecasting electricity demand, modelling customers’ usage profiles, optimizing unit commitment and preventing outages. These advances also introduce new challenges to the power grid, such as managing and analysing of large volumes of complex, high dimensional data in an efficient manner. So, utilities need to apply advanced data management and analytical models to extract actionable insights from this information. By leveraging better predictive and analytical models and the high volume of data, utility companies are able to produce a wide range of forecasts including: 1. Forecasting the amount of excess energy generation, the appropriate time to sell it, and the feasibility of transmitting it into the grid 2. Forecasting when and where contingencies are most likely to happen 3. Identifying the customers that are most likely to transfer energy back to the grid 4. Identifying the customers that are most likely to respond to demand reduction incentives and energy conservation programs 5. Considering the generation of distributed energy resources in the decision-making process to manage the commitment of conventional plants 6. Considering the integration of renewable energy resources into the power grid, which are inherently intermittent, weather dependent and unpredictable, to run a clean and reliable power system. To achieve these potential benefits, grid operators require accurate and efficient methods to mine patterns in customers and grid data, which can be integrated into their decision-making frameworks. In this thesis, we develop new predictive machine learning algorithms to help address the new challenges in the smart grid era. In the first part of this thesis, we focus on understanding customers’ energy consumption behaviour (demand analytics). Previously, information about customers’ energy consumption could be obtained only with coarse granularity (e.g., monthly or bimonthly), Nowadays, using advanced metering infrastructure (or smart meters), utility companies are able to retrieve it in near real-time. By leveraging smart meter data, we propose a hierarchical demand forecasting approach. We improve the aggregated level of electricity load forecasts by first segmenting the households into several clusters, forecasting the energy consumption of each cluster, and then aggregating those forecasts. The improvements provided by this strategy depend not only on the number of clusters, but also on the size of the clusters and selecting an appropriate clustering method. We also leverage deep learning techniques to improve forecast accuracy. Dealing with the high volume of time-series data (smart meter data) has motivated us to develop a new clustering algorithm for time-series data that is computationally efficient and accurate. In the second part of this thesis, we introduce our two new proposed clustering algorithms namely, “Fuzzy C-Shape plus (FCS+)” and “Fuzzy C-Shape plus plus (FCS++)” and we show that the two new algorithms outperform state-of-the-art shape-based clustering algorithms in terms of accuracy and efficiency. Improving accuracy is a primary goal in any forecasting task, which is especially challenging in multi-step prediction scenarios. In the third part of this thesis we propose a robust and accurate ensemble-based load forecasting framework to address some of the challenges associated with load forecasting, including unbalanced training load data, the non-stationary nature of the load data, and feature selection for predictive modelling. The performance of the proposed method is validated with real-life data from the power system in the Australian National Electricity Market, as well as through on-site implementation by the system operator. In practice, an understanding of the uncertainty in the forecasts that an operational power grid uses is crucial in order to operate the system in a secure manner in real-time and into the future. In the fourth part of this thesis, we propose a dynamic stochastic decision support tool, based on Dynamic Bayesian Belief Networks, to quantify the level of uncertainty in order to improve situational awareness and understand the risks to power system operators. The performance of the proposed method is validated on real-life data from the Australian power system, and through actual on-site implementation by the Australian system operator, the Australian Energy Market Operator.
Quantifying the Effects of Situationally-Induced Impairments and Disabilities on Mobile Interaction
Situationally-Induced Impairments and Disabilities (SIIDs), also known as situational impairments, have been shown to negatively affect mobile interaction. This is a consequence of the fact that smartphones have become an indispensable part of our everyday life, and are used under various situations, contexts, and environments. While some situational impairments have received more attention from the research community (e.g., walking-, encumbrance-based SIIDs), some remain underexplored. In addition, research conducted on SIIDs has typically followed an ad-hoc approach, with studies aimed at investigating the impact of a particular SIID on a particular task. Conversely, this thesis systematically quantifies the effects of a range of SIIDs: ambient noise, stress, and dim ambient light on mobile interaction. These findings then enable us to draw baseline comparisons between the effects of these SIIDs on mobile interaction. Furthermore, in a case study this thesis focuses on cold-induced SIIDs, and proposes a sensing mechanism to detect and respond to the onset of such effects. Our contribution to Human-Computer Interaction (HCI) and UbiComp research is to enhance our understanding of the impact of SIIDs on mobile interaction. This knowledge is crucial to enable the development of smarter ubiquitous technology that can detect SIIDs and adapt mobile device interfaces accordingly with the purpose of improving the user experience for people of all abilities.
Pattern Aided Explainable Machine Learning
Interpretability has been recognized as an important property of machine learning models. Lack of interpretability brings challenges for the deployments of many black models such as random forest, support vector machine (SVM) and neural networks. One aspect of interpretability is the ability to provide explanations for the predictions of a model, and explanations help users to understand the logical reasoning behind a model, thus giving users greater confidence to accept/reject predictions. Explanations are useful and sometimes even mandatory in domains like medical analysis, marketing, and criminal investigations, where decisions based on the predictions may have severe consequences. Traditional classifiers can be categorized into interpretable models (or white-box models) and non-interpretable models (or black-box models). Interpretable models are the models whose internal structures or parameters are simple and can be easily explained, the examples of interpretable models are decision trees, linear models, logistic regression models. Non-interpretable models are the models whose are complex and difficult to explain, the examples of non-interpretable models are random forests, support vector machines and neural networks. Though the white-box models are intrinsically easy to interpret, they usually fail to achieve comparative accuracy as the black-box models. To facilitate the successful deployments of machine learning models when both of interpretability and accuracy are desired, there exist two directions of research: (1) increasing the accuracy of white-box models, and (2) increasing the interpretability of black-box models. Patterns are conjunctions of feature-value conditions, which are intrinsically easy to comprehend, and they have been shown to have good predictive power as well. The objectives of the thesis is to propose methods to utilize patterns to increase the accuracy of white-box models by interpretable feature engineering and building interpretable models, and help black-box models provide explanations. First, we discuss the pattern based interpretable feature engineering. Pattern based feature engineering extracts patterns from data, selects the representative patterns from the extracted candidates and then projects the instances in the original feature space into new pattern based feature space with a mapping function. The new pattern based features can be more discriminative, and meanwhile they are interpretable. Second, we propose a method to explain any classifier using contrast patterns. Given a model and a query instance being explained, the proposed method first generates a synthetic neighborhood around the query using random perturbations, then labels the synthetic instances using the model, finally the method mines contrast patterns from the synthetic neighborhood and selects top K patterns as the final explanations. The experiments show that the method is able to achieve high faithfulness such that the explanations truly reveal how a model ``thinks'', moreover the method is able to support scenarios where there exist multiple possible explanations. Third, we analyse why some instances are difficult to explain. We investigate the crucial process of generating synthetic neighbors for local explanations methods, as different synthetic neighbors can result in explanations of different quality, and in many cases, the random perturbation does work well. We analyze the relationship of local intrinsic dimensionality (LID) and the quality of explanations, and propose a LID based method to generate the synthetic neighbors such that the generated synthetic neighbors are more effective than the ones generated by other baseline methods. Then we propose an interpretable model that achieves comparable accuracy with the state of the art baselines using patterns. The proposed method is a pattern based partition-wise linear models method that can be trained together with expert explanations. It divides the data space into several partitions, and each partition is represented by a pattern and is associated with a local linear model. The overall prediction for a query is a linear combination of the local linear models in the activated partitions. The model is interpretable and is able to work with expert explanations as a loss function in terms of explanations is part of the ultimate loss function. The results show that the proposed method is able to make superior reliable predictions and achieve competitive accuracy comparing with the baseline methods. Finally, we show how to construct a model to make both accurate and reliable predictions by jointly learning explanations and class labels using multi-task learning in neural networks. We propose a neural network structure that is able to jointly train the class labels and the explanations where the explanations can be treated as another label information. We fit a neural network in the framework of multi-task learning. The neural network starts with a set of shared layers and then split into two separate layers where one is for class label and the other is for explanations. The experiments suggest that the proposed method is able to make reliable predictions. In summary, this work recognizes the importance of interpretability of machine learning models, and it utilizes patterns to help improve the interpretability of machine learning models through: interpretable feature generation, pattern based model-agnostic local explanation extraction, pattern based partition-wise linear models and joint learning framework with explanations and class labels. We also investigate why a particular instance is difficult to explain using local intrinsic dimensionality. All work is supported by theoretical analysis and empirical evaluations.
Quality of Service (QoS)-aware Application Management in Fog Computing Environments
In recent years, the Internet of Things (IoT) paradigm is being rapidly adopted in the creation of applications for various smart-city, healthcare, Industry 4.0, and Agtech-based Cyber-Physical Systems (CPS). Usually, the IoT-enabled CPSs reside at a multi-hop distance from the Cloud datacentres. As a consequence, the Cloud-centric execution of IoT applications often fails to meet their Quality of Service (QoS) requirements in real-time. Fog computing, an extension of Cloud at the edge network, can execute the IoT applications closer to the data sources. Thus, it can improve the application service delivery time and reduce network congestion. However, the Fog computing nodes are highly distributed, heterogeneous, and most of them are constrained in resources and spatial sharing. Therefore, without efficient management of applications, it is complicated to harness the capabilities of Fog computing for different IoT-driven use cases. Application management is an integral part of computing resource management. It can be ensured by finding suitable placement options for the applications within the computing infrastructure. In IoT-enabled CPSs, different entities including, applications, Fog nodes, IoT devices, users, and service providers, continuously interact with each other. This thesis focuses on application placement in Fog environments considering i. the characteristics of the applications, ii. the communication delay among the Fog nodes, iii. the context of the IoT devices, iv. the service expectations of the users, and v. the operational cost of the providers. It demonstrates how the placement of applications from the perspectives of different system entities can improve the application’s QoS, the user’s Quality of Experience (QoE), and the provider’s profit. This thesis advances the state-of-the-art by making the following contributions: 1. A comprehensive taxonomy and literature review on application management approaches in Fog computing environments 2. An application characteristics-driven model that facilitates application classification and selection for Fog-based placement at the gateway level. 3. A latency-aware application management policy that deals with the service delivery deadline and the inter-nodal communication delay simultaneously while placing the applications over distributed Fog nodes. 4. A context-aware application management policy that optimizes the service time of applications by coordinating the sensing frequency and data size of IoT devices with the capacity of Fog nodes. 5. A QoE-aware application management policy that prioritizes the placement of applications in Fog environments based on user expectations. 6. A pricing model for integrated Fog-Cloud environments that enhances the profit of providers for executing the applications in the proximity of end-users.