Computing and Information Systems - Theses
Now showing items 1-12 of 380
Older Adults Designing Avatars for Self-expression
Representations of older age are frequently associated with bodies in decline. Looking old can trigger discriminatory social behaviours and conceal the richness of the lived experience. Avatars, full-body digital self-representations of the user, influence the way users think and behave in virtual environments (VE). As older adults increasingly participate in online spaces, which use avatars for self-representation, it is essential to understand how to best support their online self-representations. This thesis addresses this gap by engaging older adults in designing their full-body avatars. Across four studies using research through design, this thesis provides older adults’ views about how they want to be graphically represented. In Study 1, I conducted gameplay observations and semi-structured interviews that provided an initial understanding of how older adults who play online games projected aspects of their identities into their player self-representations. The study revealed that participants designed their player self-representations by projecting aspects of their past (former) self and embracing their present older selves. For Studies 2, 3, and 4, I engaged a group of older adults aged 70-80 in designing avatars. In Study 2, older adults designed a full-body avatar during a group design workshop. The study demonstrated that older adults negotiate with ageing stereotypes when creating their avatar designs. Some participants reproduced realistic representations of the aged appearance that suggests acceptance of ageing bodies; others idealised their avatars by depicting healthy bodies or societal ideals of youth. This study highlighted that the character creation interfaces (CCIs) (where users designed the avatars) presented limited design choices to portray ageing features. Informed by the previous outcomes, Study 3 explored if the graphic styles of the avatar customisation prompted older adults’ expressions of identity. Through extended individual design sessions, participants designed a photorealistic avatar and a cartoon avatar. The analysis of the individual design journeys demonstrates that participants conformed to social norms through the design of the photorealistic avatar and rebelled against these social norms through the design of the cartoon avatar. While the photorealistic avatar prompted participants to reflect on the appearance of the ageing body, the cartoon avatar design supported the expression of hidden aspects of the self. Finally, in Study 4, older adults participated in virtual reality sessions over four months, choosing between the predesigned photorealistic or cartoon avatar for each session. Along with the VR sessions, some participants modified their avatar designs in further avatar customisation sessions. This study evaluated how the context (people, place and purpose) influences older adults’ expressions of identity through avatars. The analysis highlighted gender differentiation and revealed that participants chose the photorealistic avatar to conform to social norms of an older age when meeting peers of similar age. This research contributes with a schematic that illustrates how older adults’ self-expression through avatars is mediated by the design choices available from the avatar creation software, and, by the context in which the avatar is used. Furthermore, this research shows that designing avatars is a powerful mechanism that supports self-expression, reflection and experimentation. These results have implications for designing CCI online environments that cater to the preferences of older individuals.
Embedding Graphs for Shortest-Path Distance Predictions
Graph is an important data structure and is used in an abundance of real-world applications including navigation systems, social networks, and web search engines, just to name but a few. We study a classic graph problem – computing graph shortest-path distances. This problem has many applications, such as finding nearest neighbors for place of interest(POI) recommendation or social network friendship recommendation. To compute a shortest-path distance, traditional approaches traverse the graph to find the shortest path and return the path length. These approaches lack time efficiency over large graphs. In the applications above, the distances may be needed first (e.g., to rank POIs), while the actual shortest paths may be computed later (e.g., after a POI has been chosen). Thus, an alternative approach precomputes and stores the distances, and answers distance queries with simple lookups. This approach, however, falls short in the space cost – O(n^2) in the worst-case for vertices, even with various optimizations. To address these limitations, we take an embedding based approach to predict the shortest-path distance between two vertices using their embeddings without computing their path online or storing their distance offline. Graph embedding is an emerging technique for graph analysis that has yielded strong performance in applications such as node classification, link prediction, graph reconstruction, and more. We propose a representation learning approach to learn a k-dimensional (k<<n) embedding for every vertex. This embedding preserves the distance information of the vertex to the other vertices. We then train a multi-layer perceptron (MLP) to predict the distance between two vertices given their embeddings. We thus achieve fast distance predictions with-out a high space cost (i.e., only O(kn)). Experimental results on road network graphs, social network graphs, and web document graphs confirm these advantages, while our approach also produces distance predictions that are up to 97% more accurate than those by the state-of-the-art approaches. Our embeddings are not limited for only distance predictions. We further study their applicability on other graph problems such as link prediction and graph reconstruction. Experimental results show that our embeddings are highly effective in these tasks.
Individual use of Enterprise 2.0 and its impact on social capital within large organisations
Over the past years, there has been a significant momentum in the adoption of Enterprise 2.0 in larger organisations. Enterprise 2.0 adoption encapsulates the use of integrated social media tools in a unified social networking platform to support business operations. Many organisations are adopting Enterprise 2.0, with the hope of achieving the benefits of better knowledge retention, faster information discovery, innovation, employee engagement and higher productivity through social networking. Whilst organisations herald Enterprise 2.0 with great promises, it is still not clear how the individuals’ use of Enterprise 2.0 will result in organisational benefits. This thesis aims to contribute to a better understanding of the individual use of Enterprise Social Networks (Enterprise 2.0) in large organisations and how this leads to organisational benefits. To achieve this objective, this research applies social capital theory as the basis to analyse the organisation’s rich set of relationships and the organisational value they bring as a result of using Enterprise 2.0. In-depth research and a critical review of relevant previous studies and established theories have been conducted to delineate the components within structural, cognitive, and relational social capital dimensions. Sixty in-depth interviews were conducted with participants from six large organisations in Australia who were actively using an existing implementation of Enterprise 2.0 (i.e. Yammer and Oracle Social Network). For an all-encompassing and unbiased view, participants from varying roles and responsibilities, levels of expertise and usage types were selected from the identified organisation. To ensure unbiased and granular categorisation of use modes, the research analysed the data collected from the interviews, observations and notes using a grounded approach to identify emerging trends and patterns of individual Enterprise 2.0 use. The results generate new insights in the form of seven distinct individual use modes of Enterprise 2.0. The findings also present rich insights into the impact of these varied individual use modes on each structural, relational, and cognitive social capital dimension. In addition, the research also reveals novel insights on how different user types benefited from using Enterprise 2.0 from a social capital perspective. Finally, the study demonstrates the supportive and enabling role of Enterprise 2.0 as a platform to build social capital. The research contributes to a deeper understanding of the individual use of Enterprise 2.0 and social capital theory. The research also includes seven insights to help shape future research and as outcome to guide managers in large organisation to set up, manage, and promote individual use of Enterprise 2.0.
Mapping the structural connectome and predicting functional connectivity with deep learning methods
Mapping the human connectome is a major goal in neuroscience, where connectome refers to a comprehensive network description of the brain. This network is often represented as a graph, where nodes denote brain regions and edges represent white matter pathways. Tractography is a computational reconstruction method based on diffusion-weighted magnetic resonance imaging (dMRI) that estimates millions of streamlines that trace out the trajectories of white matter fiber bundles. The number of streamlines interconnecting each pair of regions comprising a predefined cortical parcellation is computed to yield a structural connectivity matrix. Network analyses of these connectivity matrices have yielded new insights into brain disorders (such as Schizophrenia, Alzheimer’s disease), cognition and neurodevelopmental processes. Moreover, the temporal dependence of neuronal activity patterns of different brain regions (functional connectivity) is also associated with underlying neuronal pathways (structural connectivity). In this thesis, we analyse the capabilities of state-of-the-art tractography algorithms (deterministic and probabilistic) for mapping connectomes and develop algorithms that overcome the limitations of conventional tractography algorithms for connectome mapping. Also, we utilize the structure-functional coupling for training Deep Neural Nets to predict the functional connectivity from structural connectivity. In the first part of the thesis, we develop numerical connectome phantoms that feature realistic network topologies and match to the fiber complexity of in vivo dMRI. The connectivity between pairs of regions was predefined for these phantoms. The phantoms are utilized to evaluate the performance of tensor-based and multi-fiber implementations of deterministic and probabilistic tractography. We found that multi-fiber deterministic tractography yields the most accurate connectome reconstructions, whereas probabilistic algorithms are hampered by an abundance of spurious connections. It is essential to omit connections with the fewest number of streamlines (thresholding) when using probabilistic algorithms for mapping connectomes. The study suggests that multi-fiber deterministic tractography is well suited for connectome mapping, regardless of the streamline threshold. In the second part, we propose a novel framework to map structural connectomes using deep learning. This framework not only enables connectome mapping with a convolutional neural network (CNN) but can also be straightforwardly incorporated into conventional connectome mapping pipelines (using tractography) to enhance accuracy. This framework involves decomposing the entire brain volume into overlapping blocks. Blocks are sufficiently small to ensure that a CNN can be efficiently trained to predict each block’s internal connectivity architecture. Later, a block stitching algorithm is proposed to rebuild the full brain volume from these blocks and thereby map end-to-end connectivity matrices. Performance is evaluated using simulated dMRI data generated from numerical connectome phantoms with known ground truth connectivity. Due to the redundancy achieved by allowing blocks to overlap, block decomposition and stitching steps can enhance the accuracy of probabilistic and deterministic tractography algorithms by up to 20-30%. Various studies have reported that functional brain connectivity is associated with underlying structural characteristics. In the third part of the thesis, we utilize this structure-functional coupling to develop a novel framework using deep learning that predicts functional connectivity from structural connectivity. The framework predicts functional connectivity without explicitly modelling the biophysical characteristics of the brain. We have demonstrated that a neural network can predict functional connectivity with high accuracy while preserving the inter-subject functional differences. Furthermore, we also demonstrated that functional connectivity could be used to predict human behavior, namely cognition. Altogether, the analyses and frameworks presented in this thesis aid in extracting structural connectivity and understanding the complex relationships between functional and structural connectivity in the human brain.
Ontologies in neuroscience and their application in processing questions
Neuroscience is a vast, multi-dimensional and complex field of study based on both its medical importance and unresolved issues regarding how brain and the nervous system work. This is because of the huge amount of brain disorders and their burden on people and society. Furthermore, scientist have been excited about the function and structure of brain, ever since it was discovered to be responsible for all our emotions, thoughts and behaviour. Ontologies are concepts whose origins go back to philosophy and the concern with the nature and relation of being. They have emerged as promising tools for assistance with neuroscience research recently and provide additional data on a field of study. They connect each entity or element to other ones through descriptive relationships. Ontologies seem to suit the complex, multi-dimensional and still incomplete nature of neuroscience very well because of their characteristics. The first study shines light on applications of ontologies in neuroscience. It incorporated a systematic literature review and methodically reviewed over 1000 research papers from eight databases and three journals. After scanning all documents, 208 of them were selected. Then, a full text analysis was performed on the selected documents. This study found eight major applications for ontologies in neuroscience, most of them consisted of several subcategories. The analysis not only demonstrated the current applications of ontologies in neuroscience, but also their potential future in this field. The second study was set to represent neuroscience questions and then, classify them using ontologies. For this purpose, a questions set was gathered from two research teams and analysed. This, results in a set of dimensions which represents questions. Then, a question hierarchy was formed based on dimensions and questions were classified according to that hierarchy. Two different approaches were used for the classification including an ontology-based approach and a statistical approach. The ontology-based approach exceeded the statistical approach by 15.73% better classification results. The last study was designed to tackle and resolve questions with the assistance of ontologies. It first proposed a set of templates that acted as a translation mechanism for changing questions into machine readable code. Templates were based on the question hierarchy presented in the previous study. Second, this study created an integrated collection of resources including two domain ontologies (NIFSTD and NeuroFMA) and a neuroimaging annotation application (Freesurfer). Subsequently, the code created using templates was executed upon the integrated resource (knowledge base) to find the appropriate answer. While processing the questions, ontologies were used for disambiguation purposes too. At the end, all parts created in this study along with the question classification method created in the previous study were merged as different modules of a question processing model. In conclusion, this thesis reviewed all current ontology applications in neuroscience in detail and demonstrated the extent to which they can assist scientists in classifying and resolving questions. The results of this thesis show that applications of ontologies in neuroscience are diverse and cover a wide range; they are steadily becoming more used in this field; and they can be powerful semantic tools in performing different tasks in neuroscience.
Towards Robust Representation of Natural Language Processing
There are many challenges in building robust natural language applications. Machine learning based methods require large volumes of annotated text data, and variations over text can lead to problems, namely: (1) language can be highly variable and expressed with different variations, such as lexical and syntactic. Robust models should be able to handle these variations. (2) A text corpus is heterogeneous, often making language systems domain-brittle. Solutions for domain adaptation and training with corpora comprised of multiple domains are required for language applications in the real world. (3) Many language applications tend to be biased to the demographic of the authors of documents the system is trained on, and lack model fairness. Demographic bias also causes privacy issues when a model is made available to others. In this thesis, I aim to build robust natural language models to tackle these problems, focusing on deep learning approaches which have shown great success in language processing via representation learning. I pose three basic research questions: how to learn representations that are robust to language variation, robust to domain variation, and robust to demographic variables. Each of these research questions is tackled using different approaches, including data augmentation, adversarial learning, and variational inference. For learning robust representations to language variation, I study lexical variation and syntactic variation. To be specific, a regularisation method is proposed to tackle lexical variation, and a data augmentation method is proposed to build robust models, using a range of language generation methods from both linguistic and machine learning perspectives. For domain robustness, I focus on multi-domain learning and investigate domain supervised and unsupervised learning, where domain labels may or may not be available. Two types of models are proposed, via adversarial learning and latent domain gating, to build robust models for heterogeneous text. For robustness to demographics, I show that demographic bias in the training corpus leads to model fairness problems with respect to the demographic of the authors, as well as privacy issues under inference attacks. Adversarial learning is adopted to mitigate bias in representation learning, to improve model fairness and privacy-preservation. To demonstrate the proposed approaches, a range of tasks are considered, including text classification and POS tagging. To evaluate the generalisation and robustness, both in-domain and out-of-domain experiments are conducted with two classes of language tasks: text classification and part-of-speech tagging. For multi-domain learning, multi-domain language identification and multi-domain sentiment classification are conducted, and I simulate domain supervised learning and domain unsupervised learning to evaluate domain robustness. I evaluate model fairness with different demographic attributes and apply inference attacks to test model privacy. The experiments show the advantages and the robustness of the proposed methods. Finally, I discuss the relations between the different forms of robustness, including their commonalities and differences. The limitations of this thesis are discussed in detail, including potential methods to address these shortcomings in future work, and potential opportunities to generalise the proposed methods to other language tasks. Above all, these methods of learning robust representations can contribute towards progress in natural language processing.
Temporal analytics for understanding students’ study behaviours in digital educational environments
The growth of using technologies in education has motivated the development of research to study and promote students’ academic success in the educational environments applying them. Digital educational settings provide a high volume of data for this analytical purpose and enable researches to easily collect and analyse data from students’ interactions within the system (audit trails) as they proceed towards their study goals. This has motivated the development of Learning Analytics (LA) approaches in these educational settings that offer innovative applications of analytics methods to understand and promote students’ study behaviours. One of the main challenges of LA approaches is making connections between students’ data traces and educational assumptions which is necessary to ensure the improvement of education and needs interdisciplinary knowledge. In addition, in digital environments, students’ data are available at different levels (i.e., fine-grain to coarse-grain) and can be structured in varied ways (e.g., aggregated, temporal). This data requires appropriate formulation to be able to reveal information regarding specific aspects of students’ study behaviours. In this matter, the main focus of LA research is on representing students’ study behaviour based on the aggregated measures of their task level interactions within digital environments that helped to identify various patterns of learning processes associated with learning outcomes. However, this suffers from some limitations, mainly due to neglecting the time dimension that could better reveal the effect of processes students used during studying. In addition, investigation of students’ behaviour at specific context levels such as session is understudied by research that may reveal novel insight into particular aspects of students’ study processes. This thesis provides an understanding of students’ study behaviours by considering specific levels of conceptualizing students’ data and the level at which their behaviours can be structured. In the first part of this thesis, a temporal analysis based on clustering and statistical tests is performed in the context of a Massive Open Online Course (MOOCs) where students’ study behaviour is investigated at session level; that is, dedicated blocks of time in which learners complete single or multiple contiguous learning tasks without interruption. The concept of “session” has rarely been explicitly examined in relation to learning outcome in online learning. Creating and managing sessions when learning online has been associated with an important factor that is associated with students’ time management strategy that subsequently can impact their academic outcome. The result of this study provides insight into varied ways that students organize and prioritise their time in terms of sessions when learning in a MOOC and how these behaviours impact students’ academic outcome. In the second part, a study is conducted in the context of two offerings of a MOOC, where the impact of sequential representations of students’ task level behaviour on their learning outcome is investigated. This study considers assessment task outcomes as a proxy for learning outcome rather than students’ final achievement that could provide more insight regarding students’ progress over time. For this purpose, temporal and non-temporal prediction models are used to show how the sequential nature of learners’ task level behaviour in a MOOC is more informative (predictive) of their assessment outcome rather than aggregated measures examined in most studies. Additionally, it provides insight into variations in behavioural sequences of high and low achieving students when preparing for assessments using a sequential pattern mining approach. The results show that it is possible to successfully predict students’ readiness for assessment tasks, particularly if the sequential aspects of students’ behaviour are represented in the model. Moreover, the results reveal some behavioural patterns reflecting specific learning strategies that may be more effective in promoting learning. In the third part of this thesis, a study is performed in the context of a digital word processing software to examine the importance of the temporal nature of students’ writing behaviour on their writing outcome. It helps to understand how particular aspects of the writing process at specific moments of writing influence the writing outcome. This view is understudied in writing research using students’ audit trails (i.e., keystrokes). For this purpose, a temporal approach is proposed combining classification and local feature interpretation as methods. The results reveal the importance of temporal analysis when studying students’ writing behaviour. Findings also reveal that the influence of specific writing behaviours on writing quality is likely determined in combination with other writing characteristics, that emphasise the necessity of using models that capture and take the interrelationship between features into account. In summary, this work contributes to the learning analytics research by raising awareness regarding the need to account for various levels of conceptualizing data and different dimensions when studying learners’ behaviour. Various stakeholders could take benefits from the knowledge discovered in this research to improve learners’ study behaviours. In particular, educators can identify which study behaviours require support - and (most importantly) when – so they can select relevant interventions to include in their courses.
Budget-constrained Workflow Applications Scheduling in Workflow-as-a-Service Cloud Computing Environments
The adoption of workflow, an inter-connected tasks and data processing application model, in the scientific community has led to the acceleration of scientific discovery. The workflow facilitates the execution of complex scientific applications that involves a vast amount of data. These workflows are large-scale applications and require massive computational infrastructures. Therefore, deploying them in distributed systems, such as cloud computing environments, is a necessity to acquire a reasonable amount of processing time. With the increasing demand for scientific workflows execution and the rising trends of cloud computing environments, there is a potential market to provide a computational service for executing scientific workflows in the clouds. Hence, the term Workflow-as-a-Service (WaaS) emerges along with the rising of the Everything-as-a-Service concept. This WaaS concept escalates the functionality of a conventional workflow management system (WMS) to serve a more significant number of users in a utility service model. In this case, the platform, which is called the WaaS platform, must be able to handle multiple workflows scheduling and resource provisioning in cloud computing environments in contrast to its single workflow management of traditional WMS. This thesis investigates the novel approaches for budget-constrained multiple workflows resource provisioning and scheduling in the context of the WaaS platform. They address the challenges in managing multiple workflows execution that not only comes from the users' perspective, which includes the heterogeneity of workloads, quality of services, and software requirements, but also problems that arise from the cloud environments as the underlying computational infrastructure. The latter aspect brings up the issues of the heterogeneity of resources, performance variability, and uncertainties in the form of overhead delays of resource provisioning and network-related activities. It pushes a boundary in the area by making the following contributions: - A taxonomy and survey of the state-of-the-art multiple workflows scheduling in multi-tenant distributed computing systems. - A budget distribution strategy to assign tasks' budgets based on the heterogeneous type of VMs in cloud computing environments. - A budget-constrained resource provisioning and scheduling algorithm for multiple workflows that aims to minimize workflows' makespan while meeting the budget. - An online and incremental learning approach to predict task runtime that considers the performance variability of cloud computing environments. - The implementation of multiple workflows scheduling algorithm and its integration to extend the existing WMS for the development of WaaS platform.
Health Information Systems Enabled Transformation of Service Ecosystems: The Case of Indonesian Healthcare
Information and Communication Technology (ICT) has contributed significantly to the socio-economic development of societies. In particular, developing countries are now beginning to undertake ICT-enabled transformations that previously took place in the western world. However, while the proliferation of ICT is considered a crucial enabler of this transformation, ICT for Development (ICT4D) projects continue to fail as they do not achieve the anticipated societal impacts. Therefore, a holistic and systemic perspective of ICT4D research is needed to enhance the current understanding of these phenomena. This study addresses this knowledge gap through an in-depth investigation on how the structure of public health ecosystem in Indonesia is changed and transformed following Health Information Systems (HIS) introduction. A qualitative multiple case study was conducted across three district-level government. The analysis reveals the distinctive impacts of HIS introduction on the structural properties of the ecosystem, which include institutional rules, resources configuration, actors’ institutional logics, and practices. This study also identifies three mechanisms (adoption-incorporation, breaking-making, and self-reinforcing) of HIS enabled transformation which constitute two pathways (enslaving and emergence) of the ecosystem's transformation. The findings of this study offer theoretical contributions to ICT4D and service literature and practical contributions to HIS implementation in Indonesia. The transformation process of the ecosystem’s structure offers a systemic perspective of ICT4D, which overcomes the tendency to overemphasise the significance role of technology and agency in developing countries. The pathways of transformation complement those earlier studies investigating the reasons for numerous failures of the top-down technological transfer and the importance of inclusion, engagement, and empowerment of the societal groups in ICT4D. To service literature, this study offers insights into the origins and lifecycle of practices and how they emerge in the ecosystem, which shed light on the dynamic and evolving nature of ecosystem’s structure that currently has not been adequately understood. Finally, the results of this study advocate the autonomy of the district’s health providers, the inclusion and engagement of local actors, and the use of the incremental approach to HIS implementation in public health ecosystem.
Privacy-Preserving Approaches to Analyzing Sensitive Trajectory Data
The evolution of smart devices and sensor-enabled vehicles has brought forward the capability of collecting large and rich datasets. The datasets provide unprecedented opportunities for devising the next generation of location-based decision systems. Analysing detailed continually updated information of a user's status such as location, speed and direction is vital in improving the safety, reliability, mobility and efficiency of any form of location-based services in smart cities. More generally, trajectory data is paramount for studying people's movement patterns, shopping behaviour and preferences (i.e., visited cafes, parks, and their sequence of points of interest). However, such fine-grained data raises significant concerns about the privacy of individuals, which in turn hinders the further development of next generation applications that benefit from trajectory data. Such data can reveal various sensitive information about individuals such as their home and workplace locations, whereabouts over time and health. Recent approaches to address such concerns use a strong privacy guarantee -- known as differential privacy. Their aim is to tackle a core privacy challenge: publishing modified datasets of individuals without compromising their privacy while not sacrificing the utility of the published data. However, the current approaches guaranteeing differential privacy are limited in scalability and utility for real applications which both are crucial for later usage or data analytics. In this thesis, we are concerned with publishing trajectory data which poses privacy risks due to its sequential nature. A key issue is that the known algorithms fail to preserve the utility of published trajectory data when perturbing it to satisfy differential privacy. Critical information of trajectory datasets such as total travel distances and frequent location patterns in trajectories cannot be fully preserved by the existing differentially private algorithms. This thesis investigates three research issues. First, it is known that simple histograms, which is widely studied under differential privacy, are insufficient to capture aggregated information for spatial data. Our first work shows how to use instead spatial histograms to provide accurate distribution of traffic counts with differential privacy guarantee. Spatial histograms must satisfy sequential constraints (spatial) and naively applying differential privacy can destroy sequential constraints. Our proposed algorithm computes new information about trajectory counts without destroying spatial constraints and hence, improves the utility of published data. We further refine the algorithm to improve the utility of the published data by incorporating the traffic distribution. Intuitively, dense regions gain more information about the trajectory counts compared to sparse regions. Since the density of different regions might be uneven, we need to directly use trajectory densities to accurately compute information about the trajectory distribution in the regions for efficiently scaling the added noise to ensure differential privacy. Spatial histogram data has limitations in terms of spatial queries. For example, we cannot ask queries such as ``how many trajectories start from location A and end at location B?''. To address this limitation, in our third work, instead of using count information from trajectories as in spatial histograms we use actual trajectory data. We introduce a graphical model to capture accurate statistics about the movement behaviours in trajectories. Using this model, our algorithm privately generates synthetic trajectories such that the noise is optimally added to capture the movement direction of a trajectory. Our algorithm preserves both the spatial and temporal information of trajectories in the generated dataset, requires less memory and computation than competing approaches, and preserves the properties of original trajectory data in terms of travelled distance, movement patterns and locations of interest. Our extensive theoretical and experimental analysis shows the significant improvement in the utility of published data generated by our algorithms.
Voice interaction game design and gameplay
This thesis is concerned with the phenomenon of voice-operated interaction with characters and environments in videogames. Voice interaction with virtual characters has become common in recent years, due to the proliferation of conversational user interfaces that respond to speech or text input through the persona of an intelligent personal assistant. Previous studies have shown that users experience a strong sense of social presence when speaking aloud to a virtual character, and that voice interaction can facilitate playful, social and imaginative experiences of the type that are often experienced when playing a videogame. Despite this, the user experience of voice interaction is frequently marred by frustration, embarrassment and unmet expectations. The aim of this thesis is to understand how voice interaction can be used in videogames to support more enjoyable and meaningful player experiences. Voice-operated videogames have existed for more than three decades, yet little research exists on how they are designed and how they are received by players. The thesis addresses that knowledge gap through four empirical studies. The first study looks at player responses to a videogame character that can be given commands through a natural language interface. The second study is a historical analysis of voice-operated games that examines the technological and cultural factors that have shaped their form and popularity. The third study develops a pattern language for voice game design based on a survey of 471 published videogames with voice interaction features. The fourth study compares player responses to videogames that feature speech-based voice interaction and non-verbal voice interaction, and applies the theoretical perspective of frame analysis to interpret their reactions. Through these studies, the thesis makes two main contributions to the human-computer interaction and games studies literature. First, it identifies five genres of voice gameplay that are based upon fundamentally different types of vocal activities, and details the design patterns and design goals that are distinctive to each genre. Second, it presents an empirically grounded theoretical model of gameplay that accounts for players’ feelings of engagement, social presence, frustration and embarrassment during voice gameplay. Overall, the thesis demonstrates that the fictional framing a videogame presents is a crucial factor in determining how players will experience its voice interaction features.
Contrast Data Mining of Multi-source Heterogeneous Trajectory Data
The rapid growth of location-acquisition and mobile computing techniques has led to an increasing availability of human trajectory data. This raises the challenge of detecting and understanding human mobility from these trajectory datasets to extract useful knowledge in a variety of domains, such as business management and urban computing. In this thesis, we focus on research into knowledge discovery from multi-source heterogeneous trajectory data. To be specific, five research questions in three scenarios are studied. The details are as follows. The first research question is how to perform trajectory pattern identification and anomaly detection for pedestrian flows. We propose to adopt contour maps as the visualization method of the origin-destination flow matrix to describe the distribution of pedestrian movements in terms of entry/exit areas. By transforming the origin-destination flow matrix into a dissimilarity matrix, a visual clustering algorithm is applied to visually cluster the most popular and related areas. We also propose a clustering-based algorithm to detect normal/abnormal time periods with similar/anomalous pedestrian flow patterns. Our results on one synthetic and one real-life dataset validate the effectiveness of our proposed algorithms. The second research question is how to perform contrast pattern mining from multi- source datasets in retail environments. Given the sales data and customers’ trajectory data, in order to find patterns where there has been a big change in one dataset but little change in the other dataset, we define a new kind of contrast pattern, conditional contrast patterns, which are a subset of traditional contrast patterns in one kind of dataset conditioned on a property of these patterns in another kind of dataset. Accordingly, we propose an algorithm based on tree search for mining these patterns. Experiments on a synthetic dataset as well as a real-life retail dataset show that our proposed patterns are more informative and actionable for decision makers than traditional contrast patterns, and our tree-based algorithm has good performance in terms of computational efficiency. Three research questions are studied in the third scenario, i.e., human behavior analysis in heterogeneous mobile networks. First, we focus on identifying the underlying geographical corridors of trajectories generated in mobile networks. We propose a hierarchical multi-scale trajectory clustering algorithm for corridor identification by analyzing the non-homogeneity of the spatial distribution of cell towers and users’ movements. Results on a three-week real-life dataset from China Mobile show that our method can achieve the best performance with more than 10% improvement in clustering quality compared with other state-of-the-art methods. Identifying static corridors plays an important role in managing networks for the long term design of a network. However, there is also a great opportunity for dynamically reconfiguring a network in response to changes in traffic flows. Therefore, in our fourth work, we propose a framework based on contrast data mining to identify significantly different corridors during different time periods. Contrast corridors are defined and a distance measure based on Hausdorff distance and earth movers’ distance is proposed to calculate the dissimilarity between the identified corridors. Experimental results on synthetic as well as real-life datasets show that our method can effectively and robustly detect contrast corridors from trajectories generated from different time periods in mobile networks by improving the F1 score by 20% on average. Finally, we focus on how to design caching strategies at the edge of networks. Edge caching in mobile networks can improve users’ experience, reduce latency and balance the network traffic load. Considering that cells located in different places have different levels of predictability due to the heterogeneity of mobile users’ content preferences and mobility, we propose an adaptive edge caching algorithm based on content popularity as well as the individual’s prediction results to provide an optimal caching strategy, aiming to maximize the cache hit rate with acceptable file replacement cost. Our results on a real-life dataset as well as simulation data show that our method is more appropriate for resource-limited and heterogeneous network than other methods. In summary, we have proposed several trajectory data mining approaches to extract useful knowledge from heterogeneous trajectory data or multi-source datasets in three different scenarios. We have shown that our proposed methods can achieve better performance compared to existing state-of-art techniques on a variety of real-life datasets.