Computing and Information Systems - Theses

Permanent URI for this collection

http://hdl.handle.net/11343/351

Search Results

Now showing 1 - 10 of 45

Practical declarative debugging of mercury programs

MacLarty, Ian Douglas. (University of Melbourne, 2006)
Practical declarative debugging of mercury programs

MacLarty, Ian Douglas. (University of Melbourne, 2006)
A multistage computer model of picture scanning, image understanding, and environment analysis, guided by research into human and primate visual systems

Rogers, T. J. (University of Melbourne, Faculty of Engineering,, 1983)

This paper describes the design and some testing of a computational model of picture scanning and image understanding (TRIPS), which outputs a description of the scene in a subset of English. This model can be extended to control the analysis of a three dimensional environment and changes of the viewing system's position within that environment. The model design is guided by a summary of neurophysiological, psychological, and psychophysical observations and theories concerning visual perception in humans and other primates, with an emphasis on eye movements. These results indicate that lower level visual information is processed in parallel in a spatial representation while higher level processing is mostly sequential, using a symbolic, post iconic, representation. The emphasis in this paper is on simulating the cognitive aspects of eye movement control and the higher level post iconic representation of images. The design incorporates several subsystems. The highest level control module is described in detail, since computer models Of eye movement which use cognitively guided saccade selection are not common. For other modules, the interfaces with the whole system and the internal computations required are out lined, as existing image processing techniques can be applied to perform these computations. Control is based on a production . system, which uses an "hypothesising" system - a simplified probabilistic associative production system - to determine which production to apply. A framework for an image analysis language (TRIAL), based on "THINGS". and "RELATIONS" is presented, with algorithms described in detail for the matching procedure and the transformations of size, orientation, position, and so On. TRIAL expressions in the productions are used to generate "cognitive expectations" concerning future eye movements and their effects which can influence the control of the system. Models of low level feature extraction, with parallel processing of iconic representations have been common in computer vision literature, as are techniques for image manipulation and syntactic and statistical analysis� Parallel and serial systems have also been extensively investigated. This model proposes an integration Of these approaches using each technique in the domain to which it is suited. The model proposed for the inferotemporal cortex could be also suitable as a model of the posterior parietal cortex. A restricted version of the picture scanning model (TRIPS) has been implemented, which demonstrates the consistency of the model and also exhibits some behavioural characteristics qualitatively similar to primate visual systems. A TRIAL language is shown to be a useful representation for the analysis and description of scenes. key words: simulation, eye movements, computer vision systems, inferotemporal, parietal, image representation, TRIPS, TRIAL.
What you get is what you see: Decomposing Epistemic Planning using Functional STRIPS

Hu, Guang ( 2019)

Epistemic planning --- planning with knowledge and belief --- is essential in many multi-agent and human-agent interaction domains. Most state-of-the-art epistemic planners solve this problem by compiling to propositional classical planning, for example, generating all possible knowledge atoms, or compiling epistemic formula to normal forms.It is noted that the compilations are typically exponentially larger than the original problem. However, these methods become computationally infeasible as problems grow. In addition, those methods only works on propositional variables in discrete domains. In this thesis, we decompose epistemic planning by delegating epistemic logic reasoning to an external solver. We do this by modelling the problem using \emph{functional STRIPS}, which is more expressive than standard STRIPS and supports the use of external, black-box functions within action models. Exploiting recent work that demonstrates the relationship between what an agent `sees' and what it knows, we allow modellers to provide new implementations of externals functions. These define what agents see in their environment, allowing new epistemic logics to be defined without changing the planner. As a result, the capability and flexibility of the epistemic model itself are increased, as our model is able to avoid exponential pre-compilation steps and handle logics from continuous domains.We ran evaluations on well-known epistemic planning benchmarks to compare with an existing state-of-the-art planner, and on new scenarios based on different external functions. The results show that our planner scales significantly better than the state-of-the-art planner which we compared against, and can express problems more succinctly.
Towards improving the network architecture of GANs and their evaluation methods

Barua, Sukarna ( 2019)

Generative Adversarial Networks (GANs) are a powerful class of generative models. GAN models have recently brought significant success in image synthesis tasks. One key issue concerning GANs is the design of a network architecture that results in high training stability and sample quality. GAN models consist of two distinct neural networks known as the generator and discriminator. Conventional practice is to use a deep convolution architecture for both networks that eliminates fully connected layers from the architecture or restricts their uses to only input and output layers. Our investigation reveals that eliminating fully connected layers from the network architecture of GANs is not the best practice, and more effective GAN architecture can be designed by rather exploiting fully connected layers in the conventional convolution architecture. In this respect, we propose an improved network architecture for GANs that employs multiple fully connected layers in both the generator and discriminator networks. Models based on our proposed architecture learn both faster than the conventional architecture and also generate higher quality of samples. In addition, our proposed architecture demonstrates higher training stability than the conventional architecture in several experimental settings. We demonstrate the effectiveness of our architecture in generating high-fidelity images on four benchmark image datasets. Another key challenge when using GANs is how to best measure their ability to generate realistic data. In this regard, we demonstrate that an intrinsic dimensional characterization of the data space learned by a GAN model leads to an effective evaluation metric for GAN quality. In particular, we propose a new evaluation measure, CrossLID, that assesses the local intrinsic dimensionality (LID) of real-world data with respect to neighborhoods found in GAN-generated samples. Intuitively, CrossLID measures the degree to which manifolds of two data distributions coincide with each other. We compare our proposed measure to several state-of-the-art evaluation metrics. Our experiments show that CrossLID is strongly correlated with the progress of GAN training, is sensitive to mode collapse, is robust to small-scale noise and image transformations, and robust to sample size. One key advantage of the proposed CrossLID metric is the ability to assess mode-wise performance of GAN models. The mode-wise evaluation can be used to assess how well a GAN model has learned the different modes present in the target data distribution. We demonstrate how the proposed mode-wise assessment can be utilized during the GAN training process to detect unlearned modes. This leads us to an effective training strategy for GANs that dynamically mitigate unlearned modes by oversampling them during the training. Experiments on benchmark image datasets show that our proposed training approach achieves better performance scores than the conventional GAN training. In addition, our training approach demonstrates higher stability against mode failures of GANs compared to the conventional training.
Performances and publics while watching and live-streaming video games on Twitch.tv

Robinson, Naomi Eleanor Isobel ( 2019)

Twitch.tv is a video live-streaming website that launched in 2011 with content centred mostly, but not exclusively, on the playing of video games. Streamers or broadcasters play games in real-time often accompanied by a face camera and audio, while viewers or audiences watch them and interact through a text chat. This study responds to the small, but growing literature surrounding Twitch, and addresses the relative lack of ethnographic research on the topic. Previous research on the platform has focussed thus far on technical aspects of the platform, however user-focused qualitative research on the platform has started to emerge, making this research both timely and relevant. This thesis considers how, and to what extent, the social practices of users contribute to the concepts of ‘networked publics’ and ‘social performance’. It draws on the work of danah boyd and Erving Goffman and considers the usefulness of their theoretical contributions to help contextualise the forms and amendments associated with platforms like Twitch. The analysis emerges from an ethnographic study conducted completely online that features reflexive participant observation, semi-structured, open-ended interviews conducted via email, and in-depth observations of participants’ channels. The thesis is divided into three thematically-organised main data chapters that then feed into a discussion that draws them together to consider a larger conceptual framework. The first such data chapter, ‘Twitch as a Social Media Platform’, argues that the platform demonstrates its role as a social networking site through evidence of matchmaking and mental health. The second main chapter, ‘Twitch as a hobby-profession’, addresses casual and serious leisure and considers the platform in terms of personal investment, branding, and streamer motivation. The third main chapter, ‘Interactions of Streamers and Viewers’, considers the different types of interactions displayed between various users including parasocal relationships and how audiences may hold power on Twitch. Overall, the thesis offers insight into platform use and it characterises Twitch as a user-led participatory space for like-minded individuals who interact in particular ways in a shared community of practice. The interactions exist along a flexible continuum of differing levels of intimacy where users can lurk, actively participate, and network on both personal and professional levels. Audiences are critical for the platform to function, for communities to flourish, and for streamer success. Streamers build rapport and construct ‘authentic’ brands to attract viewers and promote loyalty and sincerity, and users are seen to actively shape and shift extant social structures and practices over time. Ultimately, users find meaning, produce a sense of community belonging, forge social networks, and shape their own identities in relation to others. The thesis concludes that Twitch somewhat paradoxically is both fleeting and robustly sustained by its contemporary community of practice. This community is produced and maintained through interaction and performance that shapes the construction of Twitch’s publics, with Twitch itself acting as a large participatory public as well. Performative sociality and networking are understood as key driving forces for Twitch, offering a rewarding space to make relationships, participate in self-care, share in leisure, and build potential livelihoods, with entertainment becoming a pleasing secondary function.
Designing a tangible user interface for the learning of motor skills in spinal mobilisation

Chacon Salas, Dimas Antony ( 2018)

Current techniques in the learning of psychomotor skills in physiotherapy, especially in spinal mobilisation, follow the traditional classroom approach: an expert performs a demonstration and students try to emulate the task by practising on each other while receiving mostly verbal feedback from the instructor. The introduction of a tailored tangible user interface would overcome the limitation of requiring the presence of a tutor and an extra fellow student, improving the scalability of the teaching delivery, and provide more objective feedback. Inspired by this opportunity, this work presents SpinalLog, a visuo-haptic interface that replicates the shape and deformable sensation of a human lower spine for the learning of spinal mobilisation techniques by employing conductive foam. This smart material is used simultaneously to sense vertebral displacements and provide passive haptic feedback to the user, emulating the flexibility of a spine. However, there is a need to understand the impact of the feedback provided in the learning of spinal mobilisation. Therefore, this work aims to design and implement SpinalLog to improve the teaching of this activity, and to investigate the effect of visual feedback, deformable haptic perception and shape fidelity in the learning of this delicate psychomotor task. We evaluated each of these three features—Visual Feedback, Passive Haptic Feedback, and Physical Fidelity—in the first part of an experiment to understand their effects on physiotherapy students' ability to replicate a mobilisation pattern recorded by an expert. Whereas in the second and last part of the experiment we presented the full features of our system to the students to gather their viewpoint for future improvement. From the first part of the experiment, we found that simultaneous feedback has the largest effect, followed by passive haptic feedback. The high fidelity of the interface has little quantitative effect, but it plays an important role in students' perceptions of the benefit of the system. From the second part of the experiment, we found that students had a favourable view on the SpinalLog suggesting improvements for the shape fidelity and the visual components.
High-quality lossless web page template and data separation

Zhao, Chenxu ( 2018)

Web page separation is an important task that aims to separate a web page into template code and data records populated into the template. Web page separation needs to work in a lossless manner where the web page can be reconstructed by running the template code on the data records. In this thesis, we investigate two sub-problems of web page separation for obtaining (1) high-quality template code and (2) high-quality data records. For the first sub-problem, we focus on improving the maintainability of the template code. Easily maintainable template code is reliable and will simplify further developments on top of the template code, e.g., to update the web templates. We formulate such a problem and analyze its complexity. We show that this problem is NP-hard. We then propose a heuristic algorithm to solve the problem. The main idea of our algorithm is to parse a web page into a tree and then to process it recursively in a bottom-up manner with three steps: splitting, folding, and alignment. In particular, we split siblings in the tree and fold them into chunks, where the alignment step is used to align sibling in the same chunk. During the sibling splitting step, to determine which siblings should be grouped into the same chunk, we further propose a population-based optimization algorithm named dual teaching and learning based optimization. We perform experiments on real data sets to evaluate the performance of our proposed algorithms in maximizing the maintainability of the template code produced. Experimental results show that our proposed algorithms outperform the baseline algorithms in the maintainability measure. For the second sub-problem, we focus on extracting data records from a set of web pages which are generated by different unknown templates and deducing the schemas that provide the data records. The extracted data records can be used in many applications, such as stock market prediction and personalized recommendation systems. We formulate such a problem and propose a framework to tackle the problem. Our framework processes web pages with four steps: web page template and data separation, template clustering, template alignment, and data record filtering. The web page template and data separation step separates web pages into template code and data records. The template clustering step then clusters the web pages by the similarity of template code. The template alignment step captures the differences among templates to construct a generalized template code which can generate all web pages in the same group. The data filtering step utilizes the template code to verify the data records extracted by the web page template and data separation step and modifies those which are incorrectly extracted. We perform experiments on real data sets to evaluate the performance of our framework. Experimental results show that our proposed framework outperforms baseline algorithms which assume a pre-known clustering of the set of web pages in the F-Score.
Highly efficient distributed hypergraph analysis: real-time partitioning and quantized learning

Jiang, Wenkai ( 2018)

Hypergraphs have been shown to be highly effective when modeling a wide range of applications where high-order relationships are of interest, such as social network analysis and object classification via hypergraph embedding. Applying deep learning techniques on large scale hypergraphs is challenging due to the size and complex structure of hypergraphs. This thesis addresses two problems of hypergraph analysis, real-time partitioning and quantized neural networks training, in a distributed computing environment. When processing a large scale hypergraph in real-time and in a distributed fashion, the quality of hypergraph partitioning has a significant influence on communication overhead and workload balance among the machines participating in the distributed processing. The main challenge of real-time hypergraph partitioning is that hypergraphs are represented as a dynamic hypergraph stream formed by a sequence of hyperedge insertions and deletions, where the structure of a hypergraph is constantly changing. The existing methods that require all information of a hypergraph are inapplicable in this case as only a sub-graph is available to the algorithm at a time. We solve this problem by proposing a streaming refinement partitioning (SRP) algorithm that partitions a real-time hypergraph flow in two phases. With extensive experiments on a scalable hypergraph framework named HyperX, we show that SRP can yield partitions that are of the same quality as that achieved by offline partitioning algorithms in terms of communication overhead and workload balance. For machine learning tasks over hypergraphs, studies have shown that using deep neural networks (DNNs) can improve the learning outcomes. This is because the learning objectives in hypergraph analysis are becoming more complex these days, where features are difficult to define and are highly-correlated. DNNs can be used as a powerful classifier to construct features automatically. However, DNNs require high computational power and network bandwidth as the size of DNN models are getting larger. Moreover, the widely adopted training algorithm, stochastic gradient descent (SGD), suffers in two main problems: vast communication overhead that comes from the broadcasts of parameters during the partial gradient aggregations, and the inherent variance between partial gradients, making the training process even longer as it impedes the convergence rate of SGD. We investigate these two problems in depth. Without sacrificing the performance, we develop a quantization technique to reduce the communication overhead and a new training paradigm, named cooperated low-precision training (C-LPT), in which importance sampling is used to reduce variance, and the master and workers collaborate together to make compensation for the precision loss due to the quantization. Incorporating deep learning techniques into distributed hypergraph analysis shows a great potential in query processing and knowledge mining on high-dimensional data records where relationships among them are highly correlated. On one hand, such a process takes the advantage of strong representational power of DNNs as an appearance-based classifier; on the other hand, such a process exploits hypergraph representations to gain benefits from its strong capability in capturing high-order relationships.
Towards highly accurate publication information extraction from academic homepages

Zhang, Yiqing ( 2018)

More and more researchers list their research profiles in academic homepages online. Publications from a researcher's academic homepage contain rich information, such as the researcher's fields of expertise, research interests, and collaboration network. Extracting publication information from academic homepages is an essential step in automatic profile analysis, which enables many applications such as academic search, bibliometrics and citation analysis. The publications extracted from academic homepages can also be a supplementary source for bibliographic databases. We investigate two publication extraction problems in this thesis: (i) Given an academic homepage, how can we precisely extract all the individual publication strings from the homepage? Here, a publication string is a text string that describes a publication record. We call this problem publication string extraction. (ii) Given a publication string, how can we extract different fields, such as publication authors, publication title, and publication venue, from the publication string? We call this problem publication field extraction. There are two types of traditional approaches to these two problems, rule-based approaches and machine learning based approaches. Rule-based approaches cannot accommodate the large variety of styles in the homepages, and they require significant efforts in rule designing. Machine learning based approaches rely on a large amount of high-quality training data as well as suitable model structures. To tackle these challenges, we first collect two datasets and annotate them manually. We propose a training data enhancement method to generate large sets of semi-real data for training our models. For the publication string extraction problem, we propose a PubSE model that can model the structure of a publication list in both line-level and webpage-level. For the publication field extraction problem, we propose an Adaptive Bi-LSTM-CRF model that can utilize the generated and the manually labeled training data to the full extent. Extensive experiment results show that the proposed methods outperform the state-of-the-art methods in the publication extraction problems studied.

Computing and Information Systems - Theses

Permanent URI for this collection

Filters

Date

Author

Subject

Type

Settings

Sort By

Results per page

Statistics

Citations

Search Results