dc.contributor.author Tan, Jia Tian Justin dc.date.accessioned 2020-06-19T08:15:24Z dc.date.available 2020-06-19T08:15:24Z dc.date.issued 2020 dc.identifier.uri http://hdl.handle.net/11343/240592 dc.description © 2020 Jia Tian Justin Tan dc.description.abstract In searches for new physics in high-energy physics, experimental analyses are primarily concerned with physical processes which are rare or hitherto unobserved. To claim a statistically significant discovery or exclusion of new physics when studying such decays, it is necessary to maintain an appropriate signal to noise ratio. This makes systems capable of efficient discrimination of signal from datasets overwhelmingly dominated by background events an important component of modern experimental analyses. However, na\"ive application of these methods is liable to raise poorly understood systematic effects which may ultimately degrade the significance of the final measurement. To understand the origin of these systematic effects, we note that there are certain protected variables in experimental analyses which should remain unbiased by the analysis procedure. Variables that the input parameters of models of new physics are strongly dependent upon and variables used to model background contributions to the total measured event yield fall into this category. Systems responsible for separating signal from background events achieve this by sampling events with signal-like characteristics from all candidate events. If this procedure introduces sampling bias into the distribution of protected variables, this introduces systematic effects into the analysis which are difficult to characterize. Thus it is desirable for these systems to distinguish between signal and background events without using information about certain protected variables. Beyond high-energy physics, building systems that make decisions independent of certain protected or sensitive information is an important theme in the real-world application of machine learning and statistics. We address this task as an optimization problem of finding a representation of the observed data that is invariant to the given protected quantities. This representation should satisfy two competing criteria. Firstly, it should contain all relevant information about the data so that it may be used as a proxy for arbitrary downstream tasks, such as inference of unobserved quantities or prediction of target variables. Secondly, it should not be informative of the given protected quantities, so that downstream tasks are not influenced by these variables. If the protected quantities to be censored from the intermediate representation contain information that can improve the performance of the downstream task, it is likely that removing this information will adversely affect this task. The challenge lies in balancing both objectives without significantly compromising either requirement. The contribution of this thesis is a new set of methods for addressing this problem. This thesis is divided into two parts, which are largely independent of one another. The first part of this thesis is about constraining the optimization procedure by which the representation is learnt to reduce the informativeness of the representation of the given protected quantities, such that the representation is invariant to changes in these quantities. The second part of this thesis approaches the problem from a latent variable model perspective, in which additional unobserved (latent) variables are introduced which explain the interaction between different attributes of the observed data. These latent variables can be interpreted as a more fundamental, compact lower-dimensional representation of the original high-dimensional unstructured data. By constraining the structure of this latent space, we demonstrate we can isolate the influence of the protected variables into a latent subspace. This allows downstream tasks to only access a relevant subset of the learned representation without being influenced by protected attributes of the original data. The feasibility of our proposed methods is demonstrated through application to a challenging experimental analysis in precision flavor physics at the Belle II experiment - the study of the $b \rightarrow s \gamma$ transition, a sensitive probe of potential new physics. dc.rights Terms and Conditions: Copyright in works deposited in Minerva Access is retained by the copyright owner. The work may not be altered without permission from the copyright owner. Readers may only download, print and save electronic copies of whole works for their own personal non-commercial use. Any use that exceeds these limits requires permission from the copyright owner. Attribution is essential when quoting or paraphrasing from these works. dc.subject Particle Physics dc.subject Experimental High-Energy Physics dc.subject Flavor Physics dc.subject B Physics dc.subject Machine Learning dc.subject Latent Variable Models dc.subject Representation Learning dc.subject Disentangled Representations dc.title Learning invariant representations with applications to high-energy physics dc.type Masters Research thesis melbourne.affiliation.department School of Physics melbourne.affiliation.faculty Science melbourne.thesis.supervisorname Phillip Urquijo melbourne.contributor.author Tan, Jia Tian Justin melbourne.thesis.supervisorothername Luigia Barberio melbourne.tes.fieldofresearch1 020203 Particle Physics melbourne.tes.confirmed true melbourne.accessrights Open Access
﻿