Show simple item record

dc.contributor.authorTan, Jia Tian Justin
dc.date.accessioned2020-06-19T08:15:24Z
dc.date.available2020-06-19T08:15:24Z
dc.date.issued2020
dc.identifier.urihttp://hdl.handle.net/11343/240592
dc.description© 2020 Jia Tian Justin Tan
dc.description.abstractIn searches for new physics in high-energy physics, experimental analyses are primarily concerned with physical processes which are rare or hitherto unobserved. To claim a statistically significant discovery or exclusion of new physics when studying such decays, it is necessary to maintain an appropriate signal to noise ratio. This makes systems capable of efficient discrimination of signal from datasets overwhelmingly dominated by background events an important component of modern experimental analyses. However, na\"ive application of these methods is liable to raise poorly understood systematic effects which may ultimately degrade the significance of the final measurement. To understand the origin of these systematic effects, we note that there are certain protected variables in experimental analyses which should remain unbiased by the analysis procedure. Variables that the input parameters of models of new physics are strongly dependent upon and variables used to model background contributions to the total measured event yield fall into this category. Systems responsible for separating signal from background events achieve this by sampling events with signal-like characteristics from all candidate events. If this procedure introduces sampling bias into the distribution of protected variables, this introduces systematic effects into the analysis which are difficult to characterize. Thus it is desirable for these systems to distinguish between signal and background events without using information about certain protected variables. Beyond high-energy physics, building systems that make decisions independent of certain protected or sensitive information is an important theme in the real-world application of machine learning and statistics. We address this task as an optimization problem of finding a representation of the observed data that is invariant to the given protected quantities. This representation should satisfy two competing criteria. Firstly, it should contain all relevant information about the data so that it may be used as a proxy for arbitrary downstream tasks, such as inference of unobserved quantities or prediction of target variables. Secondly, it should not be informative of the given protected quantities, so that downstream tasks are not influenced by these variables. If the protected quantities to be censored from the intermediate representation contain information that can improve the performance of the downstream task, it is likely that removing this information will adversely affect this task. The challenge lies in balancing both objectives without significantly compromising either requirement. The contribution of this thesis is a new set of methods for addressing this problem. This thesis is divided into two parts, which are largely independent of one another. The first part of this thesis is about constraining the optimization procedure by which the representation is learnt to reduce the informativeness of the representation of the given protected quantities, such that the representation is invariant to changes in these quantities. The second part of this thesis approaches the problem from a latent variable model perspective, in which additional unobserved (latent) variables are introduced which explain the interaction between different attributes of the observed data. These latent variables can be interpreted as a more fundamental, compact lower-dimensional representation of the original high-dimensional unstructured data. By constraining the structure of this latent space, we demonstrate we can isolate the influence of the protected variables into a latent subspace. This allows downstream tasks to only access a relevant subset of the learned representation without being influenced by protected attributes of the original data. The feasibility of our proposed methods is demonstrated through application to a challenging experimental analysis in precision flavor physics at the Belle II experiment - the study of the $b \rightarrow s \gamma$ transition, a sensitive probe of potential new physics.
dc.rightsTerms and Conditions: Copyright in works deposited in Minerva Access is retained by the copyright owner. The work may not be altered without permission from the copyright owner. Readers may only download, print and save electronic copies of whole works for their own personal non-commercial use. Any use that exceeds these limits requires permission from the copyright owner. Attribution is essential when quoting or paraphrasing from these works.
dc.subjectParticle Physics
dc.subjectExperimental High-Energy Physics
dc.subjectFlavor Physics
dc.subjectB Physics
dc.subjectMachine Learning
dc.subjectLatent Variable Models
dc.subjectRepresentation Learning
dc.subjectDisentangled Representations
dc.titleLearning invariant representations with applications to high-energy physics
dc.typeMasters Research thesis
melbourne.affiliation.departmentSchool of Physics
melbourne.affiliation.facultyScience
melbourne.thesis.supervisornamePhillip Urquijo
melbourne.contributor.authorTan, Jia Tian Justin
melbourne.thesis.supervisorothernameLuigia Barberio
melbourne.tes.fieldofresearch1020203 Particle Physics
melbourne.tes.confirmedtrue
melbourne.accessrightsOpen Access


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record