Learning invariant representations with applications to high-energy physics
AuthorTan, Jia Tian Justin
AffiliationSchool of Physics
Document TypeMasters Research thesis
Access StatusOpen Access
© 2020 Jia Tian Justin Tan
In searches for new physics in high-energy physics, experimental analyses are primarily concerned with physical processes which are rare or hitherto unobserved. To claim a statistically significant discovery or exclusion of new physics when studying such decays, it is necessary to maintain an appropriate signal to noise ratio. This makes systems capable of efficient discrimination of signal from datasets overwhelmingly dominated by background events an important component of modern experimental analyses. However, na\"ive application of these methods is liable to raise poorly understood systematic effects which may ultimately degrade the significance of the final measurement. To understand the origin of these systematic effects, we note that there are certain protected variables in experimental analyses which should remain unbiased by the analysis procedure. Variables that the input parameters of models of new physics are strongly dependent upon and variables used to model background contributions to the total measured event yield fall into this category. Systems responsible for separating signal from background events achieve this by sampling events with signal-like characteristics from all candidate events. If this procedure introduces sampling bias into the distribution of protected variables, this introduces systematic effects into the analysis which are difficult to characterize. Thus it is desirable for these systems to distinguish between signal and background events without using information about certain protected variables. Beyond high-energy physics, building systems that make decisions independent of certain protected or sensitive information is an important theme in the real-world application of machine learning and statistics. We address this task as an optimization problem of finding a representation of the observed data that is invariant to the given protected quantities. This representation should satisfy two competing criteria. Firstly, it should contain all relevant information about the data so that it may be used as a proxy for arbitrary downstream tasks, such as inference of unobserved quantities or prediction of target variables. Secondly, it should not be informative of the given protected quantities, so that downstream tasks are not influenced by these variables. If the protected quantities to be censored from the intermediate representation contain information that can improve the performance of the downstream task, it is likely that removing this information will adversely affect this task. The challenge lies in balancing both objectives without significantly compromising either requirement. The contribution of this thesis is a new set of methods for addressing this problem. This thesis is divided into two parts, which are largely independent of one another. The first part of this thesis is about constraining the optimization procedure by which the representation is learnt to reduce the informativeness of the representation of the given protected quantities, such that the representation is invariant to changes in these quantities. The second part of this thesis approaches the problem from a latent variable model perspective, in which additional unobserved (latent) variables are introduced which explain the interaction between different attributes of the observed data. These latent variables can be interpreted as a more fundamental, compact lower-dimensional representation of the original high-dimensional unstructured data. By constraining the structure of this latent space, we demonstrate we can isolate the influence of the protected variables into a latent subspace. This allows downstream tasks to only access a relevant subset of the learned representation without being influenced by protected attributes of the original data. The feasibility of our proposed methods is demonstrated through application to a challenging experimental analysis in precision flavor physics at the Belle II experiment - the study of the $b \rightarrow s \gamma$ transition, a sensitive probe of potential new physics.
KeywordsParticle Physics; Experimental High-Energy Physics; Flavor Physics; B Physics; Machine Learning; Latent Variable Models; Representation Learning; Disentangled Representations
- Click on "Export Reference in RIS Format" and choose "open with... Endnote".
- Click on "Export Reference in RIS Format". Login to Refworks, go to References => Import References