School of Physics - Theses

Permanent URI for this collection

Search Results

Now showing 1 - 1 of 1
  • Item
    Thumbnail Image
    Addressing domain shift in deeply-learned jet tagging at the LHC
    Ore, Ayodele Oladimeji ( 2023-09)
    Over the last fifteen years, deep learning has emerged as an extremely powerful tool for exploiting large datasets. At the Large Hadron Collider, which has been in operation over the same time span, an important use case is to identify the initiating particles of hadronic jets. Due to the complexity of the radiation patterns within jets, neural network-based classifiers are able to out-perform traditional techniques for jet tagging. While these approaches are powerful, neural networks must be applied carefully to avoid performance losses in the presence of domain shift—where the data on which a model is evaluated follows different statistics to the training dataset. This thesis presents studies of possible strategies to mitigate domain shift in the application of deep learning to jet tagging. Firstly, we develop a deep generative model that can separately learn the distribution of quark and gluon jets from mixed samples. Building on the jet topics framework, this model provides the ability to sample quark and gluon jets in high dimension without taking input from Monte Carlo simulations. We demonstrate the advantage of the model over a conventional approach in terms of estimating the performance of a quark/gluon classifier on experimental data. One can also use likelihoods under the model to perform classification that is robust to outliers. We go on to evaluate fully- and weakly-supervised classifiers using real datasets collected at the CMS experiment. Two measurements of the quark/gluon mixture proportions of the datasets are made under different assumptions. Compared to the predictions based on simulation, we either over- or under-estimate the quark fractions of each sample depending on which assumption is made. When estimating the discrimination power of the classifiers in real data we find that while the absolute performance depends on the choice of fractions, the rankings among the models are stable. In particular, weakly-supervised models trained on real jets outperform both simulation-trained models. Our generative networks yield competitive classification and provide a better model for the quark and gluon jet topic distributions in data than the simulation. Finally, we investigate the performance of a number of methods for training mass-generalised jet taggers, with a focus on algorithms that leverage meta-learning. We study the discrimination of jets from boosted Z' bosons against a QCD background and evaluate the networks' performance at masses distant from those used in training. We find that a simple data augmentation strategy that standardises the angular scale of jets with different masses is sufficient to produce strong generalisation. The meta-learning algorithms provide only a small improvement in generalisation when combined with this augmentation.