REU2019: Validation of Higgs to bb¯ tagging techniques with Z→bb¯
Goal
Use Z→bb¯ process to validate performance of techniques used to identify H→bb¯.

Data
- Proton-proton collisions from the 2017 LHC run, at a centre of mass energy of 13 TeV.
- A total luminosity of approximately 47fb−1 was recorded by ATLAS.

Monte-Carlo samples
-
Signal: Z(→bb¯)+γ

-
Dominant background: photon + jets, where the jets contain gluons splitting into pairs of bottom quarks. Examples of leading-order diagrams:

-
Smaller contributions from tt¯γ and Wγ.
-
Negligible contributions from jets faking photons, electrons faking photons and tt¯.
Event selection
Some of these cuts are already applied at ntuple level.
- See this ATLAS paper draft, Section 8:
- Single-photon trigger: transverse energy above 140 GeV and loose photon identification requirements.
- Photon with ET>175 GeV.
- Primary vertex and jet-cleaning requiremetns.
- Exactly one photon and at least one large-R jet with pT>200 GeV, |η|<2.0, mass greater than 30 GeV.
- Jet-photon overlap removal: remove photons within ΔR=1.0 of the large-R jet.
- Z→bb¯ candidate is highest pT large-R jet.
Ntuples
-
Basic pre-selection cuts (defined here and already applied):
- GRL: select only events on good runs list
- GOODCALO: ensure that the LAr and Tile are working properly for this event (no noise bursts)
- PRIVTX: ensure that at least a primary vertex exists in the event.
-
Names and physics processes: the ntuples are split by data and simulation (MC). For MC, they are further split by type of process that was generated, with associated dataset IDs:
- 361039-361062 correspond to photon + jets samples (generated in several bins of transverse momentum).
- 305435-305439 correspond to W(qq) + photon
- 305440-305444 correspond to Z(qq) + photon
How to start
- A code skeleton that reads ntuples and creates histograms of relevant variables is here:
/data/users/miochoa/REU2019/reu-2019-skeleton
-
Copy the entire folder above into your user area in /data/users/<your-username>. If your user area does not yet exist, create it and cd into it, then:
cp -r /data/users/miochoa/REU2019/reu-2019-skeleton .
-
Everytime you start a session, you need to setup the required tools by running:
cd reu-2019-skeleton
source setup.sh
- The event and object selection as well as histogram definition takes place here:
ZbbAnalysisCode/src/MyZbbAnalysis.C
Follow the existing examples to add new histograms.
-
How to run a test after any modification:
cd run/
root -l -b
.L runAnalysis.C
runAnalysis(dataset name, "12", "local")
The “dataset name” should be replaced by one of the datasets listed in:
run/inputs/data.txt or inputs/mc.txt
This step will produce an output.root file. Inspect it to make sure your histograms are properly filled.
-
How to run on the full list of data and MC samples:
cd run/
./localRun.sh inputs/data.txt
./localRun.sh inputs/mc.txt
These two steps will produce output root files for each data or MC sample (around 250). You can combine the root files from data all in a single file, with the following command:
hadd output_data.root output_data_*root
The MC files can’t be combined, because they will have to be individually scaled by their corresponding cross-sections, which is performed in the next step.
-
There is an example python script for making nice plots with the histograms produced in the earlier steps:
cd Plotting/
python new_plotting_example.py -b
Questions
- How are photons identified and reconstructed in ATLAS?
- E.g. on the paper cited above, a loose photon requirement is mentioned. What does it consist of?
- This work uses two ‘types’ of jets: large-R calorimeter jets and variable-R track-jets.
- How are jets defined and built in ATLAS?
- We also use b-tagging techniques to identify jets that contain b-hadrons: what properties of the b-quark are useful for this tagging?
- What goes into the scaling of the MC samples before you make your plots?
Studies
- Study data/MC agreement in different variables and selections:
- e.g. before and after requiring two b-tagged jets associated to the large-R jet.
- What is the signal efficiency for different b-tagging selections?
- Are there distributions that provide discriminant power between γ+jets and Z→bb¯?
- How does the efficiency to find a Z→bb¯ object in data compare to the efficiency in MC?