Exercise 7: Picking a physics cut
(15 minutes)
Make a histogram of chi2
, then a scatterplot of chi2
vs
ebeam
.
Hint
If you’ve forgotten how to figure out the axis limits for ebeam
,
look at Walkthrough: Making scatterplots again for a clue.
Note
The chi2 distribution and the scatterplot hint that something interesting may be going on.
The chi2 histogram looks unusual: there’s a peak around 1, but the x-axis extends far beyond that, up to chi2 > 18. Evidently there are some events with a large chi2, but not enough of them to show up on the plot.
On the scatterplot, we can see a dark band that represents the main peak of the chi2 distribution, and a scattering of dots that represents a group of events with anomalously high chi2.
The chi2 represents a confidence level in reconstructing the particle’s trajectory. If chi2 is high, the trajectory reconstruction was poor. It would be acceptable to apply a cut of “chi2 < 1.5”, but let’s see if we can correlate a large chi2 with anything else.
Make a scatterplot of chi2
versus theta
.
Note
Take a careful look at the scatterplot. It looks like all the large-chi2 values are found in the region theta > 0.15 radians. It may be that our trajectory-finding code has a problem with large angles. Let’s put in both a theta cut and a chi2 cut to be certain we’re looking at a sample of events with good reconstructed trajectories.
Repeat the above plots with a Filter()
to only fill your histograms if chi2 < 1.5 and
theta < 0.15. Change the bin limits of your histograms to reflect these
cuts; for example, there’s no point to putting bins above 1.5 in your
chi2 histograms since you know there won’t be any events in those bins
after cuts.
Tip
You may ask which is better:
.Filter("chi2 < 1.5").Filter("theta < 0.15")
.Filter("chi2 < 1.5 && theta < 0.15")
On the relatively small scale of this example, it
doesn’t make much of a difference. For a large-scale analysis, the
second expression is more efficient, since RDataFrame
only has to
invoke the overhead of the Filter()
method once (including compiling
the C++ expression within the quotes) instead of twice.
I must confess: I cheated when I pointed you directly to theta as the cause of the high-chi2 events. I knew this because I wrote the program that created the tree. If you want to look at this program yourself, go to the UNIX window and type:
> less ~seligman/root-class/CreateTree.C