Walkthrough: Defining our own variables
(10 minutes)
So far we’ve just made histograms of variables that were already in
the n-tuple. With RDataFrame, we can introduce calculations based on the
columns within the n-tuple. If you picture the n-tuple as a
spreadsheet, this is like adding a new column in the spreadsheet with
values based on a formula.
Note
Let’s consider the quantity \(p_{T}\), which is defined by:
This is the transverse momentum of the particle, that is, the component of the particle’s momentum that’s perpendicular to the z-axis.
Let’s calculate the value of \(p_{T}\) as a new column in the n-tuple:
definept = dataframe.Define("pt","sqrt(px*px + py*py)")
This new column, pt, behaves exactly like any other column in
the dataframe. As you can see on the RDataFrame page on the ROOT web
site,
the general format of the Define method is:
Define(newname, formula)
where:
newnameis a string with the name of the new column;formulais a string containing a C++ formula based on the names of any existing columns in the n-tuple (including ones created with previousDefines.
Warning
If you’re a Python programmer, it’s important to note the formula
has to be expressed within a string using C++ syntax. In the above example, you might be tempted
to try:
import math
definept = dataframe.Define("pt","math.sqrt(px**2 + py**2)")
Not only is the import math unnecessary and wasted here (though you might
need it elsewhere in your code), but it won’t be recognized by ROOT; neither would
px**2 because C++ doesn’t use ** as an exponentiation operator. The contents of the
formula string are interpreted by ROOT’s C++ interpreter, not Python!
If you’re a C++ programmer, you might note that there’s an effective using namespace std; within ROOT’s interpreter.
The next few exercises stick to basic math (sqrt, sin, arithmetic, exponents),
so you shouldn’t have a problem with formulas for now.
Note
When you want to move on to more powerful math functions, you can always use ROOT’s TMath functions within a formula. For an extreme example, the above formula could be expressed as:
definept = dataframe.Define("pt","TMath::Sqrt(TMath::Power(px,2) + TMath::Power(py,2))")
This verbose example is not something you’d actually do, since ROOT’s C++ is good
enough for sqrt and such. But if you have to compute the relativistic Breit-Wigner function,
TMath::BreitWignerRelativistic has you covered.
What if what you want to do is not one of the ROOT built-in functions? We’ll get to that!
At the moment we have a “new” dataframe called definept, but we haven’t done anything
concrete with it. We’ll do that on the next page.