Walkthrough: Defining our own variables
(10 minutes)
So far we’ve just made histograms of variables that were already in
the n-tuple. With RDataFrame, we can introduce calculations based on the
columns within the n-tuple. If you picture the n-tuple as a
spreadsheet, this is like adding a new column in the spreadsheet with
values based on a formula.
A derived quantity
Let’s consider the quantity \(p_{T}\), which is defined by:
This is the transverse momentum of the particle, that is, the component of the particle’s momentum that’s perpendicular to the z-axis.
Let’s calculate the value of \(p_{T}\) as a new column in the n-tuple:
definept = dataframe.Define("pt","sqrt(px*px + py*py)")
This new column, pt, behaves exactly like any other column in
the dataframe.  As you can see on the RDataFrame page on the ROOT web
site,
the general format of the Define method is:
Define(newname, formula)
where:
- newnameis a string with the name of the new column;
- formulais a string containing a C++ formula based on the names of any existing columns in the n-tuple (including ones created with previous- Defines.
Formula syntax
It’s important to note the formula has to be expressed within a
string using C++ syntax. If you’re a Python programmer, in the above
example you might be tempted to do something like this:
import math
definept = dataframe.Define("pt","math.sqrt(px**2 + py**2)")
Not only is the import math unnecessary and wasted here (though you might
need it elsewhere in your code), but ROOT’s C++ interpreter won’t recognize the import;
px**2 would not work because C++ doesn’t use ** as an exponentiation operator. The contents of the
formula string are interpreted by ROOT’s C++ interpreter, not Python!
If you’re a C++ programmer, you might note that there’s an effective using namespace std; within ROOT’s interpreter.
The next few exercises stick to basic math (sqrt, sin, arithmetic, exponents),
so you shouldn’t have a problem with formulas for now.
Power formulas
When you want to move on to more powerful math functions, you can always use ROOT’s TMath functions within a formula. For an extreme example, the above formula could be expressed as:
definept = dataframe.Define("pt","TMath::Sqrt(TMath::Power(px,2) + TMath::Power(py,2))")
This verbose example is not something you’d actually do, since ROOT’s C++ is good
enough for sqrt and such. But if you have to compute the relativistic Breit-Wigner function,
TMath::BreitWignerRelativistic has you covered.
What if what you want to do is not one of the ROOT built-in functions? We’ll get to that!
At the moment we have a “new” dataframe called definept, but we haven’t done anything
concrete with it. We’ll do that on the next page.