Walkthrough: Defining our own variables
(10 minutes)
So far we’ve just made histograms of variables that were already in
the n-tuple. With RDataFrame
, we can introduce calculations based on the
columns within the n-tuple. If you picture the n-tuple as a
spreadsheet, this is like adding a new column in the spreadsheet with
values based on a formula.
A derived quantity
Let’s consider the quantity \(p_{T}\), which is defined by:
This is the transverse momentum of the particle, that is, the component of the particle’s momentum that’s perpendicular to the z-axis.
Let’s calculate the value of \(p_{T}\) as a new column in the n-tuple:
definept = dataframe.Define("pt","sqrt(px*px + py*py)")
This new column, pt
, behaves exactly like any other column in
the dataframe. As you can see on the RDataFrame page on the ROOT web
site,
the general format of the Define
method is:
Define(newname, formula)
where:
newname
is a string with the name of the new column;formula
is a string containing a C++ formula based on the names of any existing columns in the n-tuple (including ones created with previousDefines
.
Formula syntax
It’s important to note the formula
has to be expressed within a
string using C++ syntax. If you’re a Python programmer, in the above
example you might be tempted to do something like this:
import math
definept = dataframe.Define("pt","math.sqrt(px**2 + py**2)")
Not only is the import math unnecessary and wasted here (though you might
need it elsewhere in your code), but ROOT’s C++ interpreter won’t recognize the import;
px**2
would not work because C++ doesn’t use **
as an exponentiation operator. The contents of the
formula
string are interpreted by ROOT’s C++ interpreter, not Python!
If you’re a C++ programmer, you might note that there’s an effective using namespace std; within ROOT’s interpreter.
The next few exercises stick to basic math (sqrt
, sin
, arithmetic, exponents),
so you shouldn’t have a problem with formulas for now.
Power formulas
When you want to move on to more powerful math functions, you can always use ROOT’s TMath functions within a formula. For an extreme example, the above formula could be expressed as:
definept = dataframe.Define("pt","TMath::Sqrt(TMath::Power(px,2) + TMath::Power(py,2))")
This verbose example is not something you’d actually do, since ROOT’s C++ is good
enough for sqrt
and such. But if you have to compute the relativistic Breit-Wigner function,
TMath::BreitWignerRelativistic
has you covered.
What if what you want to do is not one of the ROOT built-in functions? We’ll get to that!
At the moment we have a “new” dataframe called definept
, but we haven’t done anything
concrete with it. We’ll do that on the next page.