Walkthrough: Defining our own variables

(10 minutes)

So far we’ve just made histograms of variables that were already in the n-tuple. With RDataFrame, we can introduce calculations based on the columns within the n-tuple. If you picture the n-tuple as a spreadsheet, this is like adding a new column in the spreadsheet with values based on a formula.

A derived quantity

Let’s consider the quantity \(p_{T}\), which is defined by:

\[p_{T} = \sqrt{ p_{x}^{2} + p_{x}^{2} }\]

This is the transverse momentum of the particle, that is, the component of the particle’s momentum that’s perpendicular to the z-axis.

Let’s calculate the value of \(p_{T}\) as a new column in the n-tuple:

definept = dataframe.Define("pt","sqrt(px*px + py*py)")

This new column, pt, behaves exactly like any other column in the dataframe. As you can see on the RDataFrame page on the ROOT web site, the general format of the Define method is:

Define(newname, formula)

where:

  • newname is a string with the name of the new column;

  • formula is a string containing a C++ formula based on the names of any existing columns in the n-tuple (including ones created with previous Defines.

Formula syntax

It’s important to note the formula has to be expressed within a string using C++ syntax. If you’re a Python programmer, in the above example you might be tempted to do something like this:

import math
definept = dataframe.Define("pt","math.sqrt(px**2 + py**2)")

Not only is the import math unnecessary and wasted here (though you might need it elsewhere in your code), but ROOT’s C++ interpreter won’t recognize the import; px**2 would not work because C++ doesn’t use ** as an exponentiation operator. The contents of the formula string are interpreted by ROOT’s C++ interpreter, not Python!

If you’re a C++ programmer, you might note that there’s an effective using namespace std; within ROOT’s interpreter.

The next few exercises stick to basic math (sqrt, sin, arithmetic, exponents), so you shouldn’t have a problem with formulas for now.

Power formulas

When you want to move on to more powerful math functions, you can always use ROOT’s TMath functions within a formula. For an extreme example, the above formula could be expressed as:

definept = dataframe.Define("pt","TMath::Sqrt(TMath::Power(px,2) + TMath::Power(py,2))")

This verbose example is not something you’d actually do, since ROOT’s C++ is good enough for sqrt and such. But if you have to compute the relativistic Breit-Wigner function, TMath::BreitWignerRelativistic has you covered.

What if what you want to do is not one of the ROOT built-in functions? We’ll get to that!

At the moment we have a “new” dataframe called definept, but we haven’t done anything concrete with it. We’ll do that on the next page.