Walkthrough: Defining an RDataFrame
(10 minutes)
Defining an RDataFrame
is usually simple. Here’s how to do it in both C++ and Python:
auto ntupleName = "tree1";
auto fileName = "experiment.root";
auto dataframe = ROOT::RDataframe(ntupleName,fileName);
import ROOT
ntupleName = "tree1"
fileName = "experiment.root"
dataframe = ROOT.RDataframe(ntupleName,fileName)
Note
Actually, unless I’m writing a program that accepts the name of the n-tuple or
its file as arguments, I usually don’t define separate variables like ntupleName
or fileName
the way I
do in the above examples. I’m more likely to just simply do:
dataframe=ROOT.RDataframe("tree1","experiment.root")
I’m doing this the long way so you can get a sense of what the arguments mean.
The name dataframe
in this example is arbitrary. If you visited the ROOT
website’s RDataFrame
page, you can see they typically use a short name like df
to save on typing. Since I know how
to use copy-and-paste, I’ve opted to use a longer variable name for clarity.
Note
For now, I’m showing both C++ and Python examples of the code. Eventually,
when I think I’ve shown enough examples so you can convert one to the
other, I’ll stop showing both in parallel. You’ve probably already noticed
how, at least for RDataFrame
, the code is very similar.
Note
I assume that you’re working through The RDataframe Path interactively,
probably in a notebook.
You only have to define your dataframe once per session. I’m not usually
going to include the above commands in the listings below. If you restart
ROOT or the notebook kernel, be sure to initialize dataframe
again.
If you’d like to see the names of the columns in the dataframe, it’s easy to do interactively:1
dataframe.Describe()
If you’d like a peek at the first few values (roughly equivalent to the TTree::Scan()
method in The C++ Path or The Python Path):2
dataframe.Display()->Print()
dataframe.Display().Print()
Give these commands a try to see what they tell you about the n-tuple tree1
.

Figure 45: https://xkcd.com/2620/ by Randall Munroe
- 1
If the
Describe
orDisplay
methods don’t work for you, don’t panic. These were added to the very latest version of ROOT. While I try to keep the ROOT versions up-to-date for everyone on the Nevis particle-physics systems, sometimes (due to complex reasons that are beyond irrelevant to you, trust me) I can’t offer you the latest-and-greatest.- 2
If you’re using C++: You’ll have to observe via my examples when an
RDataFrame
method returns a pointer; that is, when you have to use->
to access a method. Remember that you had to deal with this back when you were fitting histograms.