Walkthrough: Apply a cut and a count
(15 minutes)
Applying cuts is an important part of any physics analysis. There’ll be some events you want to analyze and others which are not important to your study. A “cut” is a calculation that separates the two categories.
In RDataFrame
, the method that applies a cut is Filter
. For example, suppose
that we’re only interested in events with pz
less than 145 GeV. A way
this can be expressed in our example n-tuple is:
pzcut = dataframe.Filter("pz < 145")
C++ syntax
Again, the string passed on to the Filter
method is interpreted as a C++ expression,
not a Python expression, even if you’re working in Python. You’ll get an error
if you try this:
pzcut = dataframe.Filter("pz lt 145")
You can also apply a cut on any new columns you’ve defined:
ptcut = definept.Filter("pt > 50")
Define vs. Filter
There’s an important operational difference between Define
and Filter
.
Define
is a column-wise operation; that is, it operates on columns and adds
a new one. Filter
is a row-wise operation; it essentially removes rows
from the n-tuple that don’t pass its criteria.
You’ve probably already guessed that you can plot any column from the filtered n-tuple; e.g.,
pzcut_hist = pzcut.Histo1D("ebeam")
The above line would accumulate a histogram of ebeam
for those rows
with pz
less than 145 GeV.
If you just want to know the number of n-tuple rows that pass a cut, the method
to use is Count
. For example:
pzcut_count = pzcut.Count()
Count syntax
Unlike Define
and Filter
, Count
never takes an argument. However, you can’t
omit the parenthesis, since Count
is a function; it’s always Count()
and not
Count
in program code.
This seems a bit counter-intuitive at first: You can’t just print out the value of
pzcut_count
. That’s because it’s still an RDataFrame
variable, in the same sense
that histchi2
was earlier. In the case of histchi2
,
you had to Draw
it to see anything. The corresponding method to use with Count
is GetValue()
; e.g.,
pzcut = dataframe.Filter("pz < 145");
pzcount = pzcut.Count();
std::cout << "The number of events with pz < 145 is "
<< pzcount.GetValue() << std::endl;
pzcut = dataframe.Filter("pz < 145")
pzcount = pzcut.Count()
print("The number of events with pz < 145 is",pzcount.GetValue())
Result
When I run either of the above code examples, I get
The number of events with pz < 145 is 14962
Give it a try. Hopefully you’ll get the same answer.

Figure 38: https://xkcd.com/562/ by Randall Munroe. This is another way to apply a cut.