(df_define)= # Walkthrough: Defining an RDataFrame **(10 minutes)** Defining an `RDataFrame` is usually simple. Here's how to do it in both C++ and Python: :::{code-block} c++ :name: cpp-rdf-def :caption: RDataFrame definition (C++) auto ntupleName = "tree1"; auto fileName = "experiment.root"; auto dataframe = ROOT::RDataframe(ntupleName,fileName); ::: :::{code-block} python :name: python-rdf-def :caption: RDataFrame definition (Python) import ROOT ntupleName = "tree1" fileName = "experiment.root" dataframe = ROOT.RDataframe(ntupleName,fileName) ::: :::{note} Actually, unless I'm writing a program that accepts the name of the n-tuple or its file as arguments, I usually don't define separate variables like **`ntupleName`** or **`fileName`** the way I do in the above examples. I'm more likely to just simply do: dataframe=ROOT.RDataframe("tree1","experiment.root") I'm doing this the long way so you can get a sense of what the arguments mean. The name **`dataframe`** in this example is arbitrary. If you visited the ROOT website's [RDataFrame](https://root.cern/doc/master/classROOT_1_1RDataFrame.html) page, you can see they typically use a short name like **`df`** to save on typing. Since I know how to use copy-and-paste, I've opted to use a longer variable name for clarity. ::: :::{note} For now, I'm showing both C++ and Python examples of the code. Eventually, when I think I've shown enough examples so you can convert one to the other, I'll stop showing both in parallel. You've probably already noticed how, at least for `RDataFrame`, the code is very similar. ::: :::{note} I assume that you're working through {ref}`rdataframe` interactively, probably in a {ref}`notebook `. You only have to define your dataframe once per session. I'm not usually going to include the above commands in the listings below. If you restart ROOT or the notebook kernel, be sure to initialize **`dataframe`** again. ::: If you'd like to see the names of the columns in the dataframe, it's easy to do interactively:[^describe] [^describe]: If the `Describe` or `Display` methods don't work for you, don't panic. These were added to the very latest version of ROOT. While I try to keep the ROOT versions up-to-date for everyone on the Nevis particle-physics systems, sometimes (due to complex reasons that are beyond irrelevant to you, trust me) I can't offer you the latest-and-greatest. :::{code-block} python :name: rdf-describe :caption: RDataframe description dataframe.Describe() ::: If you'd like a peek at the first few values (roughly equivalent to the `TTree::Scan()` method in {ref}`cpath` or {ref}`pythonpath`):[^pointers] [^pointers]: If you're using C++: You'll have to observe via my examples when an `RDataFrame` method returns a {ref}`pointer `; that is, when you have to use `->` to access a method. Remember that you had to deal with this back when you were {ref}`fitting histograms `. :::{code-block} c++ :name: cpp-rdf-display :caption: RDataframe - displaying the first few rows (C++) dataframe.Display()->Print() ::: :::{code-block} python :name: python-rdf-display :caption: RDataframe - displaying the first few rows (Python) dataframe.Display().Print() ::: Give these commands a try to see what they tell you about the n-tuple `tree1`. :::{figure-md} health_data-fig :align: center xkcd health_data by Randall Munroe :::