(cpp-approach)= # The C++ approach :::{admonition} A warning for Python programmers :class: warning Don't skip this section. Some of the concepts apply to you as well. ::: Here's a function that takes two inputs, the x-momentum and the y-momentum, and returns one output, the transverse momentum. :::{code-block} c++ :name: c-pt-func :caption: A simple C++ function to be used with RDataFrame. float pt_func( float xmom, float ymom ) { return sqrt( xmom*xmom + ymom*ymom ); } ::: Following the usual C++ standard, you'd define this function before you defined your main routine.[^forward] [^forward]: You could also define this function *after* your main routine, and just include a [forward declaration](https://stackoverflow.com/questions/4757565/what-are-forward-declarations-in-c) before the main routine. :::{admonition} Too simple? :class: note Obviously, this function is so simple that you're not likely to define it separately just to pass it to `RDataFrame::Define()` (though see the section on {ref}`lambda expressions ` later on). The point is to start with something simple as a "skeleton" for you to see how to create more complex functions of your own. ::: In order to use this function `pt_func` on a dataframe, you could do: :::{code-block} c++ :name: define-func-c :caption: How to apply `pt_func` to each entry in an n-tuple auto pt_dataframe = dataframe.Define("pt",pt_func,{"px","py"}); ::: Note how this differs from what we've done before: :::{code-block} c++ :name: define-jit-c :caption: Our earlier approach to defining a new column in our n-tuple. auto pt_dataframe = dataframe.Define("pt","sqrt(px*px + py*py)"); ::: In {numref}`Listing %s `, we supply the function in the form of a text string, to which ROOT applies its internal compiler to jit the string. In {numref}`Listing %s `, we let C++ compile the function from {numref}`Listing %s ` and pass that function's C++ "programming layer" name to the `Define()` method. However, that's not enough for `RDataFrame::Define()` to use `pt_func`. It has to be told which n-tuple columns to supply as arguments to the function. That's why we also have to provide a list `{"px","py"}` as a third argument to `Define`.[^short]{sup}`,`[^default] [^short]: Could we have avoided the need to specify `{"px","py"}` to `Define` if we'd used those names in the definition of `pt_func`? For example, :::{code-block} c++ float pt_func( float px, float py ) { return sqrt( px*px + py*py ); } ::: You've probably already guessed that the answer is no. Remember, names that are defined in the programming layer have no meaning to ROOT's internal layer. Even if we choose to use the same name in the programming layer as in the internal layer, ROOT has no direct way of matching those names between layers. [^default]: If we omit the list of columns in `Define`, ROOT will assume that the user function takes every column in the n-tuple as an argument. For the extremely simple n-tuple `tree1`, you might be able to live with that; e.g., :::{code-block} c++ float pt_func( float c2, float eb, int ev, float xmom, float ymom, float zmom, float zvertex ) { return sqrt( xmom*xmom + ymom*ymom ); } ::: Then we could omit that third argument to `Define`: :::{code-block} c++ auto pt_dataframe = dataframe.Define("pt",pt_func); ::: However: - The compiler will toss out a lot of warning messages about "unused variables". This is accurate, since our function does not refer (for example) to `zv` in its body. - You can't always control the order of the columns in an n-tuple. In particular, if you look at the n-tuple that you created using {ref}`Snapshot `, you may see that the method did not necessarily add the new columns to the end of the n-tuple. - The n-tuples in real experiments often have hundreds of columns. It's impractical to list them all in the function definition. If you don't, you may get "function not found" error messages when you compile your program; the number of arguments in your function (like the two in `pt_func`) won't match the number of arguments assumed by the compiler (hundreds?). This gives us a recipe: - Define a function that returns a value; e.g., float some_function( float value1, float value2, ... ) { // Lines of code that use value1, value2, ... // to calculate a result. return result; } - Use that function in a `Define`, supplying the n-tuple columns to be passed to the function as a list of strings: auto new_dataframe = dataframe.Define("new-column",some_function,{"column1", "column2", ...}); If you're writing a function that will be called by `Filter`, the recipe is almost the same, except that function has to return a [boolean](https://www.w3schools.com/cpp/cpp_booleans.asp) result (`true`, `false`). For example: :::{code-block} c++ :caption: An example of a function that could be used as an argument to `Filter` bool energy_cut( float energy ) { return energy < 145; } :::