(fitting-histogram)= # Walkthrough: Fitting to a Gaussian distribution **(10 minutes)** I created a file with a couple of histograms in it for you to play with. Switch to your UNIX window and copy this file into your directory:[^f31] > cp ~seligman/root-class/histogram.root $PWD Go back to your TBrowser window. (If you've quit ROOT, just start it again and start a new browser.) Click on the folder in the left-hand pane with the same name as your home directory. Double-click on `histogram.root`. You can see that I've created two histograms with the names `hist1` and `hist2`. Double-click on `hist1`; you may have to move or switch windows around, or click on the `Canvas 1` tab, to see the `c1` canvas displayed. :::{admonition} What's a 'fit'? :class: note You can guess from the x-axis label that I created this histogram from a Gaussian distribution, but what were the parameters? In physics, to answer this question we typically perform a "fit" on the histogram: you assume a functional form that depends on one or more parameters, and then try to find the value of those parameters that make the function best fit the histogram. ::: Right-click on the histogram and select {guilabel}`FitPanel`. Under {guilabel}`Fit Function`, make sure that {guilabel}`Predef-1D` is selected. Then make sure {guilabel}`gaus` is selected in the pop-up menu next to it, and {guilabel}`Chi-square` is selected in the {menuselection}`Fit Settings --> Method` pop-up menu. Click on {guilabel}`Fit` at the bottom of the panel. You'll see two changes: A function is drawn on top of the histogram, and the fit results are printed on the ROOT command window. :::{admonition} What do these options mean? :class: note The {guilabel}`Fit Function` selects which mathematical function is going to be used to fit the histogram. {guilabel}`Predef-1D` means that the function is going to come from one of ROOT's pre-defined one-dimensional math functions; as you will learn in just a bit, you can define functions of your own. {guilabel}`Chi-square` refers to a fitting method; for any fit that you're likely to do with a {guilabel}`FitPanel`, this will be the method you'll use. ::: (statistics-question)= :::::{admonition} Fit results :class: note Interpreting fit results takes a bit of practice. Recall that a Gaussian has 3 parameters ($P_0$, $P_1$, and $P_2$); these are labeled "Constant", "Mean", and "Sigma" on the fit output. ROOT determined that the best value for the "Mean" was 5.98±0.03, and the best value for the "Sigma" was 2.43±0.02. Compare this with the Mean and RMS printed in the box on the upper right-hand corner of the histogram. :::{admonition} Statistics questions :class: tip Why are these values almost the same as the results from the fit? Why aren't they identical? ::: ::::: On the canvas, select {menuselection}`Options --> Fit Parameters`. You'll see the fit parameters displayed on the plot. :::{admonition} Fit parameters :class: note As a general rule, whenever you do a fit, you want to show the fit parameters on the plot. They give you some idea if your "theory" (which is often some function) agrees with the "data" (the points on the plot). ::: :::{figure-md} gaussian-fit-fig :class: align-center gaussian fit The resulting plot should look something like this. ::: As a check, click on {guilabel}`landau` (which vaguely resembles the plot in {numref}`Figure %s `) on the FitPanel's {guilabel}`Fit Function` pop-up menu and click on {guilabel}`Fit` again; then try {guilabel}`expo` and fit again. :::{admonition} Working with FitPanel :class: note You may have to click on the {guilabel}`Fit` button more than once for the button to "pick up" the click. It looks like of these three choices (Gaussian, landau, exponential), the Gaussian is the best functional form for this histogram. Take a look at the "Chi2 / ndf" value in the statistics box on the histogram ("Chi2 / ndf" is pronounced "kye-squared per [number of] degrees of freedom"). Do the fits again and observe how this number changes. Typically, you know you have a good fit if this ratio is about 1.[^f32] The FitPanel is good for Gaussian distributions and other simple fits. But for fitting large numbers of histograms (as you'd do in the {ref}`advanced exercises` and the {ref}`expert exercises`) or for more complex functions, you'll want to learn the following ROOT commands. ::: To fit `hist1` to a Gaussian, type the following command:[^f33] [] hist1->Fit("gaus") This does the same thing as using the FitPanel. You can close the FitPanel; we won't be using it anymore. :::{figure-md} mu-fig :class: align-center xkcd mu by Randall Munroe. If your fit function looks like this, something has gone wrong. This would be a poor fit for your data. ::: [^f31]: If you're going through this class and you can't login to a system on the Nevis particle-physics Linux cluster, you'll have to get the files from [my web site](https://www.nevis.columbia.edu/~seligman/root-class/files/). If you want to get all the files from that directory at once, one way is to use this UNIX command: wget -r -np -nH --cut-dirs=2 -R "index.html*" \ https://www.nevis.columbia.edu/~seligman/root-class/files/ You may have to install the `wget` command on your system, since it's often not installed by default. Be aware that in that directory there are a lot of work files I created to test things. There's more in there than just the files I reference in my tutorials. [^f32]: If you're not familiar with terms like "chi2" or "chi-squared" there's a brief introduction to {ref}`statistics ` in this tutorial. [^f33]: What's the deal with the arrow "->" instead of the period? It's because when you read in a histogram from a file, you get a pointer instead of an object. This only matters in C++, not in Python. See the section on {ref}`pointers ` for more information.