(why_git)=
# Why Git

:::{admonition} Intended audience
:class: note

If you're wondering why we need a [version-control
system](https://www.atlassian.com/git/tutorials/what-is-version-control)
in the work we do, this section tries to answer that. If you already
know why, you can skip this page and go to the next one.

:::

## Saving a file

Let's start with an example that's almost too simple: Assume you're
working on a document using a text editor. Every once in a while you
hit {menuselection}`File --> Save` to preserve your work. [^memory]$^{,}$[^often]$^{,}$[^emacs]

[^memory]: What you're actually doing is moving information from
    [volatile
    memory](https://www.geeksforgeeks.org/difference-between-volatile-memory-and-non-volatile-memory/)
    to non-volatile storage. If you didn't do this, you'd lose your work when your computer turns off. 

[^often]: Since I'm paranoid about losing my work, I save any file I'm
    working about once every sentence (or couple of lines of code,
    depending on what I'm writing). For example, I saved the file
    you're reading now about once per footnote.

[^emacs]: Since I prefer the keyboard version of the {ref}`emacs
    <other-commands>` text editor, actually I'm typing **Ctrl-x Ctrl-s**
    every few seconds.

You can think of this as "progress-based" saving: You're saving your
progress as you go along so you don't lose your changes.

What if you reach a point in your writing when you realize that you've
made a mistake? Most editors have some form of {menuselection}`Edit -->
Undo` command. However, that feature usually only works for your
current editing session; if you quit and your editor and restart it,
you lose the "undos".

## Saving versions of a file

So far, so good. Now let's think about a different approach to saving
your work: task-based saving.

Assume you've been working on a document to accomplish a certain
task. What a "task" is up to you. All that matters is that it's
represents a step in your work on the file.

Here some random examples off the top of my head:

- You're writing an article, and you've added a new section, along
  with bibliographic references.

<p />

- You're working on {file}`Analysis.C` from this tutorial, and you've
  modified the code to generate histograms. You'd like to "bookmark"
  this change, before you move on to the next step in your work:
  writing those histograms to a file.

<p />

- You wrote a ROOT tutorial, and you're adding material on git,
  possibly with footnotes and xkcd cartoons.[^reason]

[^reason]: For some reason this last item is on my mind as I type these words. I wonder why?

    :::{figure-md} footnote_labyrinths-fig
    :align: center

    <img src="https://imgs.xkcd.com/comics/footnote_labyrinths.png" alt="xkcd footnote_labyrinths" width="50%">

    <https://xkcd.com/1208/> by Randall Munroe
    :::

If we're only talking about one file, a potential solution is to make
a copy of the file at these signficant points, perhaps with a suffix
to indicate why you made the bookmark; e.g.,

```text
Analyze.C
Analyze-histograms.C
Analyze-write-histograms.C
Analyze-write-histograms-and-ntuple.C
Analyze-derive-pt.C
... and so on
```

This can become awkward[^atleast], but it works, at least for as long
as you're willing to come up with suffixes.

[^atleast]: ...at least in the way I've chosen to name the files. I'm
    sure you can think of a better scheme, at least for a single file.

What if there were a way to have some program keep track of these
"bookmarks" for you? With a way to track what changes you made, and
perhaps a way to revert to an earlier version if you wished?

You've already guessed that this is what git can do for you.

## Git management for one file

In the above example with {file}`Analyze.C`, what what one might do instead is:[^main]

[^main]: If you follow some git guides, they'll suggest to just use this command
    to initialize the repository:

        git init

    If you do this, the default name `master` will be used for the main branch. Many
    in the software community are trying to get away from using names like "master"
    to describe coding relationships. I agree with those reasons, so I'm going to
    encourage the use of the word `main` in these instructions.

```
# Once only: Set up the current directory to be managed by git
git init -b main

# Tell git to keep track of Analyze.C
git add Analyze.C

# Edit Analyze.C to do something different, then use a
# command like this. The option to '-m' is your comment
# on your change.
git commit Analyze.C -m "create histograms"

# Do some more work on Analyze.C, then commit those additional changes:
git commit Analyze.C

# Because I've omitted the '-m' option, git will prompt me with
# an editor session to add a comment. One reason to do this is that the
# comment can be several lines long.

# Make more changes to Analyze.C, then:
git commit Analyze.C -m "derive pt and include it in the ntuple"
```

## Git management for many files

This might seem overkill for a single file. But in a real project, you
might want to keep track of changes you made to many files in a coordinated way:

- You might write a physics paper with each section in its own file,
  to make it easier for different people to review only those sections
  relevant to their work. These sections might have a shared
  bibliography file, so as each person adds additional references you
  (or they) would add lines to both their section and to the
  bibliography.

<p />

- You're working on analysis code that requires several programs to
  function. There's an example of this in {ref}`friends`.

  Another example comes from my own work: I have a program that requires a
  code file, a file of options to control its analysis logic, and a
  shell script to execute it.

<p />

- To take another completely random example that comes to mind, suppose
  I'm adding a section on using git to a ROOT tutorial's appendices. I'd
  have to change the file that defines the {ref}`appendix table of
  contents <appendices>`, include a new {ref}`file with the table of
  contens of the git sub-sections <git>`, and a file for each
  sub-section that explains an aspect of git (for example, a page like
  the one you're reading now).[^source]

[^source]: If you're interested in what the original files that I
    create look like, click on "View page source" in the upper
    right-hand corner of any of the pages in this tutorial. If you do,
    you'll get an idea of how many files I updated as I
    added the topic of git to the tutorial.

This is where is a version-control system like git really shines: It
can track all the changes to all the files you've told it to control.

Let's set up a simplified example of using git to manage multiple files
in a project. I'll use the files from the {ref}`condor-tutorial` in
another one of the appendices.[^dontexpect]

[^dontexpect]: No, I don't expect you to read through the description
    of batch systems if that topic has no interest for you. It's
    convenient for me to use these files because they have a
    clear relationship to a single purpose. You don't have to know
    exactly what they do.

Let's assume that you followed the directions on {ref}`that page <condor-tutorial>`
and have these files in a directory.

```
condor-example.cmd
condor-example.sh
condor-example.py
```

For the sake of simplicity let's assume they're the _only_ files in that directory.

You could set up that directory to use git for version control:

    git init -b main

Now we tell git which files in that directory we want to manage:

    git add condor-example.py
    git add condor-example.sh
    git add condor-example.cmd

Note that we can get use UNIX's command-line tools and do this in a single line:

    git add condor-example.py condor-example.sh condor-example.cmd

Or even:[^period]

    git add condor-example.*

[^period]: The advantage of having these be the only files in your git-managed directory
    is that you could do this:

        git add .

    This means to have git manage _all_ the files in that directory.

    I don't like to use this form of the command. That's because when
    I do my work, I usually create a bunch of temporary work files
    that I don't want git to manage. I normally do a separate `git
    add` for each file I want git to control.

That trio of files defines a batch program that reads two
parameters on the command line: the mean of a gaussian, and the output
file to which to write a histogram of that gaussian
distribution.

Assume for the sake of the example that you want to add a third
argument: the standard deviation of that gaussian. That requires
making a coordinated change to all three files. As you worked on this,
you could "bookmark" your work with:

    git commit condor-example.py condor-example.sh condor-example.cmd -m "add a width argument"

Except that no one ever types something like that. They use the
following command; the `-a` means to commit changes to _all the files
that git manages_ in this directory and any sub-directories:

    git commit -a -m "add a width argument"

## Simple version control

We've learned how to save versions of our files as we work on
them. How do we make use of this feature?

There are a large number of possibilities. I'll leave a
description of most them to the [Git book](https://git-scm.com/book/). Here's
a few commands to get you started:

```
# To look over all commits I made:
git log

# An abbreviated listing of the previous command.
# If you want to revert back to an earlier commit, this
# output is more convenient. 
git reflog

# I've made changes, then realize that I don't like any of the work I
# did I can revert back to the files' as they were at the last commit:
git revert HEAD

# I want to revert back to a specific commit, perhaps just to look at
# it.  Assume that 'git reflog' tells me that it's the commit with the
# ID 62ceb81.
git checkout 62ceb81

# If I want to make changes to this commit and make it the new starting
# point for my work:
git merge main

```

:::{admonition} I'm a lying liar who lies
:class: warning

There's a problem with all of the git examples above: I lied. For the
work that you're likely to do with git, you won't be using the `git
init` command; you'll do something more complex instead.

I'll get to that in a subsequent page. But my conscience demands that I
admit my lie before you rush off to your home directory and type `git
init`. You _don't_ want to do that.

Let me discuss a couple of other git-related topics first.  Then I'll
get back to my lie.

:::