Motivation¶
Greg Wilson’s excellent Script for Introduction to Version Control provides a detailed motivation why you will benefit greatly from using version control systems. Here follows a shorter motivation and a quick overview of the basic concepts.
Why not Dropbox or Google Drive?¶
The simplest services for hosting project files are Dropbox and Google Drive. It is very easy to get started with these systems, and they allow you to share files among laptops and mobile units with as many users as you want. The systems offer a kind of version control in that the files are stored frequently (several times per minute), and you can go back to previous versions for the last 30 days. However, it is challenging to find the right version from the past when there are so many of them and when the different versions are not annotated with sensible comments. Another deficiency of Dropbox and Google Drive is that they sync all your files in a folder, a feature you clearly do not want if there are many large files (simulation data, visualizations, movies, binaries from compilations, temporary scratch files, automatically generated copies) that can easily be regenerated.
However, the most serious problem with Dropbox and Google Drive arises when several people edit files simultaneously: it can be difficult detect who did what when, roll back to previous versions, and to manually merge the edits when these are incompatible. Then one needs more sophisticated tools, which means a true version control system. The following text aims at providing you with the minimum information to started with Git, the leading version control system, combined with project hosting services for file storage.
Repositories and local copies¶
The mentioned services host all your files in a specific project in what is known as a repository, or repo for short. When a copy of the files are wanted on a certain computer, one clones the repository on that computer. This creates a local copy of the files. Now files can be edited, new ones can be added, and files can be deleted. These changes are then brought back to the repository. If users at different computers synchronize their files frequently with the repository, most modern version control systems will be able to merge changes in files that have been edited simultaneously on different computers. This is perhaps one of the most useful features of project hosting services. However, the merge functionality clearly works best for pure text files and less well for binary files, such as PDF files, MS Word or Excel documents, and OpenOffice documents.
Installing Git¶
The installation of Git on various systems is described on the Git website under the Download section. Git involves compiled code so it is most convenient to download a precompiled binary version of the software on Windows, Mac and other Linux computers. On Ubuntu or any Debian-based system the relevant installation command is
Terminal> sudo apt-get install git gitk git-doc
This tutorial explains Git interaction through command-line applications in a terminal window. There are numerous graphical user interfaces to Git. Three examples are git-cola, TortoiseGit, and SourceTree.
Configuring Git¶
Make a file .gitconfig
in your home directory with information on
your full name, email address, your favorite text editor, and
the name of an “excludes file” which defines the file types that Git
should omit when bringing new directories under version control.
Here is a simplified version of the author’s .gitconfig
file:
[user]
name = Hans Petter Langtangen
email = hpl@simula.no
editor = emacs
[core]
excludesfile = ~/.gitignore
The excludesfile
variable is important: it points to a
file called .gitignore
, which must list,
using the Unix Shell Wildcard notation,
the type of files that you do not need to have under version control,
because they represent garbage or temporary information,
or they can easily be regenerated from
some other source files. A suggested .gitignore file looks like
# compiled files:
*.o
*.so
*.a
# temporary files:
*.bak
*.swp
*~
.*~
*.old
tmp*
.tmp*
temp*
.#*
\#*
# tex files:
*.log
*.dvi
*.aux
*.blg
*.idx
*.nav
*.out
*.toc
*.snm
*.vrb
*.bbl
*.ilg
*.ind
*.loe
# eclipse files:
*.cproject
*.project
# misc:
.DS_Store
Carefully judge what files to bring under version control
You should be critical to what kind of files you really need a full
history of. For example, you do not want to populate the repository
with big graphics files of the type that can easily be regenerated by
some program. The suggested .gitignore
file above lists typical
files that are not needed (usually because they are automatically
generated by some program).
In addition to a default .gitignore
file in your home directory, it
may be wise to have a .gitignore
file tailored your repo in the
root directory of the repo.
Large data files, even when you want to version them, fill up your repo and should be taken care of through the special service Git Large File Storage.