Getting started with Git

Git Going

© Lead Image © magiceyes, 123RF.com

© Lead Image © magiceyes, 123RF.com

Article from Issue 176/2015
Author(s): , Author(s):

Git is more than a version control system. We'll show you how to get started with this powerful creation of Linus Torvalds and the kernel developers.

What is Git? A nightmare? An idiot? It was Monty Python's Flying Circus that made the word, and later the project, well-known. After the Vikings sang "Spam, Spam, Spam," another sketch, this time about a Mr. and Mrs. Git, went viral. But it was not just the idiot that inspired Linus Torvalds to choose the three-letter word for his version control and source code management system [1].

Git is close to "get" and follows the tradition of short and easy-to-remember, but pretty universal, Unix commands. Torvalds always meant Git to be simple, stupid, contemptible, and despicable. Take your own pick from the dictionary of slang to find the dominant meaning. Some definitions call Git a Global Information Tracker; others say it's just a combination of curse words ("goddamn idiotic truckload of s*!"); however, these are rumors and probably only apply when it breaks.

Another Finnish Invention

In 2005 BitKeeper, the Linux Kernel developers' favorite source code management system, changed its license, and thus, the crew around Linus had to find an alternative. When he couldn't find another tool that met his standards, Linus decided to create a new code management solution from scratch. The goals for Git were equally simple: It should be fast, have a simple design, and have strong support for non-linear development (i.e., thousands of parallel branches). Linus wanted a fully distributed system that could handle very large projects – like the Linux kernel.

After 10 years, Git has become a de facto standard among developers (Figure 1). At SUSE, most of the developers work with Git, and most of the upstream projects also rely on Git, which makes collaboration a lot easier.

Figure 1: The man page that pops up after you type git help glossary is very informative.

With Git, everything became a little easier. With the website GitHub [2] (Figure 2) behind the local systems, the team is not only able to manage their own documentation projects but is hoping for far more contributions in the future – not only from developers who are already using Git and GitHub, but also from other interested people. "The website makes collaboration easier, while Git offers lots of advantages as a distributed project," says SUSE technical editor Thomas Schraitle.

Figure 2: GitHub is a company, a website, and a widespread platform for developers using Git as their version control system.

What Does Git Do?

Linus Torvalds said that Git is just a stupid content tracker, finding tracks and folders. According to Torvalds, "I really, really designed it coming at the problem from the viewpoint of a filesystem person. I actually have absolutely zero interest in creating a traditional SCM system" [3]. Where Subversion sees the data as a list of changes (check-ins) over time, Git takes the data as a set of snapshots.

You'll find more details about Git and how it works in the project's reference documentation [4]. Git lets the individual developer work independently and locally, experimenting with changes until it is safe to commit the changes beck to the Git repository. Git users always work in a Working Directory. When changes are made, they are first saved to a Staging Area. A commit command will incorporate the data from the staging area into the official Git repository (i.e., the .git directory).

Behind the scenes, Git is carefully organized to manage the project efficiently and reliably. As I mentioned previously, Git is best envisioned as a series of snapshots. Each snapshot defines the state of the project at a single point in time when a commit operation occurred. Conceptually, you can think of a snapshot as a complete copy of the project, but in fact, files that haven't changed are referenced as pointers back to the previous snapshot. Because the snapshot is created through a commit command, the snapshot itself is commonly referred to as a "commit."

The line of development is thus represented as a chain of commits (Figure 3). This linear chain of snapshots is known as a branch. The main branch of the project is called the master.

Figure 3: A Git project consists of a series of snapshots (called "commits.") The user's working directory normally references the latest snapshot in the branch. Changes are saved to a staging area, then integrated into a new snapshot through a commit operation.

Suppose you want to make a major revision to your project. You want to be able to tinker with the code – make changes and test these changes – without affecting the stable codebase. In that case, you can create a new branch of the project for your testing and tinkering. The new branch begins a new, separate history of commits starting at the branch point (Fork in Figure 4). The user can switch between branches (and switch between commits within a branch) using Git commands. Git uses the term HEAD to refer to the snapshot and branch where the user's working directory is currently pointing. (The user's working directory and staging area typically points to the last commit within the active branch, although it is possible to use a "detached head," which points to any arbitrary snapshot in the history of the project).

Figure 4: A new branch allows the code to evolve separately from the master. When work on the revision is complete, a merge operation integrates the changes with the main code base.

In conventional revision control systems, a "checkout" operation copies and locks the files, as if you were checking out a book at the library. In Git, "checkout" simply means you update the working directory to reference a different snapshot.

A separate branch can evolve independently of the master. A whole team can work on a branch and perfect it for months, or even years, with numerous commits and hundreds of new files. At some point, when the code is stable and all new features are added, the branch can then be merged back in with the main master branch. For a large project, the merging process requires many rules and steps handled within the software, as well as decisions by the maintainer. As you can imagine, whoever has the authority to merge has the ultimate control over the project. In the case of the Linux kernel, Linus Torvalds himself retains the authority to merge code back into the master branch, and much of his work for the Linux Foundation consists of evaluating code from other branches to determine if it is ready to merge back into the main branch.

Figure 5 shows a typical Git development process. The master branch (on the far right – in blue) represents the official release history of the product. A team of developers wants to start working on upgrading the code, so they start the develop branch (in yellow). A development branch typically uses frequent commits, where new changes are integrated, tested, and revised. Separate feature branches allow small teams of developers to work on specific new features. When the feature is finished, it is merged back into the development branch.

Figure 5: A typical Git development scenario using the Gitflow workflow.

At some point, when the new features are successfully integrated and the development branch is looking stable, the team will use a snapshot of the development branch to start a release branch for final testing (in green). The release branch typically focuses on bug hunting and minor repairs, with a freeze on any new features (as you could probably guess, this phase is typically equivalent to a beta or release candidate version of the project). When the release branch is ready, it is merged back into the master branch, an event that is known to the world as a "new release" of the software. In this case, the release branch is also merged back into the development branch, so the development branch will gain the benefit of the pre-release bug fixing.

As you also see in Figure 5, a maintainer who wants to fix a specific severe problem within the master branch (such as a security flaw) also has the option to launch a short-term hotfix branch, to fix the specific problem, then merge the changes back to the master. (As you will learn later in this article, the workflow described in Figure 5 is known as the Gitflow workflow.)

Getting Started

Git resides in the standard package repository for most Linux distributions. The commands for installing Git through Zypper are shown in Listing 1. If you use another package tool, see the project documentation for your own Linux distro.

Listing 1

Installing Git and Other Tools

 

Git uses several configuration files: Whereas the system-wide /etc/gitconfig might not be available on all distros, the user's ~/.gitconfig or ~/.config/git/config files contain settings affecting all the user's Git projects.

Probably the most convenient way to add, change, or delete Git configuration settings is with the git command: the git config command followed by either --system, --global, or --local lets you add settings to your configuration. For example, The first three lines in Listing 2 change your personal data and add a GPG key (which will be used to sign commits). The next three lines define a standard editor, a pager, and a color scheme (auto is the default). The next eight lines use the following form to create aliases:

--global alias.<shortcut> "<command>"

Listing 2

Configuration Settings

 

If you're used to working with the command line, you know how handy aliases can be. Git aliases are used in addition to Bash aliases. You can check the Git aliases that have been set up with the command in the final line of the listing.

A handy Bash script will make your Git prompt more visual and informative. If you are an expert in Git already, you can fetch the helpful script using git clone (Listing 3; Figure 6). Once installed, load the script with the source command or its shortcut.

Listing 3

Getting a Nicer Prompt

 

Figure 6: Use bash-git-prompt for a nicer prompt; the source command activates it, git init initializes a directory, and bash-git-prompt gives interactive information.

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy Linux Magazine

SINGLE ISSUES
 
SUBSCRIPTIONS
 
TABLET & SMARTPHONE APPS
Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

  • Remote Git Repositories

    Software projects often comprise several code branches, some of which exist in parallel. Git supports community code development through remote repositories and code branching.

  • Tree View

    Complex Git projects sometimes require a better view of the dependencies and branches. Several tools offer GUI options for Git. We take a look at gitk, gitg, git-gui, and GitAhead.

  • Git 101

    When several people collaborate on source code or documents, things can get messy fast. Git provides a quick cure: The distributed versioning system reliably ensures the integrity and consistency of data with minimal effort.

  • Perl: Collaborate with GitHub

    GitHub makes it easier for programmers to contribute to open source projects by simplifying and accelerating communications between project maintainers and people willing to contribute.

  • Version Control with Git

    The Git version control system is a powerful tool for managing large and small software development projects. We'll show you how to get started.

comments powered by Disqus

Direct Download

Read full article as PDF:

Price $2.95

News