Installing software the Debian way
Apt Mastery
Dependency tangles fall away with the Debian package system.
Rarely do you hear about "dependency hell" any more, the term used when the installation of an application failed because it needed a development library that wasn't installed and when each library installed could thrust you further into a proliferation of non-installed libraries, leaving you a whimpering mass of frustration.
Fortunately, this scenario is largely a thing of the past, thanks to the Debian dpkg and apt-get package system. More than a decade ago, Debian developers hit on the idea of installing software arranged in packages of files that included scripts to grab missing libraries automatically and configure the software. The idea was eventually copied by other package systems, so now you generally encounter dependency hell only when trying to install unpackaged applications still in development.
The Debian package system is used by many of the most popular GNU/Linux distributions. Technically, dpkg is the tool that manages the software, and apt-get is the tool through which users interact with dpkg most of the time. Both are used to install software in the online repositories in /etc/apt/sources.list.
Additionally, you can add /etc/apt/apt.conf, the file that configures apt-get in more detail. However, most people prefer to configure apt-get through sub-commands and options. The basics of these commands are easy to learn, whether you want to add or remove software or maintain your system. Also, you'll find that a whole sub-system of related utilities have grown up around apt-get to increase your control.
Adding and Removing
Unlike most commands, apt-get consists of three parts separated by spaces: the basic apt-get command, a sub-command, and the packages involved. For example, to install the Pysol solitaire game, you would type apt-get (the basic command), followed by install (the sub-command) and pysol (the name of the package).
A sub-command must always be present, but for some purposes – especially for maintenance – you do not need a specific list of packages. To include multiple packages, either list the packages separated by a space or use regular expressions, such as the asterisk, although doing so can make troubleshooting harder and sometimes lead to unforeseen results.
Instead, if you need to install multiple packages, you are probably better off looking for a metapackage. A metapackage is a dummy package meant to simplify the installation of large applications that are split into more than one package. For instance, in Debian, kde-minimal installs the fewest number of packages needed to run the KDE desktop. To see whether a metapackage exists for your purposes, search online in your distribution's repositories; if all else fails, guess its name and see whether you are successful. As long as you exercise some common sense, guessing a metapackage's name is unlikely to have any results that you can't uninstall as easily as you installed.
The basic command for adding or upgrading a software package is simply apt-get install package-name. As soon as you enter the command, you usually get a complete summary of what will happen if you go through with the installation, including the dependencies that will be installed, the packages that will be upgraded and removed, and the amount of disk space that will be required. Unless the action can proceed automatically without affecting anything else, you then have the choice to continue the process or not (Figure 1). Usually, you should read the summary carefully before continuing, just to be sure that what you typed doesn't include any unpleasant surprises.
If you are using a non-standard online repository, it might not be verified automatically as a valid source. When that happens, you should only continue if you are absolutely sure that you can trust the repository.
As apt-get works, it shows which package is downloading and its progress, as well as the download speed and the amount of time required to finish the operation. The times are only estimates and will change as the Internet connection speed changes. Once the downloads are complete, apt-get installs the software, sometimes pausing to ask questions about how you want it installed. After everything is done, apt-get then exits with a summary of any problems that it encountered, if necessary. As a final touch, the software you just installed is added to desktop menus.
The basic command for installing software can be modified with a number of options. For example, you might want to use -s to simulate the installation without actually doing anything, just to make sure you uncover any problems before the real installation. If the installation reports any problems, you can run the command again, this time with the -f option, in the hopes that apt-get can intelligently provide a solution to the problem, or with -m to ignore any missing dependencies in the hopes that you will get results that you can live with. If you don't want to answer questions about the installation, you can use -y to have all questions answered with "yes" – a dangerous option that you should avoid if you don't know what you are doing. If you want to re-install the same version of a piece of software, use --reinstall.
However, perhaps the most common useful option for installation is -t repository, which allows you to specify the online repository from which you want to install the packages and all its dependencies. This option is especially useful in Debian, whose main repositories – stable, testing, and unstable – describe the state of the software. For instance, if you want the very latest version of Gnome, even if it has not been tested, you might enter apt-get -t unstable install gnome-desktop environment. Similarly, in other Debian-based distributions, you might have added a development branch of the software to your repositories or a privately developed version of software that you only want to use occasionally. With this option, you can downgrade a package when the most recent version is buggy or nor working.
Alternatively, you could add /repository at the end of the command. However, this option will only install the specified packages from the specified source, so it might not work as well.
If you are an expert, you could also download a single package to your hard drive for installation. In that case, you would go directly to dpkg. For example, if you downloaded a development version of the digiKam image manager, you could install by changing to the directory containing the package and entering dpkg -i digikam.
For other options, the command structure is the same, except for the change in the sub-command. Even the available options are the same, although some might not make sense with every sub-command. The remove sub-command uninstalls software, whereas the purge sub-command removes all traces of it from your computer (neither, however, removes dependencies, which is why you might need to run some of the maintenance sub-commands listed below). If you want to upgrade every package on your computer, then you can use the dist-upgrade command rather than entering every package individually. To add every available application in the repositories, whether already installed or not, you can use upgrade, although most people have no use for this option.
Most people use the Debian package system to install pre-compiled binary files. However, if you want to ensure that all your software runs as efficiently as possible on your system, you can use the source sub-command to download source packages and the -b option to compile them on your computer. If the source requires dependencies, you can use the build-dep sub-command to provide them. Note, however, that compiling source packages can take considerable time, particularly with a large application – perhaps even a matter of hours with an application like OpenOffice.org.
Maintaining Software
To help with these basic operations, dpkg and apt-get include a number of utilities. When you run into difficulties and are seeking information, the dpkg-query command can give you detailed information about the packages involved. For example, if you type dpkg-query -p kdepim, you receive a description of the package that lists contact information for the developers who maintain it, the package's dependencies, size, and description, as well as the homepage for the development team (Figure 2). Similarly, you can use the -s option to determine the status of a file or -L to see a list of all the files included in the application's package. All this information can be invaluable if you run into trouble, regardless of whether you want to solve the problem yourself or find someone to help you.
The apt-get command includes several other utilities in the form of sub-commands that are issued without referring to any packages. Just as you might use fsck to investigate and repair the structure of a filesystem, you can use apt-get check to ensure that the package system is working properly.
The more you install and uninstall, the more regularly you should consider running apt-get with the clean and autoclean sub-commands. The clean command removes all the packages you have downloaded and installed, and autoclean removes all packages that can no longer be downloaded. By running both occasionally, you can free up extra space on your hard drive without affecting the system.
Another useful maintenance sub-command for apt-get is autoremove, which removes orphaned packages (i.e., ones that serve no purpose because they were added as dependencies for an application that you have since removed). Because these orphans do nothing but fill space on your hard drive, you might as well remove them. Unlike clean and autoclean, the Debian package system keeps track of orphans and will remind you that they exist when you run apt-get for some other purpose.
Yet another bit of maintenance you might want to perform adds or removes online repositories from /etc/apt/sources.list. These edits can be done in vim, emacs, or any other text editor. The sources.list file points to all the online repositories that apt-get and dpkg use. Each repository is listed on its own line according to a simple system. The entry for each repository begins with deb if it is a repository of binaries and deb-src if it is a repository of source packages. This information is followed by the repository URL, name, and subsections. Sources are disabled with a hash mark (#) at the start. Typically, hash marks also are used to add comments that humans can use to identify the source.
When you add or remove a repository from sources.list, you must then run apt-get update to change the repositories that apt-get and dpkg are using. Otherwise, the Debian package system continues to use those previously identified. Editing then updating takes a few minutes to complete each time but has the advantage of ensuring that you know precisely which sources you are using. For this reason, some users prefer editing sources.list and specifying the -t option for setting which sources they install from. In this way, your chances of making a mistake are fewer.
Related Utilities
Depending on your distribution, apt-get and dpkg can have a number of other utilities with which they are associated. For example, most distributions will probably have apt-cdrom, which can specify a source for installing from CD or DVD, although the utility is most useful when you first install.
A less common utility is apt-spy, which you can use to determine the fastest repository for you to use. The only drawback is that connection speeds can vary depending on the number of users, and you might need to run apt-spy several times before using its results to edit /etc/apt/sources.list.
If you are cautious, you might want to see whether your distribution includes apt-listbugs before installing a package. The apt-listbugs utility looks for any recent bugs that have been filed against the version you are about to download.
By far, the most useful package utility is apt-cache, which offers a treasury of information about packages and your system. For example, apt-cache showpkg packagename shows which version you have installed, the latest version available in the repositories you are using, and the reverse dependencies of the packages (i.e., which packages depend on it).
Similarly, apt-cache dump lists all the packages you have installed, and apt-cache stats offers information such as the number of installed packages and the total number of dependencies. An especially useful option is apt-cache search packagename, which tracks down the exact name of a package or packages that you might want to install.