Compiling Bash shell scripts
Wrapped
The Bash Shell Script Compiler converts shell scripts directly into binaries. Compiling your scripts provides protection against accidental changes, but you will have to contend with some quirks.
The Shell Script Compiler tool (SHC [1]) brings one advantage of compiled scripts to Bash: the ability to hide source code and prevent future modifications.
Other advantages of compiled scripts include speed and portability, but in this case, portability and faster run time are not the focus. In fact, programs that you compile with SHC still require Bash, and speed gains are hardly noticeable.
If you have a need to protect your Bash scripts from prying eyes, though, SHC might be your best option. It is currently the most popular free tool for converting (Bash) shell scripts into executable programs (see the "Installation" box).
Installation
On Ubuntu: The PPAs for many versions and variants (Xubuntu, Mint, etc.) of this distribution have more-or-less recent versions of SHC.
On Arch Linux: The current version of the package is available from the AUR (user repositories). You will have to navigate two obstacles during installation. The first issue relates to an incorrect checksum in the PKGBUILD
file (Listing 1). The correct checksum is computed by the specified tool (sha256sums
) and inserted during the install (Listing 2). If you answer y
to Edit PKGBUILD?, an editor opens in which you can make the necessary changes.
Additionally, the SHC package includes a series of test scripts (pru.sh
, test.bash
) that are not installed on Arch Linux, even though the original archive contains them. These scripts are used to make sure that SHC works properly and should therefore always be compiled before using the compiler to verify the results.
Alternatively, you could always build SHC directly from source code and then install. After unpacking the archive, a simple make
handles the compilation; make install
installs the program below /usr/local/
. The make test
command no longer works in version 3.8.9; instead, you can use shc -f test.bash
(Listing 3).
Listing 1
Checksum
Listing 2
Unsupported Package
Listing 3
Test SHC Install
Understanding SHC
SHC uses a two-step process (Figure 1): SHC generates fairly extensive, highly specialized C source code from the shell script, which is then subsequently compiled using the C compiler to create a binary program.
In the first step, SHC generates a file with the extension .x.c
; this is then compiled in the second step by the C compiler defined in the $CC
environment variable to create a file with the extension .x
.
Obfuscation of the shell script source code in the C code relies on the use of an array that contains the contents of the script. During the build, SHC progressively accesses the (encrypted) entries of the array and integrates them into the executable program.
How arrays are processed in detail and how the binary program is implemented is described online [2], where you can also learn something about password security in scripts. The very informative blog also discusses the options for subsequently decrypting programs created by SHC.
Hands On
To begin, I take the classic "Hello World" script and output Hello SHC. The command
shc -f hello.sh
will refuse to compile and will output the message: shc: invalid first line in script:… if the first line of your script is not the shebang line:
#! /bin/sh
If you include this line, the build proceeds without complaint. Adding -v
to the shc
command outputs compilation comments. Table 1 lists some of the most important command-line options.
Table 1
SHC Options and Variables
Option/Variable | Meaning | Function |
---|---|---|
Options |
||
|
Expire |
Limits the time in which the program can be run up to the specified date. The date is expected in the format <dd>/<mm>/<yyyy>. After the deadline expires, the following warning is output: …: has expired! |
-m < |
Message |
Specifies the message that appears after the expiration set by -e. |
-f <script> |
File |
A mandatory option that references the script SHC needs to build in the argument. |
-i <shell options> |
Inline |
Options passed into Bash and enabled by the binary program when Bash starts. |
-x <command> |
Exec |
The binary program starts the script using an exec, followed by $@ (all command-line options and arguments) by default. |
-l <Option> |
Last |
Defines the last command-line option, normally –, which is also the default. |
-r |
Relax |
Loosens the security settings for compiling so that the binary program will also run on other computers with the same operating system. |
-v |
Verbose |
Displays extensive messages while compiling, which is useful for fault diagnosis. |
-d |
Debug |
Enables debug mode in the binary program. This creates a large amount of additional information about the command-line options, arguments, paths, external programs, and so on. |
-T |
Traceable |
Creates a program that can be traced with strace or the like. |
-h |
Help |
Displays the short help -h message. |
-C |
Copyright license |
Displays the copyright license. |
-A |
Abstract |
Displays brief abstract information and terminates processing without compiling the script. |
Environment Variables |
||
$CC |
C compiler |
Contains the C compiler (cc by default). |
$CFLAGS |
C Flags |
Defines compiler options. |
The source code produced in Listing 4, hello.sh.x.c
, is almost 9KB and largely incomprehensible at first glance, but the greater part of it has to do with encrypting the script. The executable program (.x
) weighs in at 11KB; this is not exactly small and can cause problems on various platforms.
Listing 4
Hello SHC
For example, programs generated in Arch Linux and Ubuntu only run on Arch Linux if they are created with the -T
option set, which ensures that program flow can be monitored by external diagnostic programs such as strace
. On Ubuntu, this option is not required. Binaries generated on either system run on Ubuntu without problems.
Shell scripts have some special properties that the compiler needs to "understand" and implement, or at least keep. For example, arguments can be passed in to scripts that are used within the scripts as position parameters. SHC has no trouble handling them. This is true even if set -- ...
reassigns the position parameters.
The next important point that needs to be clarified in shell scripts is the evaluation of return values (exit codes), which are produced by both internal and external commands and are supported by SHC. In Bash, $?
contains the exit code for the last command executed in the foreground; to display this value, type echo $?
.
With shell scripts in particular, return values are often used for conditional linking of commands (Listing 5). Short circuit logic is a special case: &&
connects two commands, the second of which is only evaluated if the first runs without error (i.e., it terminates with a return value of 0
). Alternatively, the command that follows the ||
is only evaluated if the previous command produces an error (i.e., returns a value other than zero). Short circuit tests can be used easily with SHC.
Listing 5
Conditional Logic
Another cause of concern involves inputs and outputs. Without the addition of dialog programs, the shell provides only very limited options directly associated with the terminal. For example, what happens if the Zenity dialog tool (or the newer variant YAD) is used to accept input – and perhaps also to produce output? As the example in Listing 6 shows, this is no problem.
Listing 6
YAD Dialog
The environment variable $INPUT
is set by YAD. The input dialog is initialized in YAD with the string "input"
, which can be modified or replaced by the user (--editable
allows this to happen). Then, the second call to YAD shows the current content of the $INPUT
environment variable.
Note that when using external programs, SHC does not integrate them into the binary program but still calls the programs like a script; in other words, the external programs need to exist in $PATH. Absolute paths are also considered when executing the script, and the same applies to calling Bash, which should exist below the expected path.
Alternatives
A web search for shell script encrypt or shell script obfuscate reveals a number of alternatives for concealing the content of shell scripts – aptly implemented as shell scripts in part – that use a variety of ways to make the script code unreadable (e.g., obfsh
[3] or ShellCrypt [4])
Whereas obfsh makes the source illegible by inserting or removing spaces and lines and adding extra "garbage," ShellCrypt take things a step further: The program creates a truly encrypted program with a .bin
extension that only becomes executable again after decryption.
The program needed for decrypting is used as an interpreter (Figure 2). The disadvantage of this method is that the ShellCrypt package always needs to be installed on the executing system. You could use gpg
-encrypted scripts in a similar way; they would first need to be symmetrically encrypted (using -c
), then they can be decrypted with the -d
option and passed to the executing shell.
Buy this article as PDF
(incl. VAT)