Making e-books with Bash PubKit

Instant E-Books

Article from Issue 179/2015
Author(s):

Generate e-books from Markdown-formatted text files quickly and reliably using Bash PubKit.

When it comes to producing ready-to-publish e-books, you have plenty of tools to choose from: You could write an entire book using a word processor like LibreOffice Writer and then convert the final result to the EPUB format, or if you prefer a more direct approach, you can opt for a dedicated e-book editor like Sigil, and Calibre can come in useful when you need to convert an existing HTML document into an e-book.

However, if you happen to use a text editor as your writing tool of choice and Markdown as the preferred formatting system, then you have another option: You can use Pandoc [1] to convert Markdown-formatted text files into an e-book. This powerful and flexible tool is perfect for generating e-books in the EPUB and other formats from the command line. In fact, you can generate an e-book in the EPUB format complete with a cover image, a table of contents, and a custom layout with a simple shell one-liner:

pandoc -f markdown -t epub \
  --epub-cover-image=cover.jpg \
  -o foo.epub --toc \
  --epub-stylesheet=stylesheet.css foo.md

This command uses several parameters to convert the input file foo.md into the EPUB e-book foo.epub. The -f parameter specifies the format of the input file (Markdown in this case), and the -t and -o options specify the output format (EPUB) and file (foo.epub). The --toc parameter enables the table of contents, and the --epub-stylesheet option points to a CSS stylesheet file.

So, if you have a short and simple Markdown-formatted text file, you can convert it into an EPUB book in just one quick step. This approach has several advantages compared with using a regular word processor or a graphical e-book editor. Generating EPUB files using Pandoc is fast and efficient. More importantly, the final result doesn't need any post-processing or tweaking. Using Markdown-formatted files means that you can use practically any text editor on any platform, so you are not confined to a specific application or format. Working with text files and minimal formatting markup enables you to focus on what's important, namely content.

The described command makes it possible to generate a ready-to-publish EPUB book with a minimum of effort, but having an entire book in a single file is not very practical, especially if the book has a complex structure and contains figures. In most cases, it would make more sense to split the book into separate chapters and keep all its elements neatly organized into dedicated folders (i.e., pages for chapters, images for figures, etc.). You could then use Pandoc to assemble the parts into a single EPUB file, but this would require some command-line wizardry.

Enter Bash PubKit [2], a Pandoc-based tool for assembling e-books from a set of Markdown-formatted pages and accompanying files. Bash PubKit is essentially a Bash script that reduces the process of generating e-books to a simple command and a few options. Bash PubKit is a fork of BASC-eBookGenerator [3] which features several tweaks and improvements. Notably, Bash PubKit uses the Calibre application to generate e-books in the MOBI format.

The first step therefore is to install the required components. If you happen to use Ubuntu or any of its derivatives, you can deploy Bash PubKit and its dependencies on your system using the install-bash-pubkit.sh Bash script. Use the following command to download the script:

wget https://raw.githubusercontent.com\
  /dmpop/bash-pubkit/master/install-bash-pubkit.sh

Make the script executable by running chmod +x install-bash-pubkit.sh, then run the script by executing the ./install-bash-pubkit.sh command.

E-Book Structure

Bash PubKit expects to find all source files in specific locations inside a directory that acts as the book's root folder. A typical root folder file structure is shown in Listing 1.

Listing 1

Typical Root Folder File

 

So, all Markdown-formatted text files go to the pages sub-folder, and all figures and graphics are neatly tucked under the images sub-folder. You need to keep in mind that Bash PubKit assembles pages in alphabetical order. This means that, for example, the about.md page will appear before the colophon.md page in the final e-book. One way to prevent this from happening and to have complete control over the page sequence is to use number prefixes:

pages/
    001-colophon.md
    002-about-this-book.md
    003-chapter1.md
    004-chapter2.md
    005-chapter3.md

The metadata.yaml file in the root folder contains key book info (i.e., title, author, copyright, etc.) and points to a book cover image and an optional stylesheet. The cover image must be in the PNG or JPEG format, and the file must reside in the same folder as the metadata.yaml file. If the book cover image doesn't exist, the e-book will be compiled without a cover.

How much book metadata you want to enter in the metadata.yaml file is up to you, but you need to provide at least three key pieces of info: title, author (or contributors), and copyright. The title item in the metadata.yaml file has two identifiers: type, which indicates the title's type, and text, which specifies the title of the book:

title:
- type: main
  text: How to Code Like a Monkey

If the book has a subtitle, you can add the appropriate identifiers:

title:
- type: main
  text: How to Code Like a Monkey
- type: subtitle
  text: A Guide to Sloppy Coding

To specify authors and contributors, use the creator item as follows:

creator:
- role: author
  text: John Smith
- role: editor
  text: Sarah Jones

Finally, the publisher and rights items can be used to provide publisher and copyright info:

publisher:  Monkey Press
rights:  (c) 2015 Monkeys & Co., CC BY-NC-SA

All images used in the book must be located in the images folder, and you can insert them into the text using the regular Markdown ![]() tag:

![](images/foo.jpg)
![Caption goes here](images/foo.jpg)

To keep track of images and graphics files, you can organize them into subfolders – for example, images/figures, images/tables, images/schematics, and so on. Just remember to specify the correct paths when inserting images (e.g., ![](images/figures/foo.jpg)).

If no custom CSS file is provided, the Bash PubKit e-book compilation script falls back on Pandoc's default stylesheet. The e-book template in Bash PubKit comes with a sample stylesheet file that you can use as a starting point.

Bash PubKit Usage

The compile-ebook.sh script supplied with Bash PubKit is the tool that assembles the source files into EPUB and MOBI e-books. Using the script couldn't be easier: Run the

./compile-ebook.sh SOURCE

command, where SOURCE is the path to the directory containing book files. This generates an EPUB file in the same directory. The script supports two options. The -m option generates an e-book in the MOBI format, whereas the -o switch lets you specify a different target directory for generated files.

The script uses several global variables that specify paths to the required folders, so you can edit the default folder names. For example, if you want to use the chapters folder instead of the pages folder, edit the PAGES_FOLDER variable accordingly.

Hacking Bash PubKit

The compile-ebook.sh shell script that does all the heavy lifting is not very complex, so you can easily hack it even if your Bash scripting skills are relatively modest. For example, the script can generate EPUB and MOBI e-books, but with a bit of tweaking, you can add other formats, too.

Say you want to add the HMTL format to the mix. The script can be roughly divided into three blocks. The first block defines options, reads the options specified in the currently issued command, and evaluates them. The following statement in the script defines options:

OPTS=`getopt -o mo: -l \
  generate-mobi,output-folder: -- "$@"`

Options in the script are defined using the getopt tool, and the statement above defines two options: -m (and its --generate-mobi long form) and -o (--output-folder). So, to specify a new option, you need to add its short and long forms to the statement:

OPTS=`getopt -o mho: -l generate-mobi,generate-html,output-folder: -- "$@"`

In this case, the -h or --generate-html options are used to enable the HTML format.

The next code block consists of a while loop and the case condition that enables specific options based on supplied command-line parameters. For example, the snippet shown in Listing 2 enables the MOBI format by setting the COMPILE_MOBI variable to true if the ./compile-ebook.sh command is issued with the -m parameter:

Listing 2

Enable MOBI Format

 

To add the HTML option, copy the appropriate code fragment and edit it as shown in Listing 3.

Listing 3

Add HTML Option

 

The script generates the e-book in the target format using an appropriate command. For the EPUB format, the command is:

awk 'FNR==1{print ""}1' metadata.md \
  "$PAGES_FOLDER"/*.md | pandoc -o \
  "$EBOOK_FOLDER.epub" --toc

To add support for the HTML format, all you have to do is to tweak the command slightly and put it inside an if condition:

if [ "$COMPILE_HTML" = true ] ; then
    awk 'FNR==1{print ""}1' metadata.md "$PAGES_FOLDER"/*.md | \
    pandoc -f markdown -t html -o "$EBOOK_FOLDER.html" --toc
fi

If the ./compile-ebook.sh command uses the -o option to specify a target folder, the following code block moves the generated file to the specified directory:

if [[ -n "$OUTPUT_FOLDER" ]]; then
      if $( any_with_ext epub ); then
        mv *.epub ../"$OUTPUT_FOLDER"
      fi

Additionally, here you need to add code that handles the HTML files:

if [[ -n "$OUTPUT_FOLDER" ]]; then
      if $( any_with_ext epub ); then
        mv *.epub ../"$OUTPUT_FOLDER"
      fi
      if $( any_with_ext html ); then
        mv *.html ../"$OUTPUT_FOLDER"
      fi

Finally, you need to add the COMPILE_HTML=false statement at the beginning of the script to initialize the COMPILE_HTML parameter. That's all there is to it. You can now generate e-books in the HTML format using the -h (or --generate-html) option. And, of course, using the described procedure you can add any format supported by Pandoc or Calibre.

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy Linux Magazine

SINGLE ISSUES
 
SUBSCRIPTIONS
 
TABLET & SMARTPHONE APPS
Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

comments powered by Disqus
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters

Support Our Work

Linux Magazine content is made possible with support from readers like you. Please consider contributing when you’ve found an article to be beneficial.

Learn More

News