Binary format for the web

Classy New Build

© Lead Image © hasselblad15, photocase.com

© Lead Image © hasselblad15, photocase.com

Article from Issue 203/2017
Author(s):

The WebAssembly project makes a portable binary for browsers, with a focus on minimizing size and load time. C and C++ programs are used as source, which makes it possible to compile virtually any application for the web.

Relocating applications to the browser is not exactly an innovation. However, WebAssembly [1] has announced a format that intelligently combines the desktop and the web. The Emscripten SDK [2] plays a leading role by offering an LLVM-based compiler [3] that translates C and C++ programs to standard JavaScript, making extensive porting redundant and thus building a simple bridge to the web.

Figure 1 illustrates the relationship between the tools and toolchains involved and shows that WebAssembly not only supports applications in the browser, but locally executes programs, as well.

Figure 1: Various tools and toolchains translate C and C++ to WebAssembly [4] (CC BY-SA 3.0 [5]).

ASM Meets WASM

The Emscripten developers had originally set their sights on asm.js [6], a optimizable, low-level subset of JavaScript. Nowadays, the SDK is also used for WebAssembly. WebAssembly and asm.js have quite similar ideas and goals, and some developers are involved in both projects; however, the asm.js and WebAssembly designs differ significantly.

Asm.js hopes to define a formally specified subset of JavaScript that a compiler such as emcc [3] by Emscripten then generates. The limitation in the language scope allows it to guarantee improvements such as type checking and array handling, which improves run time compared with manually implemented JavaScript code. Another advantage of asm.js is that almost all browsers interpret JavaScript, so developers don't need any new virtual machines for the code; all necessary extensions can be installed in existing JavaScript engines (e.g., Google's V8).

WebAssembly is supposed to run native code in the browser, but it goes one step further by involving applications outside the browser, as well. The differences manifest themselves in the language used. Whereas asm.js always generates JavaScript code, WebAssembly C and C++ programs compile to their own binary format (.wasm). JavaScript can be involved, but it doesn't have to be if the code is not intended to run in the browser.

The benchmarks mentioned in this article show that the new binary format runs more efficiently than asm.js-optimized JavaScript code and loads (browser) applications faster. For more arguments in favor of WebAssembly and information about its differences with asm.js, see the WebAssembly FAQs [7].

A Miracle of Translation

As a rule, it not only takes longer for developers to port their C and C++ programs to JavaScript themselves, but Emscripten also generates optimized JavaScript code, which the browser executes efficiently. Because it uses LLVM as a basis, you can compile not only C and C++ programs, but any code, as long as it can first be converted to LLVM bitcode [8]. The documentation [9] lists other benefits, including:

  • WebAssembly defines its own sandbox per module – that is, its own memory area that it does not share with other modules.
  • As a module in the browser, WebAssembly can interact with JavaScript and use the browser functions via a JavaScript API. As a result, WebAssembly and JavaScript share some code.
  • As already mentioned, WebAssembly modules do not just run in the browser, but also as native code outside the browser.

State of Play

The WebAssembly website provides a demo online [10]. According to the roadmap [11], developers from four browser manufacturers (Firefox, Chrome, Edge, and WebKit) are collaborating on the project, which the W3C Community Group is also developing as an open standard [12] independent of individual manufacturers. Recently, a Minimum Viable Product (MVP) has implemented and tested all major design decisions relating to the WebAssembly API and the binary format, although other features are still in the design phase [13].

If you want to convert programs into WebAssembly format, you will need a compiler. Instead of GCC and G++, emcc [3], the Emscripten compiler front end, is used. Developers can check it out from Git, together with the Emscripten SDK, and install the software according to the instructions [14]. Because the project initially only supported asm.js, to produce WASM binary codes, you need to pass in the -s WASM=1 flag to the emcc linker at the command line. The default extension for WebAssembly programs is .wasm. The command

emcc hello.c -s WASM=1 -o hello.html

builds the Hello World program in Listing 1. Called with the arguments shown, emcc creates three files: hello.wasm, hello.js, and hello.html. The WASM and JavaScript files contain the WebAssembly binary code and the WebAssembly JavaScript module, whereas emcc only generates the HTML file if the argument for the -o option ends in .html. With the HTML page created in this way, you can proceed to test the WebAssembly module in the browser. Figure 2 shows the (simple) results.

Listing 1

hello.c

 

Figure 2: An emcc-generated HTML page for the browser that displays the output of the Hello World program converted from C.

To host these HTML pages, the SDK provides its own tool, Emrun [15], which provides a simple HTTP server that uses the

emrun --no_browser --port 8080

command to serve up all files in the current directory via HTTP.

Figure 3 shows another example that uses scanf() to retrieve user input. The associated listing (fib.c) and all other listings are available online [16].

Figure 3: The user input dialog box produces a scanf() statement in the C program.

The list of projects that Emscripten has successfully built and made available for browsers is extensive [17]: from games with graphics rendering and audio output, through frameworks such as Qt and Unity, to C and C++ run-time environments for Lua and Python that allow the indirect use of Python in the browser. In addition to the standard C and C++ libraries, Emscripten can cope with Simple DirectMedia Layer (SDL) and OpenGL libraries, removing the need for additional libraries for the converted WebAssembly and JavaScript code.

Although Emscripten converts almost any native C and C++ code to JavaScript, the run times differ in some respects from their native siblings. Adjustments to the code are occasionally required before building; however, the changes are generally limited to the main() loop and to accessing the filesystem.

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy Linux Magazine

SINGLE ISSUES
 
SUBSCRIPTIONS
 
TABLET & SMARTPHONE APPS
Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

comments powered by Disqus
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters

Support Our Work

Linux Magazine content is made possible with support from readers like you. Please consider contributing when you’ve found an article to be beneficial.

Learn More

News