Programming Snapshot – Multilingual Programming

Tower of Babylon

© Lead Image © Maksym Shevchenko, 123RF.com

© Lead Image © Maksym Shevchenko, 123RF.com

Article from Issue 201/2017
Author(s):

We show you how to whip up a script that pulls an HTTP document off the web and how to find out which language offers the easiest approach.

Few programming tasks illuminate the differences between commonly used languages as clearly as that of retrieving a web document. When it comes to shell scripts, admins often turn to the curl utility, which transfers the data behind a URL without much ado and sends them to the standard output.

But, what if the URL points to a black hole? Or the server denies access? And what if the server returns a redirect? For example, curl http://google.com does not return the expected HTML page with the search form but just a note that the desired page may be available on www.google.com. Armed with the -L option, however, curl follows the reference and then returns the data from the source it finds there.

What happens with a huge file like a 4K movie containing many gigabytes of data? Will the process exhaust your RAM because it attempts to swallow everything in a single gulp? Does encryption work automatically for an HTTPS URL using the SSL protocol, and does the utility check the server's certificate correctly so that it does not fall victim to a man-in-the-middle attack? Similar to good old curl, popular programming languages offer all of this, although often only as an add-on package and often requiring quirky approaches.

[...]

Use Express-Checkout link below to read the full article (PDF).

Buy Linux Magazine

SINGLE ISSUES
 
SUBSCRIPTIONS
 
TABLET & SMARTPHONE APPS
Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

  • Better Safe than Sorry

    Developers cannot avoid unit testing if they want their Go code to run reliably. Mike Schilli shows how to test even without an Internet or database connection, by mocking and injecting dependencies.

  • Elixir 1.0

    Developers will appreciate Elixir's ability to build distributed, fault-tolerant, and scalable applications.

  • Fighting Chaos

    When functions generate legions of goroutines to do subtasks, the main program needs to keep track and retain control of ongoing activity. To do this, Mike Schilli recommends using a Context construct.

  • Simultaneous Runners

    In the Go language, program parts that run simultaneously synchronize and communicate natively via channels. Mike Schilli whips up a parallel web fetcher to demonstrate the concept.

  • Web Scraping

    Web scraping lets you automatically download and extract data from websites to build your own database. With a simple scraping script, you can harvest information from the web.

comments powered by Disqus