Using voice-controlled interfaces via Amazon Alexa

Listening to the Word

Article from Issue 230/2020
Author(s): , Author(s):

Want to add voice activation to your IoT environment? Create an Alexa skill.

If you want to control your own home automation environment with Amazon Alexa using natural language, you have two options. Either resort to a prebuilt Alexa skill, as offered by the vendors of some automation components, or write a skill of your own.

If you can find a prebuilt skill that performs the task you want to automate, you can accomplish the automation with just a few short steps; however, the possibilities are limited to the set of options that have already been provided by third-party programmers. If you want to reach other devices – or even if you just want to execute a series of actions that don't fall easily within Alexa's existing skill set, you need to write the skill yourself.

This article shows how to build the front end of your Alexa automation by getting Alexa to communicate with a Raspberry Pi. Once you establish the link to the RaspPi device, you can train the Pi to perform any number of basic functions on your IoT home network.

Before you jump out and start from scratch, however, it pays to take a careful look at the prebuilt options. Alexa supports a number of prebuilt skills that provide easy access to existing automation systems.

Prebuilt Skills

Alexa's built-in skills are the fastest and easiest way to automate – if you can find a skill that does what you need. Many prebuilt skills tie in with existing automation systems and IoT environments. (See the box entitled "Alexa in Harmony.")

Alexa in Harmony

One example of a ready-made skill is Logitech's Harmony universal remote control with its hub. The hub is a central transmitter that sends signals via infrared, Bluetooth, or WLAN to the devices that the user wants to operate. The remote control you hold in your hand no longer talks directly to the TV, stereo, or video player, but to the hub, which in turn talks to the devices. Thanks to an Alexa skill, this hub can now be operated by voice, which gives you the ability to talk to a wide and diverse range of home electronics devices.

The basic switch-on and switch-off commands can be combined to create actions. For example, if – as a TV viewer – you use a sound bar for better sound quality or surround sound, you can always switch it on and off along with the TV set by linking the virtual on/off switches of both devices in a single sequence (Figure 1).

Figure 1: Two screens of the Harmony app to define a start sequence for the TV set and sound bar.

These actions can then be triggered again using an Alexa voice command; in other words, a terse "Alexa, good night" is enough to switch off the TV and sound bar, dim the lights, lower the blinds, and lock the apartment door.

What do you need to do to achieve this? The first step is to define the actions in the Harmony app on your smartphone or tablet. The app is available for Android and iOS. Harmony controls over 270,000 different devices from most major manufacturers, including Bose, Philips, Denon, Sonos, Hue, and Deutsche Telekom. The actions can also be assigned to buttons on the remote control, so that pressing a button triggers a whole cascade of commands. In this example, however, Alexa will trigger the actions.

You can obtain the Harmony skill from the Amazon Alexa App Store in the Alexa app on your smartphone or tablet. The skill sports a blue logo. Watch out! An older version of this skill with a red logo named Harmony Second Hub is no longer recommended because it forced you to say the words "with Harmony" with all commands.

After you download, you need to enable the skill and log in with the same credentials that you use for your Logitech account. The Alexa skill then automatically gathers information about the Harmony actions.

If you want, you can fine-tune the pre-stored wording for the voice commands or the device name, but this is not absolutely necessary. Voice commands like "Alexa, switch on the TV" or "Alexa, switch on the Xbox" will work – provided you previously defined a corresponding action in the Harmony app. You can say "Alexa, switch to channel 3" or "Alexa, turn up the volume by four increments" and so on.

Prebuilt Alexa Skills are also available for wireless socket outlets, but only for certain manufacturers. For example, there is a skill for the Kasa device series (wireless socket outlets, cameras, lamps) by TP-Link. You first need to set up the socket outlet in a TP-Link Kasa app and assign a name. You can then select the device and add it to the Alexa app. If you named the socket outlet "reading lamp" in the first step because it operates the standard lamp next to your comfortable armchair, you can then turn on the lights by saying "Alexa, reading lamp on."

The setup for a prebuilt skill is usually simple and convenient. If you are happy with the basic functions of popular devices, you will not be motivated to become a skilled programmer yourself. At times, however, you might want to combine several actions or use functions that are not included in the repertoire of ready-made skills. In this case, you will have to program a skill yourself.

Keep in mind that, if you don't trust Alexa when it comes to data protection, self-programming will not help much. Whether you use a prebuilt skill or program the skill yourself, everything you tell Alexa is routed through the Amazon server and stored there.

DIY Alexa Skill

The skill programmer faces two quite different challenges: First, you need to ensure that the computer that will execute the actions, a Raspberry Pi in this example, receives and interprets the voice command from Alexa and learns what to do. In addition to the RaspPi, one of Amazon's smart Echo series speakers is also needed to receive the spoken instructions and forward them to the servers.

Secondly, you need to program the action that you want carried out. The example in this article only looks at step 1. Communication between Alexa and your RaspPi is the foundation; from there, you can program your Pi to perform any task that makes sense for your network. In this example, the Raspberry Pi will switch on one of its eight LEDs to indicate that it has received and understood an instruction (Figure 2).

Figure 2: The schematic for the eight LEDs.

A first generation Raspberry Pi is powerful enough for the experiment; it runs the latest version of Raspbian "Stretch" Lite. The Lite version of the operating system does without a graphical user interface and therefore copes particularly well despite the frugal hardware resources. A GUI is not necessary for this project anyway. The Lite version of Raspbian lacks some libraries and tools that need to be installed before you can install the Flask web framework (written in Python), which is required for this example.

These installations are handled by the commands in the first four lines of Listing 1, including the installation of Flask via Pip, the Python module management tool. The process takes some time – this might be a good time for you to take a coffee break.

Listing 1

Setting up Flask and Ngrok

§§number
01 $ sudo apt update
02 $ sudo apt upgrade
03 $ sudo apt install python2.7-dev python-dev python-pip wiringpi
04 $ sudo pip install flask-ask
05 $ unzip /home/pi/ngrok-stable-linux-arm.zip
06 $ ./ngrok http 5000

Tunnel Builder

The Raspberry Pi still lacks a tool for building a secure tunnel between the RaspPi and the Amazon servers. This is where the Ngrok tunneling service comes in. Ngrok is very easy to install and configure. With the free version of Ngrok, however, the URL of the tool changes after each restart. If this is too much of an annoyance for you, you have to bite the bullet and go for a commercial version. The basic cost is around $5 per month.

To install the program, download the Linux ARM version from the Ngrok website and transfer the zip archive to the Raspberry Pi. Then unpack the archive and call Ngrok directly in a terminal (Listing 1, Lines 5 and 6). The options of the command in Line 6 tell Ngrok to forward port 5000 from localhost and listen for HTTP requests there. The output should be similar to Figure 3.

Figure 3: The output from Ngrok. The HTTPS URL is required later on.

Each time a connection is established, Ngrok generates new random URLs. You need to make a note of the address marked red in the figure: you will need it later on as a communication endpoint in the Alexa skills.

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy Linux Magazine

SINGLE ISSUES
 
SUBSCRIPTIONS
 
TABLET & SMARTPHONE APPS
Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

  • Programming Snapshot – Alexa

    Asking Alexa only for built-in functions like the weather report gets old quickly, and add-on skills from the skills store only go so far. With a few lines of code, Mike teaches this digital pet some new tricks.

  • Mycroft

    Voice-activated assistants like Mycroft bring online, hands-free help to users, but with more transparency and less spying.

  • Programming Snapshot – Multilingual Programming

    We show you how to whip up a script that pulls an HTTP document off the web and how to find out which language offers the easiest approach.

comments powered by Disqus
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters

Support Our Work

Linux Magazine content is made possible with support from readers like you. Please consider contributing when you’ve found an article to be beneficial.

Learn More

News