Finding a path to energy-efficient software

Clean It Up

Article from Issue 258/2022
Author(s):

Sustainability studies for the IT industry often ignore the contributions of software. This article explores what developers and admins can do to create and maintain more energy-efficient systems.

The digital transformation has taken hold in all the corners of our culture, from business to our personal lives. Virtually nothing works without digital technologies, and for that reason, it is clear that we won't achieve the sustainable development goals (SDGs) set out by the United Nations unless we rethink our approach to our digital life.

The amount of energy required to operate the world's countless devices, networks, applications, and data centers is immense. Numerous studies have shined light on the increasing appetite for energy in the IT industry, and experts point to a need to massively improve the energy efficiency and sustainability of IT systems.

The problem with the increasing CO2 footprint in IT has been known for some time and has led to various initiatives that fall under the green IT umbrella. However, the main focus of this movement, especially in the data center sector, is on reducing the waste of natural resources in digital devices, using renewable energy sources, and a call for digital fasting – the practice of spending part of the day or week away from digital technology.

From the beginning, the emphasis has been on the hardware. As early as 1992, the U.S. Environmental Protection Agency (EPA) and the EU Commission (2003) introduced the Energy Star label for energy-efficient IT equipment and computers. Despite all efforts, energy consumption has continued to rise steeply. The testing procedures for the Energy Star label are neither sufficiently rigorous nor thorough, and the rapidly increasing share of carbon dioxide emissions caused by the use of software and digital applications running on these devices is largely ignored. Algorithmic efficiency and sustainable computing are proving to be a major blind spot in most Green IT initiatives, which focus on the production and operation of the devices but ignore the daily emissions that far-too-energy-hungry software generates over the device's lifetime.

A study by The Shift Project shows that, as early as 2017, energy consumption from the use of digital technologies exceeded that in the production of digital devices by more than five percent, with a steadily increasing share. This suggests that, in the future, measures to reduce the CO2 footprint of digital technologies needs to focus more on improving the energy footprint of software systems and their interaction with hardware, rather than just focusing on the energy footprint of hardware [1].

The quest for more sustainable IT will require us to systematically study the the energy consumption of digital systems and make the figures comparable. Then we need to develop methods for designing IT systems that are more energy-efficient. To this end, we need to combine the principles of algorithmic efficiency and sustainable computing in a fundamental paradigm of Sustainability by Design.

Massive Rise

In recent decades, digital technologies have been hailed as the "clean" counterpart to the "dirty" industries of manufacturing, agriculture, and energy production. Digital devices, products, and services were believed to contribute little or nothing to global CO2 emissions because of their intangible nature. This assumption is wrong: Globally, digital devices and applications have a very significant CO2 footprint.

All data traffic requires energy. The total annual Internet traffic volume has increased exponentially in recent years and continues to rise steeply. In 2007, only 54 exabytes of data passed through the Internet. By 2017, this number had increased by a factor of 20, to 1.1 quintabytes, according to the International Energy Agency. By the end of 2022, annual data traffic is projected to reach 4.2 zettabytes [2]. Already, carbon emissions from digital technologies exceed those from global air travel by a factor of two. In 2019, for example, all air travel was responsible for about 1 billion tons of carbon dioxide emissions, about two percent of total emissions. In the same year, digital technologies emitted about 2 billion tons, about four percent of total human-generated carbon dioxide [3].

An example of technology that is considered very new age and progressive from a technology perspective but that is actually quite regressive in its energy usage, is Artificial Intelligence. Researchers at the University of Massachusetts Amherst have studied the energy consumption of modern AI systems and found that the training phases for new neural networks, in particular, consumes a significant amount of energy and, as a result, emits huge amounts of carbon dioxide. For example, training a common AI model with Big Data produces about 300 tons of carbon dioxide equivalents, which is like the CO2 lifecycle emissions of five cars, including fuel, or 300 round-trip flights from New York City to San Francisco [4]. (Cryptomining is another huge consumer of electrical power – see the article on Cryptomining elsewhere in this issue.)

But before you rule out AI, keep in mind that many believe AI and Big Data are fundamentally important for reducing CO2 emissions overall because of their importance for optimizing processes in energy production, manufacturing, agriculture, and other industries. The solution is not to eliminate or "fast" from these technologies but to make them more efficient.

Sustainability by Design

Programmers can help reduce the growing CO2 footprint by making more efficient software systems. The goal is to develop software that uses less energy to deliver practically equivalent results – with as few, and as simple, computing operations as possible. Further potential for reducing energy consumption lies in the implementation of the underlying algorithms.

When developing innovative software architectures, you can consider trade-offs between precision in the computational results and reduced energy usage. This is especially true of systems that are used millions of times – even the smallest savings in the individual computational processes add up to significant savings.

Because the use of digital technologies already accounts for the largest share of the digital CO2 footprint, and will continue to rise sharply, it is particularly important to make algorithms more efficient. Greater consideration must be given to the trade-off of precision and speed on one hand and energy consumption on the other. Weighing this balance must become a core principle in the design of digital systems.

To solve the apparent paradox of more digitalization using less energy, we need to develop new paradigms in algorithm design and programming, and we need widespread implementation of these paradigms in practice. Principles such as energy-efficient algorithms and sustainable computing must also play a major role in the education of computer scientists and IT engineers. The focus must be on raising awareness of the issue of energy efficiency in software systems and making the principle of sustainability by design the basis for IT development practice.

Programming and the Quest for Cleaner IT

Typically, computer scientists and IT engineers develop software that aims to solve different classes of problems using algorithms. By their very nature, different algorithms can be developed for very similar problems. If these algorithms need to be very fast (when computing the results) and accurate (100 percent accuracy), this can require a significant amount of computation and a long runtime. Computational overhead and long run times translate to high energy requirements, and algorithms for best solutions are often extremely complex. For very large-scale problems, such as complex climate models or traffic predictions, the computation time scales endlessly depending on the desired accuracy.

One important field of research in algorithm engineering focuses on solving the problem using tradeoffs between accuracy and runtime. Appropriate algorithms use heuristics and randomized approaches to produce a result that is as close as possible to the optimum solution but that can be computed with far less overhead and runtime. These "second-best" algorithms, which usually come very close to the exact solution, can shorten the runtime of algorithms by factors of between 100 and 10,000, depending on the problem class.

Experiments at the Hasso Plattner Institute for Digital Engineering (HPI, Figure 1) have shown that applying heuristic algorithms to optimize submodular functions suitable for solutions that optimize traffic, allocate raw materials in production, or allocate goods to markets can reduce the runtime by three-digit factors compared to traditional algorithms. Whereas traditional software might take two days to compute the solution to such a problem, a heuristic algorithm might compute the task in just 10 minutes, reducing energy consumption to less than one percent.

Figure 1: Researchers at the Hasso Plattner Institute are working on energy-efficient data profiling. © HPI

The following are three examples of programming techniques that could lead to significant energy savings if widely implemented.

1. Energy-Efficient Data Profiling

Digital applications, like many new smart technologies, require perfectly organized mass data sets. However, the larger the volume of data, the more time and energy you need to process it. One of the main tasks of data engineers is to automatically organize data in a meaningful way so that the data sets can be used for artificial intelligence applications and other forms of analysis.

"Pure" data has no value. To make an impact, you need to categorize data in meaningful ways (explore the nature of the data), normalize the data (homogeneous structure), get rid of redundancies, and so on. Metadata plays an important role in this kind of task because it helps organize data for the value chain. In this sense, organizing data sets is the basis for all data-driven digital products and services.

Unique Column Combinations (UCC) are an important aspect of data profiling. Successfully identifying UCCs can lead to more efficient processing for database systems. Until recently, it was only possible to identify UCCs for small to medium-sized data sets – and with considerable time overhead. For large data sets, UCCs were difficult to discover due to runtime and memory restrictions. The people at HPI have developed a novel algorithm for discovering UCCs (HPIValid) that reduces the UCC discovery runtime by several orders of magnitude while reducing the memory requirements when compared with other algorithms.

On medium-sized data sets, HPIValid was 5 to 100 times faster and consumed only 5 to 20 percent the amount of memory on average, while efficiency decreased for larger data sets. Although the previously used HyUCC algorithm was not able to detect UCCs in extremely large datasets, HPIValid can perform the computation within a reasonable runtime and with a reasonable memory consumption, making it possible to indentify UCCs for massive datasets [5].

2. Energy Efficiency for Artificial Intelligence

The development of machine learning techniques and deep neural networks have meant crucial advances in the field of AI research. However, the ever-improving deep learning algorithms translate to an ever-increasing need for energy, during training in particular, but also during execution.

Modern machine learning systems train neural networks based on 32-bit algorithms like ResNet. However, reducing the complexity of AI models using quantization and pruning techniques can save energy. Rounding data values in deep learning models drastically reduces the energy consumption of AI systems.

In the extreme case, deep learning networks can be executed with binary neural networks (1-bit algorithm). This approach reduces the effort in the individual computing steps and immediately generates energy savings with a factor of 20. Although binary neural networks are currently still about five percent less precise than today's best AI systems, they impress with their 95 percent reduction in energy consumption. When used millions of times a day, enormous amounts of energy can be saved.

Table 1 shows the significant reduction in model size and number of operations of three variants of binary neural networks compared to 100 percent accurate 32-bit ResNet networks [6]. The loss of accuracy is moderate.

Table 1

Energy Saving in Deep Learning

Model Name

Accuracy

Size

Operations

ResNet (32-Bit)

69.3 Percent

46.8MB

1.61x109

BinaryDenseNet45

63.7 Percent

7.4MB

3.43x108

BinaryDenseNet37

62.5 Percent

5.1MB

2.70x108

BinaryDenseNet28

60.7 Percent

4.0MB

2.58x108

3. Energy-Aware Computing at the Data Center

Data centers are considered to be the heart of digitalization. Cloud applications, streaming, complex simulations – everything runs in a data center. The resulting, ever-increasing energy consumption contributes significantly to the global CO2 footprint. Next-generation data centers include an increasingly diverse landscape of accelerators and hardware architectures, each offering advantages for specific classes of algorithms or application areas.

However, today's data center architecture and software largely ignore this level of heterogeneity. Running workloads on the most appropriate hardware can significantly improve energy efficiency. In a preliminary assessment at a research seminar on energy efficiency, participants were able to improve energy efficiency by a factor of more than 10 for weather simulation models by using Field Programmable Gate Array (FPGA) accelerators instead of general-purpose processors.

In this case, a weather simulation ran on different processors with different results (see Table 2). For this particular task, the Xilinx processor showed higher energy efficiency than other processors. However, other tasks require different specialized hardware. Similarly, Qasaimeh et al. [7] demonstrated a 20-fold improvement in energy efficiency when using FPGAs for certain computer vision tasks. In contrast, computational tasks that rely heavily on floating-point math can often achieve higher performance and better energy efficiency on GPUs [8].

Table 2

More Efficiency with Special Processors

Device

NVIDIA Tesla K20Xm

Intel E5-2630 v4

Xilinx XCKU060

Problem size (MCells)

256

256

1024

Throughput (MCells/s)

1127.88

1435.48

2209.35

Consumption (Watts)

60

85

9.5

Efficiency (MCells/Ws)

18.80

16.89

232.56

This research opens up the question of how to optimally distribute computing operations in data centers based on heterogeneous computing resources in order to reduce energy consumption.

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy Linux Magazine

SINGLE ISSUES
 
SUBSCRIPTIONS
 
TABLET & SMARTPHONE APPS
Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

comments powered by Disqus
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters

Support Our Work

Linux Magazine content is made possible with support from readers like you. Please consider contributing when you’ve found an article to be beneficial.

Learn More

News