Spotlight | Reviews | Current Issue | Academy | Newsletter | Subscribe | Shop |
Departments

Yatego Shopping
Yatego International
Germany's Shoppingmall No.1! 10000 Shops and over 3,4 Mio. Products. Computer, Software and Technic Guidebooks.

user friendly

Admin Magazine

ADMIN Network & Security

Subscribe now and save!

 ADMIN - Explore the new world of system administration! ADMIN is a smart, technical magazine for IT pros on heterogeneous networks. Each issue delivers technical solutions to the real-world problems you face every day. Learn the latest techniques for better:

  • network security
  • system management
  • troubleshooting
  • performance tuning
  • virtualization
  • cloud computing

 on Windows, Linux, Solaris, and popular varieties of Unix.

http://www.admin-magazine.com/

  linuxpromagazine.com » Issues » 2008 » 94 » OpenMP  

Print this page. Recommend
Share

OpenMP Hands On

To use OpenMP in your own programs, you need a computer with more than one CPU, or a multi-core CPU and an OpenMP-capable compiler. GNU compilers later than version 4.2 support OpenMP. Also, the Sun compiler for Linux is free [2], and the Intel Compiler is free for non-commercial use [3].

Listing 5 shows an OpenMP version of the classic Hello World program. To enable OpenMP, set -fopenmp when launching GCC. Listing 8 shows the commands for building the program along with the output.

Listing 5

Hello, World

01 /* helloworld.c (OpenMP Version) */
02 #
03 #ifdef _OPENMP
04 #include <omp.h>
05 #endif
06 #include <stdio.h>
07 int main(void)
08 {
09   int i;
10 #pragma omp parallel for
11   for (i = 0; i < 4; ++i)
12   {
13     int id = omp_get_thread_num();
14     printf("Hello, World from thread %d\n", id);
15     if (id==0)
16       printf("There are %d threads\n", omp_get_num_threads());
17   }
18   return 0;
19 }

Listing 8

Building Hello World

$ gcc -Wall -fopenmp helloworld.c
$ export OMP_NUM_THREADS=4
[...]
$ ./a.out
Hello World from thread 3
Hello World from thread 0
Hello World from thread 1
Hello World from thread 2
There are 4 threads

If you are using the Sun compiler, the compiler option is -xopenmp. With the Intel compiler, the option is -openmp. The Intel compiler even notifies the programmer if something has been parallelized (Listing 9).

Listing 9

Notification

01 $ icc -openmp helloworld.c
02 helloworld.c(8): (col. 1) remark:
03 OpenMP DEFINED LOOP WAS PARALLELIZED.

Benefits?

For an example of a performance boost with OpenMP, I'll look at a test that calculates pi [4] with the use of Gregory Leibniz's formula (Listing 7 and Figure 5). This method is by no means the most efficient for calculating pi; however, the goal here is not to be efficient but to get the CPUs to work hard.

Listing 7

Calculating Pi

01 /* pi-openmp.c (OpenMP version) */
02 #
03 #include <stdio.h>
04 #define STEPCOUNTER 1000000000
05 int main(int argc, char *argv[])
06   {
07   long i;
08   double pi = 0;
09   #pragma omp parallel for reduction(+: pi)
10   for (i = 0; i < STEPCOUNTER; i++) {
11     /* pi/4 = 1/1 - 1/3 + 1/5 - 1/7 + ...
12        To avoid the need to continually change
13        the sign (s=1; in each step s=s*-1),
14        we add two elements at the same time. */
15        pi += 1.0/(i*4.0 + 1.0);
16        pi -= 1.0/(i*4.0 + 3.0);
17   }
18   pi = pi * 4.0;
19   printf("Pi = %lf\n", pi);
20   return 0;
21   }

Parallelizing the for() loop with OpenMP does optimize performance (Listing 6). The program runs twice as fast with two CPUs than with one, in that more or less the whole calculation can be parallelized.

Listing 6

Parallel Pi

$ gcc -Wall -fopenmp -o pi-openmp pi-openmp.c
$ export OMP_NUM_THREADS=1 ; time ./pi-openmp
Pi = 3.141593
real    0m31.435s
user    0m31.430s
sys     0m0.004s
$ export OMP_NUM_THREADS=2 ; time ./pi-openmp
Pi = 3.141593
real    0m15.792s
user    0m31.414s
sys     0m0.012s

If you monitor the program with the top tool, you will see that the two CPUs really are working hard and that the pi-openmp program really does use 200 percent CPU power.

This effect will not be quite as pronounced for some problems, in which case, you might need to resort to serial execution for a large proportion of the program. Of course, your two CPUs will not be a big help in such a case, and the performance boost will be less significant. Amdahl's Law [5] (see the "Amdahl's Law" box for an explanation) applies here.

Amdahl's Law

"Speedup" describes the factor by which a program can be accelerated with parallelization. In an ideal case, program execution with N processors would take just 1/N of the time required by a serial program. This ideal case is known as linear speedup. In the real world, linear speedup often is impossible to achieve because some parts of a program do not particularly lend themselves to parallelization.

Given a part of a program that supports parallelization, P (thus, 1 – P is the non-parallelizable part), and the number of processors available, N, the maximum speedup is calculated by the formula in Figure 6.

If the serial part of the program (1-P) is 1/4, the speedup cannot be greater than 4 – no matter how many processors you use.


Figure 6: Amdahl's Law.

Glossar

SMP

Symmetric multi-processor system. All of the machine's CPUs can access the shared main memory – in contrast to cluster systems, in which separate machines exchange data over the wire. OpenMP is suitable for parallel programming on SMP systems.

Thread

One popular definition of thread is a "lightweight process." A Unix process has a separate memory area and various resources are assigned to it – such as environmental variables, network connections, or device access. A thread shares memory and certain other resources with other threads in a process. This reduces the management overhead compared with processes, and facilitates switching between threads. Pressing Shift+H in the top tool enables and disables the thread display.

Read full article as PDF » 064-069_openMP.pdf 899.60 kB


Comments

Please reference wiki-diagrams

User_A1 Apr 19, 2011 3:18pm GMT

I know this is old, but you have used a picture of mine.

The fork-join diagram was released under a CC-BY-SA licence. Please attribute to the wiki-page.

Print this page. Recommend
Share
No More Downloads!

Save the download and take Linux Magazine DVDs instead.

Each DVD contains a full distro like Ubuntu, SUSE, Mandriva, Fedora, or Debian and comes with the corresponding issue of Linux Magazine.

Don't waste time downloading Linux!

more...