Creating parallel applications with the Julia programming language

Initialization

In addition to the functions for creating arrays, Julia supports complicated initialization functions, such as:

DArray(init, dims[, procs, dist])

Programmers need to supply the init function, which is given a tuple of index ranges for which the array is then initialized. dims specifies the array's dimensions.

Programmers can use dist to determine how the array will be distributed across the various processes. However, this manual declaration isn't generally necessary because Julia sorts it automatically. The example in Listing 1 creates a distributed random matrix and then returns the part for which Process 2 is responsible.

Listing 1

Distributed Arrays

01 julia> @everywhere using DistributedArrays
02
03 julia> d=drand(10,10);
04
05 julia> r=@spawnat 2 begin
06  localpart(d)
07  end
08
09 RemoteRef(2,1,23)
10
11 julia> fetch(r)
12 5x5 Array{Float64,2}:
13  0.932366 0.181326 0.0261475 0.211363 0.308226
14  0.330008 0.924271 0.543386 0.895825 0.617452
15  0.51471 0.801718 0.786854 0.174585 0.413264
16  0.840219 0.750775 0.126154 0.853618 0.899762
17  0.866529 0.804654 0.19808 0.49411 0.951355

The practical thing about these distributed arrays is that Julia hides all communications from the user. If, for example, a user wants to calculate the sum of all the elements of the d matrix, that user just needs to call sum(d). sum(d) delivers the result directly. Julia adds all the elements of the subarray with each individual process. All partial results to the total are then added.

Julia also offers structures for easily distributing loops. The following few lines, for example, add generated random numbers using rand():

julia> r = @parallel (+) for i=1:200000000
 rand()
 end

The @parallel macro distributes the loop to all the available processes. The return value for each loop is the last expression in the loop, in this case, the rand() statement for generating a random number. The example above then adds the results for each loop run. Programmers can also omit the reduction operator (in this case, the addition). Julia then distributes the loop in parallel without reducing the results at the end.

The pmap() function is useful for such cases. pmap() simply executes a function for a specific object (i.e., a typical map task). The following example uses the pmap() statement to compute the rank of matrices:

julia> M = [rand(1000,1000) for i=1:4];
julia> pmap(Rang,M)
4-element Array{Any,1}:
 1000
 1000
 1000
 1000

The first statement here creates four 1000x1000 matrices. pmap() then calculates the rank for each matrix. Because Julia is started with -p 4, a worker process is responsible for each matrix. It is just as easy to calculate the characteristics of other matrices. For example, the det() function calculates the determinant, and the inv() function calculates the inverse of a matrix.

Monte Carlo Pi Calculation

Monte Carlo calculations are generally very easy to parallelize because the individual calculation steps are mostly independent of each other.

For Monte Carlo-based calculations with the number pi, programmers need to generate a lot of random numbers in a square surrounding the unit circle (radius 1). These random numbers are generated uniformly between -1 and +1 for the x- and y-coordinates.

The ratio of the area of the unit circle to the square is now just pi*1*1/(2*2). This means that exactly the proportion pi/4 of the random numbers should be in the circle. If you now count them in a loop, you will get the Monte Carlo estimate for pi. In the following example:

pi_MC = N_in / N_tot * 4

N_in is the number of random numbers in the unit circle, and N_tot the total number of random numbers generated in the x-y plane. The higher N_tot, the closer pi_MC is to pi. The best thing is, therefore, to generate loads of random numbers to obtain a very precise value for pi.

Listing 2 shows an implementation of this method in Julia. The program executes the central loop using @parallel and adds the loop result (1 or 0) at the end in a reduction step. This code runs on a processor in less than two minutes on the test system:

time = 116.90136814117432 seconds
pi estimate = 3.14162308

Anyone who uses two cores will speed up the example by nearly a factor of 2:

time = 73.63914084434509 seconds
pi estimate = 3.141607444

The method converges very slowly, but the first two digits of pi (3.14) have already been calculated correctly.

Listing 2

Monte Carlo Estimation

01 N_tot = 1000000000
02
03 tstart = time()
04
05 N_in = @parallel (+) for n=1:N_tot
06         x = rand() * 2 - 1
07         y = rand() * 2 - 1
08
09         r2 = x*x + y*y
10         if r2 < 1.0
11                 1
12         else
13                 0
14         end
15 end
16
17 tstop = time()
18
19 pi_MC = N_in/N_tot*4.0
20
21 println("time        = ", tstop-tstart, " seconds")
22 println("pi estimate = ", pi_MC)

A Second Example: Julia with Julia

The parallel calculation of a Julia fractal with Julia is a more complicated example. Martin Rupp originally wrote the program based on the official Julia Mandelbrot example [2]. However, the version shown in Listing 3 is modified because the syntax has changed.

Listing 3

Julia Set

01 #DistributedArrays, WIDTH, HEIGHT and MAXITER are global
02 @everywhere using DistributedArrays
03 @everywhere WIDTH=1000
04 @everywhere HEIGHT=500
05 @everywhere MAXITER=100
06
07 #Images is local
08 using Images
09
10
11
12 # Julia function
13 @everywhere function julia(z, maxiter::Int64)
14  c=-0.8+0.156im
15  for n = 1:maxiter
16  if abs(z) > 2.0
17  return n-1
18  end
19  z = z^2 + c
20  end
21  return maxiter
22 end
23
24
25
26 # Init function for creating the array
27 @everywhere function parJuliaInit(I)
28  # local patch
29  d=(size(I[1], 1), size(I[2], 1))
30  m = Array(Int, d)
31
32  xmin=I[2][1]
33  ymin=I[1][1]
34
35  for x=I[2], y=I[1]
36  c = complex((x-WIDTH/2)/(HEIGHT/2), (y-HEIGHT/2)/(HEIGHT/2))
37  m[y-ymin+1, x-xmin+1] = julia(c, MAXITER)
38  end
39  return m
40 end
41
42
43
44 # creates distributed array
45 Dm = DArray(parJuliaInit, (HEIGHT, WIDTH))
46
47 # fetches distributed array on the local process
48 m = convert(Array, Dm)
49
50 saves #image as PNG
51 imwrite(grayim(transpose(m)/(1.0*MAXITER)),"juliaset.png")

The mathematical idea of a Julia fractal is very simple. Consider the sequence of complex numbers zn+1 = zn + c for an arbitrary complex constant c. Each c yields a particular Julia set. Typical Julia images are created by using a counter. The more iterations are needed in the Zn-series to reach a certain threshold value, the brighter the associated item appears. The two-dimensional complex plane can be easily mapped to the x- and y-coordinate of a pixel grid through which the known pictures are created.

Julia can easily deal with complex numbers, which is why the implementation is particularly simple. The code is complicated by parallelization because distributed arrays are used. Depending on the process ID, part of the Julia image is calculated by another process. An advantage of parallelization is that the calculation of a certain pixel is independent of its surroundings. The situation would be more complicated if the characteristic of one pixel also depended on the environment of the pixel. So, using

~/julia/julia -p 4 juliaset.jl

it is possible to easily execute code (see Figure 2).

Figure 2: Julia calculates Julia sets with distributed arrays.

Buy this article as PDF

Express-Checkout as PDF
Price $2.95
(incl. VAT)

Buy Linux Magazine

SINGLE ISSUES
 
SUBSCRIPTIONS
 
TABLET & SMARTPHONE APPS
Get it on Google Play

US / Canada

Get it on Google Play

UK / Australia

Related content

  • Flam3 and Fractal Fr0st

    Few fractal algorithms create as beautiful and ethereal structures as Flam3. The Fr0st GUI helps you master the complex software.

  • Linux News
    • Android gains market share
    • VMware Workstation
    • Perforce on Demand
    • openSUSE 12.2 released
    • CUDA 5 rc
    • Web Foundation index measures world
    • New Z shell release first since 2004
  • Linux Mint 10 "Julia" is now available!

    The popular, Ubuntu 10.10 based, Linux distribution released Linux Mint 10 today.

  • OpenMP

    OpenMP brings the power of multiprocessing to your C, C++, and Fortran programs.

  • New C++ Features in GCC

    Recent versions of the GNU compiler include new features from the next C++ standard.

comments powered by Disqus

Direct Download

Read full article as PDF:

Price $2.95

News