Application development for the Cell processor
Random
The PPE program contains a loop (Listing 2), which distributes the workload over the SPEs involved and sets a seed for creating pseudo random numbers from the current system time. To launch a program on an SPE, three steps are required. First, the spe_context_create() function (line 7) needs to create an SPE context. Second, the spe_program_load() function (line 8) needs to specify the program to execute; the programmer needs to declare the spe_program_handle_t variable in the PPE program header for this. This variable is always declared externally, that is, outside of the function. The name is identical to the name that the SPE program will be given later when you compile it.
Listing 2
For Loop for Controlling the SPEs
The third step is for the spe_context_run() function to launch the program you want to execute. Normally, this function would block the PPE program while the SPE program is running, thus preventing any other SPE programs from launching parallel to it. A Posix thread helps to avoid this by executing the spu_pthread() function (line 10), which in turn launches an SPE program without interrupting the PPE program flow.
Now the SPE program needs to know where the parameters for the forthcoming calculations are located. Each SPE has a mailbox for incoming messages (four 32-bit words) and a mailbox for outgoing messages (one 32-bit word). Another mailbox triggers a software interrupt when data is available. In this case, the PPE program calls spe_in_mbox_write() (line 13) to pass in the start address of the array in which the parameters for the calculations are stored. The SPE context defines which SPE receives the message; its start address is the first function argument.
When all SPE programs have terminated, the PPE program releases the memory for the SPE context in question. Finally, the PPE program outputs the SPE's results on the console.
SPE Culture
The SPE's work starts with the compute_pi() function (Listing 3). compute_pi() expects a seed as an argument, which it will use to generate random numbers, and the number of pairs of numbers to calculate. The function returns an approximate value for PI as a function value. To allow this to happen, the main() function (Listing 4) reads the main memory address at which the structure with the parameters for the current SPE program is located. This address is also referred to as an effective address.
Listing 3
compute_pi Function
Listing 4
Main Function in the SPE Program
Because the spu_read_in_mbox() function can only read single 32-bit words, it must be called twice to retrieve the full 64-bit address (lines 7 and 8). The variables declared inside the SPE program all lie within the SPE's local memory space. Pointers also reference memory addresses in the local memory. Because the Cell processor uses Big Endian architecture, the first word contains the higher value, and the second word contains the lower value bits.
Next, the SPE program must reserve a tag ID to distinguish DMA data transfers between main and local memory (line 10). An SPE can manage up to 32 tag IDs. Following this, the spu_mfcdma64() function transfers the parameter block that points to the main memory address previously retrieved from the mailbox to the spe_par variable in local memory (line 12). This function can handle both read and write DMA transfer. The sixth argument defines the transfer direction, as a comparison with line 18 shows.
The spu_mfcdma64() function does not wait for the memory transfer to complete. To ensure data integrity, the SPE program must wait until the DMA controller (Memory Flow Controller, MFC) has finished; the mfc_read_tag_status_all() (line 14) makes sure of this. The mfc_write_tag_mask() function (lines 19 and 20) tells us which of the 32 possible parallel DMA transfers it is waiting for.
Now the calculations can start, and the results, which are again stored in the spe_par structure, make their way back into main memory. Finally, line 22 releases the tag ID.
Instilling Life
Creating the object files is the next step. Because the PPE SPE processor cores use different instruction sets, two different compilers must be used to build the source:
/opt/cell/toolchain/bin/spu-gcc -o pi_libspe_spe.spuo pi_libspe_spe.c /opt/cell/toolchain/bin/ppu-gcc -c pi_libspe_ppe.c
The .spuo suffix indicates an object file based on the SPE instruction set. To create a single executable, the ppu-embedspu tool converts the SPE program's object code into a format that the PPE can read:
/opt/cell/toolchain/bin/ppu-embedspu pi_libspe_spe pi_libspe_spe.spuo pi_libspe_spe.o
The first parameter is the name used by the PPE to address the SPE program; it is identical to the name of the spe_program_handle_t type variable, which is declared in the pi_libspe_ppe.c source file.
The second parameter is the name of the file containing the SPE object code, and the third refers to the file where ppu-embedspu will write the PPE-readable object code. Finally, the developer must link the PPE and SPE programs with the libspe2 library to create an executable:
/opt/cell/toolchain/bin/ppu-gcc -o pi_libspe pi_libspe_ppe.o pi_libspe_spe.o -lspe2
If you have access to a computer with Cell hardware, you can simply copy the pi_libspe executable to it and execute the program. If you are using the simulator, you will need to take a small detour.
« Previous 1 2 3 Next »
Buy this article as PDF
(incl. VAT)
Buy Linux Magazine
Subscribe to our Linux Newsletters
Find Linux and Open Source Jobs
Subscribe to our ADMIN Newsletters
Support Our Work
Linux Magazine content is made possible with support from readers like you. Please consider contributing when you’ve found an article to be beneficial.
News
-
TUXEDO Computers Unveils Linux Laptop Featuring AMD Ryzen CPU
This latest release is the first laptop to include the new CPU from Ryzen and Linux preinstalled.
-
XZ Gets the All-Clear
The back door xz vulnerability has been officially reverted for Fedora 40 and versions 38 and 39 were never affected.
-
Canonical Collaborates with Qualcomm on New Venture
This new joint effort is geared toward bringing Ubuntu and Ubuntu Core to Qualcomm-powered devices.
-
Kodi 21.0 Open-Source Entertainment Hub Released
After a year of development, the award-winning Kodi cross-platform, media center software is now available with many new additions and improvements.
-
Linux Usage Increases in Two Key Areas
If market share is your thing, you'll be happy to know that Linux is on the rise in two areas that, if they keep climbing, could have serious meaning for Linux's future.
-
Vulnerability Discovered in xz Libraries
An urgent alert for Fedora 40 has been posted and users should pay attention.
-
Canonical Bumps LTS Support to 12 years
If you're worried that your Ubuntu LTS release won't be supported long enough to last, Canonical has a surprise for you in the form of 12 years of security coverage.
-
Fedora 40 Beta Released Soon
With the official release of Fedora 40 coming in April, it's almost time to download the beta and see what's new.
-
New Pentesting Distribution to Compete with Kali Linux
SnoopGod is now available for your testing needs
-
Juno Computers Launches Another Linux Laptop
If you're looking for a powerhouse laptop that runs Ubuntu, the Juno Computers Neptune 17 v6 should be on your radar.