Parallel and Distributed Computer Architecture

Computer Architecture and Systems

Parallel computing is the simultaneous execution of the same task (split up and specially adapted) on multiple processors in order to obtain faster results. There are many different kinds of parallel computers (or "parallel processors"). Flynn's taxonomy classifies parallel (and serial) computers according to whether all processors execute the same instructions at the same time (single instruction/multiple data -- SIMD) or each processor executes different instructions (multiple instruction/multiple data -- MIMD). They are also distinguished by the mode used to communicate values between processors. Distributed memory machines communicate by explicit message passing, while shared memory machines have a global memory address space, through which values can be read and written by the various processors.

The fastest supercomputers are parallel computers that use hundreds or thousands of processors. In June 2008, the fast computer in the world was a machine called "Roadrunner," built by IBM for the Los Alamos National Laboratory. It has more than 100,000 processors, and can compute more than one trillion (1012) floating point operations per second (one petaflop/s). Of course, only very large problems can take advantage of such a machine, and they require significant programming effort. One of the research challenges in parallel computing is how to make such programs easier to develop.

The challenge for parallel computer architects is to provide hardware and software mechanisms to extract and exploit parallelism for performance on a broad class of applications, not just the huge scientific applications used by supercomputers. Reaching this goal requires advances in processors, interconnection networks, memory systems, compilers, programming languages, and operating systems. Some mechanisms allow processors to share data, communicate, and synchronize more efficiently. Others make it easier for programmers to write correct programs. Still others enable the system to maximize performance while minimizing power consumption.

With the development of multicore processors, which integrate multiple processing cores on a single chip, parallel computing is increasingly important for more affordable machines: desktops, laptops, and embedded systems. Dual-core and quad-core chips are common today, and we expect to see tens or hundreds of cores in the near future. These chips require the same sorts of architectural advances discussed above for supercomputers, but with even more emphasis on low cost, low power, and low temperature.

Associated Courses