The HELIX Research Project
By conventional definition, a parallel program is either expressed in terms of explicit parallel threads or tasks, or else is heavily annotated to guide compilers in mapping its data and control structures to parallel hardware.
Research in recent years (such as the Liberty project at Princeton), however, has shown that in a very practical sense, every program is a parallel program, even one that has been designed and implemented with sequential semantics. Every long-running program depends on loops, and an increasing body of work demonstrates that automatic parallelization of loops, without help from the programmer, can lead to substantial speedup of the overall program. Since multicore microprocessors are now at the heart of devices from cell phones to supercomputers, it is important that these research demonstrations be translated soon into mainstream compilers and run-time software.
That is the goal of our HELIX project. HELIX starts with a simple idea for loop transformation: to run a loop in parallel, assign its separate iterations to separate processing elements (cores). In general, the cores that handle separate iterations must communicate, both to synchronize and to exchange data. So successful parallelization of a loop depends on whether the benefit of running it in parallel outweighs the communication costs. When the separate iterations are independent, or nearly so, this simple approach to loop parallelization scales well with the number of available cores.
The reason this approach has not been more widely used is that historically the cost of communication between processing elements has swamped the benefits of running in parallel. Now that a powerful multiprocessor can come on a single chip, inter-core communication costs are greatly reduced, and the trend is towards even greater reduction.
To show that the HELIX loop transformation is practical on a current commodity processor, we implemented a prototype compiler that parallelizes ordinary sequential code, including programs with irregular control and data behavior.