Two giants back initiatives toward new multicore models
Microsoft stressed its vision for placing new layers to its system software stack and point extensions to its .Net environment. Meanwhile, Intel said it plans to have extensions to its x86 instruction set and has shown progress on Ct, extensions to the C++ language with an objective of supporting greater parallelism.
From the advent of computing, software got a free ride as Moore's Law drove serial processors to greater performance levels. "But growing problems of power leakage in MPUs has driven a shift to putting more cores on a die, forcing a historic transition to a parallel programming model yet to be invented," said David Callahan, who leads Microsoft's parallel computing initiative, announced late last year.
Microsoft and Intel are backing various academic research initiatives to help plow the way forward. At IDF, they shared some of the progress from their internal corporate teams.
As if this job was not ambitious enough, Microsoft hopes to use the parallel shift to enable advances in computer interfaces.
"This is really about a new set of natural and immersive experiences we want to deliver," said Callahan. "The parallel computing shift is like an accident along the way," he added.
The underlying software plumbing needs an overhaul before such work can begin. Callahan noted the next system's software will be much more layered into separate elements including new runtime environments that sit in a user space below application libraries and above hypervisors and the core OS kernel.
The runtime environments will act as schedulers, working cooperatively with hypervisors that map virtual to physical resources and OSes that manage access to physical hardware. "This represents a refactoring of traditional OS services," he added.
The objective is to better handle the increasing number of competing requests in multicore environments. Even today's PCs host a "terrifying number" of processes running in parallel, creating sequential-processing bottlenecks and losses in data locality, he said.
Need for further collaboration
Microsoft will expose its runtime layer to third parties including Intel because it expects there will be a need for many kinds of interoperable software abstractions from different vendors to serve different application types. Tomorrow's software also calls for improved techniques in cooperative scheduling, better thread-level performance and enhanced message passing.
"There are a deep set of changes before you can even get to rebuilding libraries and rewriting applications," said Callahan.
"This is an ambitious shift and this is simply the first cut," said Michael McCool, chief scientist, RapidMind, which sells parallel programming tools for the x86 and other processors.
"Initially, they have done some of the obvious things supporting parallel tasks, but I haven't seen anything about efforts to abstract data," he added.
"The future's parallel programming model will need new categories for sorting data so that it can be marshalled into appropriate locations in cache at the right time," he noted. He stressed that Intel's latest high speed processor interconnect significantly reduces latency, but if the wrong data appears in cache, latency can shoot up dramatically.
In the area of programming tools, Callahan said Microsoft is making extensions to its .Net environment based on its C# 3.0 language. Intel said it will release beta version of four new parallel programming tools in November.
"Programmers will need a whole new toolset to help debug, optimize and validate parallel codes," Callahan said. Meanwhile, McCool said, "Debugging has to move from single-step to visualization tools that capture trends in thousands of synchronized tasks,"
On the language front, Intel talked about Ct, an extension of C++ for multicore processors. The language seeks to automate the job of splitting processing tasks across many cores without the programmer knowing the details of x86 architecture.
The language delivers 1.7 to 3.7 times performance speed on code running on four processor systems, according to data shown by Anwar Ghuloum, principal engineer, corporate technology group, Intel. Ct was initially geared toward Intel's general purposed Nehalem quad core chips, but is now up and running on its prototype 16-core Larrabee graphics processors.
"RapidMind and Ct are pointing in the same direction, but we have been around longer as a mature commercial offering while Ct is still essentially a research API," said McCool.
Intel also discussed its Advanced Vector Extensions, instruction set extensions that will replace the Streaming SIMD Extensions currently used in its processors.
AVX is expected to provide a superior environment for parallel programming compared to Streaming single instruction, multiple data (SIMD) Extensions, boosting floating point performance and adding wider SIMD units. However, it is not expected to be fully implemented until Intel's SandyBridge processors, a 32nm processor family debuting probably in 2010, a full two generations out from today' Nehalem CPUs.
Separately, Intel disclosed a new feature in its Nehalem processors to optimize performance when some of their cores are not being used. The feature can automatically shut off one or more cores when they are not being used and bolster the amount of chip-level power available to the remaining cores that are running.
The technique involves a new transistor design with high-off resistance that helps reduce even the leakage current from a core that is turned off. It also employs a million-transistor controller and sensors on the processor.
"The more powder constrained you are, the bigger the performance boost," said Rajesh Kumar, an Intel fellow heading up power management on Nehalem.
Archrival AMD has had a capability to run cores independently via separate power planes on its processors. Previously, Intel said such features do not result in significant savings in power.
- Rick Merritt
|Related Articles||Editor's Choice|
|Related Articles||Editor's Choice|