Redefine algorithm design for energy-efficient machines
Chip and app developers ought to work together as smaller, cheaper, faster systems continue to raise the bar on energy efficiency, said Mark Horowitz in his keynote presentation at the International Solid-State Circuits Conference (ISSCC). The Stanford professor believes that addressing stalled clock frequency and increasing power consumption requires a combination of specialised silicon and better algorithms.
Various thermal systems hindered efforts to scale up power in the mid-2000s resulting in power limitation happening around 100W in desktops/servers, 30W in laptops, and up to 3W in cell phones, according to Horowitz.
He added that the only way to increase power is to decrease the energy per operation in new ways. The veteran researcher said he isn't hopeful that engineers will find a new technology to replace power-limited CMOS and, instead, advocated for specialised processing across multiple cores.
"I can see why people get multi-core processors. Instead of one processor, put four processors so the performance curve is going to shift. By backing off on peak performance on a per processor basis, we can lower the energy per op which allows us to put more processors per die in the same energy budget," said Horowitz.
Developing dedicated hardware for specialised application processing would be 1,000 times more energy efficient than a general processor. Horowitz advocated for stencil applications, where the outputs of one operation are forwarded directly to another intense computation.
Achieving the highest energy efficiency levels requires a very specific combination: very low energy operations, and extreme locality, according to Horowitz's paper. These levels are only possible if the application works on short integer data, 8bits to 16bits, tens of data operations are completed for each local memory fetch, and roughly 1,000 operations are completed for every DRAM fetch.
"If you think about applications that don't access memory very much, you can see why specialisation can help," Horowitz continued. "Specialisation is not so much about hardware, but you have to move algorithms to a much more restricted space."
By restricting algorithms to specific kinds of processing, Horowitz believes developers will be able to build a general engine that can handle those tasks more efficiently. To get to this point, Horowitz said the industry will have to change the way it does algorithm design.
"If we want algorithm designers to play and create better computing devices, we have to minimise the cost to them to do exploration. We have to give them a much higher level development platform in which to play," he said, adding that it's possible to make an app store for hardware. "I don't think it's inconceivable that we could do the same thing [as the Apple store]. We can take sets of hardware and build a strong environment so designers can write code."
Not all problems require the "bleeding edge" of efficiency, however, and many applications will continue to work with current processors. In his paper, Horowitz said the people who know how to use current, adaptable parts are "likely a distinct group from the people who have applications that they want to implement."
"If technology is scaling more slowly, there's not going to be a killer microprocessor that outdoes you in two years when your product first comes out," Horowitz said. "The techniques that we need to do design are stabilising, so let's codify those and make them easier to do."
- Jessica Lipsky
|Related Articles||Editor's Choice|
|Related Articles||Editor's Choice|