« Previously: Upstart energy fuels Hot Chips event  

Start-up InVisage Technologies has been working for a decade on a higher quality image sensor using quantum dots that could replace today’s CMOS sensors. It believes its QuantumFilm is ready to roll for multispectral cameras in drones, VR headsets, self-driving cars and other systems.

EETA hotchips 02 01 Figure 1: QuantumFilm (Source: InVisage)

The company uses an approach with front-side illumination and a thin coat of quantum dots that takes less space than today’s imagers (above). The highly sensitive dots provide wider dynamic range and work in a broader spectrum area that silicon, said Emanuele Mandelli, vice president of engineering for the company in a talk.

It’s been a long slog over many hurdles, he said. The company had to identify the right materials, create a fab that can evenly spin-coat standard TSMC wafers with the quantum dot fluid and create a protective but electrically readable layer for the dots. “It took a lot of time and money, but we are at the point where we solved all the technical problems,” he said.

Samsung describes custom ARM core

Samsung described the Exynos M1, its first custom ARM core, running at 2.6GHz on less than 3W in a 14nm process. The smartphone chip is rumoured to have had its start in an ARM server project in Austin that was later cancelled, shifting the IP to the mobile group.

“There are various areas we improved over the ARM A57, and other areas we could do more, but for a first effort, we did well putting out a competitive part for a next-generation phone,” said Brad Burgess, chief architect of the chip. “We’re not sitting on our laurels—there’s more to come,” said Burgess, a microprocessor veteran with designs in all three major game consoles and the Mars Rover.

For its part, ARM described its next-generation Mali GPU core headed for a market where it shipped 750 million cores last year. The G71 is the first member of the new Bifrost GPU generation sporting a new instruction set architecture, ALU and CPU-to-GPU cache coherency.

“We’ve taken the best bits of our old architecture and added a whole bunch of new features with new shader core that’s more scalable and a new GPU architecture that will last for years–it has much better energy efficiency, bandwidth and better silicon area utilisation,” said Jem Davies, an ARM fellow and vice president of technology.

Big iron clashes on the network

Intel described its Omnipath systems, the result of multiple acquisitions of networking technology for high-performance computing. The systems pack switch ASICs that drive dense arrays of 100Gbit/second links at latencies as low as 100 nanoseconds in director-class systems that reach up to 20U in size.

EETA hotchips 02 02 Figure 2: Omnipath systems (Source: Intel)

The same day as the Intel talk, the rival Infiniband Trade Association released its long term road map with plans for 200G HDR links in 2018 and concepts for three generations beyond. Like Intel, Infiniband proponent Mellanox has been bulking up to become a vertically integrated supplier of networking chips and systems including its latest deal to buy EZChip which acquired in 2014 network processor designer Tilera.

Both sides face a significant shift in next-generation systems that will migrate from 28 to 56Gbit/s serdes. The new latency-sensitive serdes will need large forward-error control capabilities forcing engineers to sharpen their pencils.

China’s Phytium shows working server silicon

EETA hotchips 02 03 Figure 3: Mars chip (Source: Phytium)

A year after it announced plans for a massive 64-core ARM server processor, start-up Phytium of Guangzhou came to Hot Chips with working silicon of its 2GHz Mars chip. The company hopes yields of the chip in a 28nm TSMC process are high enough it can ship parts before the end of the year.

The start-up has been shipping for a year the four- and 16-core Earth variants of the design for Linux laptops and desktops, respectively. So far sales have not cracked into five digits, but the company is hopeful it can ride the popularity of Kylin Linux and productivity apps and government mandates for state-run companies to switch to China-made computers.

Phytium’s 200 engineers have their work cut out for them. The Earth chips significantly lag performance of Intel Haswell and Skylake processors, so it has started work on a next-generation core. Meanwhile it is also building symmetric multiprocessing into its next-generation Mars which so far only supports single-socket systems. Mars-2 also needs to integrate currently external memory controllers and L3 cache.

Server silicon beefs up security

Server processors are dedicating more gates to security. AMD, IBM, Intel and Oracle all mentioned expansions, albeit the function still represents a tiny patch of silicon area.

Oracle’s Sparc M7 uses hardware and system calls to paint one of 14 “colours” on all data in memory. Data can only be accessed with correct colour in pointers.

The move is part of a broader push to accelerate in hardware the Oracle database stack, part of the rational of the company’s acquisition of chip and system builder Sun Microsystems. The M7 also supports acceleration for SQL and compression functions in native Oracle code and via APIs for third-party programs.

For its part, Intel described its new security guard extensions Skylake to protect memory regions from privileged software and malware. IBM’s Power9 will sport new hardware enforced trusted execution capabilities the company would not describe. And AMD’s Zen will include 2 AES units to improve encryption performance.

New thinking, up-and-coming designers

Two academic projects showed fresh thinking and solid engineering chops from an up-and-coming generation of microprocessor architects including Michael McKeown of Princeton’s Piton project.

Piton is a 25-core chip designed as a tile for an array of up to 8,000 chips that can form flexible coherence domains across multiple tiles. “We look toward flattening the data centre so communications don’t go through Ethernet or Infiniband but one on-chip interconnect,” said McKeown.

The research chip is based on an OpenSparc T1 core and has been taped out in a 32nm IBM SOI process. At 460 million transistors, it is one of the largest academic chips to date with RTL already available online as open source code.

Separately, students and the University of California at Davis built a 1,000-core chip they claim is the most core-rich device to date. The KiloCore is aimed at work as a co-processor that can be programmed at runtime.

The chip’s novel approach to giving each core micro-tasks handled in a tiny 128-word memory space would no doubt make programming complex. That said, the architecture hits some notable metrics including a potential maximum of 1.78 trillion instructions per second at 40W.

Perhaps even more noteworthy, the young team showed great design dexterity when given just two months’ notice of the opportunity to tape out the design in a 32nm IBM process.

Presenter Brent Bohnenstiehl said he had a "toy processor" available as a starting point that gave him a 20% head start on defining a core. The physical design team had an even shorter window—they got access to libraries to start their work just 34 days before the fab run.

« Previously: Upstart energy fuels Hot Chips event