SAN JOSE, Calif. — The world’s seven largest data centers, aka the hyperscalers, have created huge chip markets, driving the semiconductor industry to new performance heights and cost floors. Some fear that they also are distracting chip vendors from the needs of smaller, more diverse users.  

Even more worrying, most hyperscalers have set up their own world-class design teams, threatening to become rivals. They are already designing leading-edge AI accelerators and smart network controllers, and they are just starting to release open-source RTL, which they claim will be a growing trend.  

Competing with these leviathans is tough, but losing their business altogether is even more frightening. That could happen as regulators in Europe and the U.S. now target a handful of the top hyperscalers, threatening to break them up.  

The most likely scenario is that Amazon, Google, Microsoft, Facebook, Alibaba, Tencent, and Baidu will continue to grow as chip customers and designers. Slowly, the semiconductor industry is learning how to swim alongside these whales.  

Hyperscalers will operate 628 data centers by 2021 and carry more than half of all data center traffic, up from 338 data centers in 2016, according to figures from Cisco. The largest hyperscalers are said to have installed at least 3 million and maybe more than 8 million servers.  

Plowing a new sales channel, the hyperscalers bought directly (mainly from Intel and AMD) 35% of all server processors in 2018, up from just 15% in 2013, according to International Data Corp. (IDC). “They have become king makers by selecting x86 processors, for example,” said Shane Rau, a research vice president at IDC. 

text

Hyperscalers have created a new, broad channel for server processors. (Source: IDC)

Overall, the hyperscalers manage about 70% of all public cloud services and buy about a third of all data center gear, said Vlad Galabov, a principal analyst at IHS Markit. Amazon, Google, and Microsoft were the first customers for Intel’s latest Xeon Scalable chips, he noted.  

Those chips support twice the main memory of the previous Intel generation. As a result, the hyperscalers practically cornered the DRAM market in a surge of purchases over the last two years, helping drive a historic spike in memory-chip prices, said Baron Fung, an analyst with Dell’Oro Group.  

The 10 largest hyperscalers spent nearly $100 billion in data center gear last year, up about 45% from 2017, said Fung, who includes Apple, IBM, and Oracle on his list. The top eight hyperscalers (including Apple) bought 71% of all cloud computing gear last year, said Alan Weckel of 650 Group, who pegs the overall total at $109 billion. The largest of them — Amazon — ate up 12% of the pie all by itself.  

“Amazon’s installed base of servers is larger than that of the whole telco industry,” said Weckel, who covers both markets. By contrast, China’s three hyperscalers together are smaller than Amazon, and the smallest — Baidu — is about a tenth the size of Amazon.  

Some count Apple, IBM and Oracle among the hyperscalers. (Source: Dell'Oro)

Some count Apple, IBM and Oracle among the hyperscalers. (Source: Dell'Oro)

The spending seems huge, but so are the companies. Amazon, Google, and Facebook alone raked in more than $400 billion in revenues last year.  

Defining the roadmap for copper and optical nets
Because their data centers need to connect tens of thousands of servers at high speeds, the hyperscalers also drive networking standards, products, and acquisitions in Ethernet controllers, switches, and optical modules. As early as 2010, Facebook engineers were clamoring for vendors to start work on a Terabit Ethernet standard that even today is not seen as practical.

“Ten years ago, the big guys in networking were Cisco and Juniper — they were kings of the hill,” said Brad Booth, manager of network hardware engineering for Microsoft’s Azure cloud service. “Now, everybody comes to us and the other hyperscalers first.” In the race to stay ahead in cloud services, “we’re constantly needing to evaluate new technology to intercept it at the right point.”

That’s why in 2015, Booth helped form the Consortium for On-Board Optics (COBO). COBO specified a way to squeeze ASICs and optics in modules for top-of-rack switches that form the spine of a data center network. The modules aim to lower power and cost and ease a transition to 400-Gbits/s switches at the bleeding edge of today’s networks.  

Microsoft is evaluating its first COBO modules now, with hopes of deploying them in 2021. First, it has to test them out in a “canary cluster” that uses neural-networking apps to drive high levels of test traffic.  

text

Hyperscalers specified a way to slam ASICs and optics together. (Source: COBO)

The race to higher speeds continues. Earlier this year, Microsoft and Facebook announced a collaboration to define components for the next big step for switches — putting ASICs and optics in the same package.  

The whales “drive huge volumes … a handful of hyperscalers can justify a feature set and investment in a chip,” said Pete Del Vecchio, a product manager for Broadcom, the dominant maker of network switch chips for years.  

A handful of startups with innovative switch chips have challenged Broadcom in recent years, but “none have really made a significant dent [in Broadcom’s market share] so far,” said Bob Wheeler, networking analyst for The Linley Group. 

Hyperscalers’ “volumes ramp amazingly quickly,” Wheeler explained. “Ten years ago, when carriers dominated the market, they used to buy 10,000 switch ports a quarter, but now hyperscalers need 100,000 ports a quarter. The scale and speed is unprecedented, and that’s a challenge for startups.”

The whales are also quick to cut off sales. “When we made transitions like from 10G to 40G, we stopped buying 10G — there’s no long tail,” said Booth, suggesting that the upcoming move to 400G switches may leave today’s bleeding-edge 100G products without much of a market.  

Hyperscalers also drive the Ethernet controllers on servers that form the end points of the network. Back in 2014, Booth and others helped form an ad hoc group that defined 25G and 50G specifications to evolve more rapidly from the 10G controllers of the day. That threw something of a monkey wrench into the IEEE process that had just defined 40G and 100G standards in part for telcos.  

Today, Microsoft is already migrating its servers to 50G Ethernet links — a spec that didn’t exist a few years back. And it is already planning for a shift to 100G controllers in a couple of years.  

Some say that the hyperscalers have hijacked the Ethernet standards process, driving requirements that may not serve traditional business switches and servers. In addition, their need to get to the next-fastest speed quickly is much greater than their need for interoperability given that they focus on a short list of vendors.  

Hyperscalers are also highly influential buyers of networking and storage. (Source: 650 Group)

Hyperscalers are also highly influential buyers of networking and storage. (Source: 650 Group)

“They have definitely pushed the industry,” said a veteran leader of Ethernet standards who asked not to be named. “The easiest way to get a project approved is to say it’s needed by a hyperscaler.

“The downside is [that] they don’t like the time it takes to develop the thoroughness of Ethernet standards. Their churn moving to the next interconnect is quicker than others … but not everyone is a gorilla, and there’s a growing disparity between the needs of the gorillas and smaller companies.”

When big customers become small rivals
Taking a different direction toward better networks, Amazon bought startup Annapurna for its smart controllers that can process network protocols. “It didn’t find a satisfactory Ethernet part from Intel or others, so it bought Israel’s Annapurna with chips based on Arm cores, and now, it’s become a top five Ethernet adapter vendor” behind Intel, Mellanox, Broadcom, and Marvell, said Galabov of IHS Markit.  

Annapurna had an estimated 200 engineers when Amazon bought it, a figure thought to have doubled today. “The top four Ethernet card vendors have been there forever, so to become a top five player so quickly is very significant,” said Galabov, estimating that Amazon’s internal designs now make up 8% of worldwide Ethernet controller ports and revenues.  

The desire to forge deeper ties with hyperscalers drove Nvidia to bid $6.9 billion for networking specialist Mellanox, its biggest acquisition effort to date. Mellanox makes a third of its sales to large- and medium-sized cloud service providers with design wins at Alibaba, Baidu, Facebook, and Tencent, said Fung of Dell’Oro.  

Expect more quick shifts, said Gabalov. Hyperscalers are the first to feel when computing workloads shift. To compete with their rivals, they need to react quickly. That’s exactly what’s happening now with the rise of deep learning, a new style of AI sparked by research advances in 2012.  

Jumping on the trend, Amazon, Alibaba, and Baidu have announced separate accelerators for deep learning, some getting deployed this year. Google is ahead of the pack, with three generations of its TPU already running in its networks.  

Google's latest TPU runs so fast it needs liquid cooling. (Google)
Google's latest TPU runs so fast it needs liquid cooling. (Google)

“By designing the TPU, Google helped spur an industry of AI accelerators,” said Rau of IDC.  

“When hyperscalers don’t find what they want off the shelf, they have such deep pockets and high performance needs, they will design it themselves,” he said. “It’s a kick in the pants to the semiconductor industry to provide a standard component at a cheaper cost.”

That tension is creating some angst for many startups and established chipmakers about to field their first AI accelerators. They fear that the whales’ own internally designed chips may satisfy what’s expected to be the largest available market.  

Long term, deep learning will be widely used, and private companies will turn to the cloud giants to train their neural networks because the work can take weeks on banks of the world’s largest processors. “If that happens, we’ll see an even more disproportionate push of computing toward the hyperscalers,” said Weckel of the 650 Group.  

Weckel worries that the whales will forge their own semiconductor supply chains in a scenario (complicated by the U.S./China trade wars) in which no one has enough volume to make a decent profit. “If they continue to invest in in-house engineering, hyperscalers could work directly with foundries like Samsung and TSMC,” said Galabov.  

Hyperscalers “will produce their own ASICs for a time, but they will always feel pressure to get back to a commodity solution to save costs,” said Rau.  

“It’s not as bad as it looks,” said Linley Gwennap of the Linley Group. Both Amazon and others have only talked about inference, not training ASICs, and many of the projects may not see the light of day. “It’s more that they are not getting what they need rather than they really want to design their own chips.”

The view from Facebook’s hardware team
In one nightmare scenario, hyperscalers will go around chip vendors, just as a decade ago, they cut out server makers like Dell and HP. Google was among the first to go public with stripped-down server specs that it shared directly with ODMs such as Taiwan’s Quanta and Wistron, cutting out branded OEMs as unnecessary profit-hogging middlemen.  

The result is that the server market “has become more cutthroat,” and China’s Huawei and Inspur have risen among the top six players, Weckel said.  

In small ways, this is already happening. “We’ve started buying optical modules directly instead of from switch vendors because we were paying too high a markup,” said Booth of Microsoft.  

Taking a bigger step, Microsoft released in March open-source RTL for a new data-compression standard through the Open Compute Project (OCP) founded by Facebook in 2011 to define system specs for hyperscalers. “Among cloud providers, we are first to … set a new precedent of contributing RTL, which I hope others will follow,” said Kushagra Vaid, general manager of server engineering at Microsoft.  

Another OCP effort is defining open standards for chiplets that many see as the future of semiconductor design as Moore’s Law slows. One member, chip vendor Netronome, aims to offer to the group the RTL for the 800-Gbits/s fabric used in its multicore network processors.  

The Open Compute Group aims to set open specs for chiplets. (Source: OCP)

The Open Compute Group aims to set open specs for chiplets. (Source: OCP)

Separately, Google was an early backer of the RISC-V movement that is publishing open-source code for an instruction set at the heart of many new processors.  

Long term, an accumulation of open-source IP blocks and tools could transform the chip industry, but that won’t come anytime soon.  

Open-source hardware “has the same potential as open-source software, but a critical difference is [that] software can be developed with a clean sheet of paper, not hardware,” said Aaron Sullivan, director of hardware engineering for Facebook’s new silicon group. “With hardware, you have to make something that works, and it’s not as simple as compiling open source to binary.”

“We don’t see us having a core competency of building EDA tools and IP,” said Vijay Rao, Facebook’s director of infrastructure who spent 15 years at AMD and Intel designing processors and EDA flows before joining the social network. “At Facebook, we look at building better infrastructure — that’s where we keep an eye. If it requires building a better EDA tool or chip, we will do it.

“Disintermediating the server OEMs didn’t take much investment upfront, but designing data center chips is a big investment and getting bigger as we advance to 7 nm and below, so Facebook is more interested in advocacy than designing its own chips.”

For example, “I spoke to 20 companies and no one had a good video encoder for the data center, so we seeded the market with ideas and helped companies go build them — Broadcom and Verisilicon are two [that] we partnered with,” he said.

Similarly, Facebook partnered with Broadcom, Qualcomm, and others to define an inference processor. Facebook is actively helping design its compiler but not the silicon, he added. “We might do some silicon ourselves if the industry doesn’t want to because the volumes are too small. We have AR/VR products, a Telco Infra Project that may need some pieces of an EDA flow, a lot of things we may choose to do in the future that may include a mix of stuff we build in- and out-of-house.”

Microsoft’s Booth agrees: “I have teams of engineers who define [system-level] switches, optics, and components. Cisco has thousands of engineers for switch designs, but Microsoft hasn’t hired thousands of engineers — it’s more cost-effective for us to use OEM or white-box systems.” 

The seven whales don’t swim as one pod
Although we group them together, the hyperscalers have diverse businesses and workloads in social networking, e-commerce, and media. “They compete fiercely with each other, so they need to innovate and differentiate themselves,” said Galabov of IHS Markit.  

For example, “hyperscalers drove a new way of working with Intel,” he said. “Intel didn’t used to have custom CPUs, but by 2018, half the CPUs it sold to cloud customers were custom, up from 20% in 2013 — Intel had to build completely new capabilities.”

The trends also plays out in optical communications through which Google uses the CDWDM4 specs to link its data centers. Relative latecomers Facebook and Microsoft were able to leverage cheaper single-mode fibre. “We are driving vendors nuts — they would love us all to have a common vision and direction,” said Microsoft’s Booth.  

In their efforts to make something unique for each whale, semiconductor companies have forgotten the traditional discipline of marketing. According to Booth, they stopped “pulling together requirements from multiple markets and building one thing for them all — they are bifurcating their own markets. It’s a weird transition.”

Others point to more than a dozen Ethernet specifications in flight as an example of the fragmentation in standards efforts.  

AI, too, is expected to fragment into separate target sectors as well as separate requirements from the hyperscalers. “Acceleration for hosting is very different from acceleration for Facebook, which has no plans for hosting,” said Rao. “We have a lot of insight into what we want, and we want to build accelerators tightly coupled to our problems. Our model sizes are quite different from other hyperscalers, and our types of models and their parameters are quite different, too.”

Rao’s team has talked with most of the AI silicon startups, and “they are not all attacking the same area — some are in edge training, or Alexa-like devices, different parts of the problems.”

Microsoft’s Booth agreed. “The machine learning we want to do may be very different from what Google or Amazon wants to do, and right now, some of that is secret sauce to differentiate our AI-as-a-service, so you can’t make one device to sell to everyone.”

The one thing consistent across all the hyperscalers from the start has been and continues to be a laser-like focus on squeezing out every penny of unnecessary cost.

“It’s hard as a silicon provider to really understand the scale of the operations we work at,” said Booth, who worked for Intel, Applied Micro, and Dell before joining Microsoft six years ago. “Things that seem small, a half-watt per part, explode at the hyperscale where you are burning megawatts of power.”

One of Google's dozens of data centers. (Google)
One of Google's dozens of data centers. (Google)

Will regulators make the whales extinct?
The data center whales face many threats. After scandals about privacy breaches and their role as pawns in the 2016 U.S. presidential election, public sentiment has turned against them. In addition, the European Commission and the U.S. Department of Justice and Federal Trade Commission are conducting or considering antitrust investigations.  

At the center of much of the scrutiny, Facebook’s chief executive has made numerous pledges that the company is moving in new directions to protect privacy and guard against abuse of the social networks that it runs. But its vast and highly automated networks remain vulnerable to leaks and attacks.  

Anecdotally, some point to signs that the hyperscalers have gotten knocked off of their pedestals as media darlings. One analyst notes that promising graduates increasingly see them, and Silicon Valley in general, as less attractive employers.  

But despite grassroots movements to turn off Facebook accounts, the social network and other hyperscalers continue to grow. It’s impossible to predict what future elections, security breaches, and regulatory sanctions may bring, but the smart money is betting that the whales will continue to grow.  

Chipmakers got a taste of what life without hyperscalers would look like this year, when the giants pulled back spending after two years of heady growth. Rau of IDC predicts a return to growth, starting this fall, with overall server processor sales to hyperscalers recovering to about 3.5%. 

Over the next five years, hyperscalers’ server consumption should even out to a bit more than 7.5% growth in units and revenues, according to Galabov of IHS. The much smaller second-tier cloud providers and telcos will grow more quickly in their server spending, he added.   

text

Galabov sees telcos and a second tier of cloud providers growing faster than hyperscalers.(Source: IHS Markit)

Fung of Dell’Oro is more bullish, predicting that hyperscalers will return to growth in the ‘teens. Most signs indicate that companies are continuing to move more of their work to public cloud services like AWS, Azure, and Google Cloud. 

The whales’ next surge could come as early as next year as they scoop up new server platforms, faster networks, and the first batch of AI accelerators. Chipmakers would be well-advised to get their swimming gear ready for more big waves to come. 

At Semicon West, a Google engineer said that the search giant has reinvented everything from accelerators to interconnects and floating-point formats in its quest for more AI performance. He invited chipmakers to join its search for what will drive its fourth-generation TPU — perhaps a whole new transistor. 

It’s just one of many invitations to heady parties hosted by hyperscalers. The semiconductor industry needs to RSVP.