Sunday, June 29, 2014

Exascale in perspective: RSC's 1.2 petaflop rack

Russian supercomputing manufacturer RSC generated some buzz at ISC'14 last week when they showed their 1.2 PF-per-rack Xeon Phi-based platform.  I was aware of this system from when they first announced it a few months prior, and I referenced it in a piece of a blog post I was writing about the scarier aspects of exascale computing.  Given my impending career change though, it is unclear that I will have the time to ever finish that post before it becomes outdated.  Since RSC is back in the spotlight though, I thought I'd post the piece I wrote up to illustrate how wacky this 1.2 PF rack really is in terms of power consumption.  Power consumption, of course, is the limiting factor standing between today and the era of exascale computing.

So, to put a 400 kW, 1.2 PF rack into perspective, here is that piece:



The Importance of Energy Efficiency

Up through the petascale era in which we currently live, raw performance of high-performance components--processors, RAM, and interconnect--were what limited the ultimate performance of a given high-end machine.  The first petaflop machine, Los Alamos' Roadrunner, derived most of its FLOPs from high-speed PowerXCell 8i processors pushing 3.2 GHz per core.  Similarly, the first 10 PF supercomputer, RIKEN's K computer, derived its performance from its sheer size of 864 cabinets.  Although I don't mean to diminish the work done by the engineers that actually got these systems to deliver this performance, the petascale era really was made possible by making really big systems out of really fast processors.

By contrast, Exascale represents the first milestone where the limitation does not lie in making these high-performance components faster; rather, performance is limited by the amount of electricity that can be physically delivered to a processor and the amount of heat that can be extracted from it.  This limitation is what has given rise to these massively parallel processors that eschew a few fast cores for a larger number of low-powered ones.  By keeping clock speeds low and densely packing many (dozens or hundreds) of compute cores on a single silicon die, these massively parallel processors are now realizing power efficiencies (flops per watt) that are an order of magnitude higher than what traditional CPUs can deliver.

The closest technology on the market that will probably resemble the future's exaflop machines are based on accelerators--either NVIDIA GPUs or Intel's MICs.  The goal will be to jam as many of these massively parallel processors into as small a space and with as tight of an integration as possible.  Recognizing this trend, NERSC has opted to build what I would call the first "pre-exascale" machine in its NERSC-8 procurement which will feature a homogeneous system of manycore processors.

However, such pre-exascale hardware doesn't actually exist yet, and NERSC-8 won't appear until 2016.  What does exist, though, is a product by Russia's RSC Group called PetaStream: a rack packed with 1024 current-generation Xeon Phi (Knight's Corner) coprocessors that has a peak performance of 1.2 PF/rack.  While this sounds impressive, it also highlights the principal challenge of exascale computing: power consumption.  One rack of RSC PetaStream is rated for 400 kW, delivering 3 GFLOPs/watt peak.  Let's put this into perspective.

Kilowatts, megawatts, and gigawatts in perspective

During a recent upgrade to our data center infrastructure, three MQ DCA220SS-series diesel generators were brought in for the critical systems.  Each is capable of producing 220 kVA according to the spec sheets.
Three 220 kVA diesel generators plugged in during a PM at SDSC
It would take three of these diesel generators to power a single rack of RSC's PetaStream.  Of course, these backup diesel generators aren't a very efficient way of generating commercial power, so this example is a bit skewed.

Let's look at something that is used to generate large quantities of commercial power instead.  A GE 1.5-77 wind turbine, which is GE's most popular model, is advertised as delivering 1.5 megawatts at wind speeds above 15 miles per hour.

GE 1.5 MW wind turbine.   Source: NREL
Doing the math, this means that the above pictured turbine would be able to power only three racks of RSC PetaStream on a breezy day.

To create a supercomputer with a peak capability of an exaflop using RSC's platform, you'd need over 800 racks of PetaStream and over 300 MW of power to turn it all on.  That's over 200 of the above GE wind turbines and enough electrity to power about 290,000 homes in the U.S.  Wind farms of this size do exist; for example,

300 MW Stateline Wind Farm.  Source: Wikimedia Commons
the Stateline Wind Farm, which was built on the border between Oregon and Washington, has a capacity of about 300 MW.  Of course, wind farms of this capacity cannot be built in any old place.

Commercial nuclear power plants can be built in a variety of places though, and they typically generate on the order of 1 gigawatt (GW) of power per reactor.  In my home state of New Jersey, the Hope Creek Nuclear Generating Station has a single reactor that was built to deliver about 1.2 GW of power:

1.2 GW Hope Creek nuclear power station.  The actual reactor is housed in the concrete cylinder to the bottom left.  Courtesy of the Nuclear Regulatory Commission.

This is enough to power almost 4 exaflops of PetaStream.  Of course, building a nuclear reactor for every exaflop supercomputer would be extremely costly, given the multi-billion dollar cost of building reactors like this.  Clearly, the energy efficiency (flops/watt) of computing technology needs to improve substantially before we can arrive at the exascale era.