Monday, November 4, 2013

On the road to SC'13

It's been about a year since I decided to quit the life of a research scientist and move across the country to get into the supercomputing business, and coincidentally, it's also time for this year's SC conference.  I note the coincidence because hearing about SC'09 from pals in attendance was the first time I realized there was a whole industry surrounding what I considered to be a fun hobby.  I'd argue that hearing about SC antics was a major force in putting me on the path that led me to becoming a part of the supercomputing industry--for example, I specifically recall thinking that my dream job would be one that paid for me to attend SC.

It seems like I've found that job now, and luckier still, I get to attend SC during my first year in the business. I've had this post open as a draft for the past two weeks and initially intended to write a reflective post on where I've come in the last year, but as it turns out, going to SC is a lot of work, and I don't really have time to do all that writing.

My Schedule

Instead, I'm going to do a bit of self-promotion and let people know where I'll be presenting and what I'll be doing.  I am giving three 7-10 minute public booth talks on two different topics, all of which will be on Wednesday, November 20:

  1. At 11:00 AM, I will be presenting some of the work we supported to enable large-scale genomic analysis using Gordon at the SDSC booth (#3313)
  2. At 11:45 AM, I will be talking about our early benchmarks using SR-IOV and Mellanox Virtual-IQ, the key technologies underpinning Comet, the world's first fully virtualized HPC cluster, at the Mellanox booth (#2722)
  3. At around 2:45 PM, I will be presenting the same talk on Mellanox Virtual-IQ and SR-IOV, but this time at the SDSC booth (#3313)
The exact times are a little infirm, and I don't have finalized (read: company-approved) titles for these talks yet, but I will update this post when I do.  Also, a map of the expo floor is already online; both the SDSC and Mellanox booths are in the lower-right quadrant.

In addition to these speaking events, I will definitely be at the SDSC booth (#3313) for these events:
  • On Tuesday at 2:00 PM, the official Comet announcement will be made by SDSC's director, deputy directors, and a bunch of dignitaries from the vendor partners involved.  I will be on hand to discuss the project with whoever is interested in learning more and to consume the free refreshments to be served.
  • On Thursday at 10:00 AM, we will be concluding a special event (not sure how much more I can/should say).  I will be there for the good times.
Until SDSC's final SC'13 agenda is published though, none of this is set in stone so please treat it as such.  Also, if it's of interest to anyone, I will be flying in early (arriving Sunday at 6:30 PM) and leaving late (Friday at 10:00 PM)

SDSC's Schedule

UPDATE: I've posted the exact schedule on a subsequent post.

SDSC has some cool stuff planned for the booth centered around Gordon, SDSC's data-intensive production machine, and Comet, SDSC's upcoming machine designed to address the needs of the 99% of supercomputing users who aren't running ultra-massive hero jobs.

Right now there are two separate "lightning round" talks being given: one on Tuesday morning, and one on Tuesday afternoon.  I am presenting different talks at both, as are a number of my colleagues.  In addition, I think there is an additional set of lightning talks, but I'm not (as far as I know...but I may be wrong) giving one there.  Once the SDSC schedule is available for public consumption, I'll post an update here.

Finally, there is a really neat special event for students spanning the entire exhibition which concludes on Thursday morning.  I'm not sure how secretive this is, so I won't say more until I know more.

My Talks

I'm presenting on two topics, both of which I think are really neat.  

The Gordon Talk

So as to not leave this post without any fun pictures, here is one that describes my talk on large-scale genomic analysis:

Lustre filesystem capacity and number of jobs in flight over time

In brief, a project came to us that involved processing over four hundred complete human genomes straight out of a next-generation sequencer.  The input data came in the form of eight 6-terabyte RAID0 devices which we needed to somehow plug into Lustre to upload, and the ensuing adventure was quite challenging because of the nine-step processing pipeline through which each genome needed to go.  It proved to be a fantastic example of non-traditional uses of HPC that, despite requiring only modest CPU power, is wholly intractable without a lot of high-end compute capability.  

We wound up exploiting almost every aspect of Gordon's flexible architecture:

The fundamental building block of SDSC's Gordon resource

to solve the problem within the parameters necessitated by the project's sponsors.

The Comet Talk

I'm also presenting on some of the early benchmarking work I did with coworkers while architecting Comet, SDSC's upcoming supercomputer.  In particular, Comet will rely heavily upon Mellanox's Virtual-IQ technology and SR-IOV to allow users and gateways to dynamically provision virtual clusters with their own software stacks and operating environments on SDSC's hardware.

SR-IOV is a really interesting technology that greatly simplifies the process of virtualizing cluster interconnects (i.e., Infiniband) without the significant loss of performance that has been characteristic of virtualized clusters.  This talk will be a bit more data-oriented, and I will be presenting hard performance numbers comparing the following types of clusters:

  • Bare metal
  • Virtualized with PCIe passthrough
  • Virtualized with Mellanox's SR-IOV-capable Infiniband adapters
  • Virtualized on Amazon EC2 with cluster compute instances (sound familiar?)
We ran both standard benchmarks as well as real-life applications across these different types of clusters, and the results were quite interesting.