AMD is finally becoming more aggressive in their bid to win the lucrative HPC share, which is set to explode with this week's SC'12 supercomputing conference in Salt Lake City, Utah.
On AMD Fusion Developer Summit 2012 (AFDS), which was held in June 2012 in Bellevue, WA Mark Papermaster (Chief Technology Officer) showed "the next generation FirePro for workstations and servers" card. The card represented quite the departure from familiar-looking FirePro cards, since it featured no less than three 80mm fans and purportedly two GPUs. In conversations with PR team, it was clarified that Mark Papermaster was mistaken, and that the part in question was long delayed and pretty much shelved Radeon HD 7990 (PowerColor launched the product on their own). Today we see that the CTO of AMD was not mistaken in any way, and that he indeed, showed the server part.
FirePro S10000 (S10K) represents the best Southern Islands GPUs can offer to give "optimal efficiency for HPC calculations, ideal for virtual desktop infrastructure (VDI) and workstation graphics deployments." This is also (on paper) the highest performing accelerator on the market. AMD cites 5.91 TFLOPS single-precision and 1.48 TFLOPS in double-precision floating-point operations. This is higher than what both Intel and Nvidia can offer with their Xeon Phi 3110/5110P and Tesla K10 and K20/K20X boards. In fact, in terms of peak compute performance, this is the highest performing accelerator card of all time.
As you might have figured out, the S10K is consisted out of two Tahiti XT GPUs, packing no less than 4096 stream cores, while the dual 384-bit memory interface delivers 480GB/s (240GB/s per GPU). On paper, nothing compares to this part. We believe that AMD should have launched the 12GB version (6GB per GPU instead of "just" 3) and completely thrash the competition, but there are objective reasons why the company did not dare to offer maximum memory configuration. Probably the biggest concern was populating the back of the card on a server board, as these parts have to endure 100% computational load for most of its life. Still, S10000 has equal amount of memory to single-GPU configurations like Sapphire's Toxic HD 7970 6GB, K20X (6GB) and less memory than Intel's Xeon Phi 5110P (8GB) and Tesla K10 (8GB).
Furthermore, AMD is the only vendor that doesn’t offer a passively cooled option, which will prevent system integrators from embedding this part with their standard 1U and 2U servers. In the high-occupancy 4U configuration, you should be able to put up to 8 FirePro S10000 boards, for a 16GPU/48GB GDDR5 configuration. We spoke with one of vendors that will unveil its FirePro S10000 system at SC'12 in a few hours time, and it cites record performance from a 4U box:
- 8x FirePro S10000
- 16 Tahiti XT GPU
- 48GB GDDR5 Memory
- 3.75 TB/s Aggregate GPU bandwidth
- 47.28 TFLOPS Single Precision (peak)
- 11.68 TFLOPS Double Precision (peak)
In the standard 42U rack, you could configure up to seven systems for a grand total of:
- 56 FirePro S10000
- 112 Tahiti XT GPUs
- 336GB GDDR5 Memory
- 26.25 TB/s Aggregate GPU bandwidth
- 330.96 TFLOPS Single Precision (peak)
- 82.88 TFLOPS Double Precision (peak)
All in all, it is good to see that AMD is becoming more aggressive and is launching products for this space, but we cannot but wonder is this the reason why HD 7990 never saw the light of the day. Furthermore, AMD could have made a passively cooled part without DisplayPort, HDMI and DVI, as these things are really unnecessary in a server configuration.