New GPUs will have even wider and faster memory buses than their predecessors, with AMD moving to 384 bits from 256 bits, and Nvidia hopping even further, to 512 bits, in their high end SKUs. But is it more beneficial in the usual graphics tasks, or for the future vector FP math jobs in GPGPU operations?
You must have noticed by now how the next generation of high end GPUs, both AMD 'Tahiti' 7900 series and Nvidia 'Kepler' GK100 processors, have further increased their memory bandwidth and even bus width as well. AMD is now one step up from the 6900 series, with its 256 bit GDDR5 memory bus being widened to 384 bits in the 7900 series coming early next month. The Nvidia GK100, a quarter or two later, will have a 512 bit wide memory bus, a hop from 384 bits seen now in the GTX580 and its Quadro & Tesla brethren. Now, this doesn't seem as complicated a job as widening the CTE over and over again for our LTA here in Singapore, but it is still quite a work, as those wider buses require more pins, power drive and present board layout challenges to get both the speed and width.
Now, the obvious use in graphics, with – hopefully finally – 4K resolution displays coming to the PC world, and the extra refresh loads of stereoscopic 3D, comes to mind. However, the current memory subsystems on high end GPUs actually take quite a good care of that, aside from of course getting higher frame rates with faster memory. What's then the other use then for this width increase?
Well, GPUs have very very high peak FP compute rates, but the real performance in FP-heavy code is greatly affected by the available memory bandwidth. While a game may get a 5% FPS boost from a 33% memory bus width jump, a vector FP run may gain over 20% from the same enhancement, all other things staying the same. So, whatever the vendor's primary intentions are for widening the graphics local memory buses on new GPUs, the extra bandwidth – and yes, extra capacity headroom from being able to add more chips and keep things in local RAM – will have the greatest benefit exactly for GPU compute apps. The impact? More HPC-like stuff running oh so fast on your PC, and more true HPC apps running faster on GPGPU arrays.