0x0031 - The Cuda Compute Monster is 20x the Power/Performance Envelope of CPU Computing!
The Cuda Compute Monster is 20x the Power/Performance Envelope of CPU Computing!
Was Nivida's Cuda computing platform named after the very rare 1971 Plymouth Barracuda monster engine?
It should be, here is why:
Take a stock CPU the AMD Ryzen 5 2600X with a respectable CPU Passmark of 14,000. Here is it's AIDA64 benchmarks:
If we build stock-parts from a bare-bones factor, and to get it all working is $400, we are looking at about $1.04 / GFLOP. It is the cheapest CPU out there - with a respectable performance envelope - selling used for as little as $100/Cdn as of 2022-Dec.
Naturally a GPU cannot work either without some base-configuration so we will simply add the cost of the same system, add in the GPU itself at a street price of $500 (say for a 3060ti) but now we are talking 16 Teraflops of compute capability on the same single-precision problem set. As per TechPowerup page:
$900 / 16000 GFLOPs = 0.05 cents / GFLOP (single-card-mode)
We could run two GPU's in the same setup, adding marginally to the cost but getting us to a insane 32 TFLOP and we get.
$1400 / 32000 GFLOPs = 0.04 cents / GFLOP (dual-card-mode)
$/GFLOP the Cuda compute platform utterly destroys standard CPU computing in the simulation arena.
In in compute terms by itself - one would need to build 41 Ryzen 5 2600X compute nodes to match the single output capability of a Nvidia 3060ti!
So if we can get to the Cuda compute environment we are looking at a infrastructure cost savings of about 20:1. But the reality is many scientists would shy away simply because of the complexity / challenge of learning to code for it.