0x0031 - The Cuda Compute Monster is 20x the Power/Performance Envelope of CPU Computing!

The Cuda Compute Monster is 20x the Power/Performance Envelope of CPU Computing!

0x0031 - The Cuda Compute Monster is 20x the Power/Performance Envelope of CPU Computing!

Was Nivida's Cuda computing platform named after the very rare 1971 Plymouth Barracuda monster engine?

It should be, here is why:

Take a stock CPU the AMD Ryzen 5 2600X with a respectable CPU Passmark of 14,000. Here is it's AIDA64 benchmarks:

If we build stock-parts from a bare-bones factor, and to get it all working is $400, we are looking at about $1.04 / GFLOP.   It is the cheapest CPU out there - with a respectable performance envelope -  selling used for as little as $100/Cdn as of 2022-Dec.

Naturally a GPU cannot work either without some base-configuration so we will simply add the cost of the same system, add in the GPU itself at a street price of $500  (say for a 3060ti) but now we are talking 16 Teraflops of compute capability on the same single-precision problem set. As per TechPowerup page:

$900 / 16000 GFLOPs = 0.05 cents / GFLOP (single-card-mode)

We could run two GPU's in the same setup, adding marginally to the cost but getting us to a insane 32 TFLOP and we get.

$1400 / 32000 GFLOPs = 0.04 cents / GFLOP (dual-card-mode)

$/GFLOP the Cuda compute platform utterly destroys standard CPU computing in the simulation arena.

In in compute terms by itself -  one would need to build 41 Ryzen 5 2600X compute nodes to match the single output capability of a Nvidia 3060ti!

So if we can get to the Cuda compute environment we are looking at a infrastructure cost savings of about 20:1.  But the reality is many scientists would shy away simply because of the complexity / challenge of learning to code for it.

Linux Rocks Every Day