Performance of BlochSolver was measured for several GPUs and CPUs including the latest GPU RTX 3090 using two pulse sequences.

RF spoiled 3D gradient echo (256×256×32):

 

Multislice fast spin echo (256×256, 24 slices):

 

RTX 3090: 10496 cuda cores, RTX 3080: 8704 cuda cores, RTX 2080Ti: 4352 cuda cores, GTX 1080Ti: 3584 cud cores, RTX 2070: 2560 cuda cores, GTX 1070: 1920 cuda cores, GTX 1050Ti: 768 cuda cores,  Xeon E5-2696 x 2: 36 cores, core i7 5960X: 8 cores, core i7 8750QH: 6 cores, core i7 7700HQ: 4 cores