As the successor to the two ampere accelerator cards A100 and A40, Nvidia presented three more models on Monday: A30, A16 and A10. The latter is closely related to the top gaming model GeForce RTX 3090, but focuses on high efficiency instead of maximum computing power. According to the data sheet, the same graphics chip GA102 is used, but restricted to 150 watts (Thermal Design Power, TDP).
The difference of 200 watts comes into play in the clock frequencies: Nvidia mentions 31.2 FP32 TFlops, a good 13 percent less than with the GeForce RTX 3090 (36 TFlops). The information relates to the boost clock, which the A10 is unlikely to maintain as well as the GeForce model under high load. In addition, Nvidia installed slower, but less power-hungry GDDR6 memory instead of GDDR6X-SDRAM. As a result, the transfer rate between the GPU and memory drops from 936 to 600 GB / s.
The A10 is intended as a GPU accelerator without image outputs for servers and data centers. For the applications there, Nvidia enables the processing of data formats such as Bfloat16 via the integrated tensor cores. Basically, all ampere models should be well suited for training neural networks.
HBM2 stack storage with 165 watt TDP
The A30 corresponds to a slimmed-down A100 with HBM2 stack memory – there are no comparable GeForce graphics cards. The computing power of the A30 is almost halved with 5.2 FP64-TFlops and 10.3 FP32-TFlops, and there is less and slower memory installed. 24 GB of HBM2 achieve a total of 933 GB / s, which would be possible with three 8 GB stacks. The fastest version of the A100 uses 80 GByte HBM2e with a little more than 2 TByte / s. The A30 consumes a maximum of 165 watts – the A100 250 to 400 watts, depending on the version.
For the third newcomer in the group, Nvidia is largely silent. The A16 is a multi-GPU card with four mid-range graphics chips and a total of 64 GB of GDDR6 memory. The maximum power consumption is 250 watts. Such a card would be unexciting for gamers due to the lack of multi-GPU optimization; Compute calculations, on the other hand, scale well.
Server with a lot of HBM2 (e)
In the course of the virtual in-house exhibition GTC 2021, Nvidia has announced two more complete server systems. The DGX Station 320G relies on four 80 GByte versions of the A100 accelerator with an added 320 GByte HBM2e-RAM that gives it its name. On the CPU side, everything stays the same with AMD’s 64-core Epyc 7742 from the Zen 2 generation. Cost point: $ 149,000. The DGX Superpod comes in the form of a series of server cabinets with a total of 1125 A100-80G GPUs.