So, I managed to get set of 6 Nividia Tesla T4s for $500 each. If I had just set out to do this a month earlier the price would have been $400 to $450 each, but bitcoin hit new highs as I waited for Black Friday deals.
So the system has a total of 390 TFLOPs of fp16, 780 TOPS of INT8, and 1560 TOPs of INT4 with 96GB of ram. There is also a RTX 3080 ti which adds fp16 Tensor 273 TFLOPs, INT8 Tensor 546 TOPs and INT4 Tensor is 1091 TOPS. So total is 663 fp16 TFLOPs , INT8 1326 TOPs, and INT4 2651 TOPs with 108GB of ram.
The total power draw from the 7 GPUs is only 750 watts, where each T4 draws a max of 75watts.
I will probably upgrade the graphics card from the RTX 3080ti to an RTX 3090, maybe a 4090 to get more memory and use the RTX 3080 ti for a VR headset rig.
The CPU is a Ryzen 3960x with 256GB of ram. I purchased this about 2 years ago and so there are some PCIe issues where the MB distribute the bandwith as 16x, 8x, 16x, 8x not full 16x across all slots! To fit six I need a spliter card, which I have one and two more are on order. So, 4 T4s will have 8x bandwidth, while the other 2 will have 4x bandwidth. This is ok since each GPu will be dedicated to a specific specialized model, where feeding data that is text or audio doesn't need as much bandwidth so those will be loaded onto the T4s on the 4x+4x slot.
You might ask why not just upgrade the motherboard, well the TRX40 MBs don't support full 16x on all lanes, but the Threadripper pro MBs do and have 7 PCIe lanes, just what I need! But the cost is pretty high for the CPU and MB right now. So maybe in the future if prices come down. But I'm happy with the rig and it should work as intended.
Below is a pic of the T4s, two are installed where I experimented with cooling them, they are passively cooled cards. The top solution cools best and keeps temps below 50C loaded. You'll notice the splitter card and how I have to space the cards with top and bottom stages. The miner frame is decently priced at $38 on Amazon. The reason for this config is the extension ribbon cables are expensive where the lowest priced 20cm length is $20 to $25 and the 40cm ones are priced at $50 plus, so opted for just one 40cm and the others are 20cm.
The cooling solution is pretty simple you just remove the plastic face cover of the card and place thermal pads on the copper radiator that comes with the T4, place the aluminum heat sink on top and use tie wraps to secure, and simply screw on the fans.