Custom server Tesla K80 running an LLM

Аватар автора
Цифровая Виртуозность
I have built a multi-GPU system for this from Ebay scraps. 4 Tesla K80 GPU 24 GB VRAM 130 USD/EUR a piece X9DRi-LN4F+ Server board, Dual Xeon, 128 GB RAM bought on Ebay for 160 USD/EUR custom frame build with aluminum profiles and a piece of MDF (total cost about 80 USD/EUR) Alibaba mining PSU 1800 Watt currently will upgrade to 2000 Watt (used 70 USD/EUR) Total money spent for one node is about 1000 USD/EUR. GPU So the basic driver was that the K80. Slow but has the best VRAM/money factor. I want to run large models later on. I don&mind if inference takes a 5x times longer, it still will be significantly faster than CPU inference. K80s sell for about 100-140 USD on Ebay. I got mine for a little less than that because i bought batches of 4, however since I am in Europe I had to pay for shipping and taxes.... meh. Cooling: Forget about all these 3d-printed gizmos trying to emulate a server airflow: super-loud, doesn&work very well, plus it&expensive.. Just tape two 80 / 90 mm fans on with aluminium tape (see link above). Cards do not get hotter than 65° Celsius, which is perfectly fine. Mainboard/CPU/RAM I got a bundle with 2 x Intel(R) Xeon(R) CPU E5-2650 0 @ 2.00GHz and 128 GB RAM, which runs fine. A CPU with less cores and higher clock would probably be a better fit for the purpose . I think the minimum RAM requirement for this setup would be 64 GB. Power supply Power requirements for the thing are 4 x 300 W for the GPUs + 500 W for the rest (if you just go for a...

0/0


0/0

0/0

0/0