Term of the Moment

ARP


Look Up Another Term


Definition: Project Colossus


An xAI datacenter for AI training (xAI is Elon Musk's AI company). Colossus is a fitting name because at an estimated total cost of $4 billion, Musk claims it is the largest AI datacenter in the world. More advanced chips from NVIDIA will be used, and by 2030, Colossus is expected to operate with the equivalent power of millions of H100 GPUs.

Revitalizing an old Electrolux vacuum cleaner factory in Memphis, Tennessee, Project Colossus will eventually house more than a hundred thousand AI servers. This "Memphis Supercluster" trained xAI's Grok-3 chatbot. See xAI and AI datacenter.

Colossus and Terafab
Colossus is the xAI datacenter and Terafab is Musk's semiconductor manufacturing facility (see Terafab). See Project Stargate.






100,000 Liquid Cooled GPU Modules
In September 2025, the Colossus datacenter went into operation with 100,000 NVIDIA H100 modules (top). Colossus is expected to double in size to 200,000 modules. SuperMicro's water cooling keeps the required temperature. See H100 and Blackwell. (Image courtesy of xAI, Supermicro Computer, Inc. and NVIDIA.)







A Massive Network
Networking all this equipment is no simple feat. An RDMA fabric ties all servers together in one huge Ethernet running at 400 Gbps (see RDMA and Terabit Ethernet). (Images courtesy of xAI and Supermicro Computer, Inc.)






Launched in Only Four Months
Phase I requires 150 MW (megawatts) of electricity. Combined with 8 MW from the local grid, 14 transportable VoltaGrid generators were installed to launch the datacenter in July 2024 with a third of the H100s. To keep the power uniform, all electricity flows into batteries and then to the equipment. (Image courtesy of VoltaGrid LLC.)