Definition: AI chip

An enhanced GPU chip used to train and execute AI systems. The training side (machine learning) requires the most processing, and large language models cause quadrillions of calculations per second to be executed for days, weeks and even months.

The execution side, called "inference," also requires high-performance chips. When people type a prompt into a chatbot, they expect results in a few seconds, and GPU chips are used for AI runtime (inference) as well.

Capitalizing on its vast experience with graphics processors (GPUs) that perform parallel operations, NVIDIA enhanced its GPUs to become the world leader in AI chips, each of which can cost tens of thousands of dollars (see A100, H100, Blackwell and GPU).

New to the AI Chip Race
In 2023, AMD introduced its MI300 AI chips, which are expected to generate considerable revenue for the company (see Instinct MI300). In 2024, Amazon developed the Trainium and Inferentia chips (see Amazon AI chips). There are also smaller companies racing to develop chips for the runtime (inference) side of AI. For example, when people ask ChatGPT or other chatbots a question or request that new content be generated, it is the runtime part of these AI systems that develops the results. See AI training vs. inference. See Tensor chip, neural processing unit, deep learning and Cerebras AI computer.

The Versal System-on-Chip

Today's chips often include AI processing. This Versal system-on-chip (SoC) contains more than 30 billion transistors and provides circuits for AI processing (green area). It also contains programmable hardware, which means the actual circuits are programmed at startup, a rarity on any SoC (red area). See SoC, FPGA and Versal. (Image courtesy of AMD.)

misc

Term of the Moment

integrated GPU

Look Up Another Term

Definition: AI chip