Definition: Tensor Processing Unit

A custom-built AI chip from Google. Introduced in 2016 and used in Google Cloud datacenters, the Tensor Processing Unit (TPU) is designed for matrix multiplication, which is the type of processing that takes place in both AI training and execution (inference). As of 2026, Google is expected to make TPUs available for non-Google datacenters. See Google Cloud Platform.

Running software from Google's TensorFlow AI library, TPU circuits are designed to process huge volumes of low-precision math very quickly (see TF32). NVIDIA also added Tensor processing to its GPUs (see Tensor core).

An ASIC Chip
A TPU is an ASIC, which is a custom-designed chip for a specific process and not a general-purpose computer. See ASIC.

TPUs in a Smartphone
The first datacenter TPUs came out in 2016. Starting in 2019, Pixel phones began using Edge TPUs, which were later integrated into the phone's primary chip (SoC). See Tensor chip.

Ironwood
In 2025, Google introduced Ironwood, its seventh-generation TPU and the first Google TPU designed specifically for inference. Liquid cooled, up to 9,216 Ironwood TPUs can be linked with Google's InterChip Interconnect (ICI).

misc

Term of the Moment

8-track tape

Look Up Another Term

Definition: Tensor Processing Unit