A custom-built chip for machine learning from Google. Introduced in 2016 and found only in Google datacenters, the Tensor Processing Unit (TPU) is optimized for matrix multiplications, which are essential computations in AI training and inference.
Running software from Google's TensorFlow AI library, TPU circuits are designed to process huge volumes of low-precision math very quickly. They do not support the TF32 floating point format (see
TF32).
TPU Versions and Edge Computing
TPUv1 through TPUv4 were the first four versions of the chip from 2016 to 2021. In 2018, a lighter-weight Edge v1 chip designed for AI processing was used in Google's Pixel 4 smartphone (see
Tensor chip).
Ironwood
In 2025, Google introduced Ironwood, its seventh-generation TPU and the first Google TPU designed specifically for inference. Liquid cooled, up to 9,216 Ironwood TPUs can be linked with Google's InterChip Interconnect (ICI). See
Tensor core,
TensorFlow,
PaLM and
neural network.
TPUs vs. Tensor Cores
Tensor Processing Units (TPUs) are custom chips for AI workloads, whereas Tensor cores are processing units within NVIDIA GPUs. See
Tensor core.