Also called an "intelligent processing unit" (IPU), a neural processing unit (NPU) is a set of circuits or an independent chip that is designed to accelerate the execution of AI applications, which is known as "inference." Although GPUs are primarily used at the initial training phases, an NPU may also speed up processing at these early stages (see
AI training vs. inference). See
AI PC.
The NPU accelerates matrix multiplication and non-linear functions, especially in desktop devices with limited computing power compared to datacenter operations.
In 2016, Qualcomm introduced its Zeroth Machine Intelligence Platform and Snapdragon Neural Processing Engine that enable mobile devices to execute AI applications on their own rather than going to the cloud. See
neural network and
neuromorphic computing.