The software in an AI system that does processing for the user. A peculiar name for sure; however, the inference term dates back to very early AI systems and it has not gone away.
Also called "AI runtime," the "inference engine" uses the AI model to answer questions, generate content of all variety, make forecasts, translate languages and more.
In a non-AI system, the inference counterpart is a single application processing data. However, AI machine learning comprises a two-part system: training and execution, the latter called inference. See
inference engine and
AI training vs. inference.