Training an AI model is performed in two stages: pre-training and fine tuning.
Training Stage 1 - Pre-Training
In the pre-training phase, the model is fed millions and billions of text samples to learn general language and facts. Massive amounts of content on the Internet have been culled into huge datasets that are used to pre-train large language models (LLMs). See
large language model.
Text is typically turned into mathematical tokens and trillions of tokens can be generated from massive datasets. Text pre-training can take from days to months and uses enormous amounts of computer time. In contrast, when image models are pre-trained on labeled data (this is a dog, this is a cat), they can take only hours or days. See
AI model and
AI token.
Training Stage 2 - Fine-Tuning
Fine-tuning is more targeted, and the data examples are geared to more specific subjects. For example, companies can fine-tune AI models using their own data. It may only take hours or days to fine-tune a model in contrast to weeks and months of pre-training.
Supervised Fine-Tuning and RLHF
Supervised fine-tuning uses input-output pairs; for example, if the input is this, the output should be that. Reinforcement Learning from Human Feedback (RLHF) uses supervised fine-tuning along with human ranking of the generated answers.
Unsupervised Fine-Tuning
The model learns patterns on its own without any labeled data or question/answer pairs. See
AI training vs. inference and
AI cost function.