Custom models

Training a custom model

Training large language models is only required when you need to teach the model something extremely niche, like your company's unique knowledge base or specific domain knowledge. Common knowledge, like the colour of the sky, does not require training.

There are several reasons one might choose to train a custom language model:

  1. Domain-specific language: If the language used in a specific domain is unique and differs significantly from a general language, training a custom model specifically for that domain can lead to better results compared to using a general-purpose model.
  2. Better accuracy: A custom model that is trained on a large, diverse and high-quality dataset that is specific to the task can achieve higher accuracy compared to using a pre-trained general-purpose model.
  3. Privacy and data security: Training, a custom model allows you to keep the training data within your own organization, which can be important in industries where data privacy is a concern.
  4. Fine-tuning: pre-trained language models can be fine-tuned to a specific task using a smaller, task-specific dataset, but training a custom model from scratch can give you more control over the data used for training and the model architecture, leading to better performance.
  5. Control over the generated output: Training a custom model allows you to tailor the generated output to specific needs or preferences, such as ensuring the output is in a specific tone or style.

Comparing baseline model to custom model

Token likelihood is a useful tool for model evaluation. For instance, let's say you've trained a custom model and would like to know how much it's improved over the default model - you could use token likelihoods to compare the performance of the models on some held-out text.