Fine-Tuning

Fine-tuning is the process of training an existing AI model on your own custom dataset so it learns specific behaviors, knowledge, or writing styles. While AI Engine does support fine-tuning through OpenAI, we strongly recommend using embeddings and context-based approaches instead.

Fine-tuning might sound appealing, but in practice it comes with significant challenges:

High effort: You need a large, high-quality dataset (at least 500 rows, ideally 1,000+). Each row must be carefully crafted with realistic questions and well-written answers.
Difficult to maintain: Every time your content changes, you would need to retrain the model. This is generally unrealistic for most businesses.
Unpredictable results: If the dataset isn’t perfect, the model may produce random, poorly formatted, or completely wrong answers.
Cost: Fine-tuning consumes significant resources and the resulting model costs more to run than a standard one.
Limited flexibility: A fine-tuned model only knows what it has been taught. It can’t adapt to new information without retraining.

Use Embeddings Instead

Modern AI models have large context windows that can process thousands of tokens at once. Combined with embeddings, this means you can dynamically provide relevant knowledge to the AI at query time, without any training.

Here’s why embeddings are better for most use cases:

Easy to update: Add, edit, or remove content anytime without retraining.
Accurate: The AI references your actual content, reducing hallucinations.
Fast to set up: You can be up and running in minutes, not days.
Cost-effective: Uses standard models with no additional training costs.

To learn how to set up embeddings, check the Knowledge (Embeddings) section and the Add Embeddings guide.

If You Still Want to Fine-Tune

If you have a specific use case that truly requires fine-tuning (e.g., teaching the model a very specific tone or behavior that can’t be achieved through instructions), here’s what you need to know.

Preparing Your Dataset

Your dataset needs at least 500 rows, but 1,000 to 2,500 is recommended.
Each row is a prompt/completion pair: the prompt is a question (as a visitor would ask it), and the completion is the ideal answer.
Answers should be well-written, in the tone you want your AI to have.
One question per topic is not enough. Train the model with varied and creative phrasings of similar questions.
Include questions that are out of scope too, with answers that politely decline or redirect.

Using Fine-Tuning in AI Engine

AI Engine includes a Dataset Builder that helps you prepare your data for fine-tuning through OpenAI. When using it:

Build your dataset in the Dataset Builder.
Click Format with Defaults to apply the correct formatting. Every row in the dataset must be green (properly formatted).
Submit the dataset for fine-tuning through OpenAI.
Once complete, select your fine-tuned model in the chatbot settings.
Do not set the context or content-aware parameter — the fine-tuned model already contains its knowledge.

Using Dynamic Values

If you need your fine-tuned model to include dynamic information (like an email address or a URL that may change), train the model to reply with placeholders. For example, train it to output {EMAIL} instead of a real email, then replace it dynamically using the mwai_ai_reply filter:

add_filter( 'mwai_ai_reply', function ( $reply, $query ) {
  $reply->replace( '{EMAIL}', "[email protected]" );
  return $reply;
}, 10, 2 );

For more details on modifying replies, see Modify a Reply.

Troubleshooting

If your fine-tuned model produces random or broken answers:

Make sure every row in the dataset was green (properly formatted) before training.
Lower the temperature to 0.1 or 0.2 to make the model more deterministic.
Make your dataset more comprehensive — cover more question variations and edge cases.
Ensure the fine-tuned model is selected in your chatbot settings and that the context/content-aware parameters are disabled.

Summary

For the vast majority of use cases, embeddings + good instructions will give you better results than fine-tuning, with far less effort. Fine-tuning should only be considered as a last resort for very specific behavioral needs that can’t be solved through prompting and context.

Why We Don’t Recommend Fine-Tuning