Sovereign AI

Fine-tune a sovereign LLM

Specialize an open source LLM on your own data, on GPU infrastructure hosted in Europe. LoRA, datasets, sizing, without your data training a third-party model.

Updated June 2026

Talk to an expert See pricing

A generic model knows the world but not your business. Fine-tuning teaches it your vocabulary, your formats, your tone: a model that drafts your reports the way you do, classifies your tickets with your categories, or answers in your industry's jargon.

Done on European-hosted infrastructure, training keeps your data with you, where consumer platforms reuse what you hand them.

Fine-tuning or RAG: which to choose

The two approaches solve different problems, and often combine.

RAG (retrieval-augmented generation): the model fetches information from your documents at query time. Ideal when knowledge changes often (catalog, document base, living FAQ).
Fine-tuning: you adjust the model so it internalizes a style, a format, or a lasting behavior. Ideal when the shape of the answer matters as much as the content.

In practice, many projects start with RAG, then add light fine-tuning when the tone or format doesn't follow.

LoRA: specialize without retraining everything

Retraining a whole model is expensive in GPU. LoRA (and its variants) trains only a small set of additional parameters, which changes everything on a tight budget:

training fits on one or two GPUs instead of a cluster;
the adapters produced weigh a few megabytes, easy to version and swap;
you keep the base model intact and stack several specializations.

It's the default approach to specialize a Mistral or a Llama without a lab budget.

Preparing the dataset

Fine-tuning quality depends on the data first, the hardware second.

Gather representative examples of the task: input/output pairs that show exactly the expected behavior.
Clean and format into a consistent instruction format. A few hundred to a few thousand well-chosen examples beat a large noisy volume.
Keep a validation set aside to measure whether the model truly improves and isn't just memorizing.

Your data doesn't leave the European infrastructure throughout the process.

Sizing the training

Fine-tuning needs more memory than inference, because gradients have to be stored. A few reference points:

a LoRA on a 7B model fits on a 24 GB GPU in most cases;
a larger model needs quantization (QLoRA) or several GPUs;
duration depends on dataset size and the number of passes, from a few minutes to a few hours for a reasonable LoRA.

Bunker provides the GPU and operates the training; you provide the data and the task definition.

Frequently asked questions

Does my training data feed a third-party model?

No. Training happens on dedicated infrastructure in Europe. Your data and the resulting adapter belong to you and are shared with no one.

Do I need a lot of examples?

Not necessarily. For a LoRA, a few hundred to a few thousand quality examples are often enough. Consistency beats volume.

What's the concrete difference from a well-written prompt?

A good prompt takes you far. Fine-tuning takes over when you want stable behavior, without re-explaining the format on every request, across a large number of calls.

Can I take back the specialized model?

Yes. The base model is open source and the adapter belongs to you. You can re-internalise the whole thing onto your own hardware.

Next steps

Deploy Mistral on-premise Serve the model once specialized. Host a private LLM in Europe The general sovereign-AI picture.

Specialize your model on your data

LoRA on GPU in Europe, your data and your adapter stay yours.

Talk to an expert See pricing

By need

Infrastructure

Sovereign AI

On-Premise & migration

Support

Comparisons

Community