# https://github.com/mistralai/mistral-common Project Manual

Generated at: 2026-06-10 13:53:16 UTC

## Table of Contents

- [Overview, Installation & Architecture](#page-1)
- [Tokenizers: Tekken, SentencePiece & Multimodal Encoders](#page-2)
- [Instruct Protocol, Validation & Multimodal Requests](#page-3)
- [Experimental Server, Tool Decoding, Guidance & HF Chat Templates](#page-4)

<a id='page-1'></a>

## Overview, Installation & Architecture

### Related Pages

Related topics: [Tokenizers: Tekken, SentencePiece & Multimodal Encoders](#page-2), [Instruct Protocol, Validation & Multimodal Requests](#page-3)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [README.md](https://github.com/mistralai/mistral-common/blob/main/README.md)
- [pyproject.toml](https://github.com/mistralai/mistral-common/blob/main/pyproject.toml)
- [src/mistral_common/__init__.py](https://github.com/mistralai/mistral-common/blob/main/src/mistral_common/__init__.py)
- [src/mistral_common/tokens/tokenizers/base.py](https://github.com/mistralai/mistral-common/blob/main/src/mistral_common/tokens/tokenizers/base.py)
- [src/mistral_common/tokens/tokenizers/instruct.py](https://github.com/mistralai/mistral-common/blob/main/src/mistral_common/tokens/tokenizers/instruct.py)
- [src/mistral_common/protocol/instruct/messages.py](https://github.com/mistralai/mistral-common/blob/main/src/mistral_common/protocol/instruct/messages.py)
- [src/mistral_common/protocol/instruct/request.py](https://github.com/mistralai/mistral-common/blob/main/src/mistral_common/protocol/instruct/request.py)
- [src/mistral_common/protocol/instruct/normalize.py](https://github.com/mistralai/mistral-common/blob/main/src/mistral_common/protocol/instruct/normalize.py)
- [src/mistral_common/integrations/chat_templates/template_generator.py](https://github.com/mistralai/mistral-common/blob/main/src/mistral_common/integrations/chat_templates/template_generator.py)
- [src/mistral_common/guidance/grammar_factory.py](https://github.com/mistralai/mistral-common/blob/main/src/mistral_common/guidance/grammar_factory.py)
</details>

# Overview, Installation & Architecture

## Purpose and Scope

**mistral-common** is the open-source companion library released by Mistral AI that ships the reference tokenizers, validators, and request-normalization logic used to interact with Mistral models. The README states explicitly that the project "open-source[s] the tokenizers, validation and normalization code that can be used with our models", with the goal of guaranteeing that downstream users "can take full advantage of our models" for tokenization (text, images, audio, tool calls) and for validating/normalized chat-style requests built on top of [Pydantic](https://docs.pydantic.dev/latest/). Source: [README.md:7-15]()

The library is intentionally scoped to two audiences:

1. **Application developers** who want to call Mistral models (locally or via API) and need the exact same prompt formatting and special tokens the models were trained on.
2. **Model builders** who want to reproduce Mistral's tokenization and validation behavior in their own training stacks. Source: [README.md:19-24]()

Versioning of tokenizers is a deliberate design choice: each Mistral model release ships with a pinned tokenizer version so that "backward compatibility for the models that we release" is preserved even as the surrounding Python code evolves. Source: [README.md:15-17]()

## Installation and Optional Dependencies

The base package installs with a single command and exposes an `image`, `audio`, `hf-hub`, `sentencepiece`, and experimental `server` extras set:

```sh
pip install mistral-common
```

Each extra can be installed independently. The `sentencepiece` extra is now optional because "we only release `Tekken` tokenizers for recent models"; SentencePiece is therefore only needed when targeting older checkpoints. The `server` extra is marked `[Experimental]`. Source: [README.md:28-45]()

### Optional dependency matrix

| Extra | Purpose | When it is needed |
|---|---|---|
| `image` | Image tokenizers | Vision-capable models |
| `audio` | Audio tokenizers | Audio-capable models |
| `hf-hub` | Download tokenizers from Hugging Face | Loading checkpoints from `mistralai/*` repos |
| `sentencepiece` | SentencePiece tokenizer support | Older (pre-Tekken) models |
| `server` (experimental) | Run tokenizers in a server mode | Hosting tokenization as a service |

Source: [README.md:35-44]()

> **Community note (Issue #15):** Pydantic version conflicts are a recurring source of install failures. Downstream projects that pin a specific `pydantic` version can collide with `mistral-common`'s Pydantic constraint because the library's validation layer is built directly on Pydantic models. Source: [README.md:11-13]()

## High-Level Architecture

The codebase is organized into three loosely-coupled layers: a **protocol layer** (Pydantic data classes for messages, tools, and requests), a **tokenizer layer** (concrete tokenizer implementations driven by a versioned base class), and an **integration layer** (chat-template generation, grammar/guidance, and OpenAI compatibility shims).

```mermaid
flowchart TB
    A[ChatCompletionRequest<br/>OpenAI / dict] -->|from_openai| B[Protocol Layer<br/>messages, tools, requests]
    B --> C[InstructRequestNormalizer<br/>v1..v15]
    C --> D[InstructTokenizer<br/>encode_user / encode_assistant / encode_tool]
    D --> E[Base Tokenizer<br/>SentencePiece or Tekken]
    E --> F[Tokenized<br/>tokens + text + images/audios]

    G[TemplateConfig] --> H[template_generator<br/>Jinja2]
    B --> H
    H --> I[Grammar factory<br/>Lark grammars]

    style B fill:#eef
    style D fill:#efe
    style H fill:#fee
```

The `Tokenizer` abstract base class defines the vocabulary surface every implementation must expose (`n_words`, `vocab`, `bos_id`, `eos_id`, `pad_id`, `unk_id`, `encode`, `decode`, `is_special`, etc.) and carries a `model_settings_builder` reference so the same tokenizer object can be queried for version-specific behavior. Source: [src/mistral_common/tokens/tokenizers/base.py:32-100]()

## Core Modules

### Tokenizer layer

`Tokenizer` is an `ABC` whose subclasses (`SentencePieceTokenizer`, `Tekkenizer`, etc.) plug into an `InstructTokenizer` that knows how to render the prompt template for a specific tokenizer version (`v1`, `v2`, `v3`, `v7`, `v11`, `v13`, `v15`...). Each `InstructTokenizer` provides `encode_user_message`, `encode_assistant_message`, and `encode_tool_message` methods that consume the strongly-typed messages from the protocol layer and emit `Tokenized` outputs (token ids, text, optional images/audios). Source: [src/mistral_common/tokens/tokenizers/instruct.py:46-91](), [src/mistral_common/tokens/tokenizers/base.py:55-58]()

The `SpecialTokens` enum centralizes all control tokens used across versions (`[INST]`, `[/INST]`, `[AVAILABLE_TOOLS]`, `[TOOL_RESULTS]`, `<s>`, `</s>`, plus image, audio, and thinking markers such as `[THINK]`/`[/THINK]`). Source: [src/mistral_common/tokens/tokenizers/base.py:14-67]()

> **Community note (Issue #3):** A regression was reported where a missing space between `<s>` and `[INST]` caused SentencePiece tokenizers to emit malformed instruct prefixes. The tokenizer layer is precisely the place such template-formatting bugs surface, since prompt rendering is performed at the boundary between the abstract `Tokenizer` and the per-version `InstructTokenizer`.

### Protocol layer

Pydantic models define the canonical chat schema. `UserMessage`, `AssistantMessage`, `ToolMessage`, and `SystemMessage` each carry a `to_openai` / `from_openai` pair so the library can interop with any OpenAI-compatible client. The `Roles` enum locks down the four legal message roles. Source: [src/mistral_common/protocol/instruct/messages.py:62-138]()

`InstructRequest` wraps a list of messages plus an optional `available_tools` list and exposes a `to_openai` helper used by the chat-completion entry point. Source: [src/mistral_common/protocol/instruct/request.py:18-50]()

### Normalization layer

`InstructRequestNormalizer` and its versioned subclasses (`InstructRequestNormalizerV7`, etc.) validate that a `ChatCompletionRequest` is well-formed and aggregate consecutive same-role messages before they reach the tokenizer. For pre-v15 normalizers, `build_settings` returns `ModelSettings` with all fields `None`; for v15+ it actually constructs a settings object from the request. Source: [src/mistral_common/protocol/instruct/normalize.py:29-90]()

### Integration layer

`template_generator` emits a Jinja2 chat template from a `TemplateConfig` dataclass that captures the tokenizer version, `spm` flag, image/audio/thinking support, and whether BOS/EOS should be emitted as Jinja variables (`bos_token`/`eos_token`) or as literal strings. The config object enforces mutual exclusivity (`image_support` and `audio_support` cannot both be true; `thinking_support` and `plain_thinking_support` cannot both be true) by raising `ValueError` at construction time. Source: [src/mistral_common/integrations/chat_templates/template_generator.py:1-58]()

The `guidance.grammar_factory` module consumes the rendered Jinja template plus a tool list and produces a Lark grammar suitable for constrained decoding, honoring the tokenizer's tool-calling mode (`auto`, `any`, `none`, or a `NamedToolChoice`). Source: [src/mistral_common/guidance/grammar_factory.py:21-48]()

## Typical Request Flow

1. A user constructs or receives an OpenAI-style `ChatCompletionRequest` (or a list of dicts).
2. `InstructRequest.from_openai` (or the equivalent normalizer entry point) validates the payload and turns it into a strongly-typed `InstructRequest`. Source: [src/mistral_common/protocol/instruct/request.py:18-50]()
3. The version-appropriate `InstructTokenizer` encodes each message and concatenates the resulting tokens according to that tokenizer version's template. Source: [src/mistral_common/tokens/tokenizers/instruct.py:46-91]()
4. The `Tokenized` output is fed to the model; the same library can re-emit it as OpenAI JSON via `to_openai` for logging or round-tripping. Source: [src/mistral_common/protocol/instruct/messages.py:62-138]()

## See Also

- [Tokenizers & Special Tokens](tokenizers-and-special-tokens.md)
- [Instruct Protocol & Messages](instruct-protocol.md)
- [Chat Templates & Grammar Generation](chat-templates-and-grammars.md)
- [Versioning & Backward Compatibility](versioning.md)

---

<a id='page-2'></a>

## Tokenizers: Tekken, SentencePiece & Multimodal Encoders

### Related Pages

Related topics: [Overview, Installation & Architecture](#page-1), [Instruct Protocol, Validation & Multimodal Requests](#page-3), [Experimental Server, Tool Decoding, Guidance & HF Chat Templates](#page-4)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [src/mistral_common/tokens/tokenizers/base.py](https://github.com/mistralai/mistral-common/blob/main/src/mistral_common/tokens/tokenizers/base.py)
- [src/mistral_common/tokens/tokenizers/tekken.py](https://github.com/mistralai/mistral-common/blob/main/src/mistral_common/tokens/tokenizers/tekken.py)
- [src/mistral_common/tokens/tokenizers/sentencepiece.py](https://github.com/mistralai/mistral-common/blob/main/src/mistral_common/tokens/tokenizers/sentencepiece.py)
- [src/mistral_common/tokens/tokenizers/instruct.py](https://github.com/mistralai/mistral-common/blob/main/src/mistral_common/tokens/tokenizers/instruct.py)
- [src/mistral_common/tokens/tokenizers/mistral.py](https://github.com/mistralai/mistral-common/blob/main/src/mistral_common/tokens/tokenizers/mistral.py)
- [src/mistral_common/tokens/tokenizers/image.py](https://github.com/mistralai/mistral-common/blob/main/src/mistral_common/tokens/tokenizers/image.py)
- [src/mistral_common/tokens/tokenizers/utils.py](https://github.com/mistralai/mistral-common/blob/main/src/mistral_common/tokens/tokenizers/utils.py)
- [src/mistral_common/experimental/think.py](https://github.com/mistralai/mistral-common/blob/main/src/mistral_common/experimental/think.py)
- [src/mistral_common/guidance/grammar_factory.py](https://github.com/mistralai/mistral-common/blob/main/src/mistral_common/guidance/grammar_factory.py)
- [src/mistral_common/integrations/chat_templates/template_generator.py](https://github.com/mistralai/mistral-common/blob/main/src/mistral_common/integrations/chat_templates/template_generator.py)
- [README.md](https://github.com/mistralai/mistral-common/blob/main/README.md)
</details>

# Tokenizers: Tekken, SentencePiece & Multimodal Encoders

## Overview

`mistral-common` exposes the open-source tokenization, validation, and normalization stack that powers Mistral AI models. At its core sits a pluggable `Tokenizer` abstraction implemented by `Tekkenizer` (a fast BPE tokenizer based on [tiktoken](https://github.com/openai/tiktoken)) and `SentencePieceTokenizer` (a wrapper over SentencePiece). On top of those base encoders, an `InstructTokenizer` layer applies chat templates, tool-call syntax, and multimodal encoding, with versioned behaviors from V1 through V15. Source: [src/mistral_common/tokens/tokenizers/base.py:64-150]().

The library ships first-class multimodal encoders for images and audio. Recent models are released with `Tekken` tokenizers, and the `sentencepiece` install extra is now marked optional in the README. Source: [README.md:8-20]().

## Class Hierarchy

```mermaid
classDiagram
    class Tokenizer {
        <<abstract>>
        +n_words: int
        +special_ids: set
        +vocab() list
        +encode(s, bos, eos) list
        +decode(tokens) str
    }
    class Tekkenizer
    class SentencePieceTokenizer
    class InstructTokenizerBase {
        +tokenizer: Tokenizer
        +image_encoder
        +audio_encoder
        +encode_instruct()
    }
    class InstructTokenizerV1
    class InstructTokenizerV3
    class InstructTokenizerV7
    class InstructTokenizerV11
    class InstructTokenizerV13
    class InstructTokenizerV15
    Tokenizer <|-- Tekkenizer
    Tokenizer <|-- SentencePieceTokenizer
    InstructTokenizerBase <|-- InstructTokenizerV1
    InstructTokenizerBase <|-- InstructTokenizerV3
    InstructTokenizerBase <|-- InstructTokenizerV7
    InstructTokenizerBase <|-- InstructTokenizerV11
    InstructTokenizerBase <|-- InstructTokenizerV13
    InstructTokenizerBase <|-- InstructTokenizerV15
    InstructTokenizerBase o-- Tokenizer
    InstructTokenizerBase o-- ImageEncoder
    InstructTokenizerBase o-- AudioEncoder
```

## Base Tokenizer Interface

The `Tokenizer` ABC defines the contract every implementation must satisfy: vocabulary queries (`n_words`, `vocab`, `id_to_piece`), special-token lookups (`bos_id`, `eos_id`, `pad_id`, `unk_id`), and the round-trip methods `encode(s, bos, eos)` / `decode(tokens, special_token_policy)`. Source: [src/mistral_common/tokens/tokenizers/base.py:64-150]().

The `Tokenized` dataclass is the canonical return type. It carries `tokens: list[int]`, an optional `text` preview, FIM `prefix_ids`, and parallel `images: list[np.ndarray]` / `audios: list[Audio]` lists. Pydantic's `ConfigDict(arbitrary_types_allowed=True)` is used so that numpy arrays and audio dataclasses can sit alongside primitive token ids. Source: [src/mistral_common/tokens/tokenizers/base.py:30-60]().

The `SpecialTokens` namespace enumerates every control token the encoders know about: `bos = "<s>"`, `eos = "</s>"`, `begin_inst = "[INST]"`, `begin_tools = "[AVAILABLE_TOOLS]"`, `tool_calls`, `img`, `img_break`, `img_end`, `begin_audio`, `begin_think`, `end_think`, FIM markers (`prefix`, `suffix`, `middle`), and audio streaming markers (`streaming_pad`, `streaming_word`, `text_to_audio`, `audio_to_text`). Source: [src/mistral_common/tokens/tokenizers/base.py:155-210]().

A high-engagement community report (Issue #3) flagged that the SentencePiece path was missing a separator between `<s>` and `[INST]`, which prevented the model from emitting the proper instruct template; the fix lives in [src/mistral_common/tokens/tokenizers/sentencepiece.py]() and is one reason chat-templated decoding must remain versioned.

## Tekken Tokenizer

`Tekkenizer` is the high-throughput BPE tokenizer used by recent Mistral models. It loads a `tekken.json` artifact containing a `ModelData` record with `vocab: list[TokenInfo]`, `special_tokens: list[SpecialTokenInfo]`, a `TekkenConfig` (regex `pattern`, `num_vocab_tokens`, `default_vocab_size`, `default_num_special_tokens`, and a `version` string), and per-modality `image: ImageConfig` / `audio: AudioConfig` blocks. Source: [src/mistral_common/tokens/tokenizers/tekken.py:50-110]().

Each `TokenInfo` stores `rank` plus base64-encoded `token_bytes`; `SpecialTokenInfo` adds an `is_control` flag. A `DEPRECATED_SPECIAL_TOKENS` tuple preserves backward compatibility for the legacy control-token list. Source: [src/mistral_common/tokens/tokenizers/tekken.py:30-50]().

The factory helper `download_tokenizer_from_hf_hub` (imported by `mistral.py`) pulls these JSON files from the Hugging Face Hub, gated behind the `hf-hub` extra. Source: [src/mistral_common/tokens/tokenizers/mistral.py:1-30]().

## SentencePiece Tokenizer

`SentencePieceTokenizer` is a thin adapter over the SentencePiece runtime. Valid tokenizer filenames are detected by the suffix scheme `<name>.model.<version>[mm]` together with a bare `tekken.json`; if multiple files are present, `tekken.json` wins, otherwise the highest versioned SPM model is selected. Source: [src/mistral_common/tokens/tokenizers/utils.py:30-60]().

The tokenizer exposes a `get_image_config` helper and an `is_sentencepiece` predicate used to dispatch to the SPM-specific instruct branches. For example, the v3-SPM chat template differs from the v3-Tekken template only in its tool-call branch (`uses_v2_v3spm_tool_branch`). Source: [src/mistral_common/integrations/chat_templates/template_generator.py:240-260]().

## Multimodal Encoders

`InstructTokenizerBase` composes a base `Tokenizer` with optional `ImageEncoder` and `AudioEncoder` instances. The `mistral.py` factory wires the correct special ids: image encoding uses `img`, `img_break`, and `img_end` (collected in a `SpecialImageIDs` dataclass), while audio uses `audio`, `begin_audio`, plus transcribe/text-to-audio/audio-to-text tokens when present, stored in `SpecialAudioIDs`. Source: [src/mistral_common/tokens/tokenizers/mistral.py:18-50]().

The audio encoder is only loaded for `Tekkenizer` instances, reflecting Mistral's current release policy. Image chunks are emitted with `[IMG]` markers and processed by `ImageEncoder`; the v15+ template also accepts thinking chunks via the experimental `ThinkChunk` API, and the `_split_content_and_think_chunks` helper reconstructs them by scanning token streams for `begin_think` / `end_think` markers, raising on nested or unbalanced think blocks. Source: [src/mistral_common/experimental/think.py:1-40]().

## Instruct Tokenizer Variants

`InstructTokenizerBase` is the versioned entry point. `find_first_last_user` locates the first and last user messages, `_truncate_for_max_tokens` enforces context limits, and `encode_instruct` walks the conversation applying role-specific encoders. The v15 release fixed `continue_final_message` so that prepended reasoning chunks are not lost on continuation. Source: [src/mistral_common/tokens/tokenizers/instruct.py:30-90]().

Versioned subclasses (V1, V3, V7, V11, V13, V15) differ in tool-call syntax, image/audio support, and the placement of `ThinkChunk` markers. The `TemplateConfig` class centralizes the rules — for example, `forbids_assistant_content_with_tools` is True only for V2/V3, and `system_supports_thinking` is pre-v15 only. Source: [src/mistral_common/integrations/chat_templates/template_generator.py:200-260]().

The Guidance integration reuses these templates to build Lark grammars. `GrammarFactory.from_template` renders the Jinja template (passing `fcall`, `mode`, `json_schema`, `parallel_tool_calls`) and combines the result with optional `begin_think` / `end_think` token allow-lists when the model supports `model_settings`. Source: [src/mistral_common/guidance/grammar_factory.py:80-110]().

## Community Notes

- **Issue #15 (Pydantic version conflict)** — Because tokenizers depend on Pydantic v2 models such as `Tokenized` and `InstructRequest`, a tight `pydantic<2.7` constraint can break downstream consumers; pin `pydantic` ranges carefully when mixing with other packages. Source: [src/mistral_common/tokens/tokenizers/base.py:30-60]().
- **Issue #3 (`<s>` / `[INST]` spacing)** — Tracked the missing separator in the SPM tokenizer; fixed in `src/mistral_common/tokens/tokenizers/sentencepiece.py`.
- **Release v1.11.3** — Added multi-format reasoning parsing to `from_openai` and preserved zero OpenAI seeds in chat request conversion; also fixed `continue_final_message` so prepended reasoning chunks survive continuation.

## See Also

- Chat template generator: versioned Jinja templates and `TemplateConfig`
- Pydantic protocol layer: validation and normalization of `InstructRequest` / `AssistantMessage`
- Multimodal encoders: image, audio, and FIM encoding internals
- Guidance integration: Lark grammar generation from chat templates

---

<a id='page-3'></a>

## Instruct Protocol, Validation & Multimodal Requests

### Related Pages

Related topics: [Tokenizers: Tekken, SentencePiece & Multimodal Encoders](#page-2), [Experimental Server, Tool Decoding, Guidance & HF Chat Templates](#page-4)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [src/mistral_common/protocol/instruct/request.py](https://github.com/mistralai/mistral-common/blob/main/src/mistral_common/protocol/instruct/request.py)
- [src/mistral_common/protocol/instruct/messages.py](https://github.com/mistralai/mistral-common/blob/main/src/mistral_common/protocol/instruct/messages.py)
- [src/mistral_common/protocol/instruct/chunk.py](https://github.com/mistralai/mistral-common/blob/main/src/mistral_common/protocol/instruct/chunk.py)
- [src/mistral_common/protocol/instruct/tool_calls.py](https://github.com/mistralai/mistral-common/blob/main/src/mistral_common/protocol/instruct/tool_calls.py)
- [src/mistral_common/protocol/inject/validator.py](https://github.com/mistralai/mistral-common/blob/main/src/mistral_common/protocol/instruct/validator.py)
- [src/mistral_common/protocol/instruct/normalize.py](https://github.com/mistralai/mistral-common/blob/main/src/mistral_common/protocol/instruct/normalize.py)
- [src/mistral_common/tokens/tokenizers/instruct.py](https://github.com/mistralai/mistral-common/blob/main/src/mistral_common/tokens/tokenizers/instruct.py)
- [src/mistral_common/tokens/tokenizers/mistral.py](https://github.com/mistralai/mistral-common/blob/main/src/mistral_common/tokens/tokenizers/mistral.py)
- [src/mistral_common/tokens/tokenizers/base.py](https://github.com/mistralai/mistral-common/blob/main/src/mistral_common/tokens/tokenizers/base.py)
- [src/mistral_common/integrations/chat_templates/template_generator.py](https://github.com/mistralai/mistral-common/blob/main/src/mistral_common/integrations/chat_templates/template_generator.py)
- [README.md](https://github.com/mistralai/mistral-common/blob/main/README.md)
</details>

# Instruct Protocol, Validation & Multimodal Requests

## Overview

The **Instruct Protocol** is the canonical data model used by `mistral-common` to represent, validate, and normalize chat-style requests sent to Mistral AI models. According to the [README.md](https://github.com/mistralai/mistral-common/blob/main/README.md), the library provides "validation and normalization of requests, messages, tool calls, and responses" built on top of Pydantic, plus tokenization of text, images, and tool calls. Community issue [#15](https://github.com/mistralai/mistral-common/issues/15) highlights a recurring concern about Pydantic version compatibility, so users should pin compatible Pydantic versions when depending on this layer.

The protocol layers three concerns:

1. **Schema** — Pydantic models defining what a valid request looks like (`ChatCompletionRequest`, `InstructRequest`, message and tool types).
2. **Validation / Normalization** — rules that reject malformed requests and massage them into a canonical form ready for tokenization.
3. **Multimodal content** — `Chunk` types that let a single message carry text, images, audio, tool calls, and "thinking" segments together.

## Request and Message Schemas

The top-level entry points are `ChatCompletionRequest` and the simpler `InstructRequest`, both defined in [src/mistral_common/protocol/instruct/request.py](https://github.com/mistralai/mistral-common/blob/main/src/mistral_common/protocol/instruct/request.py). `ChatCompletionRequest` extends the OpenAI-style shape with Mistral-specific fields:

| Field | Purpose |
|-------|---------|
| `messages` | Ordered list of `ChatMessageType` (user, assistant, system, tool). |
| `tools` / `tool_choice` | Function-calling definitions and selection policy. |
| `response_format` | Text vs. JSON-object response formatting. |
| `truncate_for_context_length` | Auto-trim messages if they exceed the model window. |
| `continue_final_message` | Resume generation from the last assistant turn (fixed in v1.11.3 per release notes). |
| `reasoning_effort` | Controls how much reasoning effort the model should apply. |

`to_openai()` on the request converts messages and tools into the OpenAI wire format with an optional `reasoning_field_format` argument, which was expanded in v1.11.3 to support multiple reasoning chunk formats when converting `from_openai`.

Message types live in [src/mistral_common/protocol/instruct/messages.py](https://github.com/mistralai/mistral-common/blob/main/src/mistral_common/protocol/instruct/messages.py) (`UserMessage`, `AssistantMessage`, `ToolMessage`, `SystemMessage`). The `AssistantMessage` is notable for carrying both `content` and an optional `prefix` flag — when `prefix=True`, generation continues the assistant's text without re-emitting a closing turn marker. The `tool_calls` field is defined in [src/mistral_common/protocol/instruct/tool_calls.py](https://github.com/mistralai/mistral-common/blob/main/src/mistral_common/protocol/instruct/tool_calls.py) and supports typed function calls with JSON-schema parameters.

## Multimodal Content and Chunks

A message's `content` may be a plain string or a list of `Chunk` objects, defined in [src/mistral_common/protocol/instruct/chunk.py](https://github.com/mistralai/mistral-common/blob/main/src/mistral_common/protocol/instruct/chunk.py). The `ChunkTypes` enum discriminates the supported variants:

- `text` — plain text with a `text` field.
- `image` — base64 or URL reference, lazily decoded into a `SerializableImage`.
- `audio` — base64/URL reference; format is auto-detected via `soundfile` (see `_detect_audio_format`).
- `tool_call` — emitted alongside assistant text when a tool is invoked.
- `think` — reasoning segment emitted between `<think>`/`</think>`-style markers.

Special tokens used by chunk processing are defined in [src/mistral_common/tokens/tokenizers/base.py](https://github.com/mistralai/mistral-common/blob/main/src/mistral_common/tokens/tokenizers/base.py) as `SpecialTokens` literals: `begin_inst` (`[INST]`), `begin_tools` (`[AVAILABLE_TOOLS]`), `img`, `img_break`, `img_end`, `begin_audio`, `begin_think`, `end_think`, plus `bos` (`<s>`) and `eos` (`</s>`). Community issue [#3](https://github.com/mistralai/mistral-common/issues/3) noted a missing space between `<s>` and `[INST]` in the SentencePiece tokenizer — a subtle formatting bug that prevented models from emitting the proper instruct template.

## Validation and Normalization Pipeline

The pipeline converts an incoming `ChatCompletionRequest` into a tokenizer-ready `InstructRequest` (and eventually token IDs). The orchestration happens in [src/mistral_common/tokens/tokenizers/mistral.py](https://github.com/mistralai/mistral-common/blob/main/src/mistral_common/tokens/tokenizers/mistral.py), which wires together an `InstructTokenizer`, a `MistralRequestValidator`, and an `InstructRequestNormalizer`:

```mermaid
flowchart LR
    A[ChatCompletionRequest] --> B[MistralRequestValidator]
    B --> C[InstructRequestNormalizer]
    C --> D[InstructRequest]
    D --> E[InstructTokenizer]
    E --> F[Tokenized: tokens + images + prefix_ids]
```

- **Validator** ([src/mistral_common/protocol/instruct/validator.py](https://github.com/mistralai/mistral-common/blob/main/src/mistral_common/protocol/instruct/validator.py)) enforces structural invariants — non-empty assistant turns, role ordering, tool-result semantics, and `continue_message` requiring `prefix=False`.
- **Normalizer** ([src/mistral_common/protocol/instruct/normalize.py](https://github.com/mistralai/mistral-common/blob/main/src/mistral_common/protocol/instruct/normalize.py)) materializes the version-specific `InstructRequest` subclass and, for v15+ tokenizers, populates `ModelSettings` (e.g. truncation policy). For pre-v15 normalizers, `build_settings` returns all-`None` defaults.
- **Encoder** ([src/mistral_common/tokens/tokenizers/instruct.py](https://github.com/mistralai/mistral-common/blob/main/src/mistral_common/tokens/tokenizers/instruct.py)) turns the normalized request into token IDs by emitting role-specific headers, tool definitions, and content chunks. It returns a `Tokenized` record whose `images` field carries the loaded `np.ndarray` arrays associated with image chunks.

## Chat Template Generation

For open-source Mistral releases, [src/mistral_common/integrations/chat_templates/template_generator.py](https://github.com/mistralai/mistral-common/blob/main/src/mistral_common/integrations/chat_templates/template_generator.py) generates a Jinja2 chat template that mirrors the in-Python encoder. A `TemplateConfig` selects the tokenizer version (`v1`, `v2`, `v3`, `v7`, `v11`, `v13`, `v15`) and feature toggles (`spm`, `image_support`, `audio_support`, `thinking_support`, `plain_thinking_support`). The generator emits message-aggregation logic that joins consecutive same-role messages with `\n\n`, supports tool-call inline branches for v2/v3-SPM, and routes `<think>` chunks either through special tokens or plain `<think>`/`</think>` markers. This file is what enables downstream runtimes (vLLM, TGI, llama.cpp) to render prompts that round-trip with `mistral-common`'s own encoder.

## Failure Modes and Edge Cases

Common pitfalls when working with this layer include:

- **Version-specific tool rules.** Per the `forbids_assistant_content_with_tools` property, v2 and v3 tokenizers reject assistant messages that carry both `content` and `tool_calls` simultaneously.
- **Continue-final-message guard.** The encoder raises `InvalidAssistantMessageException` when `continue_message=True` but `prefix=False`.
- **Audio/image exclusivity.** The template generator raises `ValueError` if `image_support` and `audio_support` are both enabled.
- **SPM compatibility.** Recent models release `Tekken` tokenizers; the `sentencepiece` extra is now optional (per [README.md](https://github.com/mistralai/mistral-common/blob/main/README.md)).

## See Also

- [Tokenizer Base & Special Tokens](./tokenizer-base.md)
- [Mistral Tokenizer & Encoding](./mistral-tokenizer.md)
- [Jinja Chat Template Generator](./chat-templates.md)
- [Tool Calls & Function Calling](./tool-calls.md)

---

<a id='page-4'></a>

## Experimental Server, Tool Decoding, Guidance & HF Chat Templates

### Related Pages

Related topics: [Tokenizers: Tekken, SentencePiece & Multimodal Encoders](#page-2), [Instruct Protocol, Validation & Multimodal Requests](#page-3)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this page:

- [src/mistral_common/experimental/app/main.py](https://github.com/mistralai/mistral-common/blob/main/src/mistral_common/experimental/app/main.py)
- [src/mistral_common/experimental/app/models.py](https://github.com/mistralai/mistral-common/blob/main/src/mistral_common/experimental/app/models.py)
- [src/mistral_common/experimental/app/routers.py](https://github.com/mistralai/mistral-common/blob/main/src/mistral_common/experimental/app/routers.py)
- [src/mistral_common/experimental/tools.py](https://github.com/mistralai/mistral-common/blob/main/src/mistral_common/experimental/tools.py)
- [src/mistral_common/experimental/think.py](https://github.com/mistralai/mistral-common/blob/main/src/mistral_common/experimental/think.py)
- [src/mistral_common/experimental/utils.py](https://github.com/mistralai/mistral-common/blob/main/src/mistral_common/experimental/utils.py)
- [src/mistral_common/integrations/chat_templates/template_generator.py](https://github.com/mistralai/mistral-common/blob/main/src/mistral_common/integrations/chat_templates/template_generator.py)
- [src/mistral_common/guidance/grammar_factory.py](https://github.com/mistralai/mistral-common/blob/main/src/mistral_common/guidance/grammar_factory.py)
- [src/mistral_common/protocol/instruct/normalize.py](https://github.com/mistralai/mistral-common/blob/main/src/mistral_common/protocol/instruct/normalize.py)
- [src/mistral_common/protocol/instruct/messages.py](https://github.com/mistralai/mistral-common/blob/main/src/mistral_common/protocol/instruct/messages.py)
- [README.md](https://github.com/mistralai/mistral-common/blob/main/README.md)
</details>

# Experimental Server, Tool Decoding, Guidance & HF Chat Templates

## Overview

`mistral-common` exposes four closely related capabilities under its `experimental` namespace and integration layer: a FastAPI-based tokenizer server, tool/function-call decoding helpers, a `guidance`-backed grammar factory for constrained generation, and a programmatic Hugging Face `tokenizer_config.json` chat template generator. The README explicitly marks the server mode as `[Experimental]` and exposes it via the `pip install "mistral-common[server]"` extra, while optional `[hf-hub]` and `[sentencepiece]` extras cover the chat-template and SentencePiece paths used by the integrations code. Source: [README.md:23-43]()

These features share a common goal: let third-party runtimes (vLLM, SGLang, TGI, raw `transformers`) consume Mistral instruct formats without re-implementing the special-token grammar or the version-aware tool/thinking conventions. The `experimental/` package is the staging area where new behaviors land before being promoted into the stable `protocol/` and `tokens/` layers.

## Experimental Server

The experimental server lives under `src/mistral_common/experimental/app/` and follows a standard FastAPI layout: `main.py` wires the ASGI app, `routers.py` defines the HTTP routes, and `models.py` declares the request/response Pydantic schemas. Source: [src/mistral_common/experimental/app/main.py](), [src/mistral_common/experimental/app/routers.py](), [src/mistral_common/experimental/app/models.py]()

Its purpose is to expose a tokenization HTTP endpoint backed by `MistralTokenizer`, decoupling the heavy tokenizer state from the inference process. The server is invoked only when the user installs the `server` extra: `pip install "mistral-common[server]"`, and the README explicitly tags it `[Experimental]`, meaning the API surface and payload schema may change between minor versions. Source: [README.md:25-32]()

```mermaid
flowchart LR
    Client[HTTP Client] -->|POST /v1/tokenize| Router[routers.py]
    Router --> Models[models.py Pydantic schemas]
    Router --> Server[main.py FastAPI app]
    Server --> MT[MistralTokenizer]
    MT --> Tekken[Tekken / SentencePiece backend]
    Server -->|tokens, text, prefix_ids| Client
```

The utility helpers in `experimental/utils.py` and the thinking-aware logic in `experimental/think.py` are consumed by the routers to keep the server endpoints aligned with the latest `ReasoningFieldFormat` enum (`thinking_chunks | reasoning | reasoning_content`) used by `AssistantMessage.to_openai`. Source: [src/mistral_common/experimental/utils.py](), [src/mistral_common/experimental/think.py](), [src/mistral_common/protocol/instruct/messages.py:ReasoningFieldFormat]()

## Tool Decoding

`src/mistral_common/experimental/tools.py` provides the tool-decoding primitives used both by the server and by external callers that want to parse Mistral tool-call streams back into structured `ToolCall` objects. The companion `experimental/think.py` complements it by separating reasoning traces from tool-call deltas. Source: [src/mistral_common/experimental/tools.py](), [src/mistral_common/experimental/think.py]()

Tool decoding must respect the same version constraints enforced by the normalizer. `InstructRequestNormalizer.forbids_assistant_content_with_tools` returns `True` for `TokenizerVersion.v2` and `v3`, meaning the decoder must reject assistant messages that mix `content` and `tool_calls` on those versions. Conversely, `validates_assistant_non_empty` is enabled from v3 non-SPM and v7+, and `uses_v2_v3spm_tool_branch` selects the inline elif branch for `v2` and `v3-SPM` tool syntax. Source: [src/mistral_common/integrations/chat_templates/template_generator.py:forbids_assistant_content_with_tools,validates_assistant_non_empty,uses_v2_v3spm_tool_branch]()

The protocol-level definition of a `Tool` (a `Function` with JSON-schema `parameters`) is the contract `experimental/tools.py` is expected to honor. The `InstructRequest(...).to_openai()` docstring demonstrates this round-trip with a `get_current_weather` tool. Source: [src/mistral_common/protocol/instruct/request.py:InstructRequest to_openai docstring]()

## Guidance Grammar Factory

`src/mistral_common/guidance/grammar_factory.py` exposes `GrammarFactory`, which renders a Lark grammar string for a given `MistralTokenizer` so that `guidance`/`llguidance` engines can perform token-constrained decoding. The factory supports three function-calling modes (`auto`, `any`, `none`), `NamedToolChoice`, optional `parallel_tool_calls`, and an additional `json_schema` to union in. Source: [src/mistral_common/guidance/grammar_factory.py:GrammarFactory.encode_for_guided_decoding]().

The constructor asserts that both `llguidance` and `jinja2` are installed (optional dependencies) and refuses to operate on tokenizers it does not understand. `GrammarFactory.is_supported` requires a Tekken tokenizer with `version >= TokenizerVersion.v11`, so legacy SentencePiece models must be upgraded before guided decoding is available. Source: [src/mistral_common/guidance/grammar_factory.py:is_supported,__init__]()

When a tokenizer exposes the `begin_think`/`end_think` special tokens, the factory injects them into the generated grammar so that thinking traces can be emitted alongside JSON tool calls (`think_with_json` branch). The `_convert_tool_calls` helper produces a per-tool `TOOL_CALL_GRAMMAR` fragment, parenthesizes and alternates each entry, and suffixes a `+` when `parallel_tool_calls=True`. Source: [src/mistral_common/guidance/grammar_factory.py:_convert_tool_calls]()

## HF Chat Template Generator

The integration under `src/mistral_common/integrations/chat_templates/template_generator.py` programmatically builds a tokenizer-version-aware Jinja2 chat template that can be embedded into a Hugging Face `tokenizer_config.json`. It is driven by a `TemplateConfig` dataclass that captures the `TokenizerVersion`, `spm` flag, and the boolean toggles for `image_support`, `audio_support`, `thinking_support`, `plain_thinking_support`, and `use_special_token_variables`. Source: [src/mistral_common/integrations/chat_templates/template_generator.py:TemplateConfig]()

| Config flag | Effect on generated template |
|---|---|
| `version` (v1–v15) | Selects special-token layout, tool-call branch, and message-aggregation rule |
| `spm` | Adds trailing spaces after special tokens; blocked on v11+ and with audio |
| `image_support` (v3+) | Adds `[IMG]` chunk processing; mutually exclusive with audio |
| `audio_support` (v7+) | Adds `[AUDIO]` chunk processing; mutually exclusive with image |
| `thinking_support` (v13+) | Emits `[THINK]/[/THINK]` around reasoning chunks |
| `plain_thinking_support` (v11 only) | Uses `<think>`/`</think>` literal tags instead of special tokens |
| `use_special_token_variables` | Emits `bos_token`/`eos_token` as Jinja variables rather than literals |

Template generation is split into composable helpers: `_generate_header` emits the BOS, `_generate_system_prompt_handling` branches on `uses_system_prompt_tokens` (pre-v7 extracts and merges, v7+ keeps messages inline), `_generate_message_aggregation` coalesces same-role messages using a sentinel-flush loop, and `_generate_system_message_handling` writes the `[SYSTEM_PROMPT]…[/SYSTEM_PROMPT]` envelope. Source: [src/mistral_common/integrations/chat_templates/template_generator.py:_generate_header,_generate_system_prompt_handling,_generate_message_aggregation,_generate_system_message_handling]()

Tool-call rendering leverages `_emit_call_id_resolution`, which prefers `message['call_id']` and falls back to `message['tool_call_id']`, raising a Jinja exception when neither is a 9-character string. Numeric tool-result content is auto-coerced through `int` then `float` parsing branches so downstream integer schemas accept stringly-typed payloads. Source: [src/mistral_common/integrations/chat_templates/template_generator.py:_emit_call_id_resolution,_emit_tool_content_int_or_float_parsing]()

## See Also

- [Tokenizers and Special Tokens](./Tokenizers-and-Special-Tokens.md) — the `Tokenizer` ABC and `SpecialTokens` enum that underpin the server and grammar factory
- [Instruct Protocol and Normalization](./Instruct-Protocol-and-Normalization.md) — the `InstructRequest` / `InstructRequestNormalizer` surface the server exposes
- [Multimodal (Image & Audio) Encoding](./Multimodal-Image-and-Audio-Encoding.md) — the `image_support` / `audio_support` branches of the template generator

---

<!-- evidence_pipeline_checked: true -->
<!-- evidence_injected: true -->

---

## Pitfall Log

Project: mistralai/mistral-common

Summary: Found 9 structured pitfall item(s), including 1 high/blocking item(s). Top priority: Security or permission risk - Security or permission risk requires verification.

## 1. Security or permission risk - Security or permission risk requires verification

- Severity: high
- Evidence strength: source_linked
- Finding: Project evidence flags a security or permission risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: community_evidence:github | https://github.com/mistralai/mistral-common/issues/148

## 2. Capability evidence risk - Capability evidence risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: README/documentation is current enough for a first validation pass.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: capability.assumptions | github_repo:786756993 | https://github.com/mistralai/mistral-common

## 3. Maintenance risk - Maintenance risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a maintenance risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: evidence.maintainer_signals | github_repo:786756993 | https://github.com/mistralai/mistral-common

## 4. Security or permission risk - Security or permission risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: no_demo
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: downstream_validation.risk_items | github_repo:786756993 | https://github.com/mistralai/mistral-common

## 5. Security or permission risk - Security or permission risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: no_demo
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: risks.scoring_risks | github_repo:786756993 | https://github.com/mistralai/mistral-common

## 6. Security or permission risk - Security or permission risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a security or permission risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: community_evidence:github | https://github.com/mistralai/mistral-common/issues/232

## 7. Security or permission risk - Security or permission risk requires verification

- Severity: medium
- Evidence strength: source_linked
- Finding: Project evidence flags a security or permission risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: community_evidence:github | https://github.com/mistralai/mistral-common/issues/229

## 8. Maintenance risk - Maintenance risk requires verification

- Severity: low
- Evidence strength: source_linked
- Finding: issue_or_pr_quality=unknown。
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: evidence.maintainer_signals | github_repo:786756993 | https://github.com/mistralai/mistral-common

## 9. Maintenance risk - Maintenance risk requires verification

- Severity: low
- Evidence strength: source_linked
- Finding: release_recency=unknown。
- User impact: May increase setup, validation, or first-run risk for the user.
- Evidence: evidence.maintainer_signals | github_repo:786756993 | https://github.com/mistralai/mistral-common

<!-- canonical_name: mistralai/mistral-common; human_manual_source: deepwiki_human_wiki -->
