Doramagic Project Pack · Human Manual

mistral-common

Official inference library for pre-processing of Mistral models

Overview, Installation & Architecture

Related topics: Tokenizers: Tekken, SentencePiece & Multimodal Encoders, Instruct Protocol, Validation & Multimodal Requests

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Optional dependency matrix

Continue reading this section for the full explanation and source context.

Section Tokenizer layer

Continue reading this section for the full explanation and source context.

Section Protocol layer

Continue reading this section for the full explanation and source context.

Related topics: Tokenizers: Tekken, SentencePiece & Multimodal Encoders, Instruct Protocol, Validation & Multimodal Requests

Overview, Installation & Architecture

Purpose and Scope

mistral-common is the open-source companion library released by Mistral AI that ships the reference tokenizers, validators, and request-normalization logic used to interact with Mistral models. The README states explicitly that the project "open-source[s] the tokenizers, validation and normalization code that can be used with our models", with the goal of guaranteeing that downstream users "can take full advantage of our models" for tokenization (text, images, audio, tool calls) and for validating/normalized chat-style requests built on top of Pydantic. Source: README.md:7-15

The library is intentionally scoped to two audiences:

  1. Application developers who want to call Mistral models (locally or via API) and need the exact same prompt formatting and special tokens the models were trained on.
  2. Model builders who want to reproduce Mistral's tokenization and validation behavior in their own training stacks. Source: README.md:19-24

Versioning of tokenizers is a deliberate design choice: each Mistral model release ships with a pinned tokenizer version so that "backward compatibility for the models that we release" is preserved even as the surrounding Python code evolves. Source: README.md:15-17

Installation and Optional Dependencies

The base package installs with a single command and exposes an image, audio, hf-hub, sentencepiece, and experimental server extras set:

pip install mistral-common

Each extra can be installed independently. The sentencepiece extra is now optional because "we only release Tekken tokenizers for recent models"; SentencePiece is therefore only needed when targeting older checkpoints. The server extra is marked [Experimental]. Source: README.md:28-45

Optional dependency matrix

ExtraPurposeWhen it is needed
imageImage tokenizersVision-capable models
audioAudio tokenizersAudio-capable models
hf-hubDownload tokenizers from Hugging FaceLoading checkpoints from mistralai/* repos
sentencepieceSentencePiece tokenizer supportOlder (pre-Tekken) models
server (experimental)Run tokenizers in a server modeHosting tokenization as a service

Source: README.md:35-44

Community note (Issue #15): Pydantic version conflicts are a recurring source of install failures. Downstream projects that pin a specific pydantic version can collide with mistral-common's Pydantic constraint because the library's validation layer is built directly on Pydantic models. Source: README.md:11-13

High-Level Architecture

The codebase is organized into three loosely-coupled layers: a protocol layer (Pydantic data classes for messages, tools, and requests), a tokenizer layer (concrete tokenizer implementations driven by a versioned base class), and an integration layer (chat-template generation, grammar/guidance, and OpenAI compatibility shims).

flowchart TB
    A[ChatCompletionRequest<br/>OpenAI / dict] -->|from_openai| B[Protocol Layer<br/>messages, tools, requests]
    B --> C[InstructRequestNormalizer<br/>v1..v15]
    C --> D[InstructTokenizer<br/>encode_user / encode_assistant / encode_tool]
    D --> E[Base Tokenizer<br/>SentencePiece or Tekken]
    E --> F[Tokenized<br/>tokens + text + images/audios]

    G[TemplateConfig] --> H[template_generator<br/>Jinja2]
    B --> H
    H --> I[Grammar factory<br/>Lark grammars]

    style B fill:#eef
    style D fill:#efe
    style H fill:#fee

The Tokenizer abstract base class defines the vocabulary surface every implementation must expose (n_words, vocab, bos_id, eos_id, pad_id, unk_id, encode, decode, is_special, etc.) and carries a model_settings_builder reference so the same tokenizer object can be queried for version-specific behavior. Source: src/mistral_common/tokens/tokenizers/base.py:32-100

Core Modules

Tokenizer layer

Tokenizer is an ABC whose subclasses (SentencePieceTokenizer, Tekkenizer, etc.) plug into an InstructTokenizer that knows how to render the prompt template for a specific tokenizer version (v1, v2, v3, v7, v11, v13, v15...). Each InstructTokenizer provides encode_user_message, encode_assistant_message, and encode_tool_message methods that consume the strongly-typed messages from the protocol layer and emit Tokenized outputs (token ids, text, optional images/audios). Source: src/mistral_common/tokens/tokenizers/instruct.py:46-91, src/mistral_common/tokens/tokenizers/base.py:55-58

The SpecialTokens enum centralizes all control tokens used across versions ([INST], [/INST], [AVAILABLE_TOOLS], [TOOL_RESULTS], <s>, </s>, plus image, audio, and thinking markers such as [THINK]/[/THINK]). Source: src/mistral_common/tokens/tokenizers/base.py:14-67

Community note (Issue #3): A regression was reported where a missing space between <s> and [INST] caused SentencePiece tokenizers to emit malformed instruct prefixes. The tokenizer layer is precisely the place such template-formatting bugs surface, since prompt rendering is performed at the boundary between the abstract Tokenizer and the per-version InstructTokenizer.

Protocol layer

Pydantic models define the canonical chat schema. UserMessage, AssistantMessage, ToolMessage, and SystemMessage each carry a to_openai / from_openai pair so the library can interop with any OpenAI-compatible client. The Roles enum locks down the four legal message roles. Source: src/mistral_common/protocol/instruct/messages.py:62-138

InstructRequest wraps a list of messages plus an optional available_tools list and exposes a to_openai helper used by the chat-completion entry point. Source: src/mistral_common/protocol/instruct/request.py:18-50

Normalization layer

InstructRequestNormalizer and its versioned subclasses (InstructRequestNormalizerV7, etc.) validate that a ChatCompletionRequest is well-formed and aggregate consecutive same-role messages before they reach the tokenizer. For pre-v15 normalizers, build_settings returns ModelSettings with all fields None; for v15+ it actually constructs a settings object from the request. Source: src/mistral_common/protocol/instruct/normalize.py:29-90

Integration layer

template_generator emits a Jinja2 chat template from a TemplateConfig dataclass that captures the tokenizer version, spm flag, image/audio/thinking support, and whether BOS/EOS should be emitted as Jinja variables (bos_token/eos_token) or as literal strings. The config object enforces mutual exclusivity (image_support and audio_support cannot both be true; thinking_support and plain_thinking_support cannot both be true) by raising ValueError at construction time. Source: src/mistral_common/integrations/chat_templates/template_generator.py:1-58

The guidance.grammar_factory module consumes the rendered Jinja template plus a tool list and produces a Lark grammar suitable for constrained decoding, honoring the tokenizer's tool-calling mode (auto, any, none, or a NamedToolChoice). Source: src/mistral_common/guidance/grammar_factory.py:21-48

Typical Request Flow

  1. A user constructs or receives an OpenAI-style ChatCompletionRequest (or a list of dicts).
  2. InstructRequest.from_openai (or the equivalent normalizer entry point) validates the payload and turns it into a strongly-typed InstructRequest. Source: src/mistral_common/protocol/instruct/request.py:18-50
  3. The version-appropriate InstructTokenizer encodes each message and concatenates the resulting tokens according to that tokenizer version's template. Source: src/mistral_common/tokens/tokenizers/instruct.py:46-91
  4. The Tokenized output is fed to the model; the same library can re-emit it as OpenAI JSON via to_openai for logging or round-tripping. Source: src/mistral_common/protocol/instruct/messages.py:62-138

See Also

  • Tokenizers & Special Tokens
  • Instruct Protocol & Messages
  • Chat Templates & Grammar Generation
  • Versioning & Backward Compatibility

Source: https://github.com/mistralai/mistral-common / Human Manual

Tokenizers: Tekken, SentencePiece & Multimodal Encoders

Related topics: Overview, Installation & Architecture, Instruct Protocol, Validation & Multimodal Requests, Experimental Server, Tool Decoding, Guidance & HF Chat Templates

Section Related Pages

Continue reading this section for the full explanation and source context.

Related topics: Overview, Installation & Architecture, Instruct Protocol, Validation & Multimodal Requests, Experimental Server, Tool Decoding, Guidance & HF Chat Templates

Tokenizers: Tekken, SentencePiece & Multimodal Encoders

Overview

mistral-common exposes the open-source tokenization, validation, and normalization stack that powers Mistral AI models. At its core sits a pluggable Tokenizer abstraction implemented by Tekkenizer (a fast BPE tokenizer based on tiktoken) and SentencePieceTokenizer (a wrapper over SentencePiece). On top of those base encoders, an InstructTokenizer layer applies chat templates, tool-call syntax, and multimodal encoding, with versioned behaviors from V1 through V15. Source: src/mistral_common/tokens/tokenizers/base.py:64-150.

The library ships first-class multimodal encoders for images and audio. Recent models are released with Tekken tokenizers, and the sentencepiece install extra is now marked optional in the README. Source: README.md:8-20.

Class Hierarchy

classDiagram
    class Tokenizer {
        <<abstract>>
        +n_words: int
        +special_ids: set
        +vocab() list
        +encode(s, bos, eos) list
        +decode(tokens) str
    }
    class Tekkenizer
    class SentencePieceTokenizer
    class InstructTokenizerBase {
        +tokenizer: Tokenizer
        +image_encoder
        +audio_encoder
        +encode_instruct()
    }
    class InstructTokenizerV1
    class InstructTokenizerV3
    class InstructTokenizerV7
    class InstructTokenizerV11
    class InstructTokenizerV13
    class InstructTokenizerV15
    Tokenizer <|-- Tekkenizer
    Tokenizer <|-- SentencePieceTokenizer
    InstructTokenizerBase <|-- InstructTokenizerV1
    InstructTokenizerBase <|-- InstructTokenizerV3
    InstructTokenizerBase <|-- InstructTokenizerV7
    InstructTokenizerBase <|-- InstructTokenizerV11
    InstructTokenizerBase <|-- InstructTokenizerV13
    InstructTokenizerBase <|-- InstructTokenizerV15
    InstructTokenizerBase o-- Tokenizer
    InstructTokenizerBase o-- ImageEncoder
    InstructTokenizerBase o-- AudioEncoder

Base Tokenizer Interface

The Tokenizer ABC defines the contract every implementation must satisfy: vocabulary queries (n_words, vocab, id_to_piece), special-token lookups (bos_id, eos_id, pad_id, unk_id), and the round-trip methods encode(s, bos, eos) / decode(tokens, special_token_policy). Source: src/mistral_common/tokens/tokenizers/base.py:64-150.

The Tokenized dataclass is the canonical return type. It carries tokens: list[int], an optional text preview, FIM prefix_ids, and parallel images: list[np.ndarray] / audios: list[Audio] lists. Pydantic's ConfigDict(arbitrary_types_allowed=True) is used so that numpy arrays and audio dataclasses can sit alongside primitive token ids. Source: src/mistral_common/tokens/tokenizers/base.py:30-60.

The SpecialTokens namespace enumerates every control token the encoders know about: bos = "<s>", eos = "</s>", begin_inst = "[INST]", begin_tools = "[AVAILABLE_TOOLS]", tool_calls, img, img_break, img_end, begin_audio, begin_think, end_think, FIM markers (prefix, suffix, middle), and audio streaming markers (streaming_pad, streaming_word, text_to_audio, audio_to_text). Source: src/mistral_common/tokens/tokenizers/base.py:155-210.

A high-engagement community report (Issue #3) flagged that the SentencePiece path was missing a separator between <s> and [INST], which prevented the model from emitting the proper instruct template; the fix lives in src/mistral_common/tokens/tokenizers/sentencepiece.py and is one reason chat-templated decoding must remain versioned.

Tekken Tokenizer

Tekkenizer is the high-throughput BPE tokenizer used by recent Mistral models. It loads a tekken.json artifact containing a ModelData record with vocab: list[TokenInfo], special_tokens: list[SpecialTokenInfo], a TekkenConfig (regex pattern, num_vocab_tokens, default_vocab_size, default_num_special_tokens, and a version string), and per-modality image: ImageConfig / audio: AudioConfig blocks. Source: src/mistral_common/tokens/tokenizers/tekken.py:50-110.

Each TokenInfo stores rank plus base64-encoded token_bytes; SpecialTokenInfo adds an is_control flag. A DEPRECATED_SPECIAL_TOKENS tuple preserves backward compatibility for the legacy control-token list. Source: src/mistral_common/tokens/tokenizers/tekken.py:30-50.

The factory helper download_tokenizer_from_hf_hub (imported by mistral.py) pulls these JSON files from the Hugging Face Hub, gated behind the hf-hub extra. Source: src/mistral_common/tokens/tokenizers/mistral.py:1-30.

SentencePiece Tokenizer

SentencePieceTokenizer is a thin adapter over the SentencePiece runtime. Valid tokenizer filenames are detected by the suffix scheme <name>.model.<version>[mm] together with a bare tekken.json; if multiple files are present, tekken.json wins, otherwise the highest versioned SPM model is selected. Source: src/mistral_common/tokens/tokenizers/utils.py:30-60.

The tokenizer exposes a get_image_config helper and an is_sentencepiece predicate used to dispatch to the SPM-specific instruct branches. For example, the v3-SPM chat template differs from the v3-Tekken template only in its tool-call branch (uses_v2_v3spm_tool_branch). Source: src/mistral_common/integrations/chat_templates/template_generator.py:240-260.

Multimodal Encoders

InstructTokenizerBase composes a base Tokenizer with optional ImageEncoder and AudioEncoder instances. The mistral.py factory wires the correct special ids: image encoding uses img, img_break, and img_end (collected in a SpecialImageIDs dataclass), while audio uses audio, begin_audio, plus transcribe/text-to-audio/audio-to-text tokens when present, stored in SpecialAudioIDs. Source: src/mistral_common/tokens/tokenizers/mistral.py:18-50.

The audio encoder is only loaded for Tekkenizer instances, reflecting Mistral's current release policy. Image chunks are emitted with [IMG] markers and processed by ImageEncoder; the v15+ template also accepts thinking chunks via the experimental ThinkChunk API, and the _split_content_and_think_chunks helper reconstructs them by scanning token streams for begin_think / end_think markers, raising on nested or unbalanced think blocks. Source: src/mistral_common/experimental/think.py:1-40.

Instruct Tokenizer Variants

InstructTokenizerBase is the versioned entry point. find_first_last_user locates the first and last user messages, _truncate_for_max_tokens enforces context limits, and encode_instruct walks the conversation applying role-specific encoders. The v15 release fixed continue_final_message so that prepended reasoning chunks are not lost on continuation. Source: src/mistral_common/tokens/tokenizers/instruct.py:30-90.

Versioned subclasses (V1, V3, V7, V11, V13, V15) differ in tool-call syntax, image/audio support, and the placement of ThinkChunk markers. The TemplateConfig class centralizes the rules — for example, forbids_assistant_content_with_tools is True only for V2/V3, and system_supports_thinking is pre-v15 only. Source: src/mistral_common/integrations/chat_templates/template_generator.py:200-260.

The Guidance integration reuses these templates to build Lark grammars. GrammarFactory.from_template renders the Jinja template (passing fcall, mode, json_schema, parallel_tool_calls) and combines the result with optional begin_think / end_think token allow-lists when the model supports model_settings. Source: src/mistral_common/guidance/grammar_factory.py:80-110.

Community Notes

  • Issue #15 (Pydantic version conflict) — Because tokenizers depend on Pydantic v2 models such as Tokenized and InstructRequest, a tight pydantic<2.7 constraint can break downstream consumers; pin pydantic ranges carefully when mixing with other packages. Source: src/mistral_common/tokens/tokenizers/base.py:30-60.
  • Issue #3 (<s> / [INST] spacing) — Tracked the missing separator in the SPM tokenizer; fixed in src/mistral_common/tokens/tokenizers/sentencepiece.py.
  • Release v1.11.3 — Added multi-format reasoning parsing to from_openai and preserved zero OpenAI seeds in chat request conversion; also fixed continue_final_message so prepended reasoning chunks survive continuation.

See Also

  • Chat template generator: versioned Jinja templates and TemplateConfig
  • Pydantic protocol layer: validation and normalization of InstructRequest / AssistantMessage
  • Multimodal encoders: image, audio, and FIM encoding internals
  • Guidance integration: Lark grammar generation from chat templates

Source: https://github.com/mistralai/mistral-common / Human Manual

Instruct Protocol, Validation & Multimodal Requests

Related topics: Tokenizers: Tekken, SentencePiece & Multimodal Encoders, Experimental Server, Tool Decoding, Guidance & HF Chat Templates

Section Related Pages

Continue reading this section for the full explanation and source context.

Related topics: Tokenizers: Tekken, SentencePiece & Multimodal Encoders, Experimental Server, Tool Decoding, Guidance & HF Chat Templates

Instruct Protocol, Validation & Multimodal Requests

Overview

The Instruct Protocol is the canonical data model used by mistral-common to represent, validate, and normalize chat-style requests sent to Mistral AI models. According to the README.md, the library provides "validation and normalization of requests, messages, tool calls, and responses" built on top of Pydantic, plus tokenization of text, images, and tool calls. Community issue #15 highlights a recurring concern about Pydantic version compatibility, so users should pin compatible Pydantic versions when depending on this layer.

The protocol layers three concerns:

  1. Schema — Pydantic models defining what a valid request looks like (ChatCompletionRequest, InstructRequest, message and tool types).
  2. Validation / Normalization — rules that reject malformed requests and massage them into a canonical form ready for tokenization.
  3. Multimodal contentChunk types that let a single message carry text, images, audio, tool calls, and "thinking" segments together.

Request and Message Schemas

The top-level entry points are ChatCompletionRequest and the simpler InstructRequest, both defined in src/mistral_common/protocol/instruct/request.py. ChatCompletionRequest extends the OpenAI-style shape with Mistral-specific fields:

FieldPurpose
messagesOrdered list of ChatMessageType (user, assistant, system, tool).
tools / tool_choiceFunction-calling definitions and selection policy.
response_formatText vs. JSON-object response formatting.
truncate_for_context_lengthAuto-trim messages if they exceed the model window.
continue_final_messageResume generation from the last assistant turn (fixed in v1.11.3 per release notes).
reasoning_effortControls how much reasoning effort the model should apply.

to_openai() on the request converts messages and tools into the OpenAI wire format with an optional reasoning_field_format argument, which was expanded in v1.11.3 to support multiple reasoning chunk formats when converting from_openai.

Message types live in src/mistral_common/protocol/instruct/messages.py (UserMessage, AssistantMessage, ToolMessage, SystemMessage). The AssistantMessage is notable for carrying both content and an optional prefix flag — when prefix=True, generation continues the assistant's text without re-emitting a closing turn marker. The tool_calls field is defined in src/mistral_common/protocol/instruct/tool_calls.py and supports typed function calls with JSON-schema parameters.

Multimodal Content and Chunks

A message's content may be a plain string or a list of Chunk objects, defined in src/mistral_common/protocol/instruct/chunk.py. The ChunkTypes enum discriminates the supported variants:

  • text — plain text with a text field.
  • image — base64 or URL reference, lazily decoded into a SerializableImage.
  • audio — base64/URL reference; format is auto-detected via soundfile (see _detect_audio_format).
  • tool_call — emitted alongside assistant text when a tool is invoked.
  • think — reasoning segment emitted between <think>/</think>-style markers.

Special tokens used by chunk processing are defined in src/mistral_common/tokens/tokenizers/base.py as SpecialTokens literals: begin_inst ([INST]), begin_tools ([AVAILABLE_TOOLS]), img, img_break, img_end, begin_audio, begin_think, end_think, plus bos (<s>) and eos (</s>). Community issue #3 noted a missing space between <s> and [INST] in the SentencePiece tokenizer — a subtle formatting bug that prevented models from emitting the proper instruct template.

Validation and Normalization Pipeline

The pipeline converts an incoming ChatCompletionRequest into a tokenizer-ready InstructRequest (and eventually token IDs). The orchestration happens in src/mistral_common/tokens/tokenizers/mistral.py, which wires together an InstructTokenizer, a MistralRequestValidator, and an InstructRequestNormalizer:

flowchart LR
    A[ChatCompletionRequest] --> B[MistralRequestValidator]
    B --> C[InstructRequestNormalizer]
    C --> D[InstructRequest]
    D --> E[InstructTokenizer]
    E --> F[Tokenized: tokens + images + prefix_ids]
  • Validator (src/mistral_common/protocol/instruct/validator.py) enforces structural invariants — non-empty assistant turns, role ordering, tool-result semantics, and continue_message requiring prefix=False.
  • Normalizer (src/mistral_common/protocol/instruct/normalize.py) materializes the version-specific InstructRequest subclass and, for v15+ tokenizers, populates ModelSettings (e.g. truncation policy). For pre-v15 normalizers, build_settings returns all-None defaults.
  • Encoder (src/mistral_common/tokens/tokenizers/instruct.py) turns the normalized request into token IDs by emitting role-specific headers, tool definitions, and content chunks. It returns a Tokenized record whose images field carries the loaded np.ndarray arrays associated with image chunks.

Chat Template Generation

For open-source Mistral releases, src/mistral_common/integrations/chat_templates/template_generator.py generates a Jinja2 chat template that mirrors the in-Python encoder. A TemplateConfig selects the tokenizer version (v1, v2, v3, v7, v11, v13, v15) and feature toggles (spm, image_support, audio_support, thinking_support, plain_thinking_support). The generator emits message-aggregation logic that joins consecutive same-role messages with \n\n, supports tool-call inline branches for v2/v3-SPM, and routes <think> chunks either through special tokens or plain <think>/</think> markers. This file is what enables downstream runtimes (vLLM, TGI, llama.cpp) to render prompts that round-trip with mistral-common's own encoder.

Failure Modes and Edge Cases

Common pitfalls when working with this layer include:

  • Version-specific tool rules. Per the forbids_assistant_content_with_tools property, v2 and v3 tokenizers reject assistant messages that carry both content and tool_calls simultaneously.
  • Continue-final-message guard. The encoder raises InvalidAssistantMessageException when continue_message=True but prefix=False.
  • Audio/image exclusivity. The template generator raises ValueError if image_support and audio_support are both enabled.
  • SPM compatibility. Recent models release Tekken tokenizers; the sentencepiece extra is now optional (per README.md).

See Also

  • Tokenizer Base & Special Tokens
  • Mistral Tokenizer & Encoding
  • Jinja Chat Template Generator
  • Tool Calls & Function Calling

Source: https://github.com/mistralai/mistral-common / Human Manual

Experimental Server, Tool Decoding, Guidance & HF Chat Templates

Related topics: Tokenizers: Tekken, SentencePiece & Multimodal Encoders, Instruct Protocol, Validation & Multimodal Requests

Section Related Pages

Continue reading this section for the full explanation and source context.

Related topics: Tokenizers: Tekken, SentencePiece & Multimodal Encoders, Instruct Protocol, Validation & Multimodal Requests

Experimental Server, Tool Decoding, Guidance & HF Chat Templates

Overview

mistral-common exposes four closely related capabilities under its experimental namespace and integration layer: a FastAPI-based tokenizer server, tool/function-call decoding helpers, a guidance-backed grammar factory for constrained generation, and a programmatic Hugging Face tokenizer_config.json chat template generator. The README explicitly marks the server mode as [Experimental] and exposes it via the pip install "mistral-common[server]" extra, while optional [hf-hub] and [sentencepiece] extras cover the chat-template and SentencePiece paths used by the integrations code. Source: README.md:23-43

These features share a common goal: let third-party runtimes (vLLM, SGLang, TGI, raw transformers) consume Mistral instruct formats without re-implementing the special-token grammar or the version-aware tool/thinking conventions. The experimental/ package is the staging area where new behaviors land before being promoted into the stable protocol/ and tokens/ layers.

Experimental Server

The experimental server lives under src/mistral_common/experimental/app/ and follows a standard FastAPI layout: main.py wires the ASGI app, routers.py defines the HTTP routes, and models.py declares the request/response Pydantic schemas. Source: src/mistral_common/experimental/app/main.py, src/mistral_common/experimental/app/routers.py, src/mistral_common/experimental/app/models.py

Its purpose is to expose a tokenization HTTP endpoint backed by MistralTokenizer, decoupling the heavy tokenizer state from the inference process. The server is invoked only when the user installs the server extra: pip install "mistral-common[server]", and the README explicitly tags it [Experimental], meaning the API surface and payload schema may change between minor versions. Source: README.md:25-32

flowchart LR
    Client[HTTP Client] -->|POST /v1/tokenize| Router[routers.py]
    Router --> Models[models.py Pydantic schemas]
    Router --> Server[main.py FastAPI app]
    Server --> MT[MistralTokenizer]
    MT --> Tekken[Tekken / SentencePiece backend]
    Server -->|tokens, text, prefix_ids| Client

The utility helpers in experimental/utils.py and the thinking-aware logic in experimental/think.py are consumed by the routers to keep the server endpoints aligned with the latest ReasoningFieldFormat enum (thinking_chunks | reasoning | reasoning_content) used by AssistantMessage.to_openai. Source: src/mistral_common/experimental/utils.py, src/mistral_common/experimental/think.py, src/mistral_common/protocol/instruct/messages.py:ReasoningFieldFormat

Tool Decoding

src/mistral_common/experimental/tools.py provides the tool-decoding primitives used both by the server and by external callers that want to parse Mistral tool-call streams back into structured ToolCall objects. The companion experimental/think.py complements it by separating reasoning traces from tool-call deltas. Source: src/mistral_common/experimental/tools.py, src/mistral_common/experimental/think.py

Tool decoding must respect the same version constraints enforced by the normalizer. InstructRequestNormalizer.forbids_assistant_content_with_tools returns True for TokenizerVersion.v2 and v3, meaning the decoder must reject assistant messages that mix content and tool_calls on those versions. Conversely, validates_assistant_non_empty is enabled from v3 non-SPM and v7+, and uses_v2_v3spm_tool_branch selects the inline elif branch for v2 and v3-SPM tool syntax. Source: src/mistral_common/integrations/chat_templates/template_generator.py:forbids_assistant_content_with_tools,validates_assistant_non_empty,uses_v2_v3spm_tool_branch

The protocol-level definition of a Tool (a Function with JSON-schema parameters) is the contract experimental/tools.py is expected to honor. The InstructRequest(...).to_openai() docstring demonstrates this round-trip with a get_current_weather tool. Source: src/mistral_common/protocol/instruct/request.py:InstructRequest to_openai docstring

Guidance Grammar Factory

src/mistral_common/guidance/grammar_factory.py exposes GrammarFactory, which renders a Lark grammar string for a given MistralTokenizer so that guidance/llguidance engines can perform token-constrained decoding. The factory supports three function-calling modes (auto, any, none), NamedToolChoice, optional parallel_tool_calls, and an additional json_schema to union in. Source: src/mistral_common/guidance/grammar_factory.py:GrammarFactory.encode_for_guided_decoding.

The constructor asserts that both llguidance and jinja2 are installed (optional dependencies) and refuses to operate on tokenizers it does not understand. GrammarFactory.is_supported requires a Tekken tokenizer with version >= TokenizerVersion.v11, so legacy SentencePiece models must be upgraded before guided decoding is available. Source: src/mistral_common/guidance/grammar_factory.py:is_supported,__init__

When a tokenizer exposes the begin_think/end_think special tokens, the factory injects them into the generated grammar so that thinking traces can be emitted alongside JSON tool calls (think_with_json branch). The _convert_tool_calls helper produces a per-tool TOOL_CALL_GRAMMAR fragment, parenthesizes and alternates each entry, and suffixes a + when parallel_tool_calls=True. Source: src/mistral_common/guidance/grammar_factory.py:_convert_tool_calls

HF Chat Template Generator

The integration under src/mistral_common/integrations/chat_templates/template_generator.py programmatically builds a tokenizer-version-aware Jinja2 chat template that can be embedded into a Hugging Face tokenizer_config.json. It is driven by a TemplateConfig dataclass that captures the TokenizerVersion, spm flag, and the boolean toggles for image_support, audio_support, thinking_support, plain_thinking_support, and use_special_token_variables. Source: src/mistral_common/integrations/chat_templates/template_generator.py:TemplateConfig

Config flagEffect on generated template
version (v1–v15)Selects special-token layout, tool-call branch, and message-aggregation rule
spmAdds trailing spaces after special tokens; blocked on v11+ and with audio
image_support (v3+)Adds [IMG] chunk processing; mutually exclusive with audio
audio_support (v7+)Adds [AUDIO] chunk processing; mutually exclusive with image
thinking_support (v13+)Emits [THINK]/[/THINK] around reasoning chunks
plain_thinking_support (v11 only)Uses <think>/</think> literal tags instead of special tokens
use_special_token_variablesEmits bos_token/eos_token as Jinja variables rather than literals

Template generation is split into composable helpers: _generate_header emits the BOS, _generate_system_prompt_handling branches on uses_system_prompt_tokens (pre-v7 extracts and merges, v7+ keeps messages inline), _generate_message_aggregation coalesces same-role messages using a sentinel-flush loop, and _generate_system_message_handling writes the [SYSTEM_PROMPT]…[/SYSTEM_PROMPT] envelope. Source: src/mistral_common/integrations/chat_templates/template_generator.py:_generate_header,_generate_system_prompt_handling,_generate_message_aggregation,_generate_system_message_handling

Tool-call rendering leverages _emit_call_id_resolution, which prefers message['call_id'] and falls back to message['tool_call_id'], raising a Jinja exception when neither is a 9-character string. Numeric tool-result content is auto-coerced through int then float parsing branches so downstream integer schemas accept stringly-typed payloads. Source: src/mistral_common/integrations/chat_templates/template_generator.py:_emit_call_id_resolution,_emit_tool_content_int_or_float_parsing

See Also

  • Tokenizers and Special Tokens — the Tokenizer ABC and SpecialTokens enum that underpin the server and grammar factory
  • Instruct Protocol and Normalization — the InstructRequest / InstructRequestNormalizer surface the server exposes
  • Multimodal (Image & Audio) Encoding — the image_support / audio_support branches of the template generator

Source: https://github.com/mistralai/mistral-common / Human Manual

Doramagic Pitfall Log

Source-linked risks stay visible on the manual page so the preview does not read like a recommendation.

high Security or permission risk requires verification

May increase setup, validation, or first-run risk for the user.

medium Capability evidence risk requires verification

May increase setup, validation, or first-run risk for the user.

medium Maintenance risk requires verification

May increase setup, validation, or first-run risk for the user.

medium Security or permission risk requires verification

May increase setup, validation, or first-run risk for the user.

Doramagic Pitfall Log

Found 9 structured pitfall item(s), including 1 high/blocking item(s). Top priority: Security or permission risk - Security or permission risk requires verification.

1. Security or permission risk: Security or permission risk requires verification

  • Severity: high
  • Finding: Project evidence flags a security or permission risk. Review the linked source before relying on this workflow.
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: community_evidence:github | https://github.com/mistralai/mistral-common/issues/148

2. Capability evidence risk: Capability evidence risk requires verification

  • Severity: medium
  • Finding: README/documentation is current enough for a first validation pass.
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: capability.assumptions | github_repo:786756993 | https://github.com/mistralai/mistral-common

3. Maintenance risk: Maintenance risk requires verification

  • Severity: medium
  • Finding: Project evidence flags a maintenance risk. Review the linked source before relying on this workflow.
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: evidence.maintainer_signals | github_repo:786756993 | https://github.com/mistralai/mistral-common

4. Security or permission risk: Security or permission risk requires verification

  • Severity: medium
  • Finding: no_demo
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: downstream_validation.risk_items | github_repo:786756993 | https://github.com/mistralai/mistral-common

5. Security or permission risk: Security or permission risk requires verification

  • Severity: medium
  • Finding: no_demo
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: risks.scoring_risks | github_repo:786756993 | https://github.com/mistralai/mistral-common

6. Security or permission risk: Security or permission risk requires verification

  • Severity: medium
  • Finding: Project evidence flags a security or permission risk. Review the linked source before relying on this workflow.
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: community_evidence:github | https://github.com/mistralai/mistral-common/issues/232

7. Security or permission risk: Security or permission risk requires verification

  • Severity: medium
  • Finding: Project evidence flags a security or permission risk. Review the linked source before relying on this workflow.
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: community_evidence:github | https://github.com/mistralai/mistral-common/issues/229

8. Maintenance risk: Maintenance risk requires verification

  • Severity: low
  • Finding: issue_or_pr_quality=unknown。
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: evidence.maintainer_signals | github_repo:786756993 | https://github.com/mistralai/mistral-common

9. Maintenance risk: Maintenance risk requires verification

  • Severity: low
  • Finding: release_recency=unknown。
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: evidence.maintainer_signals | github_repo:786756993 | https://github.com/mistralai/mistral-common

Source: Doramagic discovery, validation, and Project Pack records

Community Discussion Evidence

These external discussion links are review inputs, not standalone proof that the project is production-ready.

Sources 12

Count of project-level external discussion links exposed on this manual page.

Use Review before install

Open the linked issues or discussions before treating the pack as ready for your environment.

Community Discussion Evidence

Doramagic exposes project-level community discussion separately from official documentation. Review these links before using mistral-common with real data or production workflows.

Source: Project Pack community evidence and pitfall evidence