Doramagic Project Pack · Human Manual
mistral-common
Official inference library for pre-processing of Mistral models
Overview, Installation & Architecture
Related topics: Tokenizers: Tekken, SentencePiece & Multimodal Encoders, Instruct Protocol, Validation & Multimodal Requests
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Tokenizers: Tekken, SentencePiece & Multimodal Encoders, Instruct Protocol, Validation & Multimodal Requests
Overview, Installation & Architecture
Purpose and Scope
mistral-common is the open-source companion library released by Mistral AI that ships the reference tokenizers, validators, and request-normalization logic used to interact with Mistral models. The README states explicitly that the project "open-source[s] the tokenizers, validation and normalization code that can be used with our models", with the goal of guaranteeing that downstream users "can take full advantage of our models" for tokenization (text, images, audio, tool calls) and for validating/normalized chat-style requests built on top of Pydantic. Source: README.md:7-15
The library is intentionally scoped to two audiences:
- Application developers who want to call Mistral models (locally or via API) and need the exact same prompt formatting and special tokens the models were trained on.
- Model builders who want to reproduce Mistral's tokenization and validation behavior in their own training stacks. Source: README.md:19-24
Versioning of tokenizers is a deliberate design choice: each Mistral model release ships with a pinned tokenizer version so that "backward compatibility for the models that we release" is preserved even as the surrounding Python code evolves. Source: README.md:15-17
Installation and Optional Dependencies
The base package installs with a single command and exposes an image, audio, hf-hub, sentencepiece, and experimental server extras set:
pip install mistral-common
Each extra can be installed independently. The sentencepiece extra is now optional because "we only release Tekken tokenizers for recent models"; SentencePiece is therefore only needed when targeting older checkpoints. The server extra is marked [Experimental]. Source: README.md:28-45
Optional dependency matrix
| Extra | Purpose | When it is needed |
|---|---|---|
image | Image tokenizers | Vision-capable models |
audio | Audio tokenizers | Audio-capable models |
hf-hub | Download tokenizers from Hugging Face | Loading checkpoints from mistralai/* repos |
sentencepiece | SentencePiece tokenizer support | Older (pre-Tekken) models |
server (experimental) | Run tokenizers in a server mode | Hosting tokenization as a service |
Source: README.md:35-44
Community note (Issue #15): Pydantic version conflicts are a recurring source of install failures. Downstream projects that pin a specificpydanticversion can collide withmistral-common's Pydantic constraint because the library's validation layer is built directly on Pydantic models. Source: README.md:11-13
High-Level Architecture
The codebase is organized into three loosely-coupled layers: a protocol layer (Pydantic data classes for messages, tools, and requests), a tokenizer layer (concrete tokenizer implementations driven by a versioned base class), and an integration layer (chat-template generation, grammar/guidance, and OpenAI compatibility shims).
flowchart TB
A[ChatCompletionRequest<br/>OpenAI / dict] -->|from_openai| B[Protocol Layer<br/>messages, tools, requests]
B --> C[InstructRequestNormalizer<br/>v1..v15]
C --> D[InstructTokenizer<br/>encode_user / encode_assistant / encode_tool]
D --> E[Base Tokenizer<br/>SentencePiece or Tekken]
E --> F[Tokenized<br/>tokens + text + images/audios]
G[TemplateConfig] --> H[template_generator<br/>Jinja2]
B --> H
H --> I[Grammar factory<br/>Lark grammars]
style B fill:#eef
style D fill:#efe
style H fill:#feeThe Tokenizer abstract base class defines the vocabulary surface every implementation must expose (n_words, vocab, bos_id, eos_id, pad_id, unk_id, encode, decode, is_special, etc.) and carries a model_settings_builder reference so the same tokenizer object can be queried for version-specific behavior. Source: src/mistral_common/tokens/tokenizers/base.py:32-100
Core Modules
Tokenizer layer
Tokenizer is an ABC whose subclasses (SentencePieceTokenizer, Tekkenizer, etc.) plug into an InstructTokenizer that knows how to render the prompt template for a specific tokenizer version (v1, v2, v3, v7, v11, v13, v15...). Each InstructTokenizer provides encode_user_message, encode_assistant_message, and encode_tool_message methods that consume the strongly-typed messages from the protocol layer and emit Tokenized outputs (token ids, text, optional images/audios). Source: src/mistral_common/tokens/tokenizers/instruct.py:46-91, src/mistral_common/tokens/tokenizers/base.py:55-58
The SpecialTokens enum centralizes all control tokens used across versions ([INST], [/INST], [AVAILABLE_TOOLS], [TOOL_RESULTS], <s>, </s>, plus image, audio, and thinking markers such as [THINK]/[/THINK]). Source: src/mistral_common/tokens/tokenizers/base.py:14-67
Community note (Issue #3): A regression was reported where a missing space between<s>and[INST]caused SentencePiece tokenizers to emit malformed instruct prefixes. The tokenizer layer is precisely the place such template-formatting bugs surface, since prompt rendering is performed at the boundary between the abstractTokenizerand the per-versionInstructTokenizer.
Protocol layer
Pydantic models define the canonical chat schema. UserMessage, AssistantMessage, ToolMessage, and SystemMessage each carry a to_openai / from_openai pair so the library can interop with any OpenAI-compatible client. The Roles enum locks down the four legal message roles. Source: src/mistral_common/protocol/instruct/messages.py:62-138
InstructRequest wraps a list of messages plus an optional available_tools list and exposes a to_openai helper used by the chat-completion entry point. Source: src/mistral_common/protocol/instruct/request.py:18-50
Normalization layer
InstructRequestNormalizer and its versioned subclasses (InstructRequestNormalizerV7, etc.) validate that a ChatCompletionRequest is well-formed and aggregate consecutive same-role messages before they reach the tokenizer. For pre-v15 normalizers, build_settings returns ModelSettings with all fields None; for v15+ it actually constructs a settings object from the request. Source: src/mistral_common/protocol/instruct/normalize.py:29-90
Integration layer
template_generator emits a Jinja2 chat template from a TemplateConfig dataclass that captures the tokenizer version, spm flag, image/audio/thinking support, and whether BOS/EOS should be emitted as Jinja variables (bos_token/eos_token) or as literal strings. The config object enforces mutual exclusivity (image_support and audio_support cannot both be true; thinking_support and plain_thinking_support cannot both be true) by raising ValueError at construction time. Source: src/mistral_common/integrations/chat_templates/template_generator.py:1-58
The guidance.grammar_factory module consumes the rendered Jinja template plus a tool list and produces a Lark grammar suitable for constrained decoding, honoring the tokenizer's tool-calling mode (auto, any, none, or a NamedToolChoice). Source: src/mistral_common/guidance/grammar_factory.py:21-48
Typical Request Flow
- A user constructs or receives an OpenAI-style
ChatCompletionRequest(or a list of dicts). InstructRequest.from_openai(or the equivalent normalizer entry point) validates the payload and turns it into a strongly-typedInstructRequest. Source: src/mistral_common/protocol/instruct/request.py:18-50- The version-appropriate
InstructTokenizerencodes each message and concatenates the resulting tokens according to that tokenizer version's template. Source: src/mistral_common/tokens/tokenizers/instruct.py:46-91 - The
Tokenizedoutput is fed to the model; the same library can re-emit it as OpenAI JSON viato_openaifor logging or round-tripping. Source: src/mistral_common/protocol/instruct/messages.py:62-138
See Also
- Tokenizers & Special Tokens
- Instruct Protocol & Messages
- Chat Templates & Grammar Generation
- Versioning & Backward Compatibility
Source: https://github.com/mistralai/mistral-common / Human Manual
Tokenizers: Tekken, SentencePiece & Multimodal Encoders
Related topics: Overview, Installation & Architecture, Instruct Protocol, Validation & Multimodal Requests, Experimental Server, Tool Decoding, Guidance & HF Chat Templates
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Overview, Installation & Architecture, Instruct Protocol, Validation & Multimodal Requests, Experimental Server, Tool Decoding, Guidance & HF Chat Templates
Tokenizers: Tekken, SentencePiece & Multimodal Encoders
Overview
mistral-common exposes the open-source tokenization, validation, and normalization stack that powers Mistral AI models. At its core sits a pluggable Tokenizer abstraction implemented by Tekkenizer (a fast BPE tokenizer based on tiktoken) and SentencePieceTokenizer (a wrapper over SentencePiece). On top of those base encoders, an InstructTokenizer layer applies chat templates, tool-call syntax, and multimodal encoding, with versioned behaviors from V1 through V15. Source: src/mistral_common/tokens/tokenizers/base.py:64-150.
The library ships first-class multimodal encoders for images and audio. Recent models are released with Tekken tokenizers, and the sentencepiece install extra is now marked optional in the README. Source: README.md:8-20.
Class Hierarchy
classDiagram
class Tokenizer {
<<abstract>>
+n_words: int
+special_ids: set
+vocab() list
+encode(s, bos, eos) list
+decode(tokens) str
}
class Tekkenizer
class SentencePieceTokenizer
class InstructTokenizerBase {
+tokenizer: Tokenizer
+image_encoder
+audio_encoder
+encode_instruct()
}
class InstructTokenizerV1
class InstructTokenizerV3
class InstructTokenizerV7
class InstructTokenizerV11
class InstructTokenizerV13
class InstructTokenizerV15
Tokenizer <|-- Tekkenizer
Tokenizer <|-- SentencePieceTokenizer
InstructTokenizerBase <|-- InstructTokenizerV1
InstructTokenizerBase <|-- InstructTokenizerV3
InstructTokenizerBase <|-- InstructTokenizerV7
InstructTokenizerBase <|-- InstructTokenizerV11
InstructTokenizerBase <|-- InstructTokenizerV13
InstructTokenizerBase <|-- InstructTokenizerV15
InstructTokenizerBase o-- Tokenizer
InstructTokenizerBase o-- ImageEncoder
InstructTokenizerBase o-- AudioEncoderBase Tokenizer Interface
The Tokenizer ABC defines the contract every implementation must satisfy: vocabulary queries (n_words, vocab, id_to_piece), special-token lookups (bos_id, eos_id, pad_id, unk_id), and the round-trip methods encode(s, bos, eos) / decode(tokens, special_token_policy). Source: src/mistral_common/tokens/tokenizers/base.py:64-150.
The Tokenized dataclass is the canonical return type. It carries tokens: list[int], an optional text preview, FIM prefix_ids, and parallel images: list[np.ndarray] / audios: list[Audio] lists. Pydantic's ConfigDict(arbitrary_types_allowed=True) is used so that numpy arrays and audio dataclasses can sit alongside primitive token ids. Source: src/mistral_common/tokens/tokenizers/base.py:30-60.
The SpecialTokens namespace enumerates every control token the encoders know about: bos = "<s>", eos = "</s>", begin_inst = "[INST]", begin_tools = "[AVAILABLE_TOOLS]", tool_calls, img, img_break, img_end, begin_audio, begin_think, end_think, FIM markers (prefix, suffix, middle), and audio streaming markers (streaming_pad, streaming_word, text_to_audio, audio_to_text). Source: src/mistral_common/tokens/tokenizers/base.py:155-210.
A high-engagement community report (Issue #3) flagged that the SentencePiece path was missing a separator between <s> and [INST], which prevented the model from emitting the proper instruct template; the fix lives in src/mistral_common/tokens/tokenizers/sentencepiece.py and is one reason chat-templated decoding must remain versioned.
Tekken Tokenizer
Tekkenizer is the high-throughput BPE tokenizer used by recent Mistral models. It loads a tekken.json artifact containing a ModelData record with vocab: list[TokenInfo], special_tokens: list[SpecialTokenInfo], a TekkenConfig (regex pattern, num_vocab_tokens, default_vocab_size, default_num_special_tokens, and a version string), and per-modality image: ImageConfig / audio: AudioConfig blocks. Source: src/mistral_common/tokens/tokenizers/tekken.py:50-110.
Each TokenInfo stores rank plus base64-encoded token_bytes; SpecialTokenInfo adds an is_control flag. A DEPRECATED_SPECIAL_TOKENS tuple preserves backward compatibility for the legacy control-token list. Source: src/mistral_common/tokens/tokenizers/tekken.py:30-50.
The factory helper download_tokenizer_from_hf_hub (imported by mistral.py) pulls these JSON files from the Hugging Face Hub, gated behind the hf-hub extra. Source: src/mistral_common/tokens/tokenizers/mistral.py:1-30.
SentencePiece Tokenizer
SentencePieceTokenizer is a thin adapter over the SentencePiece runtime. Valid tokenizer filenames are detected by the suffix scheme <name>.model.<version>[mm] together with a bare tekken.json; if multiple files are present, tekken.json wins, otherwise the highest versioned SPM model is selected. Source: src/mistral_common/tokens/tokenizers/utils.py:30-60.
The tokenizer exposes a get_image_config helper and an is_sentencepiece predicate used to dispatch to the SPM-specific instruct branches. For example, the v3-SPM chat template differs from the v3-Tekken template only in its tool-call branch (uses_v2_v3spm_tool_branch). Source: src/mistral_common/integrations/chat_templates/template_generator.py:240-260.
Multimodal Encoders
InstructTokenizerBase composes a base Tokenizer with optional ImageEncoder and AudioEncoder instances. The mistral.py factory wires the correct special ids: image encoding uses img, img_break, and img_end (collected in a SpecialImageIDs dataclass), while audio uses audio, begin_audio, plus transcribe/text-to-audio/audio-to-text tokens when present, stored in SpecialAudioIDs. Source: src/mistral_common/tokens/tokenizers/mistral.py:18-50.
The audio encoder is only loaded for Tekkenizer instances, reflecting Mistral's current release policy. Image chunks are emitted with [IMG] markers and processed by ImageEncoder; the v15+ template also accepts thinking chunks via the experimental ThinkChunk API, and the _split_content_and_think_chunks helper reconstructs them by scanning token streams for begin_think / end_think markers, raising on nested or unbalanced think blocks. Source: src/mistral_common/experimental/think.py:1-40.
Instruct Tokenizer Variants
InstructTokenizerBase is the versioned entry point. find_first_last_user locates the first and last user messages, _truncate_for_max_tokens enforces context limits, and encode_instruct walks the conversation applying role-specific encoders. The v15 release fixed continue_final_message so that prepended reasoning chunks are not lost on continuation. Source: src/mistral_common/tokens/tokenizers/instruct.py:30-90.
Versioned subclasses (V1, V3, V7, V11, V13, V15) differ in tool-call syntax, image/audio support, and the placement of ThinkChunk markers. The TemplateConfig class centralizes the rules — for example, forbids_assistant_content_with_tools is True only for V2/V3, and system_supports_thinking is pre-v15 only. Source: src/mistral_common/integrations/chat_templates/template_generator.py:200-260.
The Guidance integration reuses these templates to build Lark grammars. GrammarFactory.from_template renders the Jinja template (passing fcall, mode, json_schema, parallel_tool_calls) and combines the result with optional begin_think / end_think token allow-lists when the model supports model_settings. Source: src/mistral_common/guidance/grammar_factory.py:80-110.
Community Notes
- Issue #15 (Pydantic version conflict) — Because tokenizers depend on Pydantic v2 models such as
TokenizedandInstructRequest, a tightpydantic<2.7constraint can break downstream consumers; pinpydanticranges carefully when mixing with other packages. Source: src/mistral_common/tokens/tokenizers/base.py:30-60. - Issue #3 (
<s>/[INST]spacing) — Tracked the missing separator in the SPM tokenizer; fixed insrc/mistral_common/tokens/tokenizers/sentencepiece.py. - Release v1.11.3 — Added multi-format reasoning parsing to
from_openaiand preserved zero OpenAI seeds in chat request conversion; also fixedcontinue_final_messageso prepended reasoning chunks survive continuation.
See Also
- Chat template generator: versioned Jinja templates and
TemplateConfig - Pydantic protocol layer: validation and normalization of
InstructRequest/AssistantMessage - Multimodal encoders: image, audio, and FIM encoding internals
- Guidance integration: Lark grammar generation from chat templates
Source: https://github.com/mistralai/mistral-common / Human Manual
Instruct Protocol, Validation & Multimodal Requests
Related topics: Tokenizers: Tekken, SentencePiece & Multimodal Encoders, Experimental Server, Tool Decoding, Guidance & HF Chat Templates
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Tokenizers: Tekken, SentencePiece & Multimodal Encoders, Experimental Server, Tool Decoding, Guidance & HF Chat Templates
Instruct Protocol, Validation & Multimodal Requests
Overview
The Instruct Protocol is the canonical data model used by mistral-common to represent, validate, and normalize chat-style requests sent to Mistral AI models. According to the README.md, the library provides "validation and normalization of requests, messages, tool calls, and responses" built on top of Pydantic, plus tokenization of text, images, and tool calls. Community issue #15 highlights a recurring concern about Pydantic version compatibility, so users should pin compatible Pydantic versions when depending on this layer.
The protocol layers three concerns:
- Schema — Pydantic models defining what a valid request looks like (
ChatCompletionRequest,InstructRequest, message and tool types). - Validation / Normalization — rules that reject malformed requests and massage them into a canonical form ready for tokenization.
- Multimodal content —
Chunktypes that let a single message carry text, images, audio, tool calls, and "thinking" segments together.
Request and Message Schemas
The top-level entry points are ChatCompletionRequest and the simpler InstructRequest, both defined in src/mistral_common/protocol/instruct/request.py. ChatCompletionRequest extends the OpenAI-style shape with Mistral-specific fields:
| Field | Purpose |
|---|---|
messages | Ordered list of ChatMessageType (user, assistant, system, tool). |
tools / tool_choice | Function-calling definitions and selection policy. |
response_format | Text vs. JSON-object response formatting. |
truncate_for_context_length | Auto-trim messages if they exceed the model window. |
continue_final_message | Resume generation from the last assistant turn (fixed in v1.11.3 per release notes). |
reasoning_effort | Controls how much reasoning effort the model should apply. |
to_openai() on the request converts messages and tools into the OpenAI wire format with an optional reasoning_field_format argument, which was expanded in v1.11.3 to support multiple reasoning chunk formats when converting from_openai.
Message types live in src/mistral_common/protocol/instruct/messages.py (UserMessage, AssistantMessage, ToolMessage, SystemMessage). The AssistantMessage is notable for carrying both content and an optional prefix flag — when prefix=True, generation continues the assistant's text without re-emitting a closing turn marker. The tool_calls field is defined in src/mistral_common/protocol/instruct/tool_calls.py and supports typed function calls with JSON-schema parameters.
Multimodal Content and Chunks
A message's content may be a plain string or a list of Chunk objects, defined in src/mistral_common/protocol/instruct/chunk.py. The ChunkTypes enum discriminates the supported variants:
text— plain text with atextfield.image— base64 or URL reference, lazily decoded into aSerializableImage.audio— base64/URL reference; format is auto-detected viasoundfile(see_detect_audio_format).tool_call— emitted alongside assistant text when a tool is invoked.think— reasoning segment emitted between<think>/</think>-style markers.
Special tokens used by chunk processing are defined in src/mistral_common/tokens/tokenizers/base.py as SpecialTokens literals: begin_inst ([INST]), begin_tools ([AVAILABLE_TOOLS]), img, img_break, img_end, begin_audio, begin_think, end_think, plus bos (<s>) and eos (</s>). Community issue #3 noted a missing space between <s> and [INST] in the SentencePiece tokenizer — a subtle formatting bug that prevented models from emitting the proper instruct template.
Validation and Normalization Pipeline
The pipeline converts an incoming ChatCompletionRequest into a tokenizer-ready InstructRequest (and eventually token IDs). The orchestration happens in src/mistral_common/tokens/tokenizers/mistral.py, which wires together an InstructTokenizer, a MistralRequestValidator, and an InstructRequestNormalizer:
flowchart LR
A[ChatCompletionRequest] --> B[MistralRequestValidator]
B --> C[InstructRequestNormalizer]
C --> D[InstructRequest]
D --> E[InstructTokenizer]
E --> F[Tokenized: tokens + images + prefix_ids]- Validator (src/mistral_common/protocol/instruct/validator.py) enforces structural invariants — non-empty assistant turns, role ordering, tool-result semantics, and
continue_messagerequiringprefix=False. - Normalizer (src/mistral_common/protocol/instruct/normalize.py) materializes the version-specific
InstructRequestsubclass and, for v15+ tokenizers, populatesModelSettings(e.g. truncation policy). For pre-v15 normalizers,build_settingsreturns all-Nonedefaults. - Encoder (src/mistral_common/tokens/tokenizers/instruct.py) turns the normalized request into token IDs by emitting role-specific headers, tool definitions, and content chunks. It returns a
Tokenizedrecord whoseimagesfield carries the loadednp.ndarrayarrays associated with image chunks.
Chat Template Generation
For open-source Mistral releases, src/mistral_common/integrations/chat_templates/template_generator.py generates a Jinja2 chat template that mirrors the in-Python encoder. A TemplateConfig selects the tokenizer version (v1, v2, v3, v7, v11, v13, v15) and feature toggles (spm, image_support, audio_support, thinking_support, plain_thinking_support). The generator emits message-aggregation logic that joins consecutive same-role messages with \n\n, supports tool-call inline branches for v2/v3-SPM, and routes <think> chunks either through special tokens or plain <think>/</think> markers. This file is what enables downstream runtimes (vLLM, TGI, llama.cpp) to render prompts that round-trip with mistral-common's own encoder.
Failure Modes and Edge Cases
Common pitfalls when working with this layer include:
- Version-specific tool rules. Per the
forbids_assistant_content_with_toolsproperty, v2 and v3 tokenizers reject assistant messages that carry bothcontentandtool_callssimultaneously. - Continue-final-message guard. The encoder raises
InvalidAssistantMessageExceptionwhencontinue_message=Truebutprefix=False. - Audio/image exclusivity. The template generator raises
ValueErrorifimage_supportandaudio_supportare both enabled. - SPM compatibility. Recent models release
Tekkentokenizers; thesentencepieceextra is now optional (per README.md).
See Also
- Tokenizer Base & Special Tokens
- Mistral Tokenizer & Encoding
- Jinja Chat Template Generator
- Tool Calls & Function Calling
Source: https://github.com/mistralai/mistral-common / Human Manual
Experimental Server, Tool Decoding, Guidance & HF Chat Templates
Related topics: Tokenizers: Tekken, SentencePiece & Multimodal Encoders, Instruct Protocol, Validation & Multimodal Requests
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Tokenizers: Tekken, SentencePiece & Multimodal Encoders, Instruct Protocol, Validation & Multimodal Requests
Experimental Server, Tool Decoding, Guidance & HF Chat Templates
Overview
mistral-common exposes four closely related capabilities under its experimental namespace and integration layer: a FastAPI-based tokenizer server, tool/function-call decoding helpers, a guidance-backed grammar factory for constrained generation, and a programmatic Hugging Face tokenizer_config.json chat template generator. The README explicitly marks the server mode as [Experimental] and exposes it via the pip install "mistral-common[server]" extra, while optional [hf-hub] and [sentencepiece] extras cover the chat-template and SentencePiece paths used by the integrations code. Source: README.md:23-43
These features share a common goal: let third-party runtimes (vLLM, SGLang, TGI, raw transformers) consume Mistral instruct formats without re-implementing the special-token grammar or the version-aware tool/thinking conventions. The experimental/ package is the staging area where new behaviors land before being promoted into the stable protocol/ and tokens/ layers.
Experimental Server
The experimental server lives under src/mistral_common/experimental/app/ and follows a standard FastAPI layout: main.py wires the ASGI app, routers.py defines the HTTP routes, and models.py declares the request/response Pydantic schemas. Source: src/mistral_common/experimental/app/main.py, src/mistral_common/experimental/app/routers.py, src/mistral_common/experimental/app/models.py
Its purpose is to expose a tokenization HTTP endpoint backed by MistralTokenizer, decoupling the heavy tokenizer state from the inference process. The server is invoked only when the user installs the server extra: pip install "mistral-common[server]", and the README explicitly tags it [Experimental], meaning the API surface and payload schema may change between minor versions. Source: README.md:25-32
flowchart LR
Client[HTTP Client] -->|POST /v1/tokenize| Router[routers.py]
Router --> Models[models.py Pydantic schemas]
Router --> Server[main.py FastAPI app]
Server --> MT[MistralTokenizer]
MT --> Tekken[Tekken / SentencePiece backend]
Server -->|tokens, text, prefix_ids| ClientThe utility helpers in experimental/utils.py and the thinking-aware logic in experimental/think.py are consumed by the routers to keep the server endpoints aligned with the latest ReasoningFieldFormat enum (thinking_chunks | reasoning | reasoning_content) used by AssistantMessage.to_openai. Source: src/mistral_common/experimental/utils.py, src/mistral_common/experimental/think.py, src/mistral_common/protocol/instruct/messages.py:ReasoningFieldFormat
Tool Decoding
src/mistral_common/experimental/tools.py provides the tool-decoding primitives used both by the server and by external callers that want to parse Mistral tool-call streams back into structured ToolCall objects. The companion experimental/think.py complements it by separating reasoning traces from tool-call deltas. Source: src/mistral_common/experimental/tools.py, src/mistral_common/experimental/think.py
Tool decoding must respect the same version constraints enforced by the normalizer. InstructRequestNormalizer.forbids_assistant_content_with_tools returns True for TokenizerVersion.v2 and v3, meaning the decoder must reject assistant messages that mix content and tool_calls on those versions. Conversely, validates_assistant_non_empty is enabled from v3 non-SPM and v7+, and uses_v2_v3spm_tool_branch selects the inline elif branch for v2 and v3-SPM tool syntax. Source: src/mistral_common/integrations/chat_templates/template_generator.py:forbids_assistant_content_with_tools,validates_assistant_non_empty,uses_v2_v3spm_tool_branch
The protocol-level definition of a Tool (a Function with JSON-schema parameters) is the contract experimental/tools.py is expected to honor. The InstructRequest(...).to_openai() docstring demonstrates this round-trip with a get_current_weather tool. Source: src/mistral_common/protocol/instruct/request.py:InstructRequest to_openai docstring
Guidance Grammar Factory
src/mistral_common/guidance/grammar_factory.py exposes GrammarFactory, which renders a Lark grammar string for a given MistralTokenizer so that guidance/llguidance engines can perform token-constrained decoding. The factory supports three function-calling modes (auto, any, none), NamedToolChoice, optional parallel_tool_calls, and an additional json_schema to union in. Source: src/mistral_common/guidance/grammar_factory.py:GrammarFactory.encode_for_guided_decoding.
The constructor asserts that both llguidance and jinja2 are installed (optional dependencies) and refuses to operate on tokenizers it does not understand. GrammarFactory.is_supported requires a Tekken tokenizer with version >= TokenizerVersion.v11, so legacy SentencePiece models must be upgraded before guided decoding is available. Source: src/mistral_common/guidance/grammar_factory.py:is_supported,__init__
When a tokenizer exposes the begin_think/end_think special tokens, the factory injects them into the generated grammar so that thinking traces can be emitted alongside JSON tool calls (think_with_json branch). The _convert_tool_calls helper produces a per-tool TOOL_CALL_GRAMMAR fragment, parenthesizes and alternates each entry, and suffixes a + when parallel_tool_calls=True. Source: src/mistral_common/guidance/grammar_factory.py:_convert_tool_calls
HF Chat Template Generator
The integration under src/mistral_common/integrations/chat_templates/template_generator.py programmatically builds a tokenizer-version-aware Jinja2 chat template that can be embedded into a Hugging Face tokenizer_config.json. It is driven by a TemplateConfig dataclass that captures the TokenizerVersion, spm flag, and the boolean toggles for image_support, audio_support, thinking_support, plain_thinking_support, and use_special_token_variables. Source: src/mistral_common/integrations/chat_templates/template_generator.py:TemplateConfig
| Config flag | Effect on generated template |
|---|---|
version (v1–v15) | Selects special-token layout, tool-call branch, and message-aggregation rule |
spm | Adds trailing spaces after special tokens; blocked on v11+ and with audio |
image_support (v3+) | Adds [IMG] chunk processing; mutually exclusive with audio |
audio_support (v7+) | Adds [AUDIO] chunk processing; mutually exclusive with image |
thinking_support (v13+) | Emits [THINK]/[/THINK] around reasoning chunks |
plain_thinking_support (v11 only) | Uses <think>/</think> literal tags instead of special tokens |
use_special_token_variables | Emits bos_token/eos_token as Jinja variables rather than literals |
Template generation is split into composable helpers: _generate_header emits the BOS, _generate_system_prompt_handling branches on uses_system_prompt_tokens (pre-v7 extracts and merges, v7+ keeps messages inline), _generate_message_aggregation coalesces same-role messages using a sentinel-flush loop, and _generate_system_message_handling writes the [SYSTEM_PROMPT]…[/SYSTEM_PROMPT] envelope. Source: src/mistral_common/integrations/chat_templates/template_generator.py:_generate_header,_generate_system_prompt_handling,_generate_message_aggregation,_generate_system_message_handling
Tool-call rendering leverages _emit_call_id_resolution, which prefers message['call_id'] and falls back to message['tool_call_id'], raising a Jinja exception when neither is a 9-character string. Numeric tool-result content is auto-coerced through int then float parsing branches so downstream integer schemas accept stringly-typed payloads. Source: src/mistral_common/integrations/chat_templates/template_generator.py:_emit_call_id_resolution,_emit_tool_content_int_or_float_parsing
See Also
- Tokenizers and Special Tokens — the
TokenizerABC andSpecialTokensenum that underpin the server and grammar factory - Instruct Protocol and Normalization — the
InstructRequest/InstructRequestNormalizersurface the server exposes - Multimodal (Image & Audio) Encoding — the
image_support/audio_supportbranches of the template generator
Source: https://github.com/mistralai/mistral-common / Human Manual
Doramagic Pitfall Log
Source-linked risks stay visible on the manual page so the preview does not read like a recommendation.
May increase setup, validation, or first-run risk for the user.
May increase setup, validation, or first-run risk for the user.
May increase setup, validation, or first-run risk for the user.
May increase setup, validation, or first-run risk for the user.
Doramagic Pitfall Log
Found 9 structured pitfall item(s), including 1 high/blocking item(s). Top priority: Security or permission risk - Security or permission risk requires verification.
1. Security or permission risk: Security or permission risk requires verification
- Severity: high
- Finding: Project evidence flags a security or permission risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: community_evidence:github | https://github.com/mistralai/mistral-common/issues/148
2. Capability evidence risk: Capability evidence risk requires verification
- Severity: medium
- Finding: README/documentation is current enough for a first validation pass.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: capability.assumptions | github_repo:786756993 | https://github.com/mistralai/mistral-common
3. Maintenance risk: Maintenance risk requires verification
- Severity: medium
- Finding: Project evidence flags a maintenance risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: evidence.maintainer_signals | github_repo:786756993 | https://github.com/mistralai/mistral-common
4. Security or permission risk: Security or permission risk requires verification
- Severity: medium
- Finding: no_demo
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: downstream_validation.risk_items | github_repo:786756993 | https://github.com/mistralai/mistral-common
5. Security or permission risk: Security or permission risk requires verification
- Severity: medium
- Finding: no_demo
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: risks.scoring_risks | github_repo:786756993 | https://github.com/mistralai/mistral-common
6. Security or permission risk: Security or permission risk requires verification
- Severity: medium
- Finding: Project evidence flags a security or permission risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: community_evidence:github | https://github.com/mistralai/mistral-common/issues/232
7. Security or permission risk: Security or permission risk requires verification
- Severity: medium
- Finding: Project evidence flags a security or permission risk. Review the linked source before relying on this workflow.
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: community_evidence:github | https://github.com/mistralai/mistral-common/issues/229
8. Maintenance risk: Maintenance risk requires verification
- Severity: low
- Finding: issue_or_pr_quality=unknown。
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: evidence.maintainer_signals | github_repo:786756993 | https://github.com/mistralai/mistral-common
9. Maintenance risk: Maintenance risk requires verification
- Severity: low
- Finding: release_recency=unknown。
- User impact: May increase setup, validation, or first-run risk for the user.
- Recommended check: Reproduce the official install and quickstart path in an isolated environment.
- Evidence: evidence.maintainer_signals | github_repo:786756993 | https://github.com/mistralai/mistral-common
Source: Doramagic discovery, validation, and Project Pack records
Community Discussion Evidence
These external discussion links are review inputs, not standalone proof that the project is production-ready.
Count of project-level external discussion links exposed on this manual page.
Open the linked issues or discussions before treating the pack as ready for your environment.
Community Discussion Evidence
Doramagic exposes project-level community discussion separately from official documentation. Review these links before using mistral-common with real data or production workflows.
- [BUG: Bug in the definition of continue_final_message that the Transfore - github / github_issue
- [BUG: Model name to tokenizer mapping is outdated - github / github_issue
- [BUG: MistralCommonTokenizer from transformers is not supported by trl S - github / github_issue
- v1.11.3: Fix continue_final_message, add reasoning format to to_openai - github / github_release
- v1.11.2: Improve from_openai method. - github / github_release
- v1.11.1: Patch for agentic use - github / github_release
- v1.11.0: Mistral Guidance - github / github_release
- v1.10.0: Tokenizer v15, Reasoning Effort and Python 3.14 - github / github_release
- v1.9.1 Patch Release - github / github_release
- v1.9.0 - Stream my audio 🎙️ - github / github_release
- v1.8.8: Backward comp - github / github_release
- v1.8.7: Refactoring and bug fixes. - github / github_release
Source: Project Pack community evidence and pitfall evidence