Doramagic Project Pack · Human Manual

outlines

Outlines follows a multi-layered architecture that transforms Python types into generation constraints:

Introduction to Outlines

Related topics: System Architecture, Quickstart Guide

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Universal Model Support

Continue reading this section for the full explanation and source context.

Section Guaranteed Valid Structure

Continue reading this section for the full explanation and source context.

Section Layer Stack

Continue reading this section for the full explanation and source context.

Related topics: System Architecture, Quickstart Guide

Introduction to Outlines

Outlines is a Python library that enables structured text generation with Large Language Models (LLMs). It guarantees that model outputs conform to a specified structure during generation, eliminating the need for post-processing, regex parsing, or fragile code that breaks easily. Source: README.md

What Problem Does Outlines Solve?

LLMs are powerful but produce unpredictable outputs. Traditional approaches attempt to fix bad outputs after generation using parsing and regex, which is fragile and breaks easily. Outlines takes a different approach by ensuring structured outputs during generation rather than after. Source: README.md

The core philosophy follows Python's own type system pattern: simply specify the desired output type, and Outlines ensures the generated data matches that structure exactly. Source: README.md

Core Philosophy

Outlines follows a simple pattern that mirrors Python's own type system:

  • For yes/no responses, use Literal["Yes", "No"]
  • For numerical values, use int
  • For complex objects, define a structure with a Pydantic model

This type-driven API makes structured generation feel natural to Python developers. Source: README.md

Key Features

Universal Model Support

Outlines works with any model provider with minimal code changes:

Model TypeDescriptionDocumentation
Server SupportvLLM and OllamaServer Integrations
Local Model Supporttransformers and llama.cppModel Integrations
API SupportOpenAI, Gemini, and DottxtAPI Integrations

Source: README.md

Guaranteed Valid Structure

Outlines provides several key guarantees:

  • Works with any model - Same code runs across OpenAI, Ollama, vLLM, and more
  • Simple integration - Just pass your desired output type: model(prompt, output_type)
  • Guaranteed valid structure - No more parsing headaches or broken JSON
  • Provider independence - Switch models without changing code

Source: README.md

Architecture Overview

Layer Stack

Outlines follows a multi-layered architecture that transforms Python types into generation constraints:

User API (outlines.models)
    ↓
Generator Classes (SteerableGenerator, BlackBoxGenerator)
    ↓
Type System (types/dsl.py: Pydantic → JsonSchema → Regex)
    ↓
FSM Compilation (outlines-core: regex → FSM via interegular)
    ↓
Guide System (processors/guide.py: FSM state management)
    ↓
Logits Processing (processors/structured.py: token masking)
    ↓
Model Providers (transformers, OpenAI, etc.)

Source: llm.txt

Key Design Decisions

  1. FSM-based constraints: For local models, constraints compile to finite state machines that track valid next tokens
  2. Provider abstraction: Same constraint system works across local models (transformers) and APIs (OpenAI)
  3. Lazy compilation: FSMs are compiled on first use and cached persistently
  4. Token-level control: Constraints apply at the token level, not character level
  5. Type-driven API: Python types are the primary interface for specifying constraints

Source: llm.txt

Model Class Hierarchy

Outlines distinguishes between two types of model implementations:

graph TD
    BaseModel[BaseModel]
    SteerableModel[SteerableModel - Controls logits]
    BlackBoxModel[BlackBoxModel - Uses provider's structured output]
    
    BaseModel --> SteerableModel
    BaseModel --> BlackBoxModel
    
    SteerableModel --> Transformers[Transformers]
    SteerableModel --> LlamaCpp[LlamaCpp]
    
    BlackBoxModel --> OpenAI[OpenAI]
    BlackBoxModel --> Gemini[Gemini]
    BlackBoxModel --> Anthropic[Anthropic]

Source: llm.txt

Getting Started

Installation

Install Outlines using pip:

pip install outlines

Source: README.md

Basic Usage

#### 1. Connect to a Model

import outlines
from transformers import AutoTokenizer, AutoModelForCausalLM


MODEL_NAME = "microsoft/Phi-3-mini-4k-instruct"
model = outlines.from_transformers(
    AutoModelForCausalLM.from_pretrained(MODEL_NAME, device_map="auto"),
    AutoTokenizer.from_pretrained(MODEL_NAME)
)

Source: README.md

#### 2. Simple Structured Outputs

from typing import Literal
from pydantic import BaseModel


# Simple classification
sentiment = model(
    "Analyze: 'This product completely changed my life!'",
    Literal["Positive", "Negative", "Neutral"]
)
print(sentiment)  # "Positive"

# Extract specific types
temperature = model("What's the boiling point of water in Celsius?", int)
print(temperature)  # 100

Source: README.md

#### 3. Complex Structures with Pydantic

from pydantic import BaseModel
from enum import Enum

class Rating(Enum):
    poor = 1
    fair = 2
    good = 3
    excellent = 4

class ProductReview(BaseModel):
    rating: Rating
    pros: list[str]
    cons: list[str]
    summary: str

Source: README.md

Using Templates

Outlines supports Jinja-based templates for dynamic prompt generation:

import outlines
from typing import List, Literal
from transformers import AutoTokenizer, AutoModelForCausalLM


MODEL_NAME = "microsoft/phi-4"
model = outlines.from_transformers(
    AutoModelForCausalLM.from_pretrained(MODEL_NAME, device_map="auto"),
    AutoTokenizer.from_pretrained(MODEL_NAME)
)


# Create a reusable template with Jinja syntax
sentiment_template = outlines.Template.from_string("""
<|im_start>user
Analyze the sentiment of the following {{ content_type }}:

{{ text }}

Provide your analysis as either "Positive", "Negative", or "Neutral".
<|im_end>
<|im_start>assistant
""")

# Generate prompts with different parameters
review = "This restaurant exceeded all my expectations. Fantastic service!"
prompt = sentiment_template(content_type="review", text=review)

# Use with structured generation
result = model(prompt, Literal["Positive", "Negative", "Neutral"])

Source: README.md

Generator Pattern

The Generator class provides a reusable way to apply structured generation:

from outlines import Generator, from_transformers
from transformers import AutoModelForCausalLM, AutoTokenizer

model = from_transformers(
    AutoModelForCausalLM.from_pretrained("microsoft/Phi-3-mini-4k-instruct"),
    AutoTokenizer.from_pretrained("microsoft/Phi-3-mini-4k-instruct")
)

# Create a generator with a specific output type
generator = Generator(model, MyOutputType)

# Reuse the generator multiple times
result1 = generator("Prompt 1")
result2 = generator("Prompt 2")

The Generator class was introduced in v1 to address the need for reusable generation with a fixed output type, where output type compilation happens only once. Source: outlines/release_note.md

Async Support

Outlines provides comprehensive async support for both generation and streaming:

Async Model Methods

# Direct model calling with async
from pydantic import BaseModel
from outlines import from_openai
from openai import AsyncOpenAI

class Character(BaseModel):
    name: str

model = from_openai(AsyncOpenAI(), "gpt-4o")
result = await model("Create a character", Character)

Source: outlines/models/base.py

Async Streaming

async for chunk in model.stream("prompt", OutputType):
    print(chunk)

Source: outlines/models/base.py

Batch Processing

# Batch generation
results = await model.batch(["prompt1", "prompt2"], OutputType)

Source: outlines/models/base.py

API Model Integration

OpenAI

from openai import OpenAI
from pydantic import BaseModel
from outlines import from_openai

class Character(BaseModel):
    name: str

model = from_openai(OpenAI(), "gpt-4o")
result = model("Create a character", Character)

Source: outlines/release_note.md

Google Gemini

from outlines import from_gemini

model = from_gemini(client, "gemini-pro")

Source: outlines/models/gemini.py

Exception Handling

Outlines provides a comprehensive exception hierarchy for error handling:

OutlinesError
├── APIError
│   ├── AuthenticationError
│   ├── PermissionDeniedError
│   ├── NotFoundError
│   ├── RateLimitError
│   ├── BadRequestError
│   ├── ServerError
│   ├── APITimeoutError
│   ├── APIConnectionError
│   └── ProviderResponseError
└── GenerationError

Source: outlines/exceptions.py

All public exceptions inherit from APIErrorOutlinesErrorException. The normalize_provider_exception function converts raw provider SDK exceptions into the appropriate Outlines type. Source: outlines/exceptions.py

Deployment Example

Outlines can be deployed on various platforms. Here's an example using Modal:

import modal

app = modal.App(name="outlines-app")


outlines_image = modal.Image.debian_slim(python_version="3.11").pip_install(
    "outlines==1.0.0",
    "transformers==4.38.2",
    "datasets==2.18.0",
    "accelerate==0.27.2",
)


def import_model():
    from transformers import AutoModelForCausalLM, AutoTokenizer

    model_id = "mistralai/Mistral-7B-Instruct-v0.2"
    _ = AutoTokenizer.from_pretrained(model_id)
    _ = AutoModelForCausalLM.from_pretrained(model_id)


outlines_image = outlines_image.run_function(import_model)

Source: examples/modal_example.py

Migration from v0 to v1

Key changes in the v1 API:

Featurev0v1
Model initializationmodels.openai("gpt-4o")from_openai(OpenAI(), "gpt-4o")
Generationgenerate.json(model, Character)Generator(model, Character)
Direct callingN/Amodel("prompt", OutputType)
StreamingSeparate methodmodel.stream("prompt", OutputType)

Source: outlines/release_note.md

Deprecated Features

  • Exllamav2 model has been removed due to interface incompatibility
  • function module and Function class replaced by Application
  • load_lora methods on VLLM and LlamaCpp models deprecated in favor of direct initialization parameters
  • TransformersVision replaced by TransformersMultiModal

Source: outlines/release_note.md

See Also

Source: https://github.com/dottxt-ai/outlines / Human Manual

Quickstart Guide

Related topics: Introduction to Outlines, Installation, Output Types Overview

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Optional Dependencies

Continue reading this section for the full explanation and source context.

Section Available Model Integrations

Continue reading this section for the full explanation and source context.

Section Simple Classification

Continue reading this section for the full explanation and source context.

Related topics: Introduction to Outlines, Installation, Output Types Overview

Quickstart Guide

This guide provides a comprehensive introduction to Outlines, a structured text generation library for Large Language Models (LLMs). It covers installation, model setup, basic usage patterns, and common workflows to help you get started with guaranteed structured outputs from any LLM.

Overview

Outlines ensures structured outputs during generation—directly from any LLM. Unlike post-processing approaches that parse and validate outputs after generation, Outlines enforces structure constraints at generation time. This eliminates parsing headaches, broken JSON, and fragile regex-based solutions.

Core capabilities:

  • Works with any model (OpenAI, Ollama, vLLM, transformers, and more)
  • Simple integration using Python type annotations
  • Guaranteed valid structure output
  • Provider independence for easy model switching

Source: README.md:1-10

Installation

Install Outlines using pip:

pip install outlines

Source: README.md:31

Optional Dependencies

Depending on your model provider, you may need additional packages:

ProviderRequired Dependencies
transformerstransformers, accelerate
OpenAIopenai
Anthropicanthropic
Geminigoogle-genai
vLLMvllm
Ollamaollama

Connecting to Models

Outlines provides factory functions to create model instances from various providers. The from_transformers function initializes a model using Hugging Face transformers.

import outlines
from transformers import AutoTokenizer, AutoModelForCausalLM

MODEL_NAME = "microsoft/Phi-3-mini-4k-instruct"
model = outlines.from_transformers(
    AutoModelForCausalLM.from_pretrained(MODEL_NAME, device_map="auto"),
    AutoTokenizer.from_pretrained(MODEL_NAME)
)

Source: README.md:37-46

Available Model Integrations

Outlines supports multiple model providers through factory functions:

Model TypeFactory FunctionDescription
Local (Transformers)from_transformers()Hugging Face transformers models
OpenAIfrom_openai()OpenAI API models
Anthropicfrom_anthropic()Anthropic Claude models
Geminifrom_gemini()Google Gemini models
vLLMfrom_vllm()vLLM server deployment
Ollamafrom_ollama()Ollama local server
llama.cppfrom_llamacpp()llama.cpp based models

Source: llm.txt:1-20

Basic Structured Generation

Simple Classification

Use Literal types for classification tasks with predefined categories:

from typing import Literal

# Simple classification
sentiment = model(
    "Analyze: 'This product completely changed my life!'",
    Literal["Positive", "Negative", "Neutral"]
)
print(sentiment)  # "Positive"

Source: README.md:55-62

Numerical Values

Generate structured numerical outputs by passing Python types:

# Extract numerical values
temperature = model("What's the boiling point of water in Celsius?", int)
print(temperature)  # 100

Source: README.md:64-68

State Flow for Basic Generation

graph TD
    A[User Prompt + Type] --> B[Outlines Model Call]
    B --> C{Output Type}
    C -->|Literal| D[Enum FSM Compilation]
    C -->|int/float| E[Number FSM Compilation]
    C -->|Pydantic| F[JSON Schema FSM Compilation]
    D --> G[Token Masking]
    E --> G
    F --> G
    G --> H[Constrained Generation]
    H --> I[Valid Structured Output]

Complex Structures with Pydantic

For complex objects, define a structure using Pydantic models:

from pydantic import BaseModel
from enum import Enum

class Rating(Enum):
    poor = 1
    fair = 2
    good = 3
    excellent = 4

class ProductReview(BaseModel):
    rating: Rating
    pros: list[str]
    cons: list[str]
    summary: str

# Generate structured review
review = model(
    "Review a smartphone with great camera but poor battery life",
    ProductReview
)

Source: README.md:70-88

Enum Types

Enums constrain outputs to specific string values:

from enum import Enum

class EventType(str, Enum):
    conference = "conference"
    webinar = "webinar"
    workshop = "workshop"
    meetup = "meetup"
    other = "other"

class EventInfo(BaseModel):
    name: str
    event_type: EventType
    topics: list[str]

Source: README.md:1-50

Prompt Templates

Outlines supports Jinja-based templates for dynamic prompt generation:

# Create a reusable template with Jinja syntax
sentiment_template = outlines.Template.from_string("""
<|im_start|>user
Analyze the sentiment of the following {{ content_type }}:

{{ text }}

Provide your analysis as either "Positive", "Negative", or "Neutral".
<|im_end>
<|im_start>assistant
""")

# Generate prompts with different parameters
review = "This restaurant exceeded all my expectations. Fantastic service!"
prompt = sentiment_template(content_type="review", text=review)

# Use with structured generation
result = model(prompt, Literal["Positive", "Negative", "Neutral"])

Source: README.md:1-50

Loading Templates from Files

Templates can be loaded from external files for better organization:

# Load template from file
example_template = outlines.Template.from_file("templates/few_shot.txt")

# Use with examples for few-shot learning
examples = [
    ("The food was cold", "Negative"),
    ("The staff was friendly", "Positive")
]
few_shot_prompt = example_template(examples=examples, query="Service was slow")

Source: README.md:1-50

Handling Incomplete Data with Union Types

Use Union types to handle cases where data might be incomplete:

from typing import Union

# Create a union type that can either be a structured response or fallback
EventResponse = Union[EventInfo, Literal["I don't know"]]

# Parse event details - returns EventInfo or "I don't know"
result = model(
    "Join us for DevCon 2024 in San Francisco on March 15th",
    EventResponse
)

Source: README.md:1-50

Using the Generator Class

The Generator class encapsulates a model with a specific output type, allowing reusable structured generation:

from outlines import Generator, from_transformers
from transformers import AutoModelForCausalLM, AutoTokenizer
from typing import Literal

model = from_transformers(
    AutoModelForCausalLM.from_pretrained("microsoft/Phi-3-mini-4k-instruct"),
    AutoTokenizer.from_pretrained("microsoft/Phi-3-mini-4k-instruct")
)

# Create a reusable generator
choice_generator = Generator(model, Literal["pizza", "burger", "tacos"])

# Use the generator multiple times
result1 = choice_generator("What should I eat for lunch?")
result2 = choice_generator("Dinner options?")

Source: outlines/release_note.md:1-50

Generator vs Direct Model Calling

AspectDirect Model CallGenerator
Output typeSpecified per callFixed at initialization
CompilationOccurs each callOccurs once
ReusabilitySingle useMultiple uses
Best forVarying output typesConsistent output types

Source: outlines/release_note.md:1-50

Function Calling with Applications

The Application class provides a way to define functions with typed parameters that LLMs can call:

from outlines import Application, Template, from_transformers
from pydantic import BaseModel
from transformers import AutoModelForCausalLM, AutoTokenizer
from typing import List, Optional
from datetime import date

# Define a function with typed parameters
def schedule_meeting(
    title: str,
    date: date,
    duration_minutes: int,
    attendees: List[str],
    location: Optional[str] = None,
    agenda_items: Optional[List[str]] = None
):
    """Schedule a meeting with the specified details"""
    meeting = {
        "title": title,
        "date": date,
        "duration_minutes": duration_minutes,
        "attendees": attendees,
        "location": location,
        "agenda_items": agenda_items
    }
    return f"Meeting '{title}' scheduled for {date}"

# Create model and template
model = from_transformers(
    AutoModelForCausalLM.from_pretrained("microsoft/phi-4"),
    AutoTokenizer.from_pretrained("microsoft/phi-4")
)

template = Template.from_string("""
Extract meeting details from: {{ request }}
""")

# Create application
app = Application(template, schedule_meeting)

# Natural language request
user_request = """
I need to set up a team sync next Monday at 2pm for 30 minutes.
Include John and Sarah. We'll discuss the Q1 roadmap.
"""

# Execute
result = app(model, {"request": user_request})

Source: README.md:1-50

Application Pattern Workflow

graph TD
    A[Natural Language Request] --> B[Application]
    B --> C[Template Variables]
    C --> D[Structured Prompt]
    D --> E[LLM Generation]
    E --> F[Function Schema]
    F --> G[Parameter Extraction]
    G --> H[Typed Function Call]
    H --> I[Structured Result]

Generation Parameters

Pass additional inference arguments to model calls:

# Beam search for better quality
result = model(
    "Write a short story about a cat",
    str,
    num_beams=2
)

# Streaming responses
for chunk in model.stream("Tell me a joke", str):
    print(chunk, end="", flush=True)

Source: outlines/release_note.md:1-50

Deployment Examples

Modal Deployment

Outlines can be deployed on Modal for serverless inference:

import modal

app = modal.App(name="outlines-app")

outlines_image = modal.Image.debian_slim(python_version="3.11").pip_install(
    "outlines==1.0.0",
    "transformers==4.38.2",
    "datasets==2.18.0",
    "accelerate==0.27.2",
)

def import_model():
    from transformers import AutoModelForCausalLM, AutoTokenizer
    model_id = "mistralai/Mistral-7B-Instruct-v0.2"
    _ = AutoTokenizer.from_pretrained(model_id)
    _ = AutoModelForCausalLM.from_pretrained(model_id)

outlines_image = outlines_image.run_function(import_model)

Source: examples/modal_example.py:1-30

Next Steps

TopicDescription
Installation GuideDetailed installation instructions for all providers
Model IntegrationsComplete reference for all supported models
Output TypesDeep dive into type system and constraints
TemplatesAdvanced template usage and patterns
ApplicationsBuilding reusable structured applications
ArchitectureUnderstanding the internal design

Quick Reference

# Complete minimal example
import outlines
from transformers import AutoModelForCausalLM, AutoTokenizer
from pydantic import BaseModel

# 1. Load model
model = outlines.from_transformers(
    AutoModelForCausalLM.from_pretrained("microsoft/Phi-3-mini-4k-instruct"),
    AutoTokenizer.from_pretrained("microsoft/Phi-3-mini-4k-instruct")
)

# 2. Define output structure
class Answer(BaseModel):
    response: str
    confidence: float

# 3. Generate with structure guarantee
result = model("What is the capital of France?", Answer)

This quickstart covers the essential patterns for using Outlines. The library's type-driven approach ensures that outputs always match your specified structure, eliminating the need for fragile post-processing.

Source: https://github.com/dottxt-ai/outlines / Human Manual

Installation

Related topics: Quickstart Guide, Structured Generation Backends

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Python Version

Continue reading this section for the full explanation and source context.

Section Core Dependencies

Continue reading this section for the full explanation and source context.

Section Standard Installation (pip)

Continue reading this section for the full explanation and source context.

Related topics: Quickstart Guide, Structured Generation Backends

Installation

This guide covers all methods for installing Outlines, a structured text generation library for Large Language Models.

Overview

Outlines provides structured output generation for LLMs by ensuring outputs match specified types during generation. The installation process is straightforward via pip, with optional dependencies for specific model backends. Source: README.md:1-10

Prerequisites

Python Version

Outlines requires Python 3.10 or later. Ensure your environment has an appropriate Python installation before proceeding.

Core Dependencies

The following table lists the core dependencies required by Outlines:

PackageVersionPurpose
interegularLatestFSM compilation for regex-based constraints
jinja2LatestTemplate processing
pydantic2.xData validation and structure definitions

Installation Methods

Standard Installation (pip)

The simplest way to install Outlines is via pip:

pip install outlines

Source: README.md:15-18

Development Installation

For contributors or those wanting the latest unreleased features, install from source:

git clone https://github.com/dottxt-ai/outlines.git
cd outlines
pip install -e ".[dev]"

Optional Dependencies by Model Backend

Outlines supports multiple model providers. Install backend-specific dependencies based on your use case.

Hugging Face Transformers

For local model inference using Hugging Face Transformers:

pip install outlines[transformers]

Required packages:

PackagePurpose
transformersModel loading and inference
accelerateGPU acceleration support
datasetsDataset utilities
torchDeep learning framework

Source: examples/modal_example.py:5-9

from transformers import AutoTokenizer, AutoModelForCausalLM

MODEL_NAME = "microsoft/Phi-3-mini-4k-instruct"
model = outlines.from_transformers(
    AutoModelForCausalLM.from_pretrained(MODEL_NAME, device_map="auto"),
    AutoTokenizer.from_pretrained(MODEL_NAME)
)

OpenAI Models

For OpenAI API integration:

pip install outlines[openai]

Required packages:

PackagePurpose
openaiOfficial OpenAI Python client

Source: README.md:20-35

from openai import OpenAI
from outlines import from_openai

model = from_openai(OpenAI(), "gpt-4o")
result = model("Create a character", Character)

Anthropic Models

For Anthropic Claude integration:

pip install outlines[anthropic]

Google Gemini Models

pip install outlines[gemini]

Source: outlines/models/gemini.py:80-95

Local Model Backends

For running models locally via vLLM, llama.cpp, or SGLang:

pip install outlines[vllm]   # For vLLM backend
pip install outlines[sglang] # For SGLang backend

Source: llm.txt:30-50

Async Support

For asynchronous inference with async model backends:

pip install outlines[async]

The async backends available are:

BackendClassPurpose
AsyncSGLangAsyncSGLangAsync SGLang inference
AsyncTGIAsyncTGIAsync Text Generation Inference
AsyncVLLMAsyncVLLMAsync vLLM inference

Source: release_note.md:45-60

import outlines
from huggingface_hub import AsyncInferenceClient

async_model = outlines.from_tgi(AsyncInferenceClient("http://localhost:11434"))

Quick Start After Installation

Once installed, you can begin using Outlines immediately:

import outlines
from transformers import AutoTokenizer, AutoModelForCausalLM

# Load a model
MODEL_NAME = "microsoft/Phi-3-mini-4k-instruct"
model = outlines.from_transformers(
    AutoModelForCausalLM.from_pretrained(MODEL_NAME, device_map="auto"),
    AutoTokenizer.from_pretrained(MODEL_NAME)
)

# Generate structured output
from typing import Literal

sentiment = model(
    "Analyze: 'This product completely changed my life!'",
    Literal["Positive", "Negative", "Neutral"]
)

Source: README.md:40-60

GPU Setup

For optimal performance with local models, configure GPU acceleration:

# Device mapping for multi-GPU setups
model = outlines.from_transformers(
    AutoModelForCausalLM.from_pretrained(
        MODEL_NAME, 
        device_map="auto"
    ),
    AutoTokenizer.from_pretrained(MODEL_NAME)
)

The device_map="auto" parameter enables automatic GPU allocation across available devices.

Verifying Installation

Verify your installation by running:

import outlines
print(outlines.__version__)  # Check installed version

Common Installation Issues

Missing Dependencies

If you encounter import errors, ensure all required dependencies are installed for your specific use case. Reinstall with the appropriate extras:

pip install --upgrade outlines[<backend>]

CUDA/GPU Compatibility

For CUDA support with transformers, ensure accelerate is installed:

pip install accelerate

Version Conflicts

If upgrading from v0.x to v1.x, note the following breaking changes:

v0.xv1.x
models.openai("gpt-4o")from_openai(OpenAI(), "gpt-4o")
generate.json(model, schema)model(prompt, schema)
Function classApplication class

Source: release_note.md:100-130

Next Steps

After installation, explore these topics:

Source: https://github.com/dottxt-ai/outlines / Human Manual

Migration Guide

Related topics: Introduction to Outlines, Quickstart Guide

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Model Initialization

Continue reading this section for the full explanation and source context.

Section Generation API Changes

Continue reading this section for the full explanation and source context.

Section Function to Application Migration

Continue reading this section for the full explanation and source context.

Related topics: Introduction to Outlines, Quickstart Guide

Migration Guide

This guide provides comprehensive documentation for migrating from Outlines v0 to v1. The v1 release introduces significant API changes that improve consistency, usability, and maintainability while preserving the core functionality of structured text generation with LLMs.

Overview

Outlines v1 represents a major evolution of the library's architecture, focusing on:

  • Unified Model Interface: All model providers now share a consistent calling pattern
  • Simplified Output Type Handling: The Generator class replaces multiple specialized generate functions
  • Type-driven API: Python types remain the primary interface for specifying constraints
  • Enhanced Streaming: All models now support streaming as a first-class feature

Source: outlines/release_note.md:1-50

High-Level Architecture Changes

The following diagram illustrates the architectural changes between v0 and v1:

graph TD
    subgraph v0_Architecture
        A0[User Code] --> B0[models]
        B0 --> C0[generate.json/choice/...]
        C0 --> D0[Generator with fixed output type]
    end
    
    subgraph v1_Architecture
        A1[User Code] --> B1[from_transformers/from_openai/...]
        B1 --> C1[Model Instance]
        A1 --> D1[Generator Model, OutputType]
        D1 --> E1[Reusable Generator]
    end
    
    style v0_Architecture fill:#ffcccc
    style v1_Architecture fill:#ccffcc

Migration by Component

Model Initialization

#### Transformers Models

Aspectv0v1
Entry pointmodels.transformers()outlines.from_transformers()
Model loadingInline with Outlines initializationSeparately via HuggingFace
TokenizerPassed to OutlinesExplicitly loaded and passed
ConfigurationScattered across model_kwargsStandard HuggingFace initialization

v0 (Deprecated):

from outlines import models
from transformers import BertForSequenceClassification, BertTokenizer

model = models.transformers(
    model_name="prajjwal1/bert-tiny",
    model_class=BertForSequenceClassification,
    tokenizer_class=BertTokenizer,
    model_kwargs={"use_cache": False},
    tokenizer_kwargs={"model_max_length": 512},
)

v1 (Current):

import outlines
from transformers import BertForSequenceClassification, BertTokenizer

hf_model = BertForSequenceClassification.from_pretrained(
    "prajjwal1/bert-tiny", 
    use_cache=False
)
hf_tokenizer = BertTokenizer.from_pretrained(
    "prajjwal1/bert-tiny", 
    model_max_length=512
)
model = outlines.from_transformers(hf_model, hf_tokenizer)

Source: outlines/release_note.md:45-60

#### OpenAI Models

Aspectv0v1
Init signatureOpenAI(client, OpenAIConfig())OpenAI(client, model_name)
Inference argsIn OpenAIConfigIn model call
RecommendationDirect initializationUse from_openai()

v0 (Deprecated):

from outlines import models

model = models.openai("gpt-4o", config)
result = generator("Create a character")

v1 (Current):

from openai import OpenAI
from outlines import from_openai

client = OpenAI()
model = from_openai(client, "gpt-4o")
result = model("Create a character", Character)

Source: outlines/release_note.md:65-80

Generation API Changes

#### Generator Class Introduction

The Generator class provides a reusable interface where the output type is compiled only once:

classDiagram
    class Generator {
        +model: Model
        +output_type: OutputType
        +__init__(model, output_type)
        +__call__(prompt, **kwargs) Any
        +stream(prompt, **kwargs) Iterator
    }
    
    class Model {
        <<interface>>
        +__call__(prompt, output_type, **kwargs) Any
        +stream(prompt, output_type, **kwargs) Iterator
    }
    
    class OutputType {
        <<union>>
    }
    
    Generator --> Model : uses
    Generator --> OutputType : compiles

Usage Pattern:

from outlines import Generator, from_transformers
from pydantic import BaseModel
from transformers import AutoModelForCausalLM, AutoTokenizer

class Character(BaseModel):
    name: str
    age: int

model = from_transformers(
    AutoModelForCausalLM.from_pretrained("microsoft/Phi-3-mini-4k-instruct"),
    AutoTokenizer.from_pretrained("microsoft/Phi-3-mini-4k-instruct")
)

# Create a reusable generator
generator = Generator(model, Character)

# Use multiple times without recompiling the output type
result1 = generator("Create a hero character")
result2 = generator("Create a villain character")

Source: outlines/release_note.md:25-45

#### Direct Model Calling

All models can now be called directly with a prompt and output type:

from typing import Literal
from outlines import from_transformers
from transformers import AutoModelForCausalLM, AutoTokenizer

model = from_transformers(
    AutoModelForCausalLM.from_pretrained("microsoft/Phi-3-mini-4k-instruct"),
    AutoTokenizer.from_pretrained("microsoft/Phi-3-mini-4k-instruct")
)

# Direct calling with output type
result = model("Pizza or burger", Literal["pizza", "burger"])

# Streaming support
for chunk in model.stream("Tell me a story", str):
    print(chunk, end="", flush=True)

Source: outlines/release_note.md:18-25

Function to Application Migration

The Function class has been deprecated in favor of the Application class:

Aspectv0 (Function)v1 (Application)
Model bindingAt initializationAt call time
Template variablesAs **kwargsAs dictionary
ReusabilitySingle model/output typeMultiple models supported

v0 (Deprecated):

from pydantic import BaseModel
from outlines import Function, Template

class Character(BaseModel):
    name: str

template = Template.from_string("Create a {{ gender }} character.")
fn = Function(template, Character, "hf-internal-testing/tiny-random-GPTJForCausalLM")
response = fn(gender="female")

v1 (Current):

from pydantic import BaseModel
from outlines import Application, Template, from_transformers
from transformers import AutoModelForCausalLM, AutoTokenizer

class Character(BaseModel):
    name: str

model = from_transformers(
    AutoModelForCausalLM.from_pretrained("microsoft/Phi-3-mini-4k-instruct"),
    AutoTokenizer.from_pretrained("microsoft/Phi-3-mini-4k-instruct")
)

template = Template.from_string("Create a {{ gender }} character.")
app = Application(template, Character)
response = app(model, {"gender": "female"})

Source: outlines/release_note.md:80-95

Generate Module Deprecation

The generate module and its specialized functions have been consolidated:

v0 Functionv1 Equivalent
generate.json()Generator(model, PydanticModel)
generate.choice()Generator(model, Literal["A", "B"])
generate.regex()Generator(model, str) with FSM constraint

v0 (Deprecated):

from pydantic import BaseModel
from outlines import generate, models

class Character(BaseModel):
    name: str

model = models.openai("gpt-4o")
generator = generate.json(model, Character)
result = generator("Create a character")

v1 (Current):

from openai import OpenAI
from pydantic import BaseModel
from outlines import Generator, from_openai

class Character(BaseModel):
    name: str

client = OpenAI()
model = from_openai(client, "gpt-4o")
generator = Generator(model, Character)
result = generator("Create a character")

Source: outlines/release_note.md:95-110

Async Model Support

v1 introduces new async model providers for asynchronous inference:

ModelFactory FunctionDescription
AsyncSGLangfrom_sglang()SGLang async backend
AsyncTGIfrom_tgi()Text Generation Inference
AsyncVLLMfrom_vllm()vLLM async backend

Usage:

import outlines
from huggingface_hub import AsyncInferenceClient

async_model = outlines.from_tgi(AsyncInferenceClient("http://localhost:11434"))

Source: outlines/release_note.md:10-18

Deprecated Features

Exllamav2 Model

The Exllamav2 model has been deprecated without replacement:

  • Reason: Interface incompatibility with Outlines' constraint system
  • Impact: Required cumbersome runtime patching
  • Action: Migrate to supported local inference backends (transformers, llama.cpp, vLLM)

Quick Reference

Import Changes

Old ImportNew Import
from outlines import modelsfrom outlines import from_transformers, from_openai, ...
from outlines import generatefrom outlines import Generator
from outlines import Functionfrom outlines import Application

Common Migration Patterns

# JSON output - v0 to v1
# v0: generate.json(model, MySchema)
# v1: Generator(model, MySchema)

# Choice selection - v0 to v1
# v0: generate.choice(model, ["option1", "option2"])
# v1: model(prompt, Literal["option1", "option2"])

# Streaming - v0 to v1
# v0: generator = generate.json(model, Schema); result = generator.stream(prompt)
# v1: for chunk in model.stream(prompt, Schema): process(chunk)

Documentation References

For additional information, consult the following resources:

Source: https://github.com/dottxt-ai/outlines / Human Manual

System Architecture

Related topics: Structured Generation Backends, Core Concepts

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Layer Descriptions

Continue reading this section for the full explanation and source context.

Section Model Classes

Continue reading this section for the full explanation and source context.

Section Generator System

Continue reading this section for the full explanation and source context.

Related topics: Structured Generation Backends, Core Concepts

System Architecture

Overview

Outlines is a structured text generation library for Large Language Models (LLMs). Its core architectural purpose is to guarantee that LLM outputs conform to developer-specified schemas and constraints during generation, rather than attempting to parse and fix invalid outputs after generation.

The architecture achieves this through a multi-layered system that compiles output type specifications into finite state machines (FSMs), which are then used to constrain token selection at the logits processing level. This approach provides provider independence, allowing the same constraint system to work across both local models (where logits can be directly controlled) and API-based models (where structured output APIs are leveraged when available).

Source: llm.txt

Layer Stack Architecture

Outlines follows a strict layered architecture where each layer has a specific responsibility and communicates with adjacent layers through well-defined interfaces.

graph TD
    A["User API<br/>(outlines.models)"] --> B["Generator Classes<br/>(SteerableGenerator, BlackBoxGenerator)"]
    B --> C["Type System<br/>(types/dsl.py)"]
    C --> D["FSM Compilation<br/>(outlines-core: regex → FSM)"]
    D --> E["Guide System<br/>(processors/guide.py)"]
    E --> F["Logits Processing<br/>(processors/structured.py)"]
    F --> G["Model Providers<br/>(transformers, OpenAI, etc.)"]
    
    style A fill:#e1f5ff
    style G fill:#fff3e1
    style C fill:#e8f5e9
    style D fill:#f3e5f5

Layer Descriptions

LayerFile LocationPurpose
User APIoutlines/models/Entry point for model initialization and generation calls
Generator Classesoutlines/generator.pyManages reusable generation with cached FSM compilation
Type Systemoutlines/types/dsl.pyConverts Python types and Pydantic models to JSON Schema and Regex
FSM Compilationoutlines-coreTransforms regex patterns into finite state machines via interegular
Guide Systemprocessors/guide.pyManages FSM state transitions during token generation
Logits Processingprocessors/structured.pyMasks invalid tokens by modifying logits before sampling
Model Providersoutlines/models/*.pyProvider-specific adapters for different LLM backends

Source: llm.txt

Core Components

Model Classes

The model layer defines two fundamental abstract base classes that reflect the fundamental difference in how Outlines interacts with different types of LLM backends.

#### SteerableModel

SteerableModel is the base class for models where Outlines has direct control over the sampling process. This includes:

  • Local models via the Transformers backend
  • llama.cpp-based models via the Llamacpp backend
  • MLX-accelerated models via the MlxLm backend

For steerable models, Outlines can apply logits processors that mask invalid tokens based on the compiled FSM, ensuring constraint satisfaction at generation time.

Source: outlines/models/base.py:1-50

#### BlackBoxModel

BlackBoxModel is the base class for API-based models where the model provider controls the generation process. This includes:

  • OpenAI models
  • Anthropic models
  • Google Gemini models
  • Ollama (when used as an API)

For black box models, Outlines leverages provider-specific structured output APIs when available, or falls back to prompting strategies. The constraint system cannot directly mask tokens, so the approach adapts based on provider capabilities.

Source: llm.txt

Generator System

The Generator class provides a reusable abstraction for generation that encapsulates both the model and output type specification.

from outlines import Generator, from_transformers
from pydantic import BaseModel

class Character(BaseModel):
    name: str
    age: int

model = from_transformers(...)
generator = Generator(model, Character)

# FSM compilation happens once
result = generator("Create a character")

Key benefits of the Generator abstraction:

  1. Lazy Compilation: FSMs are compiled on first use and cached persistently
  2. Reusability: The same generator can be called multiple times without re-specifying the output type
  3. Separation of Concerns: Model configuration and output type specification are decoupled

Source: outlines/generator.py Source: outlines/release_note.md

Async Model Support

All model classes inherit from AsyncModelMixin, providing consistent async interfaces across all providers:

async def __call__(self, model_input, output_type=None, backend=None, **inference_kwargs)
async def batch(self, model_inputs, output_type=None, backend=None, **inference_kwargs)
async def stream(self, model_input, output_type=None, backend=None, **inference_kwargs)

Source: outlines/models/base.py

Provider Abstraction

The architecture uses a factory pattern with provider-specific adapter classes that handle input and output format conversion.

graph LR
    A[User Code] --> B["from_transformers() / from_openai() / etc."]
    B --> C["Model Instance<br/>(Transformers / OpenAI / etc.)"]
    C --> D["Provider Adapter"]
    D --> E["Generation Method"]
    
    style B fill:#e8f5e9

Supported Providers

ProviderFactory FunctionModel ClassControl Type
Hugging Face Transformersfrom_transformers()TransformersSteerable
OpenAIfrom_openai()OpenAIBlackBox
Anthropicfrom_anthropic()AnthropicBlackBox
Google Geminifrom_gemini()GeminiBlackBox
Ollamafrom_ollama()OllamaBlackBox
llama.cppfrom_llamacpp()LlamacppSteerable
MLX-LMfrom_mlxlm()MlxLmSteerable
vLLMfrom_vllm()VLLMSteerable
SGLangfrom_sglang()SGLangSteerable

Source: outlines/models/gemini.py

FSM Compilation Pipeline

The FSM (Finite State Machine) compilation is the core mechanism that enables structured generation for steerable models.

graph LR
    A["Python Type<br/>(Pydantic, Literal, etc.)"] --> B["JSON Schema"]
    B --> C["Regex Pattern"]
    C --> D["FSM<br/>(via interegular)"]
    D --> E["Token Mask"]
    
    style A fill:#e1f5ff
    style E fill:#fff3e1

Type System Transformations

The type system in outlines/types/dsl.py handles the conversion pipeline:

  1. Python Types → JSON Schema: Pydantic models and Python types are converted to JSON Schema
  2. JSON Schema → Regex: Complex types are converted to regex patterns
  3. Regex → FSM: The outlines-core library (using interegular) compiles regex to finite state machines

Source: llm.txt

Key Design Decisions

1. FSM-Based Constraints

For local models where logits are accessible, constraints compile to finite state machines that track valid next tokens. The FSM maintains a current state and can determine, for any given state, which tokens are valid next tokens.

This approach provides:

  • Complete coverage: All valid continuations are allowed, all invalid are blocked
  • Efficiency: State transitions are O(1) lookup
  • Correctness: Guarantees well-formed outputs matching the schema

Source: llm.txt

2. Token-Level Control

Constraints apply at the token level, not the character level. This is critical because LLMs generate text token-by-token, and constraining at the character level would be both inefficient and potentially incorrect.

graph TD
    A["Token 1: 'Hello'"] --> B["Token 2: 'World'"]
    B --> C["Token 3: '!'"]
    
    subgraph FSM_State
        D["Current State: q3"]
        E["Valid Tokens: [END, '!', '.']"]
    end
    
    D --> E

Source: llm.txt

3. Lazy Compilation

FSM compilation is deferred until first use and the resulting FSM is cached persistently. This avoids expensive compilation overhead on module import or model loading, and allows the system to handle dynamic type specifications efficiently.

Source: llm.txt

4. Type-Driven API

Python types are the primary interface for specifying constraints, aligning with how developers already specify data structures in Python code.

from pydantic import BaseModel
from typing import Literal

class Review(BaseModel):
    sentiment: Literal["positive", "negative", "neutral"]
    confidence: float

result = model("I love this product!", Review)

Source: outlines/release_note.md

Exception Handling Architecture

Outlines defines a hierarchical exception system for consistent error handling across providers.

graph TD
    A["Exception"] --> B["OutlinesError"]
    B --> C["APIError"]
    
    C --> D["AuthenticationError"]
    C --> E["PermissionDeniedError"]
    C --> F["NotFoundError"]
    C --> G["RateLimitError"]
    C --> H["BadRequestError"]
    C --> I["ServerError"]
    
    B --> J["ProviderResponseError"]
    B --> K["GenerationError"]

All public exceptions inherit from OutlinesErrorAPIError (for provider errors). The normalize_provider_exception function converts raw provider SDK exceptions into appropriate Outlines types.

Source: outlines/exceptions.py

Generation Workflow

The following diagram illustrates the complete generation workflow for a structured output request:

sequenceDiagram
    participant User
    participant Model
    participant Generator
    participant TypeSystem
    participant FSM
    participant Guide
    participant LogitsProcessor
    
    User->>Model: model(prompt, OutputType)
    Model->>Generator: create Generator(model, OutputType)
    Generator->>TypeSystem: convert(OutputType)
    TypeSystem->>FSM: compile to FSM
    Note over FSM: FSM cached after first use
    
    loop For each token
        Model->>LogitsProcessor: logits
        LogitsProcessor->>Guide: get valid tokens
        Guide->>LogitsProcessor: token mask
        LogitsProcessor->>Model: masked logits
        Model->>Guide: next token
        Guide->>FSM: transition state
    end
    
    Generator->>Model: final output
    Model-->>User: structured result

Source: outlines/generator.py Source: outlines/models/base.py

Backend System

The backend system provides abstraction for different inference engines used with steerable models.

# Backend selection via generator
generator = Generator(model, OutputType, backend="transformers")

# Or via direct model call
result = model(prompt, OutputType, backend="vllm")

Available backends for steerable models include:

BackendDescription
transformersHugging Face Transformers library
vllmvLLM inference engine
sglangSGLang runtime
llamacppllama.cpp inference

Source: outlines/backends/__init__.py

Version 1 Interface Changes

Outlines v1 introduced significant architectural changes to the model interface:

Before (v0)

from outlines import generate, models

model = models.openai("gpt-4o")
generator = generate.json(model, Character)
result = generator("Create a character")

After (v1)

from outlines import from_openai

model = from_openai(OpenAI(), "gpt-4o")
result = model("Create a character", Character)

Key changes:

  • Models can now be called directly with prompt and output type
  • All models have a stream() method callable by users
  • Generator class provides reusable generation with caching
  • Application class replaces deprecated Function class for templated generation

Source: outlines/release_note.md

Documentation Architecture

The project uses MkDocs with automatic API reference generation:

graph TD
    A["scripts/gen_ref_pages.py"] --> B["mkdocs.yml"]
    B --> C["mkdocs-gen-files"]
    C --> D["API Reference Pages"]
    
    subgraph "Documentation Structure"
        E["docs/guide/"] --> F["User Guides"]
        G["docs/features/"] --> H["Feature Documentation"]
        I["api_reference/"] --> J["Auto-generated API Docs"]
    end

Source: scripts/gen_ref_pages.py Source: mkdocs.yml

Source: https://github.com/dottxt-ai/outlines / Human Manual

Core Concepts

Related topics: System Architecture, Output Types Overview, Structured Generation Backends

Section Related Pages

Continue reading this section for the full explanation and source context.

Section 1. User API Layer (outlines.models)

Continue reading this section for the full explanation and source context.

Section 2. Generator Classes

Continue reading this section for the full explanation and source context.

Section 3. Type System (types/dsl.py)

Continue reading this section for the full explanation and source context.

Related topics: System Architecture, Output Types Overview, Structured Generation Backends

Core Concepts

Outlines is a structured text generation library that guarantees type-safe, constrained outputs from Large Language Models (LLMs). Rather than post-processing model outputs with fragile parsing logic, Outlines integrates constraint enforcement directly into the generation process, ensuring outputs conform to specified structures from the moment generation begins.

Architecture Overview

Outlines employs a layered architecture that transforms high-level type specifications into low-level token masking operations. The system bridges the gap between Python type annotations and the token-level mechanics of language model inference.

graph TD
    A[User API<br/>outlines.models] --> B[Generator Classes<br/>SteerableGenerator<br/>BlackBoxGenerator]
    B --> C[Type System<br/>Pydantic → JsonSchema → Regex]
    C --> D[FSM Compilation<br/>outlines-core<br/>regex → FSM via interegular]
    D --> E[Guide System<br/>processors/guide.py<br/>FSM state management]
    E --> F[Logits Processing<br/>processors/structured.py<br/>token masking]
    F --> G[Model Providers<br/>transformers<br/>OpenAI<br/>Anthropic<br/>etc.]

Source: llm.txt:layer-stack

Layer Stack

1. User API Layer (`outlines.models`)

The topmost layer provides the primary interface for developers. Users instantiate a model using provider-specific functions and call it directly with prompts and output types.

Source: llm.txt:architecture

2. Generator Classes

Two generator abstractions handle different model categories:

Generator ClassModel TypeControl LevelConstraint Application
SteerableGeneratorLocal models (transformers, llama.cpp)Full logits controlFSM-based token masking
BlackBoxGeneratorAPI models (OpenAI, Anthropic)API-level constraintsProvider's native structured output

Source: llm.txt:key-design-decisions

3. Type System (`types/dsl.py`)

The type system transforms Python types into machine-processable constraints:

graph LR
    A[Pydantic Models<br/>BaseModel] --> B[JSON Schema]
    B --> C[Regex Pattern]
    C --> D[Finite State Machine]

Source: llm.txt:layer-stack

4. FSM Compilation (`outlines-core`)

The compilation layer converts regex patterns into finite state machines using the interegular library. This transformation enables efficient constraint checking at token boundaries.

Source: llm.txt:layer-stack

5. Guide System (`processors/guide.py`)

The guide system manages FSM state transitions during generation. It tracks which states are valid given the current token sequence and determines allowable next tokens.

Source: llm.txt:layer-stack

6. Logits Processing (`processors/structured.py`)

For steerable models, this layer applies token masking by setting probabilities of invalid tokens to negative infinity, ensuring they cannot be selected during sampling.

Source: llm.txt:layer-stack

Key Design Decisions

FSM-Based Constraints

For local models where Outlines controls the sampling process, constraints compile to finite state machines. These FSMs track valid next tokens at each generation step, enabling efficient constraint enforcement without enumerating all possible sequences.

Source: llm.txt:key-design-decisions

Provider Abstraction

The same constraint system works across different model providers:

  • Local models: Outlines controls sampling, applying FSM-based masking
  • API models: Outlines uses provider-native structured output support when available, or falls back to completion with validation

Source: llm.txt:key-design-decisions

Lazy Compilation

FSM compilation occurs on first use and results are cached persistently. This approach avoids upfront compilation overhead while ensuring subsequent generations with the same type are fast.

Source: llm.txt:key-design-decisions

Token-Level Control

Constraints apply at the token level rather than the character level. This design choice ensures that the constraint system works correctly with subword tokenization schemes used by modern language models.

Source: llm.txt:key-design-decisions

Type-Driven API

Python types serve as the primary interface for specifying output constraints. This design choice provides:

  • Familiar syntax for Python developers
  • Static type checking support
  • Integration with Pydantic for complex validation
  • Support for Literal types, enums, and nested structures

Source: README.md:philosophy

Model Integration

Base Model Architecture

The Model base class defines the contract for all provider implementations. Concrete implementations inherit from this base and implement provider-specific input/output handling.

# Simplified base class structure
class Model(ABC):
    @abstractmethod
    def __call__(self, prompt, output_type):
        pass
    
    @abstractmethod
    def stream(self, prompt, output_type):
        pass

Source: outlines/models/base.py

Supported Providers

Outlines provides integrations for multiple model providers:

ProviderFunctionControl Type
OpenAIfrom_openai()BlackBox
Anthropicfrom_anthropic()BlackBox
Google Geminifrom_gemini()BlackBox
Transformersfrom_transformers()Steerable
Ollamafrom_ollama()Steerable/BlackBox
vLLMfrom_vllm()Steerable
SGLangfrom_sglang()Steerable
Llama.cppfrom_llamacpp()Steerable

Source: mkdocs.yml:navigation

Gemini Integration Example

from outlines import from_gemini

client = Client()  # google.genai.Client
model = from_gemini(client, model_name="gemini-pro")
result = model("What is 2 + 2?", int)  # Returns 4

Source: outlines/models/gemini.py

Error Handling

Exception Hierarchy

Outlines defines a comprehensive exception hierarchy for structured error handling:

OutlinesError (base)
├── APIError (provider API errors)
│   ├── AuthenticationError
│   ├── PermissionDeniedError
│   ├── NotFoundError
│   ├── RateLimitError
│   ├── BadRequestError
│   └── ServerError
├── APITimeoutError
├── APIConnectionError
├── ProviderResponseError
└── GenerationError

Source: outlines/exceptions.py:outlines-exception-hierarchy

Exception Normalization

The normalize_provider_exception function converts raw provider SDK exceptions into appropriate Outlines types, preserving original exceptions for debugging:

def normalize_provider_exception(
    exception: Exception, 
    provider: Optional[str] = None
) -> OutlinesError

Source: outlines/exceptions.py

Output Types

Basic Python Types

Outlines supports primitive Python types as output specifications:

TypeGenerated Output
intInteger numbers
floatDecimal numbers
boolTrue/False
strArbitrary strings

Source: README.md:quickstart

Literal Types

For constrained choices, Literal types specify exact valid outputs:

from typing import Literal

result = model("Is this positive or negative?", Literal["Positive", "Negative", "Neutral"])

Source: README.md:philosophy

Pydantic Models

Complex nested structures use Pydantic for specification:

from pydantic import BaseModel
from enum import Enum

class Rating(Enum):
    poor = 1
    fair = 2
    good = 3
    excellent = 4

class ProductReview(BaseModel):
    rating: Rating
    pros: list[str]
    cons: list[str]

Source: README.md:complex-structures

Regex Patterns

The Regex type constrains outputs to match specific patterns:

from outlines.types import Regex

phone_number = model("Contact:", Regex(r"\d{3}-\d{3}-\d{4}"))

Source: outlines/release_note.md:regex-dsl

JSON Schema

For language-agnostic type definitions, JsonSchema accepts raw JSON Schema strings:

from outlines.types import JsonSchema

schema = '{"type": "object", "properties": {"answer": {"type": "number"}}}'
result = model("What's 2 + 2?", JsonSchema(schema))

Source: outlines/release_note.md:regex-dsl

Generator Pattern

The Generator class encapsulates reusable generation with a fixed output type:

from outlines import Generator, from_transformers
from pydantic import BaseModel

class Character(BaseModel):
    name: str

model = from_transformers(...)
generator = Generator(model, Character)

# Reuse without recompiling the output type
result1 = generator("Create a male character", {"gender": "male"})
result2 = generator("Create a female character", {"gender": "female"})

Source: outlines/release_note.md:generator-constructor

Application Pattern

The Application class combines templates with structured output types:

from outlines import Application, Template

class Character(BaseModel):
    name: str

template = Template.from_string("Create a {{ gender }} character.")
app = Application(template, Character)

result = app(model, {"gender": "female"})

Source: outlines/release_note.md:application-class

The Outlines Philosophy

Outlines mirrors Python's type system philosophy: specify what you want, and the system ensures it. Rather than validating and parsing outputs after generation, Outlines guarantees structurally valid outputs from the start.

Source: README.md:philosophy

Design Principles

  1. Constraint at generation time: Validity is enforced during token selection, not after
  2. Fail fast: Invalid outputs are impossible by construction
  3. Provider independence: Same API works across all supported models
  4. Type familiarity: Use standard Python types and Pydantic models

Source: README.md:why-outlines

Source: https://github.com/dottxt-ai/outlines / Human Manual

Structured Generation Backends

Related topics: System Architecture

Section Related Pages

Continue reading this section for the full explanation and source context.

Section OutlinesCore Backend

Continue reading this section for the full explanation and source context.

Section XGrammar Backend

Continue reading this section for the full explanation and source context.

Section LLGuidance Backend

Continue reading this section for the full explanation and source context.

Related topics: System Architecture

Structured Generation Backends

Structured Generation Backends are the underlying engine components in Outlines that handle the compilation of structured constraints (such as JSON schemas and regular expressions) into efficient token-level generation guides. These backends abstract away the complexity of constraint compilation, providing a unified interface for steering language model outputs while allowing users to choose the most appropriate implementation for their use case.

Architecture Overview

Outlines supports multiple backend implementations for structured generation, each with different trade-offs in terms of performance, memory usage, and feature support. The backend system follows a pluggable architecture where users can select or let Outlines choose the optimal backend automatically.

graph TD
    User[User Code] --> API[Outlines API]
    API --> Backends[Backend Selection]
    Backends --> Core[OutlinesCore]
    Backends --> XGrammar[XGrammar]
    Backends --> LLGuidance[LLGuidance]
    
    Core --> FSM[FSM Compilation]
    XGrammar --> XG_Engine[XGrammar Engine]
    LLGuidance --> LL_Engine[LLGuidance Engine]
    
    FSM --> Guide[Generation Guide]
    XG_Engine --> Guide
    LL_Engine --> Guide
    
    Guide --> Tokens[Token Masking]
    Tokens --> Model[Language Model]
    Model --> Output[Structured Output]

The backend system sits between the high-level Outlines API and the underlying language model, transforming structural constraints into actionable token-level guidance during generation. Source: outlines/backends/__init__.py:1-60

Available Backends

Outlines provides three main backend implementations for structured generation, each optimized for different scenarios.

BackendModuleDescription
OutlinesCoreoutlines_coreDefault backend using the interegular library for FSM compilation
XGrammarxgrammarOptimized backend using the xgrammar library for faster compilation
LLGuidancellguidanceSpecialized backend using llguidance for high-performance generation

Source: outlines/backends/__init__.py:20-30

OutlinesCore Backend

The OutlinesCore backend is the default implementation that uses the interegular library to compile regular expressions and JSON schemas into Finite State Machines (FSMs). This backend provides comprehensive support for all Outlines features and serves as the reference implementation.

Key characteristics:

  • Pure Python implementation using interegular for regex parsing
  • Persistent caching of compiled FSMs
  • Full support for JSON schemas and regex constraints
  • Memory-efficient for moderate-sized schemas

The backend converts structured constraints through the following pipeline:

  1. Parse the JSON schema or regex pattern
  2. Compile to an intermediate FSM representation using interegular
  3. Optimize the FSM for token-level generation
  4. Cache the compiled result for reuse

Source: outlines/backends/outlines_core.py

XGrammar Backend

The XGrammar backend provides an optimized implementation that leverages the xgrammar library for faster constraint compilation. This backend is particularly useful for applications requiring quick iteration cycles where compilation speed matters.

Key characteristics:

  • Faster compilation times compared to OutlinesCore
  • Optimized token masking operations
  • Good balance between performance and memory usage
  • Requires xgrammar as an additional dependency

Source: outlines/backends/xgrammar.py

LLGuidance Backend

The LLGuidance backend uses the llguidance library to provide high-performance structured generation. This backend is designed for production workloads where generation speed is critical.

Key characteristics:

  • Maximum generation throughput
  • Low-latency token selection
  • Specialized for constrained generation scenarios
  • Requires llguidance as an additional dependency

Source: outlines/backends/llguidance.py

Backend Selection

Outlines provides two mechanisms for backend selection: automatic default selection and explicit user specification.

Default Backend Configuration

When no backend is explicitly specified, Outlines uses the default backends defined in the configuration:

Constraint TypeDefault Backend
JSON Schemaoutlines_core
Regexoutlines_core

Source: outlines/backends/__init__.py:25-28

Explicit Backend Selection

Users can specify a backend explicitly using the backend parameter when calling model generation methods:

import outlines
from pydantic import BaseModel

class Person(BaseModel):
    name: str
    age: int

# Use specific backend
result = model("Create a person", Person, backend="xgrammar")

Backend names are case-insensitive and map to the following implementations:

  • "outlines_core" or "outlinescore": OutlinesCore backend
  • "xgrammar": XGrammar backend
  • "llguidance": LLGuidance backend

Source: outlines/backends/__init__.py:55-58

Backend Factory Functions

The backend system provides factory functions that create the appropriate logits processor based on the constraint type and selected backend.

JSON Schema Logits Processor

The get_json_schema_logits_processor function creates a logits processor for JSON schema constraints:

def get_json_schema_logits_processor(
    backend_name: str | None,
    model: SteerableModel,
    json_schema: str,
) -> LogitsProcessorType:
    """Create a logits processor from a JSON schema.
    
    Parameters

Source: https://github.com/dottxt-ai/outlines / Human Manual

Output Types Overview

Related topics: JSON Schema and Pydantic Support, Regex Patterns

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Type Conversion Pipeline

Continue reading this section for the full explanation and source context.

Section Core Components

Continue reading this section for the full explanation and source context.

Section Supported Types

Continue reading this section for the full explanation and source context.

Related topics: JSON Schema and Pydantic Support, Regex Patterns

Output Types Overview

Outlines provides a comprehensive type system for structured generation with Large Language Models (LLMs). Output types define the expected structure of generated text, and Outlines ensures that generated outputs match these specifications exactly during inference.

Purpose and Scope

Output types in Outlines serve as the primary interface for specifying constraints on LLM outputs. Rather than attempting to parse and fix invalid outputs after generation, Outlines enforces structure during the generation process itself. This approach eliminates fragile parsing logic and guarantees valid outputs on the first attempt.

The type system supports various complexity levels:

Type CategoryExamplesUse Case
Basic Pythonint, float, strSimple values
Literal TypesLiteral["Yes", "No"]Enumerated choices
Pydantic ModelsBaseModel subclassesComplex nested structures
Regex PatternsRegex(...), JsonSchema(...)Custom format constraints
Context-Free GrammarsCFG(...)Formal language definitions

Source: README.md:1-20

Architecture

Type Conversion Pipeline

Outlines converts output types into finite state machines (FSMs) that guide token selection during generation. This conversion follows a layered approach:

graph TD
    A[Python Type / Pydantic Model] --> B[JSON Schema]
    B --> C[Regex Pattern]
    C --> D[FSM / State Machine]
    D --> E[Token Masking Guide]
    E --> F[Constrained Generation]
    
    G[DSL Terms] --> C
    H[CFG Grammar] --> C

The type system handles three primary conversion pathways:

  1. Python Types to Terms: Basic types (int, str, float) and Literal types convert to intermediate Term representations
  2. Pydantic Models to JSON Schema: Complex models generate JSON Schema definitions
  3. Terms to Regex: All intermediate representations ultimately convert to regex patterns

Source: outlines/types/dsl.py:1-30

Core Components

ComponentFile LocationResponsibility
Term Classesoutlines/types/dsl.pyDefine regex DSL elements
JSON Schema Utilitiesoutlines/types/json_schema_utils.pyPydantic to schema conversion
Type Adaptersoutlines/types/utils.pyType introspection helpers
FSM Compilationoutlines-core packageRegex to finite state machines

Source: llm.txt:1-50

Basic Python Types

Outlines supports native Python types as output specifications. These types are automatically converted to appropriate constraints.

Supported Types

import outlines
from transformers import AutoModelForCausalLM, AutoTokenizer

model = outlines.from_transformers(
    AutoModelForCausalLM.from_pretrained("microsoft/Phi-3-mini-4k-instruct"),
    AutoTokenizer.from_pretrained("microsoft/Phi-3-mini-4k-instruct")
)

# Integer output
temperature = model("What's the boiling point of water in Celsius?", int)
print(temperature)  # 100

# String output (constrained format)
name = model("Generate a valid email address", str)
TypeConstraint Applied
intMatches integer patterns
floatMatches decimal numbers
strGeneral text with type hints

Source: README.md:45-60

Literal Types

Literal types define exact enumerated values the model can output:

from typing import Literal

sentiment = model(
    "Analyze: 'This product completely changed my life!'",
    Literal["Positive", "Negative", "Neutral"]
)
print(sentiment)  # "Positive"

This approach is ideal for classification tasks, yes/no questions, and any scenario requiring outputs from a fixed set of options.

Source: README.md:50-55

Pydantic Models

For complex structured outputs, Pydantic models provide a declarative interface to define nested schemas with validation.

Defining Models

from pydantic import BaseModel
from enum import Enum

class Rating(Enum):
    poor = 1
    fair = 2
    good = 3
    excellent = 4

class ProductReview(BaseModel):
    rating: Rating
    pros: list[str]
    cons: list[str]
    summary: str

review = model("Review the latest iPhone", ProductReview)

Model Conversion Process

graph LR
    A[Pydantic BaseModel] --> B[JSON Schema]
    B --> C[Regex via interegular]
    C --> D[FSM]
    D --> E[Guided Generation]
    
    F[TypeAdapter] --> A
    G[GetJsonSchemaHandler] --> B

The conversion process uses Pydantic's schema generation hooks:

from pydantic import BaseModel, GetCoreSchemaHandler
from pydantic.json_schema import JsonSchemaValue
from pydantic_core import core_schema as cs

class CustomType(BaseModel):
    @classmethod
    def __get_pydantic_core_schema__(
        cls, 
        source_type: Any, 
        handler: GetCoreSchemaHandler
    ) -> cs.CoreSchema:
        return cs.string_schema(
            pattern=r"^[A-Z]{2}\d{4}$"  # Format: XX0000
        )

Source: outlines/types/dsl.py:40-80

Regular Expression DSL

The Regex DSL provides fine-grained control over output formats through composable term classes.

Term Classes

Term ClassDescriptionExample
RegexBase regex wrapperRegex(r"\d{3}-\d{4}")
StringLiteral string matchString("yes")
IntegerInteger numbersInteger()
AlternativesChoice between patternseither(pattern1, pattern2)
KleeneStarZero or more repetitionsrepeat(pattern)
OptionalOptional patternoptional(pattern)

Source: outlines/types/dsl.py:20-60

Building Complex Patterns

from outlines.types import either, optional, at_least, integer

# Phone number pattern
phone = either(
    Regex(r"\d{3}-\d{3}-\d{4}"),
    Regex(r"\(\d{3}\) \d{3}-\d{4}")
)

# Complex format with optional parts
date_format = Sequence(
    integer(),  # Year
    literal("-"),
    at_least(integer(), 1),  # At least one month
    optional(literal("-") + at_least(integer(), 1))  # Optional day
)

Term Functions

The DSL includes utility functions for pattern composition:

  • either(*terms): Match any one of multiple terms
  • optional(term): Make a pattern optional
  • at_least(term, n): Require at least n repetitions
  • one_of(*choices): Synonym for either
  • literal(text): Match exact text

Source: outlines/release_note.md:40-80

JsonSchema Type

The JsonSchema type allows direct use of JSON Schema definitions for complex validation:

from outlines.types import JsonSchema

json_schema = '''
{
    "type": "object",
    "properties": {
        "answer": {"type": "number"},
        "confidence": {"type": "number", "minimum": 0, "maximum": 1}
    },
    "required": ["answer"]
}
'''

result = model("What's 2 + 2? Respond in JSON.", JsonSchema(json_schema))

This approach is useful when:

  • Migrating existing JSON Schema definitions
  • Working with API specifications (OpenAPI, etc.)
  • Defining schemas separately from code

Source: outlines/release_note.md:50-65

Context-Free Grammars

Outlines supports Context-Free Grammars (CFG) for formal language generation:

from outlines.types import CFG

grammar = CFG("""
    expression ::= number op number
    op ::= "+" | "-" | "*" | "/"
    number ::= [0-9]+
""")

math_result = model("Calculate 5 + 3", grammar)

CFGs are particularly valuable for:

  • Programming language generation
  • Mathematical expression evaluation
  • Structured domain-specific languages

Source: outlines/release_note.md:60-70

Union Types

Union types enable conditional or alternative output structures:

from typing import Union

class SuccessResponse(BaseModel):
    data: str
    timestamp: str

UnknownResponse = Literal["I don't know", "Unable to determine"]

Response = Union[SuccessResponse, UnknownResponse]

result = model("What is the capital of France?", Response)

Handling Incomplete Data

Union types excel at scenarios where partial information is acceptable:

class EventInfo(BaseModel):
    name: str
    date: str
    location: str

EventResponse = Union[EventInfo, Literal["I don't know"]]

result = model(
    "Extract event details: 'Join us for the meeting next week!'",
    EventResponse
)

Source: README.md:100-120

Generator Integration

The Generator class encapsulates output types for reusable constrained generation:

from outlines import Generator, from_transformers
from pydantic import BaseModel

class Character(BaseModel):
    name: str
    species: str

model = from_transformers(
    AutoModelForCausalLM.from_pretrained("microsoft/Phi-3-mini-4k-instruct"),
    AutoTokenizer.from_pretrained("microsoft/Phi-3-mini-4k-instruct")
)

character_generator = Generator(model, Character)
result = character_generator("Create a fantasy character")

Benefits of using generators:

FeatureBenefit
Cached compilationFSM compiled once, reused across calls
Type inferenceOutput type specified at construction
Consistent behaviorSame constraints applied to all generations

Source: outlines/release_note.md:20-40

Type System Internals

Conversion Pipeline Details

graph TD
    subgraph "Type Definition"
        A[Pydantic / Python Type] --> B[Term Representation]
        B --> C[Regex Pattern]
    end
    
    subgraph "Compilation"
        C --> D[FSM via interegular]
        D --> E[Cached FSM]
    end
    
    subgraph "Generation"
        E --> F[Guide Processor]
        F --> G[Token Masking]
        G --> H[Constrained Sampling]
    end

Key Files

FilePurpose
outlines/types/__init__.pyPublic API exports
outlines/types/dsl.pyTerm classes and regex DSL
outlines/types/json_schema_utils.pySchema conversion utilities
outlines/types/utils.pyType introspection helpers

Source: llm.txt:30-60

Best Practices

Choosing Output Types

  1. Use Literal types for classification and yes/no responses
  2. Use Python primitives (int, float) for simple numerical outputs
  3. Use Pydantic models for complex, nested structures
  4. Use Regex DSL when you need precise format control
  5. Use CFG for formal language generation

Performance Considerations

ComplexityCompilation TimeRuntime Overhead
Literal typesMinimalNegligible
Simple regexLowLow
Pydantic modelsModerateModerate
Complex grammarsHigherHigher

FSM compilation happens once on first use and is cached for subsequent calls, minimizing repeated overhead.

Source: llm.txt:40-50

Summary

Outlines' output types system provides a unified, Pythonic interface for structured generation:

  • Type-driven API: Python types, Pydantic models, and custom DSLs
  • Guaranteed validity: Constraints enforced during generation, not after
  • Flexible composition: Union types, regex patterns, and grammars
  • Performance optimized: FSM caching and lazy compilation

The type system transforms high-level type specifications into optimized finite state machines, enabling reliable structured generation across diverse LLM providers.

Source: https://github.com/dottxt-ai/outlines / Human Manual

JSON Schema and Pydantic Support

Related topics: Output Types Overview

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Layer Stack for Structured Output

Continue reading this section for the full explanation and source context.

Section Class Definition

Continue reading this section for the full explanation and source context.

Section Key Methods

Continue reading this section for the full explanation and source context.

Related topics: Output Types Overview

JSON Schema and Pydantic Support

Outlines provides robust support for JSON Schema and Pydantic models as first-class output type specifications. This enables developers to define complex structured output schemas using familiar Python type annotations, which are then compiled into efficient finite state machines (FSMs) for guided text generation.

Overview

The JSON Schema and Pydantic support in Outlines serves three primary purposes:

  1. Type Definition - Allows users to define output structures using Python-native type hints
  2. Schema Conversion - Provides bidirectional conversion between JSON Schema, Pydantic, TypedDict, and dataclass representations
  3. Constraint Compilation - Transforms schema definitions into regular expressions and FSMs for guided generation

Source: outlines/types/dsl.py:1-30

Architecture

Layer Stack for Structured Output

graph TD
    A[User API: Pydantic Model / JSON Schema] --> B[Type System: python_types_to_terms]
    B --> C[JsonSchema Term / CFG Term]
    C --> D[Regex Compilation: to_regex]
    D --> E[FSM Generation via outlines-core]
    E --> F[Logits Processor: Token Masking]
    F --> G[Model Providers]
    
    H[Pydantic / TypedDict / Dataclass] -->|json_schema_dict_to_pydantic| B
    I[JSON Schema String] -->|JsonSchema class| B

The type system in Outlines follows a layered architecture where Python types are progressively converted into machine-executable constraints:

LayerComponentResponsibility
User APIPydantic models, JSON SchemaDefine desired output structure
Type Systempython_types_to_termsConvert Python types to Term instances
Schema RepresentationJsonSchema, CFG classesRepresent constraints as Terms
Regex Compilationto_regexTransform Terms to regular expressions
FSM Generationoutlines-core (interegular)Convert regex to finite state machine
Generation ControlLogits ProcessorsMask invalid tokens during generation

Source: outlines/types/dsl.py:46-80

JsonSchema Class

The JsonSchema class is the core abstraction for representing JSON Schema-based output types.

Class Definition

class JsonSchema(Term):
    """Represents a JSON Schema constraint for structured generation."""
    
    def __init__(self, schema: Union[str, dict], whitespace_pattern: Optional[str] = None):
        """
        Args:
            schema: JSON Schema as string or dict
            whitespace_pattern: Optional regex for whitespace handling
        """

Key Methods

MethodDescriptionReturns
to_format(target_type)Convert schema to Pydantic, TypedDict, or dataclassConverted type or raises ValueError
from_file(path)Create JsonSchema from .json fileJsonSchema instance
_display_node()Get string representationstr

Source: outlines/types/dsl.py:50-90

Schema Conversion

The to_format method supports converting JSON Schema to multiple Python type formats:

def to_format(self, target_types: List[str]) -> Any:
    """Convert JSON Schema to target format(s).
    
    Supported targets: 'pydantic', 'typeddict', 'dataclass', 'str', 'dict'
    """

This method iterates through the requested target types and attempts conversion, returning the first successful result.

Source: outlines/types/dsl.py:55-80

Schema Validation and Comparison

def __eq__(self, other) -> bool:
    """Compare two JsonSchema instances by parsing and comparing their contents."""
    self_dict = json.loads(self.schema)
    other_dict = json.loads(other.schema)
    return self_dict == other_dict

Source: outlines/types/dsl.py:100-108

JSON Schema Utilities

The json_schema_utils.py module provides bidirectional conversion between JSON Schema and Python type systems.

Schema Type Mapping

JSON Schema types are mapped to Python types as follows:

JSON Schema TypePython Type
stringstr
integerint
numberfloat
booleanbool
arrayList[item_type]
objectPydantic / TypedDict / Dataclass

Source: outlines/types/json_schema_utils.py:1-50

Conversion Functions

#### json_schema_dict_to_pydantic

Converts a JSON Schema dictionary to a Pydantic model:

def json_schema_dict_to_pydantic(
    schema: dict,
    name: Optional[str] = None
) -> Type[BaseModel]:
    """Convert JSON Schema dict to Pydantic BaseModel.
    
    Args:
        schema: JSON Schema dictionary
        name: Optional name for the model
        
    Returns:
        Pydantic BaseModel class
    """

#### json_schema_dict_to_typeddict

Converts JSON Schema to a TypedDict:

def json_schema_dict_to_typeddict(
    schema: dict,
    name: Optional[str] = None
) -> _TypedDictMeta:
    """Convert JSON Schema dict to TypedDict class."""

The conversion process:

  1. Extracts required fields from schema
  2. Maps properties to typed annotations
  3. Optional fields are wrapped with Optional[]
  4. Recursively handles nested objects

Source: outlines/types/json_schema_utils.py:80-120

#### json_schema_dict_to_dataclass

Converts JSON Schema to a dataclass:

def json_schema_dict_to_dataclass(
    schema: dict,
    name: Optional[str] = None
) -> type:
    """Convert JSON Schema dict to dataclass."""

schema_type_to_python

Recursively converts JSON Schema type definitions to Python types:

def schema_type_to_python(
    schema: dict,
    caller_target_type: str = "pydantic"
) -> Any:
    """Convert JSON Schema type to Python type.
    
    Args:
        schema: JSON Schema dict or nested schema
        caller_target_type: Target format ('pydantic', 'typeddict', 'dataclass')
    """

Source: outlines/types/json_schema_utils.py:40-75

Backend Integration

Different inference backends handle JSON Schema constraints through specialized logits processors.

Backend Selection

graph LR
    A[Model Instance] --> B[_get_backend]
    B --> C{Backend Name}
    C -->|outlines_core| D[OutlinesCoreBackend]
    C -->|xgrammar| E[XGrammarBackend]
    C -->|llguidance| F[LLGuidanceBackend]
    
    D --> G[get_json_schema_logits_processor]
    E --> G
    F --> G

The get_json_schema_logits_processor function creates the appropriate processor:

def get_json_schema_logits_processor(
    backend_name: str | None,
    model: SteerableModel,
    json_schema: str,
) -> LogitsProcessorType:
    """Create a logits processor from a JSON schema."""
    backend = _get_backend(
        backend_name or JSON_SCHEMA_DEFAULT_BACKEND,
        model,
    )
    return backend.get_json_schema_logits_processor(json_schema)

Source: outlines/backends/__init__.py:1-50

VLLM Offline Backend

The VLLM offline backend converts JsonSchema terms to vLLM's GuidedDecodingParams:

def _get_guided_decoding_params(self, output_type) -> dict:
    """Convert output type to guided decoding parameters."""
    if output_type is None:
        return {}

    term = python_types_to_terms(output_type)
    if isinstance(term, CFG):
        return {"grammar": term.definition}
    elif isinstance(term, JsonSchema):
        guided_decoding_params = {"json": json.loads(term.schema)}
        if term.whitespace_pattern:
            guided_decoding_params["whitespace_pattern"] = term.whitespace_pattern
        return guided_decoding_params
    else:
        return {"regex": to_regex(term)}

Source: outlines/models/vllm_offline.py:50-80

Usage Patterns

Basic Pydantic Model Usage

from pydantic import BaseModel
from enum import Enum
from outlines import from_transformers
from transformers import AutoModelForCausalLM, AutoTokenizer

class Rating(Enum):
    poor = 1
    fair = 2
    good = 3
    excellent = 4

class ProductReview(BaseModel):
    rating: Rating
    pros: list[str]
    cons: list[str]
    summary: str

model = from_transformers(
    AutoModelForCausalLM.from_pretrained("microsoft/Phi-3-mini-4k-instruct"),
    AutoTokenizer.from_pretrained("microsoft/Phi-3-mini-4k-instruct")
)

review = model("Amazing laptop! Great battery life, fast processor.", ProductReview)

Source: README.md:1-50

Using json_schema Function

import outlines

schema = {
    "type": "object",
    "properties": {
        "name": {"type": "string"},
        "age": {"type": "integer"},
        "email": {"type": "string", "format": "email"}
    },
    "required": ["name", "email"]
}

result = model("Generate a user profile", outlines.json_schema(schema))

Union Types for Flexible Output

from typing import Union, List, Literal
from pydantic import BaseModel

class EventInfo(BaseModel):
    name: str
    date: str
    location: str

EventResponse = Union[EventInfo, Literal["I don't know"]]

result = model("What event is mentioned?", EventResponse)

Type System Functions

The DSL module provides utility functions for type checking and conversion:

Type Check Functions

FunctionPurpose
is_int(t)Check if type is int
is_float(t)Check if type is float
is_str(t)Check if type is str
is_bool(t)Check if type is bool
is_datetime(t)Check if type is datetime
is_date(t)Check if type is date
is_time(t)Check if type is time
is_pydantic_model(t)Check if type is Pydantic BaseModel
is_enum(t)Check if type is Enum
is_literal(t)Check if type is Literal
is_union(t)Check if type is Union
is_typing_list(t)Check if type is List
is_typed_dict(t)Check if type is TypedDict

Source: outlines/types/dsl.py:80-150

python_types_to_terms

The main conversion function that transforms Python types into Term instances:

def python_types_to_terms(
    output_type,
    whitespace_pattern: Optional[str] = None
) -> Term:
    """Convert Python types to Term instances for guided generation.
    
    Handles:
    - Primitive types (int, float, str, bool)
    - Collections (List, Dict, Tuple)
    - Pydantic models
    - Enums and Literals
    - Union types
    - JSON Schema strings
    - TypedDict and dataclasses
    """

Source: outlines/types/dsl.py:200-280

Error Handling

Schema Conversion Failures

When schema conversion fails, Outlines provides informative warnings:

except Exception as e:  # pragma: no cover
    warnings.warn(
        f"Cannot convert schema type {type(schema)} to {target_type}: {e}"
    )
    continue

If no valid conversion is found, a ValueError is raised:

raise ValueError(
    f"Cannot convert schema type {type(schema)} to any of the target "
    f"types {target_types}"
)

Source: outlines/types/dsl.py:75-82

Best Practices

  1. Use Pydantic for Complex Schemas - Pydantic models provide validation and IDE autocomplete
  2. Define Required Fields - JSON Schema required array ensures critical fields are generated
  3. Use Optional for Nullable Fields - Mark non-required fields with Optional[] or = None
  4. Leverage Union Types - Return fallback values when data is incomplete
  5. Cache Compiled FSMs - Outlines caches compiled state machines for reuse
  • CFG Support - Context-free grammar constraints for complex syntax
  • Regex DSL - Direct regular expression specifications
  • Template System - Jinja-based prompt templating
  • Generator Class - Reusable generator objects with pre-compiled constraints

Source: https://github.com/dottxt-ai/outlines / Human Manual

Regex Patterns

Related topics: Output Types Overview

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Type Conversion Pipeline

Continue reading this section for the full explanation and source context.

Section Core Term Classes

Continue reading this section for the full explanation and source context.

Section Regex Class

Continue reading this section for the full explanation and source context.

Related topics: Output Types Overview

Regex Patterns

Regex patterns in Outlines provide a powerful Domain-Specific Language (DSL) for defining structured output constraints. The regex DSL allows developers to build complex pattern constraints by composing simple terms, which are then compiled into finite state machines (FSMs) that guide the language model's token generation process.

The core insight behind Outlines' regex support is that instead of generating text and hoping it matches a format, Outlines makes it impossible for the model to generate invalid outputs by masking invalid tokens during generation. Source: llm.txt:1-10

Overview

Outlines supports regex patterns at multiple levels of abstraction:

  1. Direct Regex Patterns: Use Regex class to define patterns that can be used as Pydantic field types
  2. Regex DSL: Compose complex patterns using term functions like either, optional, at_least, and integer/string helpers
  3. JSON Schema Integration: JsonSchema term accepts JSON schema strings and converts them to regex constraints
  4. Context-Free Grammars: CFG term provides grammar-based constraints for more complex languages

Source: outlines/release_note.md:1-20

Type Conversion Pipeline

The regex system follows a well-defined conversion pipeline:

Pydantic Model / Python Type → Term DSL → Regex → FSM → Token Masking

This pipeline ensures that high-level type specifications are progressively transformed into low-level token constraints that the generation process can enforce.

Source: llm.txt:1-15

The Term Classes

The Term class hierarchy forms the foundation of Outlines' regex DSL. All terms implement a common interface that supports composition operations and conversion to regex patterns.

Source: outlines/types/dsl.py:1-30

Core Term Classes

Term ClassDescriptionStandalone Usage
RegexRepresents a raw regex patternYes
StringLiteral string matchingYes
JsonSchemaJSON schema to regex conversionYes
CFGContext-free grammar constraintsYes
SequenceConcatenation of multiple termsNo
AlternativesChoice between multiple termsNo
KleeneStarZero or more repetitionsNo
KleenePlusOne or more repetitionsNo
OptionalZero or one occurrenceNo

Source: outlines/types/dsl.py:30-80

Regex Class

The Regex class is a Pydantic-compatible type that represents a regular expression pattern. It can be used directly as a field type in Pydantic models.

from outlines.types import Regex
from pydantic import BaseModel

age_type = Regex("[0-9]+")

class User(BaseModel):
    name: str
    age: age_type

Source: outlines/types/dsl.py:85-95

The Regex class provides the following composition operators:

OperatorMethodDescription
+__add__Concatenate patterns (sequence)
`\`__or__Create alternatives (choice)
r+__radd__Right-side concatenation
`r`__ror__Right-side alternatives

Source: outlines/types/dsl.py:97-115

JsonSchema Term

The JsonSchema term accepts a JSON schema string and converts it into regex constraints. This allows seamless integration with existing JSON schema definitions.

from outlines import from_transformers
from outlines.types import JsonSchema
from transformers import AutoModelForCausalLM, AutoTokenizer

model = from_transformers(
    AutoModelForCausalLM.from_pretrained("microsoft/Phi-3-mini-4k-instruct"),
    AutoTokenizer.from_pretrained("microsoft/Phi-3-mini-4k-instruct")
)

json_schema = '{"type": "object", "properties": {"answer": {"type": "number"}}}'
result = model("What's 2 + 2? Respond in JSON", JsonSchema(json_schema))

Source: outlines/release_note.md:1-25

CFG Term

The CFG (Context-Free Grammar) term allows definition of constraints using context-free grammar notation. This is useful for complex languages where regex alone is insufficient.

Source: outlines/types/dsl.py:75-80

Regex DSL Functions

The regex DSL provides utility functions for building complex patterns by combining simpler terms.

Composition Functions

FunctionDescription
either(*terms)Create alternatives from multiple terms
optional(term)Make a term optional (zero or one)
at_least(n, term)Require at least n occurrences
integer()Match integer patterns
float()Match floating-point number patterns
boolean()Match boolean patterns

Source: outlines/release_note.md:1-30

Building Complex Patterns

The following example demonstrates building a complex regex pattern using the DSL:

from outlines import from_transformers
from outlines.types import at_least, either, integer, optional
from transformers import AutoModelForCausalLM, AutoTokenizer

model = from_transformers(
    AutoModelForCausalLM.from_pretrained("microsoft/Phi-3-mini-4k-instruct"),
    AutoTokenizer.from_pretrained("microsoft/Phi-3-mini-4k-instruct")
)

# Build a pattern that matches email-like strings
pattern = either("[email protected]", "[email protected]")

Source: outlines/release_note.md:25-35

Architecture

FSM Compilation Flow

graph TD
    A[User Pattern Definition] --> B[Pydantic Model / Regex DSL]
    B --> C[JSON Schema Extraction]
    C --> D[Regex Generation]
    D --> E[FSM Compilation via interegular]
    E --> F[Token-Level Constraints]
    F --> G[Logits Masking during Generation]
    G --> H[Valid Output Generation]

Source: llm.txt:15-25

Layer Stack

The regex system integrates with Outlines' layered architecture:

User API (outlines.models)
    ↓
Generator Classes (SteerableGenerator, BlackBoxGenerator)
    ↓
Type System (types/dsl.py: Pydantic → JsonSchema → Regex)
    ↓
FSM Compilation (outlines-core: regex → FSM via interegular)
    ↓
Guide System (processors/guide.py: FSM state management)
    ↓
Logits Processing (processors/structured.py: token masking)
    ↓
Model Providers (transformers, OpenAI, etc.)

Source: llm.txt:30-45

Usage Patterns

Simple Classification with Literals

While not strictly regex, Outlines uses the same constraint infrastructure for literal choices:

from typing import Literal
from outlines import from_transformers
from transformers import AutoModelForCausalLM, AutoTokenizer

model = from_transformers(
    AutoModelForCausalLM.from_pretrained("microsoft/Phi-3-mini-4k-instruct"),
    AutoTokenizer.from_pretrained("microsoft/Phi-3-mini-4k-instruct")
)

result = model("Pizza or burger", Literal["pizza", "burger"])

Source: outlines/release_note.md:60-75

Regex as Pydantic Field Types

For more complex validation, use Regex as a Pydantic field type:

from outlines.types import Regex
from pydantic import BaseModel

class ProductCode(BaseModel):
    code: Regex(r"^[A-Z]{3}-[0-9]{4}$")

The Regex class implements Pydantic's schema generation and validation interfaces:

MethodPurpose
__get_validator__Pydantic validator for input validation
__get_pydantic_core_schema__Core schema for Pydantic v2 integration
__get_pydantic_json_schema__JSON schema generation

Source: outlines/types/dsl.py:117-130

Integration with Type System

Python Types to Terms Conversion

The python_types_to_terms function maps Python types to their corresponding Term representations:

Python TypeTerm Equivalent
intinteger()
floatfloat()
strstring
boolboolean()
List[T]Pattern for lists
Literal[...]Alternatives

Source: outlines/types/dsl.py:1-60

Schema Utilities

The type system includes utilities for converting between different schema formats:

  • json_schema_dict_to_pydantic(): Convert JSON schema to Pydantic model
  • json_schema_dict_to_typeddict(): Convert to TypedDict
  • json_schema_dict_to_dataclass(): Convert to dataclass

Source: outlines/types/dsl.py:50-55

Key Design Decisions

Token-Level Control

Outlines' regex constraints operate at the token level, not character level. This means:

  1. FSMs are compiled from regex patterns using the interegular library
  2. State transitions map (state, token) → next_state
  3. For each state, invalid tokens are masked by setting their logits to negative infinity

Source: llm.txt:45-50

Lazy Compilation

FSMs are compiled on first use and cached persistently. This ensures:

  • Initial overhead is minimal
  • Repeated generation with the same schema is fast
  • Memory is efficiently managed through caching

Source: llm.txt:50-55

API Reference

Regex Class

class Regex(Term):
    def __init__(self, pattern: str):
        """Initialize with a regex pattern string."""
        
    def __add__(self, other: Term) -> Sequence:
        """Concatenate patterns."""
        
    def __or__(self, other: Term) -> Alternatives:
        """Create alternatives."""
        
    def validate(self, value: Any) -> Any:
        """Validate a value against the pattern."""

DSL Functions

def either(*terms: Term) -> Alternatives:
    """Create alternatives from multiple terms."""
    
def optional(term: Term) -> Term:
    """Make a term optional (zero or one occurrence)."""
    
def at_least(n: int, term: Term) -> Term:
    """Require at least n occurrences."""
    
def integer() -> Term:
    """Match integer patterns."""
    
def boolean() -> Term:
    """Match boolean patterns."""

Source: outlines/types/dsl.py:30-75

Best Practices

  1. Pre-compile complex patterns: If using the same pattern multiple times, consider using the Generator class to cache the FSM compilation
  1. Use Pydantic models for complex structures: JSON schema conversion provides a cleaner API for nested objects
  1. Leverage composition operators: Build complex patterns from simple terms using + and | operators
  1. Test patterns separately: Validate regex patterns independently before using them in generation

See Also

Source: https://github.com/dottxt-ai/outlines / Human Manual

Doramagic Pitfall Log

Source-linked risks stay visible on the manual page so the preview does not read like a recommendation.

high [Feature] Streaming structured generation with partial validation

First-time setup may fail or require extra isolation and rollback planning.

high 📝 Integration Proposal: CAJAL — Structured Scientific Paper Generation

Users may get misleading failures or incomplete behavior unless configuration is checked carefully.

high Add more custom types

Users cannot judge support quality until recent activity, releases, and issue response are checked.

high Add function calling and MCP support

The project may affect permissions, credentials, data exposure, or host boundaries.

Doramagic Pitfall Log

Doramagic extracted 16 source-linked risk signals. Review them before installing or handing real data to the project.

1. Installation risk: [Feature] Streaming structured generation with partial validation

  • Severity: high
  • Finding: Installation risk is backed by a source signal: [Feature] Streaming structured generation with partial validation. Treat it as a review item until the current version is checked.
  • User impact: First-time setup may fail or require extra isolation and rollback planning.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: Source-linked evidence: https://github.com/dottxt-ai/outlines/issues/1856

2. Configuration risk: 📝 Integration Proposal: CAJAL — Structured Scientific Paper Generation

  • Severity: high
  • Finding: Configuration risk is backed by a source signal: 📝 Integration Proposal: CAJAL — Structured Scientific Paper Generation. Treat it as a review item until the current version is checked.
  • User impact: Users may get misleading failures or incomplete behavior unless configuration is checked carefully.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: Source-linked evidence: https://github.com/dottxt-ai/outlines/issues/1859

3. Maintenance risk: Add more custom types

  • Severity: high
  • Finding: Maintenance risk is backed by a source signal: Add more custom types. Treat it as a review item until the current version is checked.
  • User impact: Users cannot judge support quality until recent activity, releases, and issue response are checked.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: Source-linked evidence: https://github.com/dottxt-ai/outlines/issues/1303

4. Security or permission risk: Add function calling and MCP support

  • Severity: high
  • Finding: Security or permission risk is backed by a source signal: Add function calling and MCP support. Treat it as a review item until the current version is checked.
  • User impact: The project may affect permissions, credentials, data exposure, or host boundaries.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: Source-linked evidence: https://github.com/dottxt-ai/outlines/issues/1626

5. Security or permission risk: [Feature Request] Add streaming support for structured generation

  • Severity: high
  • Finding: Security or permission risk is backed by a source signal: [Feature Request] Add streaming support for structured generation. Treat it as a review item until the current version is checked.
  • User impact: The project may affect permissions, credentials, data exposure, or host boundaries.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: Source-linked evidence: https://github.com/dottxt-ai/outlines/issues/1842

6. Installation risk: Developers should check this installation risk before relying on the project: Feature request: OWASP ASI06 memory poisoning defense for structured generation

  • Severity: medium
  • Finding: Developers should check this installation risk before relying on the project: Feature request: OWASP ASI06 memory poisoning defense for structured generation
  • User impact: Developers may fail before the first successful local run: Feature request: OWASP ASI06 memory poisoning defense for structured generation
  • Recommended check: Before packaging this project, run the relevant install/config/quickstart check for: Feature request: OWASP ASI06 memory poisoning defense for structured generation. Context: Observed when using python
  • Evidence: failure_mode_cluster:github_issue | fmev_aafbb33fe2e219639553f4d4275e0223 | https://github.com/dottxt-ai/outlines/issues/1864 | Feature request: OWASP ASI06 memory poisoning defense for structured generation

7. Installation risk: Developers should check this installation risk before relying on the project: Incompatibility with vllm==0.19 because of some api changes

  • Severity: medium
  • Finding: Developers should check this installation risk before relying on the project: Incompatibility with vllm==0.19 because of some api changes
  • User impact: Developers may fail before the first successful local run: Incompatibility with vllm==0.19 because of some api changes
  • Recommended check: Before packaging this project, run the relevant install/config/quickstart check for: Incompatibility with vllm==0.19 because of some api changes. Context: Observed when using python, cuda
  • Evidence: failure_mode_cluster:github_issue | fmev_9f23e49bc91e3f8af003ddcdedec3e72 | https://github.com/dottxt-ai/outlines/issues/1854 | Incompatibility with vllm==0.19 because of some api changes

8. Installation risk: Developers should check this installation risk before relying on the project: Outlines v1.2.6

  • Severity: medium
  • Finding: Developers should check this installation risk before relying on the project: Outlines v1.2.6
  • User impact: Upgrade or migration may change expected behavior: Outlines v1.2.6
  • Recommended check: Before packaging this project, run the relevant install/config/quickstart check for: Outlines v1.2.6. Context: Observed during installation or first-run setup.
  • Evidence: failure_mode_cluster:github_release | fmev_e917f6640a48bc54b76cbbbfcfd2b346 | https://github.com/dottxt-ai/outlines/releases/tag/1.2.6 | Outlines v1.2.6

9. Installation risk: Developers should check this installation risk before relying on the project: Outlines v1.2.8

  • Severity: medium
  • Finding: Developers should check this installation risk before relying on the project: Outlines v1.2.8
  • User impact: Upgrade or migration may change expected behavior: Outlines v1.2.8
  • Recommended check: Before packaging this project, run the relevant install/config/quickstart check for: Outlines v1.2.8. Context: Observed when using python
  • Evidence: failure_mode_cluster:github_release | fmev_802eb50b3a54cd87f585ac14e899b4bc | https://github.com/dottxt-ai/outlines/releases/tag/1.2.8 | Outlines v1.2.8

10. Installation risk: Feature request: OWASP ASI06 memory poisoning defense for structured generation

  • Severity: medium
  • Finding: Installation risk is backed by a source signal: Feature request: OWASP ASI06 memory poisoning defense for structured generation. Treat it as a review item until the current version is checked.
  • User impact: First-time setup may fail or require extra isolation and rollback planning.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: Source-linked evidence: https://github.com/dottxt-ai/outlines/issues/1864

11. Capability assumption: README/documentation is current enough for a first validation pass.

  • Severity: medium
  • Finding: README/documentation is current enough for a first validation pass.
  • User impact: The project should not be treated as fully validated until this signal is reviewed.
  • Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
  • Evidence: capability.assumptions | github_repo:615403340 | https://github.com/dottxt-ai/outlines | README/documentation is current enough for a first validation pass.

12. Maintenance risk: Developers should check this migration risk before relying on the project: Outlines v1.2.10

  • Severity: medium
  • Finding: Developers should check this migration risk before relying on the project: Outlines v1.2.10
  • User impact: Upgrade or migration may change expected behavior: Outlines v1.2.10
  • Recommended check: Before packaging this project, run the relevant install/config/quickstart check for: Outlines v1.2.10. Context: Observed when using python
  • Evidence: failure_mode_cluster:github_release | fmev_75fc0fce3c200ef68083c6815dfb1b11 | https://github.com/dottxt-ai/outlines/releases/tag/1.2.10 | Outlines v1.2.10

Source: Doramagic discovery, validation, and Project Pack records

Community Discussion Evidence

These external discussion links are review inputs, not standalone proof that the project is production-ready.

Sources 12

Count of project-level external discussion links exposed on this manual page.

Use Review before install

Open the linked issues or discussions before treating the pack as ready for your environment.

Community Discussion Evidence

Doramagic exposes project-level community discussion separately from official documentation. Review these links before using outlines with real data or production workflows.

Source: Project Pack community evidence and pitfall evidence