Doramagic Project Pack · Human Manual
outlines
Outlines follows a multi-layered architecture that transforms Python types into generation constraints:
Introduction to Outlines
Related topics: System Architecture, Quickstart Guide
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: System Architecture, Quickstart Guide
Introduction to Outlines
Outlines is a Python library that enables structured text generation with Large Language Models (LLMs). It guarantees that model outputs conform to a specified structure during generation, eliminating the need for post-processing, regex parsing, or fragile code that breaks easily. Source: README.md
What Problem Does Outlines Solve?
LLMs are powerful but produce unpredictable outputs. Traditional approaches attempt to fix bad outputs after generation using parsing and regex, which is fragile and breaks easily. Outlines takes a different approach by ensuring structured outputs during generation rather than after. Source: README.md
The core philosophy follows Python's own type system pattern: simply specify the desired output type, and Outlines ensures the generated data matches that structure exactly. Source: README.md
Core Philosophy
Outlines follows a simple pattern that mirrors Python's own type system:
- For yes/no responses, use
Literal["Yes", "No"] - For numerical values, use
int - For complex objects, define a structure with a Pydantic model
This type-driven API makes structured generation feel natural to Python developers. Source: README.md
Key Features
Universal Model Support
Outlines works with any model provider with minimal code changes:
| Model Type | Description | Documentation |
|---|---|---|
| Server Support | vLLM and Ollama | Server Integrations |
| Local Model Support | transformers and llama.cpp | Model Integrations |
| API Support | OpenAI, Gemini, and Dottxt | API Integrations |
Source: README.md
Guaranteed Valid Structure
Outlines provides several key guarantees:
- Works with any model - Same code runs across OpenAI, Ollama, vLLM, and more
- Simple integration - Just pass your desired output type:
model(prompt, output_type) - Guaranteed valid structure - No more parsing headaches or broken JSON
- Provider independence - Switch models without changing code
Source: README.md
Architecture Overview
Layer Stack
Outlines follows a multi-layered architecture that transforms Python types into generation constraints:
User API (outlines.models)
↓
Generator Classes (SteerableGenerator, BlackBoxGenerator)
↓
Type System (types/dsl.py: Pydantic → JsonSchema → Regex)
↓
FSM Compilation (outlines-core: regex → FSM via interegular)
↓
Guide System (processors/guide.py: FSM state management)
↓
Logits Processing (processors/structured.py: token masking)
↓
Model Providers (transformers, OpenAI, etc.)
Source: llm.txt
Key Design Decisions
- FSM-based constraints: For local models, constraints compile to finite state machines that track valid next tokens
- Provider abstraction: Same constraint system works across local models (transformers) and APIs (OpenAI)
- Lazy compilation: FSMs are compiled on first use and cached persistently
- Token-level control: Constraints apply at the token level, not character level
- Type-driven API: Python types are the primary interface for specifying constraints
Source: llm.txt
Model Class Hierarchy
Outlines distinguishes between two types of model implementations:
graph TD
BaseModel[BaseModel]
SteerableModel[SteerableModel - Controls logits]
BlackBoxModel[BlackBoxModel - Uses provider's structured output]
BaseModel --> SteerableModel
BaseModel --> BlackBoxModel
SteerableModel --> Transformers[Transformers]
SteerableModel --> LlamaCpp[LlamaCpp]
BlackBoxModel --> OpenAI[OpenAI]
BlackBoxModel --> Gemini[Gemini]
BlackBoxModel --> Anthropic[Anthropic]Source: llm.txt
Getting Started
Installation
Install Outlines using pip:
pip install outlines
Source: README.md
Basic Usage
#### 1. Connect to a Model
import outlines
from transformers import AutoTokenizer, AutoModelForCausalLM
MODEL_NAME = "microsoft/Phi-3-mini-4k-instruct"
model = outlines.from_transformers(
AutoModelForCausalLM.from_pretrained(MODEL_NAME, device_map="auto"),
AutoTokenizer.from_pretrained(MODEL_NAME)
)
Source: README.md
#### 2. Simple Structured Outputs
from typing import Literal
from pydantic import BaseModel
# Simple classification
sentiment = model(
"Analyze: 'This product completely changed my life!'",
Literal["Positive", "Negative", "Neutral"]
)
print(sentiment) # "Positive"
# Extract specific types
temperature = model("What's the boiling point of water in Celsius?", int)
print(temperature) # 100
Source: README.md
#### 3. Complex Structures with Pydantic
from pydantic import BaseModel
from enum import Enum
class Rating(Enum):
poor = 1
fair = 2
good = 3
excellent = 4
class ProductReview(BaseModel):
rating: Rating
pros: list[str]
cons: list[str]
summary: str
Source: README.md
Using Templates
Outlines supports Jinja-based templates for dynamic prompt generation:
import outlines
from typing import List, Literal
from transformers import AutoTokenizer, AutoModelForCausalLM
MODEL_NAME = "microsoft/phi-4"
model = outlines.from_transformers(
AutoModelForCausalLM.from_pretrained(MODEL_NAME, device_map="auto"),
AutoTokenizer.from_pretrained(MODEL_NAME)
)
# Create a reusable template with Jinja syntax
sentiment_template = outlines.Template.from_string("""
<|im_start>user
Analyze the sentiment of the following {{ content_type }}:
{{ text }}
Provide your analysis as either "Positive", "Negative", or "Neutral".
<|im_end>
<|im_start>assistant
""")
# Generate prompts with different parameters
review = "This restaurant exceeded all my expectations. Fantastic service!"
prompt = sentiment_template(content_type="review", text=review)
# Use with structured generation
result = model(prompt, Literal["Positive", "Negative", "Neutral"])
Source: README.md
Generator Pattern
The Generator class provides a reusable way to apply structured generation:
from outlines import Generator, from_transformers
from transformers import AutoModelForCausalLM, AutoTokenizer
model = from_transformers(
AutoModelForCausalLM.from_pretrained("microsoft/Phi-3-mini-4k-instruct"),
AutoTokenizer.from_pretrained("microsoft/Phi-3-mini-4k-instruct")
)
# Create a generator with a specific output type
generator = Generator(model, MyOutputType)
# Reuse the generator multiple times
result1 = generator("Prompt 1")
result2 = generator("Prompt 2")
The Generator class was introduced in v1 to address the need for reusable generation with a fixed output type, where output type compilation happens only once. Source: outlines/release_note.md
Async Support
Outlines provides comprehensive async support for both generation and streaming:
Async Model Methods
# Direct model calling with async
from pydantic import BaseModel
from outlines import from_openai
from openai import AsyncOpenAI
class Character(BaseModel):
name: str
model = from_openai(AsyncOpenAI(), "gpt-4o")
result = await model("Create a character", Character)
Source: outlines/models/base.py
Async Streaming
async for chunk in model.stream("prompt", OutputType):
print(chunk)
Source: outlines/models/base.py
Batch Processing
# Batch generation
results = await model.batch(["prompt1", "prompt2"], OutputType)
Source: outlines/models/base.py
API Model Integration
OpenAI
from openai import OpenAI
from pydantic import BaseModel
from outlines import from_openai
class Character(BaseModel):
name: str
model = from_openai(OpenAI(), "gpt-4o")
result = model("Create a character", Character)
Source: outlines/release_note.md
Google Gemini
from outlines import from_gemini
model = from_gemini(client, "gemini-pro")
Source: outlines/models/gemini.py
Exception Handling
Outlines provides a comprehensive exception hierarchy for error handling:
OutlinesError
├── APIError
│ ├── AuthenticationError
│ ├── PermissionDeniedError
│ ├── NotFoundError
│ ├── RateLimitError
│ ├── BadRequestError
│ ├── ServerError
│ ├── APITimeoutError
│ ├── APIConnectionError
│ └── ProviderResponseError
└── GenerationError
Source: outlines/exceptions.py
All public exceptions inherit from APIError → OutlinesError → Exception. The normalize_provider_exception function converts raw provider SDK exceptions into the appropriate Outlines type. Source: outlines/exceptions.py
Deployment Example
Outlines can be deployed on various platforms. Here's an example using Modal:
import modal
app = modal.App(name="outlines-app")
outlines_image = modal.Image.debian_slim(python_version="3.11").pip_install(
"outlines==1.0.0",
"transformers==4.38.2",
"datasets==2.18.0",
"accelerate==0.27.2",
)
def import_model():
from transformers import AutoModelForCausalLM, AutoTokenizer
model_id = "mistralai/Mistral-7B-Instruct-v0.2"
_ = AutoTokenizer.from_pretrained(model_id)
_ = AutoModelForCausalLM.from_pretrained(model_id)
outlines_image = outlines_image.run_function(import_model)
Source: examples/modal_example.py
Migration from v0 to v1
Key changes in the v1 API:
| Feature | v0 | v1 |
|---|---|---|
| Model initialization | models.openai("gpt-4o") | from_openai(OpenAI(), "gpt-4o") |
| Generation | generate.json(model, Character) | Generator(model, Character) |
| Direct calling | N/A | model("prompt", OutputType) |
| Streaming | Separate method | model.stream("prompt", OutputType) |
Source: outlines/release_note.md
Deprecated Features
Exllamav2model has been removed due to interface incompatibilityfunctionmodule andFunctionclass replaced byApplicationload_loramethods onVLLMandLlamaCppmodels deprecated in favor of direct initialization parametersTransformersVisionreplaced byTransformersMultiModal
Source: outlines/release_note.md
See Also
- Model Integrations - Detailed documentation for each model provider
- Generator - Generator class documentation
- Application - Application class for templated prompts
- Templates - Jinja-based template system
Source: https://github.com/dottxt-ai/outlines / Human Manual
Quickstart Guide
Related topics: Introduction to Outlines, Installation, Output Types Overview
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Introduction to Outlines, Installation, Output Types Overview
Quickstart Guide
This guide provides a comprehensive introduction to Outlines, a structured text generation library for Large Language Models (LLMs). It covers installation, model setup, basic usage patterns, and common workflows to help you get started with guaranteed structured outputs from any LLM.
Overview
Outlines ensures structured outputs during generation—directly from any LLM. Unlike post-processing approaches that parse and validate outputs after generation, Outlines enforces structure constraints at generation time. This eliminates parsing headaches, broken JSON, and fragile regex-based solutions.
Core capabilities:
- Works with any model (OpenAI, Ollama, vLLM, transformers, and more)
- Simple integration using Python type annotations
- Guaranteed valid structure output
- Provider independence for easy model switching
Source: README.md:1-10
Installation
Install Outlines using pip:
pip install outlines
Source: README.md:31
Optional Dependencies
Depending on your model provider, you may need additional packages:
| Provider | Required Dependencies |
|---|---|
| transformers | transformers, accelerate |
| OpenAI | openai |
| Anthropic | anthropic |
| Gemini | google-genai |
| vLLM | vllm |
| Ollama | ollama |
Connecting to Models
Outlines provides factory functions to create model instances from various providers. The from_transformers function initializes a model using Hugging Face transformers.
import outlines
from transformers import AutoTokenizer, AutoModelForCausalLM
MODEL_NAME = "microsoft/Phi-3-mini-4k-instruct"
model = outlines.from_transformers(
AutoModelForCausalLM.from_pretrained(MODEL_NAME, device_map="auto"),
AutoTokenizer.from_pretrained(MODEL_NAME)
)
Source: README.md:37-46
Available Model Integrations
Outlines supports multiple model providers through factory functions:
| Model Type | Factory Function | Description |
|---|---|---|
| Local (Transformers) | from_transformers() | Hugging Face transformers models |
| OpenAI | from_openai() | OpenAI API models |
| Anthropic | from_anthropic() | Anthropic Claude models |
| Gemini | from_gemini() | Google Gemini models |
| vLLM | from_vllm() | vLLM server deployment |
| Ollama | from_ollama() | Ollama local server |
| llama.cpp | from_llamacpp() | llama.cpp based models |
Source: llm.txt:1-20
Basic Structured Generation
Simple Classification
Use Literal types for classification tasks with predefined categories:
from typing import Literal
# Simple classification
sentiment = model(
"Analyze: 'This product completely changed my life!'",
Literal["Positive", "Negative", "Neutral"]
)
print(sentiment) # "Positive"
Source: README.md:55-62
Numerical Values
Generate structured numerical outputs by passing Python types:
# Extract numerical values
temperature = model("What's the boiling point of water in Celsius?", int)
print(temperature) # 100
Source: README.md:64-68
State Flow for Basic Generation
graph TD
A[User Prompt + Type] --> B[Outlines Model Call]
B --> C{Output Type}
C -->|Literal| D[Enum FSM Compilation]
C -->|int/float| E[Number FSM Compilation]
C -->|Pydantic| F[JSON Schema FSM Compilation]
D --> G[Token Masking]
E --> G
F --> G
G --> H[Constrained Generation]
H --> I[Valid Structured Output]Complex Structures with Pydantic
For complex objects, define a structure using Pydantic models:
from pydantic import BaseModel
from enum import Enum
class Rating(Enum):
poor = 1
fair = 2
good = 3
excellent = 4
class ProductReview(BaseModel):
rating: Rating
pros: list[str]
cons: list[str]
summary: str
# Generate structured review
review = model(
"Review a smartphone with great camera but poor battery life",
ProductReview
)
Source: README.md:70-88
Enum Types
Enums constrain outputs to specific string values:
from enum import Enum
class EventType(str, Enum):
conference = "conference"
webinar = "webinar"
workshop = "workshop"
meetup = "meetup"
other = "other"
class EventInfo(BaseModel):
name: str
event_type: EventType
topics: list[str]
Source: README.md:1-50
Prompt Templates
Outlines supports Jinja-based templates for dynamic prompt generation:
# Create a reusable template with Jinja syntax
sentiment_template = outlines.Template.from_string("""
<|im_start|>user
Analyze the sentiment of the following {{ content_type }}:
{{ text }}
Provide your analysis as either "Positive", "Negative", or "Neutral".
<|im_end>
<|im_start>assistant
""")
# Generate prompts with different parameters
review = "This restaurant exceeded all my expectations. Fantastic service!"
prompt = sentiment_template(content_type="review", text=review)
# Use with structured generation
result = model(prompt, Literal["Positive", "Negative", "Neutral"])
Source: README.md:1-50
Loading Templates from Files
Templates can be loaded from external files for better organization:
# Load template from file
example_template = outlines.Template.from_file("templates/few_shot.txt")
# Use with examples for few-shot learning
examples = [
("The food was cold", "Negative"),
("The staff was friendly", "Positive")
]
few_shot_prompt = example_template(examples=examples, query="Service was slow")
Source: README.md:1-50
Handling Incomplete Data with Union Types
Use Union types to handle cases where data might be incomplete:
from typing import Union
# Create a union type that can either be a structured response or fallback
EventResponse = Union[EventInfo, Literal["I don't know"]]
# Parse event details - returns EventInfo or "I don't know"
result = model(
"Join us for DevCon 2024 in San Francisco on March 15th",
EventResponse
)
Source: README.md:1-50
Using the Generator Class
The Generator class encapsulates a model with a specific output type, allowing reusable structured generation:
from outlines import Generator, from_transformers
from transformers import AutoModelForCausalLM, AutoTokenizer
from typing import Literal
model = from_transformers(
AutoModelForCausalLM.from_pretrained("microsoft/Phi-3-mini-4k-instruct"),
AutoTokenizer.from_pretrained("microsoft/Phi-3-mini-4k-instruct")
)
# Create a reusable generator
choice_generator = Generator(model, Literal["pizza", "burger", "tacos"])
# Use the generator multiple times
result1 = choice_generator("What should I eat for lunch?")
result2 = choice_generator("Dinner options?")
Source: outlines/release_note.md:1-50
Generator vs Direct Model Calling
| Aspect | Direct Model Call | Generator |
|---|---|---|
| Output type | Specified per call | Fixed at initialization |
| Compilation | Occurs each call | Occurs once |
| Reusability | Single use | Multiple uses |
| Best for | Varying output types | Consistent output types |
Source: outlines/release_note.md:1-50
Function Calling with Applications
The Application class provides a way to define functions with typed parameters that LLMs can call:
from outlines import Application, Template, from_transformers
from pydantic import BaseModel
from transformers import AutoModelForCausalLM, AutoTokenizer
from typing import List, Optional
from datetime import date
# Define a function with typed parameters
def schedule_meeting(
title: str,
date: date,
duration_minutes: int,
attendees: List[str],
location: Optional[str] = None,
agenda_items: Optional[List[str]] = None
):
"""Schedule a meeting with the specified details"""
meeting = {
"title": title,
"date": date,
"duration_minutes": duration_minutes,
"attendees": attendees,
"location": location,
"agenda_items": agenda_items
}
return f"Meeting '{title}' scheduled for {date}"
# Create model and template
model = from_transformers(
AutoModelForCausalLM.from_pretrained("microsoft/phi-4"),
AutoTokenizer.from_pretrained("microsoft/phi-4")
)
template = Template.from_string("""
Extract meeting details from: {{ request }}
""")
# Create application
app = Application(template, schedule_meeting)
# Natural language request
user_request = """
I need to set up a team sync next Monday at 2pm for 30 minutes.
Include John and Sarah. We'll discuss the Q1 roadmap.
"""
# Execute
result = app(model, {"request": user_request})
Source: README.md:1-50
Application Pattern Workflow
graph TD
A[Natural Language Request] --> B[Application]
B --> C[Template Variables]
C --> D[Structured Prompt]
D --> E[LLM Generation]
E --> F[Function Schema]
F --> G[Parameter Extraction]
G --> H[Typed Function Call]
H --> I[Structured Result]Generation Parameters
Pass additional inference arguments to model calls:
# Beam search for better quality
result = model(
"Write a short story about a cat",
str,
num_beams=2
)
# Streaming responses
for chunk in model.stream("Tell me a joke", str):
print(chunk, end="", flush=True)
Source: outlines/release_note.md:1-50
Deployment Examples
Modal Deployment
Outlines can be deployed on Modal for serverless inference:
import modal
app = modal.App(name="outlines-app")
outlines_image = modal.Image.debian_slim(python_version="3.11").pip_install(
"outlines==1.0.0",
"transformers==4.38.2",
"datasets==2.18.0",
"accelerate==0.27.2",
)
def import_model():
from transformers import AutoModelForCausalLM, AutoTokenizer
model_id = "mistralai/Mistral-7B-Instruct-v0.2"
_ = AutoTokenizer.from_pretrained(model_id)
_ = AutoModelForCausalLM.from_pretrained(model_id)
outlines_image = outlines_image.run_function(import_model)
Source: examples/modal_example.py:1-30
Next Steps
| Topic | Description |
|---|---|
| Installation Guide | Detailed installation instructions for all providers |
| Model Integrations | Complete reference for all supported models |
| Output Types | Deep dive into type system and constraints |
| Templates | Advanced template usage and patterns |
| Applications | Building reusable structured applications |
| Architecture | Understanding the internal design |
Quick Reference
# Complete minimal example
import outlines
from transformers import AutoModelForCausalLM, AutoTokenizer
from pydantic import BaseModel
# 1. Load model
model = outlines.from_transformers(
AutoModelForCausalLM.from_pretrained("microsoft/Phi-3-mini-4k-instruct"),
AutoTokenizer.from_pretrained("microsoft/Phi-3-mini-4k-instruct")
)
# 2. Define output structure
class Answer(BaseModel):
response: str
confidence: float
# 3. Generate with structure guarantee
result = model("What is the capital of France?", Answer)
This quickstart covers the essential patterns for using Outlines. The library's type-driven approach ensures that outputs always match your specified structure, eliminating the need for fragile post-processing.
Source: https://github.com/dottxt-ai/outlines / Human Manual
Installation
Related topics: Quickstart Guide, Structured Generation Backends
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Quickstart Guide, Structured Generation Backends
Installation
This guide covers all methods for installing Outlines, a structured text generation library for Large Language Models.
Overview
Outlines provides structured output generation for LLMs by ensuring outputs match specified types during generation. The installation process is straightforward via pip, with optional dependencies for specific model backends. Source: README.md:1-10
Prerequisites
Python Version
Outlines requires Python 3.10 or later. Ensure your environment has an appropriate Python installation before proceeding.
Core Dependencies
The following table lists the core dependencies required by Outlines:
| Package | Version | Purpose |
|---|---|---|
interegular | Latest | FSM compilation for regex-based constraints |
jinja2 | Latest | Template processing |
pydantic | 2.x | Data validation and structure definitions |
Installation Methods
Standard Installation (pip)
The simplest way to install Outlines is via pip:
pip install outlines
Source: README.md:15-18
Development Installation
For contributors or those wanting the latest unreleased features, install from source:
git clone https://github.com/dottxt-ai/outlines.git
cd outlines
pip install -e ".[dev]"
Optional Dependencies by Model Backend
Outlines supports multiple model providers. Install backend-specific dependencies based on your use case.
Hugging Face Transformers
For local model inference using Hugging Face Transformers:
pip install outlines[transformers]
Required packages:
| Package | Purpose |
|---|---|
transformers | Model loading and inference |
accelerate | GPU acceleration support |
datasets | Dataset utilities |
torch | Deep learning framework |
Source: examples/modal_example.py:5-9
from transformers import AutoTokenizer, AutoModelForCausalLM
MODEL_NAME = "microsoft/Phi-3-mini-4k-instruct"
model = outlines.from_transformers(
AutoModelForCausalLM.from_pretrained(MODEL_NAME, device_map="auto"),
AutoTokenizer.from_pretrained(MODEL_NAME)
)
OpenAI Models
For OpenAI API integration:
pip install outlines[openai]
Required packages:
| Package | Purpose |
|---|---|
openai | Official OpenAI Python client |
Source: README.md:20-35
from openai import OpenAI
from outlines import from_openai
model = from_openai(OpenAI(), "gpt-4o")
result = model("Create a character", Character)
Anthropic Models
For Anthropic Claude integration:
pip install outlines[anthropic]
Google Gemini Models
pip install outlines[gemini]
Source: outlines/models/gemini.py:80-95
Local Model Backends
For running models locally via vLLM, llama.cpp, or SGLang:
pip install outlines[vllm] # For vLLM backend
pip install outlines[sglang] # For SGLang backend
Source: llm.txt:30-50
Async Support
For asynchronous inference with async model backends:
pip install outlines[async]
The async backends available are:
| Backend | Class | Purpose |
|---|---|---|
| AsyncSGLang | AsyncSGLang | Async SGLang inference |
| AsyncTGI | AsyncTGI | Async Text Generation Inference |
| AsyncVLLM | AsyncVLLM | Async vLLM inference |
Source: release_note.md:45-60
import outlines
from huggingface_hub import AsyncInferenceClient
async_model = outlines.from_tgi(AsyncInferenceClient("http://localhost:11434"))
Quick Start After Installation
Once installed, you can begin using Outlines immediately:
import outlines
from transformers import AutoTokenizer, AutoModelForCausalLM
# Load a model
MODEL_NAME = "microsoft/Phi-3-mini-4k-instruct"
model = outlines.from_transformers(
AutoModelForCausalLM.from_pretrained(MODEL_NAME, device_map="auto"),
AutoTokenizer.from_pretrained(MODEL_NAME)
)
# Generate structured output
from typing import Literal
sentiment = model(
"Analyze: 'This product completely changed my life!'",
Literal["Positive", "Negative", "Neutral"]
)
Source: README.md:40-60
GPU Setup
For optimal performance with local models, configure GPU acceleration:
# Device mapping for multi-GPU setups
model = outlines.from_transformers(
AutoModelForCausalLM.from_pretrained(
MODEL_NAME,
device_map="auto"
),
AutoTokenizer.from_pretrained(MODEL_NAME)
)
The device_map="auto" parameter enables automatic GPU allocation across available devices.
Verifying Installation
Verify your installation by running:
import outlines
print(outlines.__version__) # Check installed version
Common Installation Issues
Missing Dependencies
If you encounter import errors, ensure all required dependencies are installed for your specific use case. Reinstall with the appropriate extras:
pip install --upgrade outlines[<backend>]
CUDA/GPU Compatibility
For CUDA support with transformers, ensure accelerate is installed:
pip install accelerate
Version Conflicts
If upgrading from v0.x to v1.x, note the following breaking changes:
| v0.x | v1.x |
|---|---|
models.openai("gpt-4o") | from_openai(OpenAI(), "gpt-4o") |
generate.json(model, schema) | model(prompt, schema) |
Function class | Application class |
Source: release_note.md:100-130
Next Steps
After installation, explore these topics:
- Quick Start Guide - Generate your first structured output
- Model Integrations - Configure different model backends
- Structured Generation - Learn about type-constrained generation
Source: https://github.com/dottxt-ai/outlines / Human Manual
Migration Guide
Related topics: Introduction to Outlines, Quickstart Guide
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Introduction to Outlines, Quickstart Guide
Migration Guide
This guide provides comprehensive documentation for migrating from Outlines v0 to v1. The v1 release introduces significant API changes that improve consistency, usability, and maintainability while preserving the core functionality of structured text generation with LLMs.
Overview
Outlines v1 represents a major evolution of the library's architecture, focusing on:
- Unified Model Interface: All model providers now share a consistent calling pattern
- Simplified Output Type Handling: The
Generatorclass replaces multiple specialized generate functions - Type-driven API: Python types remain the primary interface for specifying constraints
- Enhanced Streaming: All models now support streaming as a first-class feature
Source: outlines/release_note.md:1-50
High-Level Architecture Changes
The following diagram illustrates the architectural changes between v0 and v1:
graph TD
subgraph v0_Architecture
A0[User Code] --> B0[models]
B0 --> C0[generate.json/choice/...]
C0 --> D0[Generator with fixed output type]
end
subgraph v1_Architecture
A1[User Code] --> B1[from_transformers/from_openai/...]
B1 --> C1[Model Instance]
A1 --> D1[Generator Model, OutputType]
D1 --> E1[Reusable Generator]
end
style v0_Architecture fill:#ffcccc
style v1_Architecture fill:#ccffccMigration by Component
Model Initialization
#### Transformers Models
| Aspect | v0 | v1 |
|---|---|---|
| Entry point | models.transformers() | outlines.from_transformers() |
| Model loading | Inline with Outlines initialization | Separately via HuggingFace |
| Tokenizer | Passed to Outlines | Explicitly loaded and passed |
| Configuration | Scattered across model_kwargs | Standard HuggingFace initialization |
v0 (Deprecated):
from outlines import models
from transformers import BertForSequenceClassification, BertTokenizer
model = models.transformers(
model_name="prajjwal1/bert-tiny",
model_class=BertForSequenceClassification,
tokenizer_class=BertTokenizer,
model_kwargs={"use_cache": False},
tokenizer_kwargs={"model_max_length": 512},
)
v1 (Current):
import outlines
from transformers import BertForSequenceClassification, BertTokenizer
hf_model = BertForSequenceClassification.from_pretrained(
"prajjwal1/bert-tiny",
use_cache=False
)
hf_tokenizer = BertTokenizer.from_pretrained(
"prajjwal1/bert-tiny",
model_max_length=512
)
model = outlines.from_transformers(hf_model, hf_tokenizer)
Source: outlines/release_note.md:45-60
#### OpenAI Models
| Aspect | v0 | v1 |
|---|---|---|
| Init signature | OpenAI(client, OpenAIConfig()) | OpenAI(client, model_name) |
| Inference args | In OpenAIConfig | In model call |
| Recommendation | Direct initialization | Use from_openai() |
v0 (Deprecated):
from outlines import models
model = models.openai("gpt-4o", config)
result = generator("Create a character")
v1 (Current):
from openai import OpenAI
from outlines import from_openai
client = OpenAI()
model = from_openai(client, "gpt-4o")
result = model("Create a character", Character)
Source: outlines/release_note.md:65-80
Generation API Changes
#### Generator Class Introduction
The Generator class provides a reusable interface where the output type is compiled only once:
classDiagram
class Generator {
+model: Model
+output_type: OutputType
+__init__(model, output_type)
+__call__(prompt, **kwargs) Any
+stream(prompt, **kwargs) Iterator
}
class Model {
<<interface>>
+__call__(prompt, output_type, **kwargs) Any
+stream(prompt, output_type, **kwargs) Iterator
}
class OutputType {
<<union>>
}
Generator --> Model : uses
Generator --> OutputType : compilesUsage Pattern:
from outlines import Generator, from_transformers
from pydantic import BaseModel
from transformers import AutoModelForCausalLM, AutoTokenizer
class Character(BaseModel):
name: str
age: int
model = from_transformers(
AutoModelForCausalLM.from_pretrained("microsoft/Phi-3-mini-4k-instruct"),
AutoTokenizer.from_pretrained("microsoft/Phi-3-mini-4k-instruct")
)
# Create a reusable generator
generator = Generator(model, Character)
# Use multiple times without recompiling the output type
result1 = generator("Create a hero character")
result2 = generator("Create a villain character")
Source: outlines/release_note.md:25-45
#### Direct Model Calling
All models can now be called directly with a prompt and output type:
from typing import Literal
from outlines import from_transformers
from transformers import AutoModelForCausalLM, AutoTokenizer
model = from_transformers(
AutoModelForCausalLM.from_pretrained("microsoft/Phi-3-mini-4k-instruct"),
AutoTokenizer.from_pretrained("microsoft/Phi-3-mini-4k-instruct")
)
# Direct calling with output type
result = model("Pizza or burger", Literal["pizza", "burger"])
# Streaming support
for chunk in model.stream("Tell me a story", str):
print(chunk, end="", flush=True)
Source: outlines/release_note.md:18-25
Function to Application Migration
The Function class has been deprecated in favor of the Application class:
| Aspect | v0 (Function) | v1 (Application) |
|---|---|---|
| Model binding | At initialization | At call time |
| Template variables | As **kwargs | As dictionary |
| Reusability | Single model/output type | Multiple models supported |
v0 (Deprecated):
from pydantic import BaseModel
from outlines import Function, Template
class Character(BaseModel):
name: str
template = Template.from_string("Create a {{ gender }} character.")
fn = Function(template, Character, "hf-internal-testing/tiny-random-GPTJForCausalLM")
response = fn(gender="female")
v1 (Current):
from pydantic import BaseModel
from outlines import Application, Template, from_transformers
from transformers import AutoModelForCausalLM, AutoTokenizer
class Character(BaseModel):
name: str
model = from_transformers(
AutoModelForCausalLM.from_pretrained("microsoft/Phi-3-mini-4k-instruct"),
AutoTokenizer.from_pretrained("microsoft/Phi-3-mini-4k-instruct")
)
template = Template.from_string("Create a {{ gender }} character.")
app = Application(template, Character)
response = app(model, {"gender": "female"})
Source: outlines/release_note.md:80-95
Generate Module Deprecation
The generate module and its specialized functions have been consolidated:
| v0 Function | v1 Equivalent |
|---|---|
generate.json() | Generator(model, PydanticModel) |
generate.choice() | Generator(model, Literal["A", "B"]) |
generate.regex() | Generator(model, str) with FSM constraint |
v0 (Deprecated):
from pydantic import BaseModel
from outlines import generate, models
class Character(BaseModel):
name: str
model = models.openai("gpt-4o")
generator = generate.json(model, Character)
result = generator("Create a character")
v1 (Current):
from openai import OpenAI
from pydantic import BaseModel
from outlines import Generator, from_openai
class Character(BaseModel):
name: str
client = OpenAI()
model = from_openai(client, "gpt-4o")
generator = Generator(model, Character)
result = generator("Create a character")
Source: outlines/release_note.md:95-110
Async Model Support
v1 introduces new async model providers for asynchronous inference:
| Model | Factory Function | Description |
|---|---|---|
AsyncSGLang | from_sglang() | SGLang async backend |
AsyncTGI | from_tgi() | Text Generation Inference |
AsyncVLLM | from_vllm() | vLLM async backend |
Usage:
import outlines
from huggingface_hub import AsyncInferenceClient
async_model = outlines.from_tgi(AsyncInferenceClient("http://localhost:11434"))
Source: outlines/release_note.md:10-18
Deprecated Features
Exllamav2 Model
The Exllamav2 model has been deprecated without replacement:
- Reason: Interface incompatibility with Outlines' constraint system
- Impact: Required cumbersome runtime patching
- Action: Migrate to supported local inference backends (transformers, llama.cpp, vLLM)
Quick Reference
Import Changes
| Old Import | New Import |
|---|---|
from outlines import models | from outlines import from_transformers, from_openai, ... |
from outlines import generate | from outlines import Generator |
from outlines import Function | from outlines import Application |
Common Migration Patterns
# JSON output - v0 to v1
# v0: generate.json(model, MySchema)
# v1: Generator(model, MySchema)
# Choice selection - v0 to v1
# v0: generate.choice(model, ["option1", "option2"])
# v1: model(prompt, Literal["option1", "option2"])
# Streaming - v0 to v1
# v0: generator = generate.json(model, Schema); result = generator.stream(prompt)
# v1: for chunk in model.stream(prompt, Schema): process(chunk)
Documentation References
For additional information, consult the following resources:
Source: https://github.com/dottxt-ai/outlines / Human Manual
System Architecture
Related topics: Structured Generation Backends, Core Concepts
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Structured Generation Backends, Core Concepts
System Architecture
Overview
Outlines is a structured text generation library for Large Language Models (LLMs). Its core architectural purpose is to guarantee that LLM outputs conform to developer-specified schemas and constraints during generation, rather than attempting to parse and fix invalid outputs after generation.
The architecture achieves this through a multi-layered system that compiles output type specifications into finite state machines (FSMs), which are then used to constrain token selection at the logits processing level. This approach provides provider independence, allowing the same constraint system to work across both local models (where logits can be directly controlled) and API-based models (where structured output APIs are leveraged when available).
Source: llm.txt
Layer Stack Architecture
Outlines follows a strict layered architecture where each layer has a specific responsibility and communicates with adjacent layers through well-defined interfaces.
graph TD
A["User API<br/>(outlines.models)"] --> B["Generator Classes<br/>(SteerableGenerator, BlackBoxGenerator)"]
B --> C["Type System<br/>(types/dsl.py)"]
C --> D["FSM Compilation<br/>(outlines-core: regex → FSM)"]
D --> E["Guide System<br/>(processors/guide.py)"]
E --> F["Logits Processing<br/>(processors/structured.py)"]
F --> G["Model Providers<br/>(transformers, OpenAI, etc.)"]
style A fill:#e1f5ff
style G fill:#fff3e1
style C fill:#e8f5e9
style D fill:#f3e5f5Layer Descriptions
| Layer | File Location | Purpose |
|---|---|---|
| User API | outlines/models/ | Entry point for model initialization and generation calls |
| Generator Classes | outlines/generator.py | Manages reusable generation with cached FSM compilation |
| Type System | outlines/types/dsl.py | Converts Python types and Pydantic models to JSON Schema and Regex |
| FSM Compilation | outlines-core | Transforms regex patterns into finite state machines via interegular |
| Guide System | processors/guide.py | Manages FSM state transitions during token generation |
| Logits Processing | processors/structured.py | Masks invalid tokens by modifying logits before sampling |
| Model Providers | outlines/models/*.py | Provider-specific adapters for different LLM backends |
Source: llm.txt
Core Components
Model Classes
The model layer defines two fundamental abstract base classes that reflect the fundamental difference in how Outlines interacts with different types of LLM backends.
#### SteerableModel
SteerableModel is the base class for models where Outlines has direct control over the sampling process. This includes:
- Local models via the
Transformersbackend - llama.cpp-based models via the
Llamacppbackend - MLX-accelerated models via the
MlxLmbackend
For steerable models, Outlines can apply logits processors that mask invalid tokens based on the compiled FSM, ensuring constraint satisfaction at generation time.
Source: outlines/models/base.py:1-50
#### BlackBoxModel
BlackBoxModel is the base class for API-based models where the model provider controls the generation process. This includes:
- OpenAI models
- Anthropic models
- Google Gemini models
- Ollama (when used as an API)
For black box models, Outlines leverages provider-specific structured output APIs when available, or falls back to prompting strategies. The constraint system cannot directly mask tokens, so the approach adapts based on provider capabilities.
Source: llm.txt
Generator System
The Generator class provides a reusable abstraction for generation that encapsulates both the model and output type specification.
from outlines import Generator, from_transformers
from pydantic import BaseModel
class Character(BaseModel):
name: str
age: int
model = from_transformers(...)
generator = Generator(model, Character)
# FSM compilation happens once
result = generator("Create a character")
Key benefits of the Generator abstraction:
- Lazy Compilation: FSMs are compiled on first use and cached persistently
- Reusability: The same generator can be called multiple times without re-specifying the output type
- Separation of Concerns: Model configuration and output type specification are decoupled
Source: outlines/generator.py Source: outlines/release_note.md
Async Model Support
All model classes inherit from AsyncModelMixin, providing consistent async interfaces across all providers:
async def __call__(self, model_input, output_type=None, backend=None, **inference_kwargs)
async def batch(self, model_inputs, output_type=None, backend=None, **inference_kwargs)
async def stream(self, model_input, output_type=None, backend=None, **inference_kwargs)
Source: outlines/models/base.py
Provider Abstraction
The architecture uses a factory pattern with provider-specific adapter classes that handle input and output format conversion.
graph LR
A[User Code] --> B["from_transformers() / from_openai() / etc."]
B --> C["Model Instance<br/>(Transformers / OpenAI / etc.)"]
C --> D["Provider Adapter"]
D --> E["Generation Method"]
style B fill:#e8f5e9Supported Providers
| Provider | Factory Function | Model Class | Control Type |
|---|---|---|---|
| Hugging Face Transformers | from_transformers() | Transformers | Steerable |
| OpenAI | from_openai() | OpenAI | BlackBox |
| Anthropic | from_anthropic() | Anthropic | BlackBox |
| Google Gemini | from_gemini() | Gemini | BlackBox |
| Ollama | from_ollama() | Ollama | BlackBox |
| llama.cpp | from_llamacpp() | Llamacpp | Steerable |
| MLX-LM | from_mlxlm() | MlxLm | Steerable |
| vLLM | from_vllm() | VLLM | Steerable |
| SGLang | from_sglang() | SGLang | Steerable |
Source: outlines/models/gemini.py
FSM Compilation Pipeline
The FSM (Finite State Machine) compilation is the core mechanism that enables structured generation for steerable models.
graph LR
A["Python Type<br/>(Pydantic, Literal, etc.)"] --> B["JSON Schema"]
B --> C["Regex Pattern"]
C --> D["FSM<br/>(via interegular)"]
D --> E["Token Mask"]
style A fill:#e1f5ff
style E fill:#fff3e1Type System Transformations
The type system in outlines/types/dsl.py handles the conversion pipeline:
- Python Types → JSON Schema: Pydantic models and Python types are converted to JSON Schema
- JSON Schema → Regex: Complex types are converted to regex patterns
- Regex → FSM: The
outlines-corelibrary (usinginteregular) compiles regex to finite state machines
Source: llm.txt
Key Design Decisions
1. FSM-Based Constraints
For local models where logits are accessible, constraints compile to finite state machines that track valid next tokens. The FSM maintains a current state and can determine, for any given state, which tokens are valid next tokens.
This approach provides:
- Complete coverage: All valid continuations are allowed, all invalid are blocked
- Efficiency: State transitions are O(1) lookup
- Correctness: Guarantees well-formed outputs matching the schema
Source: llm.txt
2. Token-Level Control
Constraints apply at the token level, not the character level. This is critical because LLMs generate text token-by-token, and constraining at the character level would be both inefficient and potentially incorrect.
graph TD
A["Token 1: 'Hello'"] --> B["Token 2: 'World'"]
B --> C["Token 3: '!'"]
subgraph FSM_State
D["Current State: q3"]
E["Valid Tokens: [END, '!', '.']"]
end
D --> ESource: llm.txt
3. Lazy Compilation
FSM compilation is deferred until first use and the resulting FSM is cached persistently. This avoids expensive compilation overhead on module import or model loading, and allows the system to handle dynamic type specifications efficiently.
Source: llm.txt
4. Type-Driven API
Python types are the primary interface for specifying constraints, aligning with how developers already specify data structures in Python code.
from pydantic import BaseModel
from typing import Literal
class Review(BaseModel):
sentiment: Literal["positive", "negative", "neutral"]
confidence: float
result = model("I love this product!", Review)
Source: outlines/release_note.md
Exception Handling Architecture
Outlines defines a hierarchical exception system for consistent error handling across providers.
graph TD
A["Exception"] --> B["OutlinesError"]
B --> C["APIError"]
C --> D["AuthenticationError"]
C --> E["PermissionDeniedError"]
C --> F["NotFoundError"]
C --> G["RateLimitError"]
C --> H["BadRequestError"]
C --> I["ServerError"]
B --> J["ProviderResponseError"]
B --> K["GenerationError"]All public exceptions inherit from OutlinesError → APIError (for provider errors). The normalize_provider_exception function converts raw provider SDK exceptions into appropriate Outlines types.
Source: outlines/exceptions.py
Generation Workflow
The following diagram illustrates the complete generation workflow for a structured output request:
sequenceDiagram
participant User
participant Model
participant Generator
participant TypeSystem
participant FSM
participant Guide
participant LogitsProcessor
User->>Model: model(prompt, OutputType)
Model->>Generator: create Generator(model, OutputType)
Generator->>TypeSystem: convert(OutputType)
TypeSystem->>FSM: compile to FSM
Note over FSM: FSM cached after first use
loop For each token
Model->>LogitsProcessor: logits
LogitsProcessor->>Guide: get valid tokens
Guide->>LogitsProcessor: token mask
LogitsProcessor->>Model: masked logits
Model->>Guide: next token
Guide->>FSM: transition state
end
Generator->>Model: final output
Model-->>User: structured resultSource: outlines/generator.py Source: outlines/models/base.py
Backend System
The backend system provides abstraction for different inference engines used with steerable models.
# Backend selection via generator
generator = Generator(model, OutputType, backend="transformers")
# Or via direct model call
result = model(prompt, OutputType, backend="vllm")
Available backends for steerable models include:
| Backend | Description |
|---|---|
transformers | Hugging Face Transformers library |
vllm | vLLM inference engine |
sglang | SGLang runtime |
llamacpp | llama.cpp inference |
Source: outlines/backends/__init__.py
Version 1 Interface Changes
Outlines v1 introduced significant architectural changes to the model interface:
Before (v0)
from outlines import generate, models
model = models.openai("gpt-4o")
generator = generate.json(model, Character)
result = generator("Create a character")
After (v1)
from outlines import from_openai
model = from_openai(OpenAI(), "gpt-4o")
result = model("Create a character", Character)
Key changes:
- Models can now be called directly with prompt and output type
- All models have a
stream()method callable by users Generatorclass provides reusable generation with cachingApplicationclass replaces deprecatedFunctionclass for templated generation
Source: outlines/release_note.md
Documentation Architecture
The project uses MkDocs with automatic API reference generation:
graph TD
A["scripts/gen_ref_pages.py"] --> B["mkdocs.yml"]
B --> C["mkdocs-gen-files"]
C --> D["API Reference Pages"]
subgraph "Documentation Structure"
E["docs/guide/"] --> F["User Guides"]
G["docs/features/"] --> H["Feature Documentation"]
I["api_reference/"] --> J["Auto-generated API Docs"]
endSource: scripts/gen_ref_pages.py Source: mkdocs.yml
Source: https://github.com/dottxt-ai/outlines / Human Manual
Core Concepts
Related topics: System Architecture, Output Types Overview, Structured Generation Backends
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: System Architecture, Output Types Overview, Structured Generation Backends
Core Concepts
Outlines is a structured text generation library that guarantees type-safe, constrained outputs from Large Language Models (LLMs). Rather than post-processing model outputs with fragile parsing logic, Outlines integrates constraint enforcement directly into the generation process, ensuring outputs conform to specified structures from the moment generation begins.
Architecture Overview
Outlines employs a layered architecture that transforms high-level type specifications into low-level token masking operations. The system bridges the gap between Python type annotations and the token-level mechanics of language model inference.
graph TD
A[User API<br/>outlines.models] --> B[Generator Classes<br/>SteerableGenerator<br/>BlackBoxGenerator]
B --> C[Type System<br/>Pydantic → JsonSchema → Regex]
C --> D[FSM Compilation<br/>outlines-core<br/>regex → FSM via interegular]
D --> E[Guide System<br/>processors/guide.py<br/>FSM state management]
E --> F[Logits Processing<br/>processors/structured.py<br/>token masking]
F --> G[Model Providers<br/>transformers<br/>OpenAI<br/>Anthropic<br/>etc.]Source: llm.txt:layer-stack
Layer Stack
1. User API Layer (`outlines.models`)
The topmost layer provides the primary interface for developers. Users instantiate a model using provider-specific functions and call it directly with prompts and output types.
Source: llm.txt:architecture
2. Generator Classes
Two generator abstractions handle different model categories:
| Generator Class | Model Type | Control Level | Constraint Application |
|---|---|---|---|
SteerableGenerator | Local models (transformers, llama.cpp) | Full logits control | FSM-based token masking |
BlackBoxGenerator | API models (OpenAI, Anthropic) | API-level constraints | Provider's native structured output |
Source: llm.txt:key-design-decisions
3. Type System (`types/dsl.py`)
The type system transforms Python types into machine-processable constraints:
graph LR
A[Pydantic Models<br/>BaseModel] --> B[JSON Schema]
B --> C[Regex Pattern]
C --> D[Finite State Machine]Source: llm.txt:layer-stack
4. FSM Compilation (`outlines-core`)
The compilation layer converts regex patterns into finite state machines using the interegular library. This transformation enables efficient constraint checking at token boundaries.
Source: llm.txt:layer-stack
5. Guide System (`processors/guide.py`)
The guide system manages FSM state transitions during generation. It tracks which states are valid given the current token sequence and determines allowable next tokens.
Source: llm.txt:layer-stack
6. Logits Processing (`processors/structured.py`)
For steerable models, this layer applies token masking by setting probabilities of invalid tokens to negative infinity, ensuring they cannot be selected during sampling.
Source: llm.txt:layer-stack
Key Design Decisions
FSM-Based Constraints
For local models where Outlines controls the sampling process, constraints compile to finite state machines. These FSMs track valid next tokens at each generation step, enabling efficient constraint enforcement without enumerating all possible sequences.
Source: llm.txt:key-design-decisions
Provider Abstraction
The same constraint system works across different model providers:
- Local models: Outlines controls sampling, applying FSM-based masking
- API models: Outlines uses provider-native structured output support when available, or falls back to completion with validation
Source: llm.txt:key-design-decisions
Lazy Compilation
FSM compilation occurs on first use and results are cached persistently. This approach avoids upfront compilation overhead while ensuring subsequent generations with the same type are fast.
Source: llm.txt:key-design-decisions
Token-Level Control
Constraints apply at the token level rather than the character level. This design choice ensures that the constraint system works correctly with subword tokenization schemes used by modern language models.
Source: llm.txt:key-design-decisions
Type-Driven API
Python types serve as the primary interface for specifying output constraints. This design choice provides:
- Familiar syntax for Python developers
- Static type checking support
- Integration with Pydantic for complex validation
- Support for Literal types, enums, and nested structures
Source: README.md:philosophy
Model Integration
Base Model Architecture
The Model base class defines the contract for all provider implementations. Concrete implementations inherit from this base and implement provider-specific input/output handling.
# Simplified base class structure
class Model(ABC):
@abstractmethod
def __call__(self, prompt, output_type):
pass
@abstractmethod
def stream(self, prompt, output_type):
pass
Source: outlines/models/base.py
Supported Providers
Outlines provides integrations for multiple model providers:
| Provider | Function | Control Type |
|---|---|---|
| OpenAI | from_openai() | BlackBox |
| Anthropic | from_anthropic() | BlackBox |
| Google Gemini | from_gemini() | BlackBox |
| Transformers | from_transformers() | Steerable |
| Ollama | from_ollama() | Steerable/BlackBox |
| vLLM | from_vllm() | Steerable |
| SGLang | from_sglang() | Steerable |
| Llama.cpp | from_llamacpp() | Steerable |
Source: mkdocs.yml:navigation
Gemini Integration Example
from outlines import from_gemini
client = Client() # google.genai.Client
model = from_gemini(client, model_name="gemini-pro")
result = model("What is 2 + 2?", int) # Returns 4
Source: outlines/models/gemini.py
Error Handling
Exception Hierarchy
Outlines defines a comprehensive exception hierarchy for structured error handling:
OutlinesError (base)
├── APIError (provider API errors)
│ ├── AuthenticationError
│ ├── PermissionDeniedError
│ ├── NotFoundError
│ ├── RateLimitError
│ ├── BadRequestError
│ └── ServerError
├── APITimeoutError
├── APIConnectionError
├── ProviderResponseError
└── GenerationError
Source: outlines/exceptions.py:outlines-exception-hierarchy
Exception Normalization
The normalize_provider_exception function converts raw provider SDK exceptions into appropriate Outlines types, preserving original exceptions for debugging:
def normalize_provider_exception(
exception: Exception,
provider: Optional[str] = None
) -> OutlinesError
Source: outlines/exceptions.py
Output Types
Basic Python Types
Outlines supports primitive Python types as output specifications:
| Type | Generated Output |
|---|---|
int | Integer numbers |
float | Decimal numbers |
bool | True/False |
str | Arbitrary strings |
Source: README.md:quickstart
Literal Types
For constrained choices, Literal types specify exact valid outputs:
from typing import Literal
result = model("Is this positive or negative?", Literal["Positive", "Negative", "Neutral"])
Source: README.md:philosophy
Pydantic Models
Complex nested structures use Pydantic for specification:
from pydantic import BaseModel
from enum import Enum
class Rating(Enum):
poor = 1
fair = 2
good = 3
excellent = 4
class ProductReview(BaseModel):
rating: Rating
pros: list[str]
cons: list[str]
Source: README.md:complex-structures
Regex Patterns
The Regex type constrains outputs to match specific patterns:
from outlines.types import Regex
phone_number = model("Contact:", Regex(r"\d{3}-\d{3}-\d{4}"))
Source: outlines/release_note.md:regex-dsl
JSON Schema
For language-agnostic type definitions, JsonSchema accepts raw JSON Schema strings:
from outlines.types import JsonSchema
schema = '{"type": "object", "properties": {"answer": {"type": "number"}}}'
result = model("What's 2 + 2?", JsonSchema(schema))
Source: outlines/release_note.md:regex-dsl
Generator Pattern
The Generator class encapsulates reusable generation with a fixed output type:
from outlines import Generator, from_transformers
from pydantic import BaseModel
class Character(BaseModel):
name: str
model = from_transformers(...)
generator = Generator(model, Character)
# Reuse without recompiling the output type
result1 = generator("Create a male character", {"gender": "male"})
result2 = generator("Create a female character", {"gender": "female"})
Source: outlines/release_note.md:generator-constructor
Application Pattern
The Application class combines templates with structured output types:
from outlines import Application, Template
class Character(BaseModel):
name: str
template = Template.from_string("Create a {{ gender }} character.")
app = Application(template, Character)
result = app(model, {"gender": "female"})
Source: outlines/release_note.md:application-class
The Outlines Philosophy
Outlines mirrors Python's type system philosophy: specify what you want, and the system ensures it. Rather than validating and parsing outputs after generation, Outlines guarantees structurally valid outputs from the start.
Source: README.md:philosophy
Design Principles
- Constraint at generation time: Validity is enforced during token selection, not after
- Fail fast: Invalid outputs are impossible by construction
- Provider independence: Same API works across all supported models
- Type familiarity: Use standard Python types and Pydantic models
Source: README.md:why-outlines
Source: https://github.com/dottxt-ai/outlines / Human Manual
Structured Generation Backends
Related topics: System Architecture
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: System Architecture
Structured Generation Backends
Structured Generation Backends are the underlying engine components in Outlines that handle the compilation of structured constraints (such as JSON schemas and regular expressions) into efficient token-level generation guides. These backends abstract away the complexity of constraint compilation, providing a unified interface for steering language model outputs while allowing users to choose the most appropriate implementation for their use case.
Architecture Overview
Outlines supports multiple backend implementations for structured generation, each with different trade-offs in terms of performance, memory usage, and feature support. The backend system follows a pluggable architecture where users can select or let Outlines choose the optimal backend automatically.
graph TD
User[User Code] --> API[Outlines API]
API --> Backends[Backend Selection]
Backends --> Core[OutlinesCore]
Backends --> XGrammar[XGrammar]
Backends --> LLGuidance[LLGuidance]
Core --> FSM[FSM Compilation]
XGrammar --> XG_Engine[XGrammar Engine]
LLGuidance --> LL_Engine[LLGuidance Engine]
FSM --> Guide[Generation Guide]
XG_Engine --> Guide
LL_Engine --> Guide
Guide --> Tokens[Token Masking]
Tokens --> Model[Language Model]
Model --> Output[Structured Output]The backend system sits between the high-level Outlines API and the underlying language model, transforming structural constraints into actionable token-level guidance during generation. Source: outlines/backends/__init__.py:1-60
Available Backends
Outlines provides three main backend implementations for structured generation, each optimized for different scenarios.
| Backend | Module | Description |
|---|---|---|
| OutlinesCore | outlines_core | Default backend using the interegular library for FSM compilation |
| XGrammar | xgrammar | Optimized backend using the xgrammar library for faster compilation |
| LLGuidance | llguidance | Specialized backend using llguidance for high-performance generation |
Source: outlines/backends/__init__.py:20-30
OutlinesCore Backend
The OutlinesCore backend is the default implementation that uses the interegular library to compile regular expressions and JSON schemas into Finite State Machines (FSMs). This backend provides comprehensive support for all Outlines features and serves as the reference implementation.
Key characteristics:
- Pure Python implementation using interegular for regex parsing
- Persistent caching of compiled FSMs
- Full support for JSON schemas and regex constraints
- Memory-efficient for moderate-sized schemas
The backend converts structured constraints through the following pipeline:
- Parse the JSON schema or regex pattern
- Compile to an intermediate FSM representation using interegular
- Optimize the FSM for token-level generation
- Cache the compiled result for reuse
Source: outlines/backends/outlines_core.py
XGrammar Backend
The XGrammar backend provides an optimized implementation that leverages the xgrammar library for faster constraint compilation. This backend is particularly useful for applications requiring quick iteration cycles where compilation speed matters.
Key characteristics:
- Faster compilation times compared to OutlinesCore
- Optimized token masking operations
- Good balance between performance and memory usage
- Requires xgrammar as an additional dependency
Source: outlines/backends/xgrammar.py
LLGuidance Backend
The LLGuidance backend uses the llguidance library to provide high-performance structured generation. This backend is designed for production workloads where generation speed is critical.
Key characteristics:
- Maximum generation throughput
- Low-latency token selection
- Specialized for constrained generation scenarios
- Requires llguidance as an additional dependency
Source: outlines/backends/llguidance.py
Backend Selection
Outlines provides two mechanisms for backend selection: automatic default selection and explicit user specification.
Default Backend Configuration
When no backend is explicitly specified, Outlines uses the default backends defined in the configuration:
| Constraint Type | Default Backend |
|---|---|
| JSON Schema | outlines_core |
| Regex | outlines_core |
Source: outlines/backends/__init__.py:25-28
Explicit Backend Selection
Users can specify a backend explicitly using the backend parameter when calling model generation methods:
import outlines
from pydantic import BaseModel
class Person(BaseModel):
name: str
age: int
# Use specific backend
result = model("Create a person", Person, backend="xgrammar")
Backend names are case-insensitive and map to the following implementations:
"outlines_core"or"outlinescore": OutlinesCore backend"xgrammar": XGrammar backend"llguidance": LLGuidance backend
Source: outlines/backends/__init__.py:55-58
Backend Factory Functions
The backend system provides factory functions that create the appropriate logits processor based on the constraint type and selected backend.
JSON Schema Logits Processor
The get_json_schema_logits_processor function creates a logits processor for JSON schema constraints:
def get_json_schema_logits_processor(
backend_name: str | None,
model: SteerableModel,
json_schema: str,
) -> LogitsProcessorType:
"""Create a logits processor from a JSON schema.
ParametersSource: https://github.com/dottxt-ai/outlines / Human Manual
Output Types Overview
Related topics: JSON Schema and Pydantic Support, Regex Patterns
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: JSON Schema and Pydantic Support, Regex Patterns
Output Types Overview
Outlines provides a comprehensive type system for structured generation with Large Language Models (LLMs). Output types define the expected structure of generated text, and Outlines ensures that generated outputs match these specifications exactly during inference.
Purpose and Scope
Output types in Outlines serve as the primary interface for specifying constraints on LLM outputs. Rather than attempting to parse and fix invalid outputs after generation, Outlines enforces structure during the generation process itself. This approach eliminates fragile parsing logic and guarantees valid outputs on the first attempt.
The type system supports various complexity levels:
| Type Category | Examples | Use Case |
|---|---|---|
| Basic Python | int, float, str | Simple values |
| Literal Types | Literal["Yes", "No"] | Enumerated choices |
| Pydantic Models | BaseModel subclasses | Complex nested structures |
| Regex Patterns | Regex(...), JsonSchema(...) | Custom format constraints |
| Context-Free Grammars | CFG(...) | Formal language definitions |
Source: README.md:1-20
Architecture
Type Conversion Pipeline
Outlines converts output types into finite state machines (FSMs) that guide token selection during generation. This conversion follows a layered approach:
graph TD
A[Python Type / Pydantic Model] --> B[JSON Schema]
B --> C[Regex Pattern]
C --> D[FSM / State Machine]
D --> E[Token Masking Guide]
E --> F[Constrained Generation]
G[DSL Terms] --> C
H[CFG Grammar] --> CThe type system handles three primary conversion pathways:
- Python Types to Terms: Basic types (
int,str,float) andLiteraltypes convert to intermediateTermrepresentations - Pydantic Models to JSON Schema: Complex models generate JSON Schema definitions
- Terms to Regex: All intermediate representations ultimately convert to regex patterns
Source: outlines/types/dsl.py:1-30
Core Components
| Component | File Location | Responsibility |
|---|---|---|
| Term Classes | outlines/types/dsl.py | Define regex DSL elements |
| JSON Schema Utilities | outlines/types/json_schema_utils.py | Pydantic to schema conversion |
| Type Adapters | outlines/types/utils.py | Type introspection helpers |
| FSM Compilation | outlines-core package | Regex to finite state machines |
Source: llm.txt:1-50
Basic Python Types
Outlines supports native Python types as output specifications. These types are automatically converted to appropriate constraints.
Supported Types
import outlines
from transformers import AutoModelForCausalLM, AutoTokenizer
model = outlines.from_transformers(
AutoModelForCausalLM.from_pretrained("microsoft/Phi-3-mini-4k-instruct"),
AutoTokenizer.from_pretrained("microsoft/Phi-3-mini-4k-instruct")
)
# Integer output
temperature = model("What's the boiling point of water in Celsius?", int)
print(temperature) # 100
# String output (constrained format)
name = model("Generate a valid email address", str)
| Type | Constraint Applied |
|---|---|
int | Matches integer patterns |
float | Matches decimal numbers |
str | General text with type hints |
Source: README.md:45-60
Literal Types
Literal types define exact enumerated values the model can output:
from typing import Literal
sentiment = model(
"Analyze: 'This product completely changed my life!'",
Literal["Positive", "Negative", "Neutral"]
)
print(sentiment) # "Positive"
This approach is ideal for classification tasks, yes/no questions, and any scenario requiring outputs from a fixed set of options.
Source: README.md:50-55
Pydantic Models
For complex structured outputs, Pydantic models provide a declarative interface to define nested schemas with validation.
Defining Models
from pydantic import BaseModel
from enum import Enum
class Rating(Enum):
poor = 1
fair = 2
good = 3
excellent = 4
class ProductReview(BaseModel):
rating: Rating
pros: list[str]
cons: list[str]
summary: str
review = model("Review the latest iPhone", ProductReview)
Model Conversion Process
graph LR
A[Pydantic BaseModel] --> B[JSON Schema]
B --> C[Regex via interegular]
C --> D[FSM]
D --> E[Guided Generation]
F[TypeAdapter] --> A
G[GetJsonSchemaHandler] --> BThe conversion process uses Pydantic's schema generation hooks:
from pydantic import BaseModel, GetCoreSchemaHandler
from pydantic.json_schema import JsonSchemaValue
from pydantic_core import core_schema as cs
class CustomType(BaseModel):
@classmethod
def __get_pydantic_core_schema__(
cls,
source_type: Any,
handler: GetCoreSchemaHandler
) -> cs.CoreSchema:
return cs.string_schema(
pattern=r"^[A-Z]{2}\d{4}$" # Format: XX0000
)
Source: outlines/types/dsl.py:40-80
Regular Expression DSL
The Regex DSL provides fine-grained control over output formats through composable term classes.
Term Classes
| Term Class | Description | Example |
|---|---|---|
Regex | Base regex wrapper | Regex(r"\d{3}-\d{4}") |
String | Literal string match | String("yes") |
Integer | Integer numbers | Integer() |
Alternatives | Choice between patterns | either(pattern1, pattern2) |
KleeneStar | Zero or more repetitions | repeat(pattern) |
Optional | Optional pattern | optional(pattern) |
Source: outlines/types/dsl.py:20-60
Building Complex Patterns
from outlines.types import either, optional, at_least, integer
# Phone number pattern
phone = either(
Regex(r"\d{3}-\d{3}-\d{4}"),
Regex(r"\(\d{3}\) \d{3}-\d{4}")
)
# Complex format with optional parts
date_format = Sequence(
integer(), # Year
literal("-"),
at_least(integer(), 1), # At least one month
optional(literal("-") + at_least(integer(), 1)) # Optional day
)
Term Functions
The DSL includes utility functions for pattern composition:
either(*terms): Match any one of multiple termsoptional(term): Make a pattern optionalat_least(term, n): Require at least n repetitionsone_of(*choices): Synonym foreitherliteral(text): Match exact text
Source: outlines/release_note.md:40-80
JsonSchema Type
The JsonSchema type allows direct use of JSON Schema definitions for complex validation:
from outlines.types import JsonSchema
json_schema = '''
{
"type": "object",
"properties": {
"answer": {"type": "number"},
"confidence": {"type": "number", "minimum": 0, "maximum": 1}
},
"required": ["answer"]
}
'''
result = model("What's 2 + 2? Respond in JSON.", JsonSchema(json_schema))
This approach is useful when:
- Migrating existing JSON Schema definitions
- Working with API specifications (OpenAPI, etc.)
- Defining schemas separately from code
Source: outlines/release_note.md:50-65
Context-Free Grammars
Outlines supports Context-Free Grammars (CFG) for formal language generation:
from outlines.types import CFG
grammar = CFG("""
expression ::= number op number
op ::= "+" | "-" | "*" | "/"
number ::= [0-9]+
""")
math_result = model("Calculate 5 + 3", grammar)
CFGs are particularly valuable for:
- Programming language generation
- Mathematical expression evaluation
- Structured domain-specific languages
Source: outlines/release_note.md:60-70
Union Types
Union types enable conditional or alternative output structures:
from typing import Union
class SuccessResponse(BaseModel):
data: str
timestamp: str
UnknownResponse = Literal["I don't know", "Unable to determine"]
Response = Union[SuccessResponse, UnknownResponse]
result = model("What is the capital of France?", Response)
Handling Incomplete Data
Union types excel at scenarios where partial information is acceptable:
class EventInfo(BaseModel):
name: str
date: str
location: str
EventResponse = Union[EventInfo, Literal["I don't know"]]
result = model(
"Extract event details: 'Join us for the meeting next week!'",
EventResponse
)
Source: README.md:100-120
Generator Integration
The Generator class encapsulates output types for reusable constrained generation:
from outlines import Generator, from_transformers
from pydantic import BaseModel
class Character(BaseModel):
name: str
species: str
model = from_transformers(
AutoModelForCausalLM.from_pretrained("microsoft/Phi-3-mini-4k-instruct"),
AutoTokenizer.from_pretrained("microsoft/Phi-3-mini-4k-instruct")
)
character_generator = Generator(model, Character)
result = character_generator("Create a fantasy character")
Benefits of using generators:
| Feature | Benefit |
|---|---|
| Cached compilation | FSM compiled once, reused across calls |
| Type inference | Output type specified at construction |
| Consistent behavior | Same constraints applied to all generations |
Source: outlines/release_note.md:20-40
Type System Internals
Conversion Pipeline Details
graph TD
subgraph "Type Definition"
A[Pydantic / Python Type] --> B[Term Representation]
B --> C[Regex Pattern]
end
subgraph "Compilation"
C --> D[FSM via interegular]
D --> E[Cached FSM]
end
subgraph "Generation"
E --> F[Guide Processor]
F --> G[Token Masking]
G --> H[Constrained Sampling]
endKey Files
| File | Purpose |
|---|---|
outlines/types/__init__.py | Public API exports |
outlines/types/dsl.py | Term classes and regex DSL |
outlines/types/json_schema_utils.py | Schema conversion utilities |
outlines/types/utils.py | Type introspection helpers |
Source: llm.txt:30-60
Best Practices
Choosing Output Types
- Use Literal types for classification and yes/no responses
- Use Python primitives (
int,float) for simple numerical outputs - Use Pydantic models for complex, nested structures
- Use Regex DSL when you need precise format control
- Use CFG for formal language generation
Performance Considerations
| Complexity | Compilation Time | Runtime Overhead |
|---|---|---|
| Literal types | Minimal | Negligible |
| Simple regex | Low | Low |
| Pydantic models | Moderate | Moderate |
| Complex grammars | Higher | Higher |
FSM compilation happens once on first use and is cached for subsequent calls, minimizing repeated overhead.
Source: llm.txt:40-50
Summary
Outlines' output types system provides a unified, Pythonic interface for structured generation:
- Type-driven API: Python types, Pydantic models, and custom DSLs
- Guaranteed validity: Constraints enforced during generation, not after
- Flexible composition: Union types, regex patterns, and grammars
- Performance optimized: FSM caching and lazy compilation
The type system transforms high-level type specifications into optimized finite state machines, enabling reliable structured generation across diverse LLM providers.
Source: https://github.com/dottxt-ai/outlines / Human Manual
JSON Schema and Pydantic Support
Related topics: Output Types Overview
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Output Types Overview
JSON Schema and Pydantic Support
Outlines provides robust support for JSON Schema and Pydantic models as first-class output type specifications. This enables developers to define complex structured output schemas using familiar Python type annotations, which are then compiled into efficient finite state machines (FSMs) for guided text generation.
Overview
The JSON Schema and Pydantic support in Outlines serves three primary purposes:
- Type Definition - Allows users to define output structures using Python-native type hints
- Schema Conversion - Provides bidirectional conversion between JSON Schema, Pydantic, TypedDict, and dataclass representations
- Constraint Compilation - Transforms schema definitions into regular expressions and FSMs for guided generation
Source: outlines/types/dsl.py:1-30
Architecture
Layer Stack for Structured Output
graph TD
A[User API: Pydantic Model / JSON Schema] --> B[Type System: python_types_to_terms]
B --> C[JsonSchema Term / CFG Term]
C --> D[Regex Compilation: to_regex]
D --> E[FSM Generation via outlines-core]
E --> F[Logits Processor: Token Masking]
F --> G[Model Providers]
H[Pydantic / TypedDict / Dataclass] -->|json_schema_dict_to_pydantic| B
I[JSON Schema String] -->|JsonSchema class| BThe type system in Outlines follows a layered architecture where Python types are progressively converted into machine-executable constraints:
| Layer | Component | Responsibility |
|---|---|---|
| User API | Pydantic models, JSON Schema | Define desired output structure |
| Type System | python_types_to_terms | Convert Python types to Term instances |
| Schema Representation | JsonSchema, CFG classes | Represent constraints as Terms |
| Regex Compilation | to_regex | Transform Terms to regular expressions |
| FSM Generation | outlines-core (interegular) | Convert regex to finite state machine |
| Generation Control | Logits Processors | Mask invalid tokens during generation |
Source: outlines/types/dsl.py:46-80
JsonSchema Class
The JsonSchema class is the core abstraction for representing JSON Schema-based output types.
Class Definition
class JsonSchema(Term):
"""Represents a JSON Schema constraint for structured generation."""
def __init__(self, schema: Union[str, dict], whitespace_pattern: Optional[str] = None):
"""
Args:
schema: JSON Schema as string or dict
whitespace_pattern: Optional regex for whitespace handling
"""
Key Methods
| Method | Description | Returns |
|---|---|---|
to_format(target_type) | Convert schema to Pydantic, TypedDict, or dataclass | Converted type or raises ValueError |
from_file(path) | Create JsonSchema from .json file | JsonSchema instance |
_display_node() | Get string representation | str |
Source: outlines/types/dsl.py:50-90
Schema Conversion
The to_format method supports converting JSON Schema to multiple Python type formats:
def to_format(self, target_types: List[str]) -> Any:
"""Convert JSON Schema to target format(s).
Supported targets: 'pydantic', 'typeddict', 'dataclass', 'str', 'dict'
"""
This method iterates through the requested target types and attempts conversion, returning the first successful result.
Source: outlines/types/dsl.py:55-80
Schema Validation and Comparison
def __eq__(self, other) -> bool:
"""Compare two JsonSchema instances by parsing and comparing their contents."""
self_dict = json.loads(self.schema)
other_dict = json.loads(other.schema)
return self_dict == other_dict
Source: outlines/types/dsl.py:100-108
JSON Schema Utilities
The json_schema_utils.py module provides bidirectional conversion between JSON Schema and Python type systems.
Schema Type Mapping
JSON Schema types are mapped to Python types as follows:
| JSON Schema Type | Python Type |
|---|---|
string | str |
integer | int |
number | float |
boolean | bool |
array | List[item_type] |
object | Pydantic / TypedDict / Dataclass |
Source: outlines/types/json_schema_utils.py:1-50
Conversion Functions
#### json_schema_dict_to_pydantic
Converts a JSON Schema dictionary to a Pydantic model:
def json_schema_dict_to_pydantic(
schema: dict,
name: Optional[str] = None
) -> Type[BaseModel]:
"""Convert JSON Schema dict to Pydantic BaseModel.
Args:
schema: JSON Schema dictionary
name: Optional name for the model
Returns:
Pydantic BaseModel class
"""
#### json_schema_dict_to_typeddict
Converts JSON Schema to a TypedDict:
def json_schema_dict_to_typeddict(
schema: dict,
name: Optional[str] = None
) -> _TypedDictMeta:
"""Convert JSON Schema dict to TypedDict class."""
The conversion process:
- Extracts
requiredfields from schema - Maps
propertiesto typed annotations - Optional fields are wrapped with
Optional[] - Recursively handles nested objects
Source: outlines/types/json_schema_utils.py:80-120
#### json_schema_dict_to_dataclass
Converts JSON Schema to a dataclass:
def json_schema_dict_to_dataclass(
schema: dict,
name: Optional[str] = None
) -> type:
"""Convert JSON Schema dict to dataclass."""
schema_type_to_python
Recursively converts JSON Schema type definitions to Python types:
def schema_type_to_python(
schema: dict,
caller_target_type: str = "pydantic"
) -> Any:
"""Convert JSON Schema type to Python type.
Args:
schema: JSON Schema dict or nested schema
caller_target_type: Target format ('pydantic', 'typeddict', 'dataclass')
"""
Source: outlines/types/json_schema_utils.py:40-75
Backend Integration
Different inference backends handle JSON Schema constraints through specialized logits processors.
Backend Selection
graph LR
A[Model Instance] --> B[_get_backend]
B --> C{Backend Name}
C -->|outlines_core| D[OutlinesCoreBackend]
C -->|xgrammar| E[XGrammarBackend]
C -->|llguidance| F[LLGuidanceBackend]
D --> G[get_json_schema_logits_processor]
E --> G
F --> GThe get_json_schema_logits_processor function creates the appropriate processor:
def get_json_schema_logits_processor(
backend_name: str | None,
model: SteerableModel,
json_schema: str,
) -> LogitsProcessorType:
"""Create a logits processor from a JSON schema."""
backend = _get_backend(
backend_name or JSON_SCHEMA_DEFAULT_BACKEND,
model,
)
return backend.get_json_schema_logits_processor(json_schema)
Source: outlines/backends/__init__.py:1-50
VLLM Offline Backend
The VLLM offline backend converts JsonSchema terms to vLLM's GuidedDecodingParams:
def _get_guided_decoding_params(self, output_type) -> dict:
"""Convert output type to guided decoding parameters."""
if output_type is None:
return {}
term = python_types_to_terms(output_type)
if isinstance(term, CFG):
return {"grammar": term.definition}
elif isinstance(term, JsonSchema):
guided_decoding_params = {"json": json.loads(term.schema)}
if term.whitespace_pattern:
guided_decoding_params["whitespace_pattern"] = term.whitespace_pattern
return guided_decoding_params
else:
return {"regex": to_regex(term)}
Source: outlines/models/vllm_offline.py:50-80
Usage Patterns
Basic Pydantic Model Usage
from pydantic import BaseModel
from enum import Enum
from outlines import from_transformers
from transformers import AutoModelForCausalLM, AutoTokenizer
class Rating(Enum):
poor = 1
fair = 2
good = 3
excellent = 4
class ProductReview(BaseModel):
rating: Rating
pros: list[str]
cons: list[str]
summary: str
model = from_transformers(
AutoModelForCausalLM.from_pretrained("microsoft/Phi-3-mini-4k-instruct"),
AutoTokenizer.from_pretrained("microsoft/Phi-3-mini-4k-instruct")
)
review = model("Amazing laptop! Great battery life, fast processor.", ProductReview)
Source: README.md:1-50
Using json_schema Function
import outlines
schema = {
"type": "object",
"properties": {
"name": {"type": "string"},
"age": {"type": "integer"},
"email": {"type": "string", "format": "email"}
},
"required": ["name", "email"]
}
result = model("Generate a user profile", outlines.json_schema(schema))
Union Types for Flexible Output
from typing import Union, List, Literal
from pydantic import BaseModel
class EventInfo(BaseModel):
name: str
date: str
location: str
EventResponse = Union[EventInfo, Literal["I don't know"]]
result = model("What event is mentioned?", EventResponse)
Type System Functions
The DSL module provides utility functions for type checking and conversion:
Type Check Functions
| Function | Purpose |
|---|---|
is_int(t) | Check if type is int |
is_float(t) | Check if type is float |
is_str(t) | Check if type is str |
is_bool(t) | Check if type is bool |
is_datetime(t) | Check if type is datetime |
is_date(t) | Check if type is date |
is_time(t) | Check if type is time |
is_pydantic_model(t) | Check if type is Pydantic BaseModel |
is_enum(t) | Check if type is Enum |
is_literal(t) | Check if type is Literal |
is_union(t) | Check if type is Union |
is_typing_list(t) | Check if type is List |
is_typed_dict(t) | Check if type is TypedDict |
Source: outlines/types/dsl.py:80-150
python_types_to_terms
The main conversion function that transforms Python types into Term instances:
def python_types_to_terms(
output_type,
whitespace_pattern: Optional[str] = None
) -> Term:
"""Convert Python types to Term instances for guided generation.
Handles:
- Primitive types (int, float, str, bool)
- Collections (List, Dict, Tuple)
- Pydantic models
- Enums and Literals
- Union types
- JSON Schema strings
- TypedDict and dataclasses
"""
Source: outlines/types/dsl.py:200-280
Error Handling
Schema Conversion Failures
When schema conversion fails, Outlines provides informative warnings:
except Exception as e: # pragma: no cover
warnings.warn(
f"Cannot convert schema type {type(schema)} to {target_type}: {e}"
)
continue
If no valid conversion is found, a ValueError is raised:
raise ValueError(
f"Cannot convert schema type {type(schema)} to any of the target "
f"types {target_types}"
)
Source: outlines/types/dsl.py:75-82
Best Practices
- Use Pydantic for Complex Schemas - Pydantic models provide validation and IDE autocomplete
- Define Required Fields - JSON Schema
requiredarray ensures critical fields are generated - Use Optional for Nullable Fields - Mark non-required fields with
Optional[]or= None - Leverage Union Types - Return fallback values when data is incomplete
- Cache Compiled FSMs - Outlines caches compiled state machines for reuse
Related Components
- CFG Support - Context-free grammar constraints for complex syntax
- Regex DSL - Direct regular expression specifications
- Template System - Jinja-based prompt templating
- Generator Class - Reusable generator objects with pre-compiled constraints
Source: https://github.com/dottxt-ai/outlines / Human Manual
Regex Patterns
Related topics: Output Types Overview
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Output Types Overview
Regex Patterns
Regex patterns in Outlines provide a powerful Domain-Specific Language (DSL) for defining structured output constraints. The regex DSL allows developers to build complex pattern constraints by composing simple terms, which are then compiled into finite state machines (FSMs) that guide the language model's token generation process.
The core insight behind Outlines' regex support is that instead of generating text and hoping it matches a format, Outlines makes it impossible for the model to generate invalid outputs by masking invalid tokens during generation. Source: llm.txt:1-10
Overview
Outlines supports regex patterns at multiple levels of abstraction:
- Direct Regex Patterns: Use
Regexclass to define patterns that can be used as Pydantic field types - Regex DSL: Compose complex patterns using term functions like
either,optional,at_least, and integer/string helpers - JSON Schema Integration:
JsonSchematerm accepts JSON schema strings and converts them to regex constraints - Context-Free Grammars:
CFGterm provides grammar-based constraints for more complex languages
Source: outlines/release_note.md:1-20
Type Conversion Pipeline
The regex system follows a well-defined conversion pipeline:
Pydantic Model / Python Type → Term DSL → Regex → FSM → Token Masking
This pipeline ensures that high-level type specifications are progressively transformed into low-level token constraints that the generation process can enforce.
Source: llm.txt:1-15
The Term Classes
The Term class hierarchy forms the foundation of Outlines' regex DSL. All terms implement a common interface that supports composition operations and conversion to regex patterns.
Source: outlines/types/dsl.py:1-30
Core Term Classes
| Term Class | Description | Standalone Usage |
|---|---|---|
Regex | Represents a raw regex pattern | Yes |
String | Literal string matching | Yes |
JsonSchema | JSON schema to regex conversion | Yes |
CFG | Context-free grammar constraints | Yes |
Sequence | Concatenation of multiple terms | No |
Alternatives | Choice between multiple terms | No |
KleeneStar | Zero or more repetitions | No |
KleenePlus | One or more repetitions | No |
Optional | Zero or one occurrence | No |
Source: outlines/types/dsl.py:30-80
Regex Class
The Regex class is a Pydantic-compatible type that represents a regular expression pattern. It can be used directly as a field type in Pydantic models.
from outlines.types import Regex
from pydantic import BaseModel
age_type = Regex("[0-9]+")
class User(BaseModel):
name: str
age: age_type
Source: outlines/types/dsl.py:85-95
The Regex class provides the following composition operators:
| Operator | Method | Description | |
|---|---|---|---|
+ | __add__ | Concatenate patterns (sequence) | |
| `\ | ` | __or__ | Create alternatives (choice) |
r+ | __radd__ | Right-side concatenation | |
| `r | ` | __ror__ | Right-side alternatives |
Source: outlines/types/dsl.py:97-115
JsonSchema Term
The JsonSchema term accepts a JSON schema string and converts it into regex constraints. This allows seamless integration with existing JSON schema definitions.
from outlines import from_transformers
from outlines.types import JsonSchema
from transformers import AutoModelForCausalLM, AutoTokenizer
model = from_transformers(
AutoModelForCausalLM.from_pretrained("microsoft/Phi-3-mini-4k-instruct"),
AutoTokenizer.from_pretrained("microsoft/Phi-3-mini-4k-instruct")
)
json_schema = '{"type": "object", "properties": {"answer": {"type": "number"}}}'
result = model("What's 2 + 2? Respond in JSON", JsonSchema(json_schema))
Source: outlines/release_note.md:1-25
CFG Term
The CFG (Context-Free Grammar) term allows definition of constraints using context-free grammar notation. This is useful for complex languages where regex alone is insufficient.
Source: outlines/types/dsl.py:75-80
Regex DSL Functions
The regex DSL provides utility functions for building complex patterns by combining simpler terms.
Composition Functions
| Function | Description |
|---|---|
either(*terms) | Create alternatives from multiple terms |
optional(term) | Make a term optional (zero or one) |
at_least(n, term) | Require at least n occurrences |
integer() | Match integer patterns |
float() | Match floating-point number patterns |
boolean() | Match boolean patterns |
Source: outlines/release_note.md:1-30
Building Complex Patterns
The following example demonstrates building a complex regex pattern using the DSL:
from outlines import from_transformers
from outlines.types import at_least, either, integer, optional
from transformers import AutoModelForCausalLM, AutoTokenizer
model = from_transformers(
AutoModelForCausalLM.from_pretrained("microsoft/Phi-3-mini-4k-instruct"),
AutoTokenizer.from_pretrained("microsoft/Phi-3-mini-4k-instruct")
)
# Build a pattern that matches email-like strings
pattern = either("[email protected]", "[email protected]")
Source: outlines/release_note.md:25-35
Architecture
FSM Compilation Flow
graph TD
A[User Pattern Definition] --> B[Pydantic Model / Regex DSL]
B --> C[JSON Schema Extraction]
C --> D[Regex Generation]
D --> E[FSM Compilation via interegular]
E --> F[Token-Level Constraints]
F --> G[Logits Masking during Generation]
G --> H[Valid Output Generation]Source: llm.txt:15-25
Layer Stack
The regex system integrates with Outlines' layered architecture:
User API (outlines.models)
↓
Generator Classes (SteerableGenerator, BlackBoxGenerator)
↓
Type System (types/dsl.py: Pydantic → JsonSchema → Regex)
↓
FSM Compilation (outlines-core: regex → FSM via interegular)
↓
Guide System (processors/guide.py: FSM state management)
↓
Logits Processing (processors/structured.py: token masking)
↓
Model Providers (transformers, OpenAI, etc.)
Source: llm.txt:30-45
Usage Patterns
Simple Classification with Literals
While not strictly regex, Outlines uses the same constraint infrastructure for literal choices:
from typing import Literal
from outlines import from_transformers
from transformers import AutoModelForCausalLM, AutoTokenizer
model = from_transformers(
AutoModelForCausalLM.from_pretrained("microsoft/Phi-3-mini-4k-instruct"),
AutoTokenizer.from_pretrained("microsoft/Phi-3-mini-4k-instruct")
)
result = model("Pizza or burger", Literal["pizza", "burger"])
Source: outlines/release_note.md:60-75
Regex as Pydantic Field Types
For more complex validation, use Regex as a Pydantic field type:
from outlines.types import Regex
from pydantic import BaseModel
class ProductCode(BaseModel):
code: Regex(r"^[A-Z]{3}-[0-9]{4}$")
The Regex class implements Pydantic's schema generation and validation interfaces:
| Method | Purpose |
|---|---|
__get_validator__ | Pydantic validator for input validation |
__get_pydantic_core_schema__ | Core schema for Pydantic v2 integration |
__get_pydantic_json_schema__ | JSON schema generation |
Source: outlines/types/dsl.py:117-130
Integration with Type System
Python Types to Terms Conversion
The python_types_to_terms function maps Python types to their corresponding Term representations:
| Python Type | Term Equivalent |
|---|---|
int | integer() |
float | float() |
str | string |
bool | boolean() |
List[T] | Pattern for lists |
Literal[...] | Alternatives |
Source: outlines/types/dsl.py:1-60
Schema Utilities
The type system includes utilities for converting between different schema formats:
json_schema_dict_to_pydantic(): Convert JSON schema to Pydantic modeljson_schema_dict_to_typeddict(): Convert to TypedDictjson_schema_dict_to_dataclass(): Convert to dataclass
Source: outlines/types/dsl.py:50-55
Key Design Decisions
Token-Level Control
Outlines' regex constraints operate at the token level, not character level. This means:
- FSMs are compiled from regex patterns using the
interegularlibrary - State transitions map (state, token) → next_state
- For each state, invalid tokens are masked by setting their logits to negative infinity
Source: llm.txt:45-50
Lazy Compilation
FSMs are compiled on first use and cached persistently. This ensures:
- Initial overhead is minimal
- Repeated generation with the same schema is fast
- Memory is efficiently managed through caching
Source: llm.txt:50-55
API Reference
Regex Class
class Regex(Term):
def __init__(self, pattern: str):
"""Initialize with a regex pattern string."""
def __add__(self, other: Term) -> Sequence:
"""Concatenate patterns."""
def __or__(self, other: Term) -> Alternatives:
"""Create alternatives."""
def validate(self, value: Any) -> Any:
"""Validate a value against the pattern."""
DSL Functions
def either(*terms: Term) -> Alternatives:
"""Create alternatives from multiple terms."""
def optional(term: Term) -> Term:
"""Make a term optional (zero or one occurrence)."""
def at_least(n: int, term: Term) -> Term:
"""Require at least n occurrences."""
def integer() -> Term:
"""Match integer patterns."""
def boolean() -> Term:
"""Match boolean patterns."""
Source: outlines/types/dsl.py:30-75
Best Practices
- Pre-compile complex patterns: If using the same pattern multiple times, consider using the
Generatorclass to cache the FSM compilation
- Use Pydantic models for complex structures: JSON schema conversion provides a cleaner API for nested objects
- Leverage composition operators: Build complex patterns from simple terms using
+and|operators
- Test patterns separately: Validate regex patterns independently before using them in generation
See Also
Source: https://github.com/dottxt-ai/outlines / Human Manual
Doramagic Pitfall Log
Source-linked risks stay visible on the manual page so the preview does not read like a recommendation.
First-time setup may fail or require extra isolation and rollback planning.
Users may get misleading failures or incomplete behavior unless configuration is checked carefully.
Users cannot judge support quality until recent activity, releases, and issue response are checked.
The project may affect permissions, credentials, data exposure, or host boundaries.
Doramagic Pitfall Log
Doramagic extracted 16 source-linked risk signals. Review them before installing or handing real data to the project.
1. Installation risk: [Feature] Streaming structured generation with partial validation
- Severity: high
- Finding: Installation risk is backed by a source signal: [Feature] Streaming structured generation with partial validation. Treat it as a review item until the current version is checked.
- User impact: First-time setup may fail or require extra isolation and rollback planning.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: Source-linked evidence: https://github.com/dottxt-ai/outlines/issues/1856
2. Configuration risk: 📝 Integration Proposal: CAJAL — Structured Scientific Paper Generation
- Severity: high
- Finding: Configuration risk is backed by a source signal: 📝 Integration Proposal: CAJAL — Structured Scientific Paper Generation. Treat it as a review item until the current version is checked.
- User impact: Users may get misleading failures or incomplete behavior unless configuration is checked carefully.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: Source-linked evidence: https://github.com/dottxt-ai/outlines/issues/1859
3. Maintenance risk: Add more custom types
- Severity: high
- Finding: Maintenance risk is backed by a source signal: Add more custom types. Treat it as a review item until the current version is checked.
- User impact: Users cannot judge support quality until recent activity, releases, and issue response are checked.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: Source-linked evidence: https://github.com/dottxt-ai/outlines/issues/1303
4. Security or permission risk: Add function calling and MCP support
- Severity: high
- Finding: Security or permission risk is backed by a source signal: Add function calling and MCP support. Treat it as a review item until the current version is checked.
- User impact: The project may affect permissions, credentials, data exposure, or host boundaries.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: Source-linked evidence: https://github.com/dottxt-ai/outlines/issues/1626
5. Security or permission risk: [Feature Request] Add streaming support for structured generation
- Severity: high
- Finding: Security or permission risk is backed by a source signal: [Feature Request] Add streaming support for structured generation. Treat it as a review item until the current version is checked.
- User impact: The project may affect permissions, credentials, data exposure, or host boundaries.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: Source-linked evidence: https://github.com/dottxt-ai/outlines/issues/1842
6. Installation risk: Developers should check this installation risk before relying on the project: Feature request: OWASP ASI06 memory poisoning defense for structured generation
- Severity: medium
- Finding: Developers should check this installation risk before relying on the project: Feature request: OWASP ASI06 memory poisoning defense for structured generation
- User impact: Developers may fail before the first successful local run: Feature request: OWASP ASI06 memory poisoning defense for structured generation
- Recommended check: Before packaging this project, run the relevant install/config/quickstart check for: Feature request: OWASP ASI06 memory poisoning defense for structured generation. Context: Observed when using python
- Evidence: failure_mode_cluster:github_issue | fmev_aafbb33fe2e219639553f4d4275e0223 | https://github.com/dottxt-ai/outlines/issues/1864 | Feature request: OWASP ASI06 memory poisoning defense for structured generation
7. Installation risk: Developers should check this installation risk before relying on the project: Incompatibility with vllm==0.19 because of some api changes
- Severity: medium
- Finding: Developers should check this installation risk before relying on the project: Incompatibility with vllm==0.19 because of some api changes
- User impact: Developers may fail before the first successful local run: Incompatibility with vllm==0.19 because of some api changes
- Recommended check: Before packaging this project, run the relevant install/config/quickstart check for: Incompatibility with vllm==0.19 because of some api changes. Context: Observed when using python, cuda
- Evidence: failure_mode_cluster:github_issue | fmev_9f23e49bc91e3f8af003ddcdedec3e72 | https://github.com/dottxt-ai/outlines/issues/1854 | Incompatibility with vllm==0.19 because of some api changes
8. Installation risk: Developers should check this installation risk before relying on the project: Outlines v1.2.6
- Severity: medium
- Finding: Developers should check this installation risk before relying on the project: Outlines v1.2.6
- User impact: Upgrade or migration may change expected behavior: Outlines v1.2.6
- Recommended check: Before packaging this project, run the relevant install/config/quickstart check for: Outlines v1.2.6. Context: Observed during installation or first-run setup.
- Evidence: failure_mode_cluster:github_release | fmev_e917f6640a48bc54b76cbbbfcfd2b346 | https://github.com/dottxt-ai/outlines/releases/tag/1.2.6 | Outlines v1.2.6
9. Installation risk: Developers should check this installation risk before relying on the project: Outlines v1.2.8
- Severity: medium
- Finding: Developers should check this installation risk before relying on the project: Outlines v1.2.8
- User impact: Upgrade or migration may change expected behavior: Outlines v1.2.8
- Recommended check: Before packaging this project, run the relevant install/config/quickstart check for: Outlines v1.2.8. Context: Observed when using python
- Evidence: failure_mode_cluster:github_release | fmev_802eb50b3a54cd87f585ac14e899b4bc | https://github.com/dottxt-ai/outlines/releases/tag/1.2.8 | Outlines v1.2.8
10. Installation risk: Feature request: OWASP ASI06 memory poisoning defense for structured generation
- Severity: medium
- Finding: Installation risk is backed by a source signal: Feature request: OWASP ASI06 memory poisoning defense for structured generation. Treat it as a review item until the current version is checked.
- User impact: First-time setup may fail or require extra isolation and rollback planning.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: Source-linked evidence: https://github.com/dottxt-ai/outlines/issues/1864
11. Capability assumption: README/documentation is current enough for a first validation pass.
- Severity: medium
- Finding: README/documentation is current enough for a first validation pass.
- User impact: The project should not be treated as fully validated until this signal is reviewed.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: capability.assumptions | github_repo:615403340 | https://github.com/dottxt-ai/outlines | README/documentation is current enough for a first validation pass.
12. Maintenance risk: Developers should check this migration risk before relying on the project: Outlines v1.2.10
- Severity: medium
- Finding: Developers should check this migration risk before relying on the project: Outlines v1.2.10
- User impact: Upgrade or migration may change expected behavior: Outlines v1.2.10
- Recommended check: Before packaging this project, run the relevant install/config/quickstart check for: Outlines v1.2.10. Context: Observed when using python
- Evidence: failure_mode_cluster:github_release | fmev_75fc0fce3c200ef68083c6815dfb1b11 | https://github.com/dottxt-ai/outlines/releases/tag/1.2.10 | Outlines v1.2.10
Source: Doramagic discovery, validation, and Project Pack records
Community Discussion Evidence
These external discussion links are review inputs, not standalone proof that the project is production-ready.
Count of project-level external discussion links exposed on this manual page.
Open the linked issues or discussions before treating the pack as ready for your environment.
Community Discussion Evidence
Doramagic exposes project-level community discussion separately from official documentation. Review these links before using outlines with real data or production workflows.
- Feature request: OWASP ASI06 memory poisoning defense for structured gen - github / github_issue
- [[Feature Request] Add streaming support for structured generation](https://github.com/dottxt-ai/outlines/issues/1842) - github / github_issue
- Add more custom types - github / github_issue
- 📝 Integration Proposal: CAJAL — Structured Scientific Paper Generation - github / github_issue
- [[Feature] Streaming structured generation with partial validation](https://github.com/dottxt-ai/outlines/issues/1856) - github / github_issue
- Complex structure makes output empty - github / github_issue
- TransformerTokenizer reads attributes from raw backend that modern trans - github / github_issue
- Incompatibility with vllm==0.19 because of some api changes - github / github_issue
- Add function calling and MCP support - github / github_issue
- Outlines v1.3.0 - github / github_release
- Outlines v1.2.12 - github / github_release
- Outlines v1.2.10 - github / github_release
Source: Project Pack community evidence and pitfall evidence