# https://github.com/huggingface/peft 项目说明书

生成时间：2026-05-16 07:29:09 UTC

## 目录

- [Introduction to PEFT](#page-introduction)
- [Installation Guide](#page-installation)
- [System Architecture](#page-architecture)
- [Core Components](#page-core-components)
- [LoRA and LoRA Variants](#page-lora-methods)
- [Other PEFT Methods](#page-other-methods)
- [Configuration System](#page-configuration)
- [Model Loading and Saving](#page-model-loading)
- [Quantization Integration](#page-quantization)
- [Advanced Features](#page-advanced-features)

<a id='page-introduction'></a>

## Introduction to PEFT

### 相关页面

相关主题：[Installation Guide](#page-installation), [System Architecture](#page-architecture), [LoRA and LoRA Variants](#page-lora-methods)

<details>
<summary>Relevant Source Files</summary>

以下源码文件用于生成本页说明：

- [src/peft/peft_model.py](https://github.com/huggingface/peft/blob/main/src/peft/peft_model.py)
- [src/peft/helpers.py](https://github.com/huggingface/peft/blob/main/src/peft/helpers.py)
- [src/peft/tuners/lora/model.py](https://github.com/huggingface/peft/blob/main/src/peft/tuners/lora/model.py)
- [src/peft/tuners/tuners_utils.py](https://github.com/huggingface/peft/blob/main/src/peft/tuners/tuners_utils.py)
- [src/peft/tuners/xlora/model.py](https://github.com/huggingface/peft/blob/main/src/peft/tuners/xlora/model.py)
- [src/peft/tuners/hira/model.py](https://github.com/huggingface/peft/blob/main/src/peft/tuners/hira/model.py)
- [src/peft/tuners/adamss/model.py](https://github.com/huggingface/peft/blob/main/src/peft/tuners/adamss/model.py)
- [src/peft/tuners/gralora/model.py](https://github.com/huggingface/peft/blob/main/src/peft/tuners/gralora/model.py)
- [src/peft/utils/hotswap.py](https://github.com/huggingface/peft/blob/main/src/peft/utils/hotswap.py)
</details>

# Introduction to PEFT

## Overview

**PEFT** (Parameter-Efficient Fine-Tuning) is a Python library developed by Hugging Face that provides efficient methods for fine-tuning pre-trained models while keeping most model parameters frozen. This approach significantly reduces computational costs and memory requirements compared to full fine-tuning, making it accessible to work with large language models on limited hardware resources.

The library supports multiple fine-tuning techniques including LoRA, Prefix Tuning, Prompt Tuning, AdaLoRA, QLoRA, and many other parameter-efficient methods. PEFT is designed to integrate seamlessly with the Hugging Face Transformers ecosystem, allowing users to apply adapter-based fine-tuning with minimal code changes.

资料来源：[src/peft/tuners/lora/model.py:1-50]()

## Core Architecture

### Design Philosophy

PEFT follows an adapter-based architecture where lightweight trainable modules are added to pre-trained models. These adapters contain a small fraction of the total model parameters, typically ranging from 0.1% to 5% of the original model size, depending on the configuration.

The core principles of PEFT's architecture include:

- **Modularity**: Each fine-tuning method is implemented as a separate "tuner" with its own configuration class
- **Composability**: Multiple adapters can be loaded and used simultaneously
- **Compatibility**: Full integration with Hugging Face Transformers and Diffusers
- **Memory Efficiency**: Support for quantization and CPU offloading strategies

资料来源：[src/peft/tuners/tuners_utils.py:1-30]()

### Component Hierarchy

```mermaid
graph TD
    A[PeftModel] --> B[BaseTuner]
    B --> C[Model Specific Tuners]
    C --> D[LoraModel]
    C --> E[PrefixTuningModel]
    C --> F[PromptTuningModel]
    C --> G[AdaLoRAModel]
    C --> H[QLoRAModel]
    C --> I[XLoraModel]
    C --> J[HiraModel]
    C --> K[GraloraModel]
    C --> L[AdamssModel]
```

## Supported Fine-Tuning Methods

PEFT provides implementations for various parameter-efficient fine-tuning techniques. Each method has its own configuration class and model wrapper.

| Method | Configuration Class | Description |
|--------|---------------------|-------------|
| LoRA | `LoraConfig` | Low-Rank Adaptation using rank-decomposition matrices |
| Prefix Tuning | `PrefixTuningConfig` | Optimizes continuous prompts prepended to layer inputs |
| Prompt Tuning | `PromptTuningConfig` | Trains soft prompts embedded in the input layer |
| P-Tuning | `P-tuningConfig` | Uses trainable prompt embeddings with optional LSTM/MLP |
| AdaLoRA | `AdaLoraConfig` | Adaptive LoRA with dynamic rank allocation |
| QLoRA | `QLoRAConfig` | LoRA with quantized base models |
| IA³ | `IA³Config` | Infused Adapter by Inhibiting and Amplifying Activations |
| Multi Adapter | `MultiAdapterConfig` | Combines multiple adapters |
| LoHa | `LoHaConfig` | Low-Rank Hadamard Product adaptation |
| LoKr | `LoKrConfig` | Low-Kranker factorization adaptation |
| AdaLoKr | `AdaLoKrConfig` | Adaptive LoKr with dynamic rank allocation |
| OFT | `OFTConfig` | Orthogonal Fine-Tuning |
| BOFT | `BOFTConfig` | Block-diagonal OFT |
| Vera | `VeraConfig` | Vector-based Random Matrix Adaptation |
| XLora | `XLoraConfig` | Cross-Layer LoRA with hierarchical structure |
| Hira | `HiraConfig` | Hierarchical Rank Adaptation |
| Gralora | `GraloraConfig` | Gradient-Routed LoRA |
| Adamss | `AdamssConfig` | Adaptive subspace efficient fine-tuning |
| SHiRA | `ShiraConfig` | SharedHierarchical Rank Adaptation |
| LND | `LNDConfig` | Layer-wise Normalization Distribution |
| Loralite | `LoraliteConfig` | Lightweight LoRA variant |

资料来源：[src/peft/tuners/lora/model.py:1-80]()

## Task Types

PEFT supports various NLP task types through specialized model classes. Each task type is designed for specific downstream applications.

```mermaid
graph LR
    A[Base Model] --> B[PeftModel]
    B --> C{Task Type}
    C --> D[CAUSAL_LM]
    C --> E[SEQ_2_SEQ_LM]
    C --> F[FEATURE_EXTRACTION]
    C --> G[QUESTION_ANS]
    C --> H[SEQ_CLS]
    C --> I[TOKEN_CLS]
    C --> J[IMAGE_CLS]
```

### Task-Specific Models

| Task Type | Model Class | Use Case |
|-----------|-------------|----------|
| `CAUSAL_LM` | `PeftModelForCausalLM` | Autoregressive text generation |
| `SEQ_2_SEQ_LM` | `PeftModelForSeq2SeqLM` | Encoder-decoder tasks (translation, summarization) |
| `FEATURE_EXTRACTION` | `PeftModelForFeatureExtraction` | Embedding extraction |
| `QUESTION_ANS` | `PeftModelForQuestionAnswering` | Question answering tasks |
| `SEQ_CLS` | `PeftModelForSequenceClassification` | Text classification |
| `TOKEN_CLS` | `PeftModelForTokenClassification` | Named entity recognition, POS tagging |

资料来源：[src/peft/peft_model.py:1-100]()

## Core API

### PeftModel Class

The `PeftModel` is the base class for all PEFT models. It wraps a pre-trained model and manages adapter injection, loading, and merging.

#### Key Methods

| Method | Description |
|--------|-------------|
| `from_pretrained(model, model_id, adapter_name, ...)` | Load PEFT model from pretrained weights |
| `get_peft_config(adapter_name)` | Get configuration for a specific adapter |
| `print_trainable_parameters()` | Display trainable vs total parameter counts |
| `merge_and_unload(progressbar, safe_merge, adapter_names)` | Merge adapters into base model |
| `unload()` | Return base model without PEFT modules |
| `set_adapter(adapter_name)` | Activate a specific adapter |
| `add_weighted_adapter(adapter_names, weights, combination_type)` | Combine multiple adapters |

资料来源：[src/peft/peft_model.py:100-200]()

### Loading Pre-trained Adapters

The `from_pretrained` class method loads PEFT adapters from the Hugging Face Hub or local storage:

```python
from peft import PeftModel, PeftConfig

# Load configuration
config = PeftConfig.from_pretrained("user/peft-model")

# Load base model
base_model = AutoModelForCausalLM.from_pretrained("base-model")

# Create PEFT model with loaded adapter
peft_model = PeftModel.from_pretrained(
    base_model, 
    "user/peft-model",
    adapter_name="default",
    is_trainable=False,
    autocast_adapter_dtype=True
)
```

资料来源：[src/peft/peft_model.py:200-280]()

### Merging and Unloading

PEFT models support merging adapters back into the base model for inference:

```python
# Merge and unload to get a standalone model
merged_model = peft_model.merge_and_unload()

# Safe merge with weight averaging
merged_model = peft_model.merge_and_unload(safe_merge=True)

# Merge specific adapters only
merged_model = peft_model.merge_and_unload(adapter_names=["adapter1", "adapter2"])

# Unload without merging
base_model = peft_model.unload()
```

资料来源：[src/peft/tuners/tuners_utils.py:50-100]()

## Adapter Management

### Multi-Adapter Support

PEFT supports loading and managing multiple adapters simultaneously. This is useful for ensemble methods or when combining adapters trained on different tasks.

```python
# Load multiple adapters
config = {
    "adapter_1": "./path/to/adapter-1",
    "adapter_2": "./path/to/adapter-2",
}

xlora_config = XLoraConfig(adapter_dict=config)
model = get_peft_model(base_model, xlora_config)
```

资料来源：[src/peft/tuners/xlora/model.py:1-50]()

### Hotswap Adapter

The hotswap functionality allows replacing loaded adapters without reloading the entire model:

```python
from peft.utils.hotswap import hotswap_adapter

# Replace the default adapter with a new one
hotswap_adapter(
    model, 
    "path-to-new-adapter", 
    adapter_name="default",
    torch_device="cuda:0"
)
```

This operation validates the new adapter configuration and swaps the weights while maintaining the model structure.

资料来源：[src/peft/utils/hotswap.py:1-80]()

## Configuration Options

### Common Parameters

Most PEFT configuration classes share common parameters that control the fine-tuning behavior:

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `r` | int | 8 | LoRA rank dimension |
| `lora_alpha` | int | None | LoRA scaling factor |
| `lora_dropout` | float | 0.0 | Dropout probability for LoRA layers |
| `target_modules` | List[str] | None | Names of modules to apply adaptation |
| `bias` | str | "none" | Bias handling: "none", "all", "lora_only" |
| `modules_to_save` | List[str] | None | Additional trainable modules |
| `fan_in_fan_out` | bool | False | Transpose weights for certain architectures |

### Method-Specific Parameters

#### LoRA Configuration

```python
from peft import LoraConfig

config = LoraConfig(
    r=16,
    lora_alpha=32,
    target_modules=["q_proj", "v_proj", "k_proj", "out_proj"],
    lora_dropout=0.05,
    bias="none",
    task_type="CAUSAL_LM"
)
```

#### Prefix Tuning Configuration

```python
from peft import PrefixTuningConfig

config = PrefixTuningConfig(
    num_virtual_tokens=20,
    token_dim=768,
    num_transformer_submodules=1,
    num_attention_heads=12,
    num_layers=12,
    encoder_hidden_size=768,
    prefix_projection=False
)
```

资料来源：[src/peft/tuners/lora/model.py:50-150]()

## Advanced Features

### Dynamic Rank Allocation

Some PEFT methods support adaptive rank allocation, where the importance of different layers is evaluated during training:

```python
# Adaptive LoRA with dynamic ranking
config = AdaLoraConfig(
    r=16,
    lora_alpha=32,
    target_r=8,
    tinit=200,
    tfinal=1000,
    deltaT=10,
    lora_dropout=0.1
)
```

资料来源：[src/peft/tuners/adamss/model.py:1-60]()

### Hierarchical Adaptation

Methods like Hira and Gralora implement hierarchical rank adaptation for better parameter efficiency:

```python
from peft import HiraConfig

config = HiraConfig(
    r=32,
    target_modules=["q_proj", "k_proj", "v_proj", "out_proj"],
    hira_dropout=0.01,
    task_type="SEQ_2_SEQ_LM"
)
```

资料来源：[src/peft/tuners/hira/model.py:1-60]()

### Quantization Support

PEFT integrates with BitsAndBytes for 8-bit and 4-bit quantization:

```python
from peft import prepare_model_for_kbit_training, get_peft_model, LoraConfig
import transformers

quantization_config = transformers.BitsAndBytesConfig(load_in_8bit=True)
model = AutoModelForCausalLM.from_pretrained(
    "model-name",
    quantization_config=quantization_config
)
model = prepare_model_for_kbit_training(model)

config = LoraConfig(r=8, lora_alpha=16, target_modules=["q_proj", "v_proj"])
peft_model = get_peft_model(model, config)
```

## Helper Functions

### Signature Updates

The `helpers` module provides utility functions for updating model signatures:

```python
from peft import update_forward_signature, update_generate_signature, update_signature

# Update forward signature only
update_forward_signature(peft_model)

# Update generate signature only
update_generate_signature(peft_model)

# Update both
update_signature(peft_model, method="all")
```

### Model Validation

```python
from peft.helpers import check_if_peft_model

# Check if a model ID corresponds to a PEFT model
is_peft = check_if_peft_model("user/peft-model")

# Works with both Hub and local paths
is_peft_local = check_if_peft_model("./local/peft-model")
```

### Adapter Scale Rescaling

```python
from peft.helpers import rescale_adapter_scale

with rescale_adapter_scale(model, multiplier=0.5):
    output = model(inputs)
```

## Memory Optimization

### Low CPU Memory Usage

Loading adapters can be optimized for memory-constrained environments:

```python
# Create adapter weights on meta device for faster loading
peft_model = PeftModel.from_pretrained(
    base_model,
    adapter_path,
    low_cpu_mem_usage=True
)
```

### Training with Quantized Models

PEFT supports full training workflows with quantized base models:

```python
from peft import get_peft_model, LoraConfig, prepare_model_for_kbit_training

model = AutoModelForCausalLM.from_pretrained(
    "mistralai/Mistral-7B-Instruct-v0.1",
    quantization_config=BitsAndBytesConfig(load_in_4bit=True)
)
model = prepare_model_for_kbit_training(model)

config = LoraConfig(r=8, lora_alpha=16, target_modules=["q_proj", "v_proj"])
peft_model = get_peft_model(model, config)
```

## Integration Patterns

### With Diffusers

PEFT works with Stable Diffusion and other diffusion models:

```python
from diffusers import StableDiffusionPipeline
from peft import MissModel, MissConfig

config_unet = MissConfig(
    r=8,
    target_modules=["proj_in", "proj_out", "to_k", "to_q", "to_v"],
    init_weights=True
)

pipeline = StableDiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5")
pipeline.unet = MissModel(pipeline.unet, config_unet, "default")
```

资料来源：[src/peft/tuners/miss/model.py:1-60]()

### Cross-Modal Applications

Some PEFT methods like XLora are designed for multi-modal models with complex architecture support:

```python
from peft import XLoraConfig, get_peft_model

config = XLoraConfig(
    adapter_dict={
        "adapter_1": "./path/to/adapter-1",
        "adapter_2": "./path/to/adapter-2"
    }
)

model = AutoModelForCausalLM.from_pretrained("model-name", trust_remote_code=True)
xlora_model = get_peft_model(model, config)
```

## Workflow Diagram

```mermaid
graph TD
    A[Pre-trained Model] --> B[Choose Fine-tuning Method]
    B --> C[Create PEFT Config]
    C --> D[Initialize Adapter]
    D --> E[Train Adapter]
    E --> F{Save or Load?}
    F -->|Save| G[save_pretrained]
    F -->|Load| H[from_pretrained]
    G --> I[Hub or Local]
    H --> J[Merge or Inference]
    J --> K[merge_and_unload]
    J --> L[Direct Inference]
    K --> M[Final Model]
    L --> M
```

## Best Practices

1. **Start with Default Ranks**: Begin with `r=8` for LoRA and increase based on performance
2. **Target Specific Modules**: Prefer targeting attention projection layers (`q_proj`, `v_proj`) over all linear layers
3. **Use Quantization for Large Models**: Apply 4-bit quantization (QLoRA) for models larger than 7B parameters
4. **Save Checkpoints Regularly**: Use PEFT's built-in checkpoint saving to prevent training loss
5. **Evaluate Before Merging**: Always evaluate adapter quality before merging into the base model

## Conclusion

PEFT provides a comprehensive framework for parameter-efficient fine-tuning that enables training large models on limited hardware. Its modular architecture supports various adaptation methods while maintaining compatibility with the broader Hugging Face ecosystem. Whether working with language models, vision models, or multi-modal architectures, PEFT offers consistent APIs and significant memory savings compared to full fine-tuning approaches.

资料来源：[src/peft/tuners/lora/model.py:1-100]()
资料来源：[src/peft/tuners/tuners_utils.py:1-50]()

---

<a id='page-installation'></a>

## Installation Guide

### 相关页面

相关主题：[Introduction to PEFT](#page-introduction), [Quantization Integration](#page-quantization)

<details>
<summary>相关源码文件</summary>

以下源码文件用于生成本页说明：

- [docs/source/install.md](https://github.com/huggingface/peft/blob/main/docs/source/install.md)
- [pyproject.toml](https://github.com/huggingface/peft/blob/main/pyproject.toml)
- [setup.py](https://github.com/huggingface/peft/blob/main/setup.py)
- [requirements.txt](https://github.com/huggingface/peft/blob/main/requirements.txt)
- [src/peft/helpers.py](https://github.com/huggingface/peft/blob/main/src/peft/helpers.py)
- [src/peft/utils/hotswap.py](https://github.com/huggingface/peft/blob/main/src/peft/utils/hotswap.py)
</details>

# Installation Guide

This guide covers all methods for installing the PEFT (Parameter-Efficient Fine-Tuning) library, including dependencies management, optional feature installations, and verification procedures.

## Overview

The PEFT library provides state-of-the-art parameter-efficient fine-tuning methods including LoRA, AdaLoRA, Prefix Tuning, Prompt Tuning, and many other advanced techniques. Proper installation ensures access to all functionality including GPU acceleration, quantization support, and integration with Hugging Face Transformers and Diffusers.

**Key Installation Features:**
- Core library installation via pip, conda, or from source
- Optional dependencies for specific tuners and features
- GPU/CUDA support for accelerated training
- BitsAndBytes integration for quantization
- Diffusers integration for image generation models

## System Requirements

### Hardware Requirements

| Component | Minimum | Recommended |
|-----------|---------|-------------|
| RAM | 8 GB | 16 GB+ |
| GPU VRAM | 4 GB | 8-24 GB (depending on model size) |
| Storage | 5 GB | 10 GB+ |
| CUDA | 11.6 | 11.8+ or CUDA 12.x |

### Software Requirements

| Requirement | Version |
|-------------|---------|
| Python | ≥ 3.8 |
| PyTorch | ≥ 1.11.0 |
| Transformers | ≥ 4.20.0 |
| Diffusers | ≥ 0.13.0 |
| Accelerate | ≥ 0.20.0 |

## Installation Methods

### Standard Installation via pip

The simplest method to install PEFT is using pip:

```bash
pip install peft
```

This installs the core library with all base dependencies.

### Installing Specific Versions

To install a specific version of PEFT:

```bash
pip install peft==0.13.0
```

To install the latest development version from GitHub:

```bash
pip install git+https://github.com/huggingface/peft.git
```

### Installation from Source

For developers contributing to PEFT or needing the latest features:

```bash
git clone https://github.com/huggingface/peft.git
cd peft
pip install -e .
```

The editable installation (`-e .`) allows modifications to the source code while keeping the package importable.

## Dependencies Structure

### Core Dependencies

The core dependencies are defined in `pyproject.toml` and `requirements.txt`:

```toml
# Core runtime dependencies
torch>=1.11.0
transformers>=4.20.0
accelerate>=0.20.0
torch>=1.11.0
```

资料来源：[pyproject.toml](https://github.com/huggingface/peft/blob/main/pyproject.toml)

### Optional Dependencies by Feature

PEFT provides optional dependencies for specific use cases:

| Feature | Installation Command | Purpose |
|---------|---------------------|---------|
| Quantization | `pip install peft[quantization]` | BitsAndBytes 4-bit/8-bit quantization |
| GPU Training | `pip install peft[gpu]` | CUDA-optimized operations |
| Diffusers | `pip install peft[diffusers]` | Stable Diffusion model support |
| Dev Tools | `pip install peft[dev]` | Testing and linting |
| All Extras | `pip install peft[all]` | Complete installation |

### Advanced Installation with Quantization

For models requiring quantized weights (e.g., using 4-bit or 8-bit precision):

```bash
pip install peft bitsandbytes scipy accelerate
```

This combination enables:
- 4-bit quantization via BitsAndBytes
- 8-bit quantization for extreme memory reduction
- Mixed-precision training optimization
- Efficient loading of large models on limited hardware

资料来源：[src/peft/tuners/lora/model.py](https://github.com/huggingface/peft/blob/main/src/peft/tuners/lora/model.py)

## Environment Setup

### Using Virtual Environments

**Using venv:**

```bash
python -m venv peft-env
source peft-env/bin/activate  # Linux/macOS
peft-env\Scripts\activate     # Windows
pip install peft
```

**Using conda:**

```bash
conda create -n peft-env python=3.10
conda activate peft-env
pip install peft
```

### CUDA Configuration

For GPU acceleration, ensure CUDA is properly configured:

```python
import torch
print(torch.cuda.is_available())  # Should return True
print(torch.cuda.device_count())  # Number of available GPUs
```

The PEFT library automatically detects and utilizes available CUDA devices during training.

## Verification and Testing

### Basic Installation Verification

Verify your installation by importing PEFT and checking the version:

```python
import peft
print(peft.__version__)  # Should print the installed version
```

### Quick Functionality Test

Test basic LoRA functionality:

```python
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import get_peft_model, LoraConfig

# Load a small model for testing
model_name = "gpt2"
model = AutoModelForCausalLM.from_pretrained(model_name)

# Configure LoRA
lora_config = LoraConfig(
    task_type="CAUSAL_LM",
    r=8,
    lora_alpha=16,
    target_modules=["c_attn", "c_proj"],
    lora_dropout=0.05
)

# Apply PEFT
peft_model = get_peft_model(model, lora_config)
peft_model.print_trainable_parameters()
```

### Signature Update Utilities

After installation, you may want to update method signatures for better IDE support:

```python
from peft import update_forward_signature, update_generate_signature

# Update forward signature
update_forward_signature(peft_model)

# Update generate signature (for generative models)
update_generate_signature(peft_model)
```

资料来源：[src/peft/helpers.py:1-100](https://github.com/huggingface/peft/blob/main/src/peft/helpers.py)

## Tuner-Specific Installation Notes

### LoRA and QLoRA

Standard LoRA requires no additional dependencies beyond core installation. QLoRA requires:

```bash
pip install peft bitsandbytes>=0.40.0 trl>=0.4.0
```

资料来源：[src/peft/tuners/lora/model.py](https://github.com/huggingface/peft/blob/main/src/peft/tuners/lora/model.py)

### Prefix Tuning and Prompt Tuning

These methods require only core dependencies:

```bash
pip install peft
```

### Diffusion Model Support (LoRA for Images)

For Stable Diffusion and similar models:

```bash
pip install peft diffusers
```

Example configuration for Stable Diffusion:

```python
from diffusers import StableDiffusionPipeline
from peft import MissModel, MissConfig

config_unet = MissConfig(
    r=8,
    target_modules=["proj_in", "proj_out", "to_k", "to_q", "to_v", "to_out.0"],
    init_weights=True
)

pipeline = StableDiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5")
pipeline.unet = MissModel(pipeline.unet, config_unet, "default")
```

资料来源：[src/peft/tuners/miss/model.py](https://github.com/huggingface/peft/blob/main/src/peft/tuners/miss/model.py)

### X-LoRA Installation

X-LoRA requires specific dependencies for multi-adapter support:

```bash
pip install peft transformers accelerate bitsandbytes
```

资料来源：[src/peft/tuners/xlora/model.py](https://github.com/huggingface/peft/blob/main/src/peft/tuners/xlora/model.py)

## Troubleshooting

### Common Installation Issues

| Issue | Solution |
|-------|----------|
| `ImportError: No module named peft` | Reinstall: `pip uninstall peft && pip install peft` |
| CUDA out of memory | Use quantization or smaller batch sizes |
| BitsAndBytes import failure | Install: `pip install bitsandbytes` |
| Old PyTorch version | Update: `pip install torch>=1.11.0` |

### Version Compatibility

Check compatibility matrix:

| PEFT Version | Min Python | Min PyTorch | Min Transformers |
|--------------|------------|-------------|------------------|
| 0.13.x | 3.8+ | 1.11.0 | 4.20.0 |
| 0.12.x | 3.8+ | 1.11.0 | 4.20.0 |
| 0.11.x | 3.7+ | 1.11.0 | 4.20.0 |

### Verifying Adapter Loading

Test adapter functionality after installation:

```python
from peft import check_if_peft_model

is_peft = check_if_peft_model("path/to/model")
print(f"Is PEFT model: {is_peft}")
```

资料来源：[src/peft/helpers.py:51-65](https://github.com/huggingface/peft/blob/main/src/peft/helpers.py)

## Adapter Hotswap Installation

For runtime adapter switching functionality:

```bash
pip install peft
```

The hotswap capability is built into PEFT's core functionality:

```python
from peft.utils.hotswap import hotswap_adapter

# Load and swap adapters at runtime
hotswap_adapter(model, "path-to-new-adapter", adapter_name="default")
```

资料来源：[src/peft/utils/hotswap.py](https://github.com/huggingface/peft/blob/main/src/peft/utils/hotswap.py)

## Next Steps

After successful installation:

1. **Quick Start**: Follow the [Quickstart Guide](quickstart.md) for first-time users
2. **Tuner Selection**: Review [available tuners](tuners.md) to choose the right method
3. **Configuration**: Learn about [PeftConfig](configuration.md) options
4. **Examples**: Explore [example notebooks](https://github.com/huggingface/peft/tree/main/examples) for your use case

## Summary

The PEFT library offers flexible installation options to accommodate various use cases from basic fine-tuning to advanced quantized training. Core installation via pip provides immediate access to all major functionality, while optional dependencies enable specialized features like 4-bit quantization and diffusion model support.

---

<a id='page-architecture'></a>

## System Architecture

### 相关页面

相关主题：[Core Components](#page-core-components), [Introduction to PEFT](#page-introduction), [Configuration System](#page-configuration)

<details>
<summary>相关源码文件</summary>

以下源码文件用于生成本页说明：

- [src/peft/peft_model.py](https://github.com/huggingface/peft/blob/main/src/peft/peft_model.py)
- [src/peft/tuners/tuners_utils.py](https://github.com/huggingface/peft/blob/main/src/peft/tuners/tuners_utils.py)
- [src/peft/tuners/__init__.py](https://github.com/huggingface/peft/blob/main/src/peft/tuners/__init__.py)
- [src/peft/mapping.py](https://github.com/huggingface/peft/blob/main/src/peft/mapping.py)
- [src/peft/auto.py](https://github.com/huggingface/peft/blob/main/src/peft/auto.py)
</details>

# System Architecture

## Overview

The PEFT (Parameter-Efficient Fine-Tuning) library implements a modular architecture designed to enable efficient model adaptation without modifying the entire parameter set of pre-trained models. The system architecture is built around three core pillars: the **PeftModel base class hierarchy**, **tuner abstractions**, and **configuration management**.

PEFT supports multiple fine-tuning techniques including LoRA, IA³, Adapters, Prefix Tuning, Prompt Learning, and various specialized methods like SHiRA, GraLoRA, X-LoRA, and others. Each technique is implemented as a separate "tuner" that follows a common interface defined in the base tuner utilities.

## High-Level Architecture Diagram

```mermaid
graph TD
    User[User Code] --> PeftAPI[PeftModel API]
    PeftAPI --> PeftModel[PeftModel Base Class]
    PeftModel --> BaseTuner[BaseTuner]
    BaseTuner --> TunerRegistry[Tuner Registry]
    
    subgraph Tuners
        LoRA[LoRA Tuner]
        IA3[IA³ Tuner]
        PrefixTuning[Prefix Tuning]
        PromptLearning[Prompt Learning]
        SHiRA[SHiRA Tuner]
        GraLoRA[GraLoRA Tuner]
        XLoRA[X-LoRA Tuner]
        Hira[Hira Tuner]
        DeLoRA[DeLoRA Tuner]
        Miss[MiSS Tuner]
        Adamss[Adamss Tuner]
    end
    
    BaseTuner --> LoRA
    BaseTuner --> IA3
    BaseTuner --> PrefixTuning
    BaseTuner --> PromptLearning
    BaseTuner --> SHiRA
    BaseTuner --> GraLoRA
    BaseTuner --> XLoRA
    BaseTuner --> Hira
    BaseTuner --> DeLoRA
    BaseTuner --> Miss
    BaseTuner --> Adamss
    
    PeftModel --> Config[PeftConfig]
    Config --> ConfigMapping[PEFT_TYPE_TO_CONFIG_MAPPING]
    
    TunerRegistry --> TargetMapping[TRANSFORMERS_MODELS_TO_*_TARGET_MODULES_MAPPING]
```

## Core Components

### 1. PeftModel Base Class

The `PeftModel` class serves as the central entry point for all PEFT operations. It wraps a base model and manages adapter lifecycle, injection, and merging.

**Location**: `src/peft/peft_model.py`

#### Class Hierarchy

```mermaid
graph TD
    PyTorchModule[torch.nn.Module] --> PeftModel
    PeftModel --> PeftModelForCausalLM[PeftModelForCausalLM]
    PeftModel --> PeftModelForSeq2SeqLM[PeftModelForSeq2SeqLM]
    PeftModel --> PeftModelForSequenceClassification[PeftModelForSequenceClassification]
    PeftModel --> PeftModelForQuestionAnswering[PeftModelForQuestionAnswering]
    PeftModel --> PeftModelForTokenClassification[PeftModelForTokenClassification]
    PeftModel --> PeftModelForFeatureExtraction[PeftModelForFeatureExtraction]
```

#### Key Responsibilities

| Responsibility | Description |
|---------------|-------------|
| Adapter Management | Loading, activating, and switching between multiple adapters |
| Module Injection | Replacing target modules with tuner layers |
| Forward Pass | Intercepting and modifying forward pass with adapter weights |
| Weight Merging | Combining adapter weights with base model weights |
| Model Saving/Loading | Serialization and deserialization of PEFT configurations |

#### Constructor Signature

```python
def __init__(self, model: torch.nn.Module, peft_config: PeftConfig, adapter_name: str = "default", **kwargs)
```

**Parameters**:
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `model` | `torch.nn.Module` | Required | The base model to be adapted |
| `peft_config` | `PeftConfig` | Required | Configuration for the PEFT method |
| `adapter_name` | `str` | `"default"` | Name identifier for the adapter |
| `**kwargs` | Any | - | Additional arguments passed to specific tuners |

资料来源：[src/peft/peft_model.py:1-100]()

### 2. BaseTuner Class

The `BaseTuner` class defines the abstract interface that all tuner implementations must follow. It handles the core logic for module injection and adapter management.

**Location**: `src/peft/tuners/tuners_utils.py`

#### Core Attributes

```python
prefix: str = ""                    # Prefix for PEFT module names
tuner_layer_cls = None              # The tuner layer class
target_module_mapping = {}          # Maps model types to target modules
```

#### Key Methods

| Method | Purpose |
|--------|---------|
| `inject_adapter()` | Creates adapter layers and replaces target modules |
| `_create_and_replace()` | Creates or updates adapter modules for specific targets |
| `_replace_module()` | Performs the actual module replacement |
| `_check_target_module_compatiblity()` | Validates module compatibility (e.g., for Mamba) |
| `merge_and_unload()` | Merges adapter weights into base model |
| `_unload_and_optionally_merge()` | Core logic for weight merging |

#### Adapter Injection Flow

```mermaid
sequenceDiagram
    participant User
    participant PeftModel
    participant BaseTuner
    participant Model as Base Model
    
    User->>PeftModel: inject_adapter(model, adapter_name)
    PeftModel->>BaseTuner: inject_adapter(...)
    BaseTuner->>BaseTuner: _create_and_replace(...)
    BaseTuner->>Model: Walk modules recursively
    Model-->>BaseTuner: Find matching targets
    BaseTuner->>BaseTuner: Create adapter layer
    BaseTuner->>Model: _replace_module(parent, name, new_module)
    Note over Model: Target module replaced with adapter
```

资料来源：[src/peft/tuners/tuners_utils.py:1-200]()

### 3. Configuration System

The configuration system uses a factory pattern to map PEFT types to their corresponding configuration classes.

**Location**: `src/peft/mapping.py`

#### Configuration Mapping Table

| PEFT Type | Config Class | Tuner Layer Class |
|-----------|--------------|-------------------|
| `LORA` | `LoraConfig` | `LoraLayer` |
| `IA3` | `IA3Config` | `IA3Layer` |
| `ADALORA` | `AdaLoraConfig` | `AdaLoraLayer` |
| `ADAPTER` | `AdapterConfig` | `AdapterLayer` |
| `PREFIX_TUNING` | `PrefixTuningConfig` | `PrefixTuningLayer` |
| `P_TUNING` | `PromptEncoderConfig` | `PromptEncoder` |
| `LORA_CONFIG` | `LoraConfig` | `LoraLayer` |
| `LOHA` | `LoHaConfig` | `LoHaLayer` |
| `OFT` | `OFTConfig` | `OFTLayer` |
| `XLORA` | `XLoraConfig` | `XLoraLayer` |
| `HIRA` | `HiraConfig` | `HiraLayer` |
| `SHIRA` | `ShiraConfig` | `ShiraLayer` |
| `GRALORA` | `GraloraConfig` | `GraloraLayer` |
| `DELORA` | `DeloraConfig` | `DeloraLayer` |
| `MISS` | `MissConfig` | `MissLayer` |
| `ADAMSS` | `AdamssConfig` | `AdamssLayer` |

#### Auto Configuration Loading

```python
def check_if_peft_model(model_name_or_path: str) -> bool:
    """Check if the model is a PEFT model."""
```

资料来源：[src/peft/mapping.py:1-100]()
资料来源：[src/peft/auto.py:1-50]()

## Task-Specific Model Classes

PEFT provides specialized model classes optimized for different transformer tasks.

### PeftModelForSeq2SeqLM

For sequence-to-sequence tasks (translation, summarization).

```python
class PeftModelForSeq2SeqLM(PeftModel):
    def __init__(self, model, peft_config, adapter_name="default", **kwargs):
        super().__init__(model, peft_config, adapter_name, **kwargs)
        self.base_model_prepare_inputs_for_generation = self.base_model.prepare_inputs_for_generation
        self.base_model_prepare_encoder_decoder_kwargs_for_generation = (
            self.base_model._prepare_encoder_decoder_kwargs_for_generation
        )
```

**Features**:
- Customizes `prepare_inputs_for_generation` for decoder input preparation
- Handles encoder-decoder kwargs for generation 资料来源：[src/peft/peft_model.py:200-400]()

### PeftModelForSequenceClassification

For text classification tasks.

```python
class PeftModelForSequenceClassification(PeftModel):
    def __init__(self, model, peft_config, adapter_name="default", **kwargs):
        super().__init__(model, peft_config, adapter_name, **kwargs)
        classifier_module_names = ["classifier", "score"]
```

**Target Modules**: `["classifier", "score"]` 资料来源：[src/peft/peft_model.py:100-200]()

### PeftModelForQuestionAnswering

For QA tasks.

```python
class PeftModelForQuestionAnswering(PeftModel):
    def __init__(self, model, peft_config, adapter_name="default", **kwargs):
        super().__init__(model, peft_config, adapter_name, **kwargs)
        qa_module_names = ["qa_outputs"]
```

**Target Modules**: `["qa_outputs"]` 资料来源：[src/peft/peft_model.py:250-350]()

### PeftModelForTokenClassification

For named entity recognition and token-level tasks.

```python
class PeftModelForTokenClassification(PeftModel):
    def __init__(self, model, peft_config=None, adapter_name="default", **kwargs):
        super().__init__(model, peft_config, adapter_name, **kwargs)
        classifier_module_names = ["classifier", "score"]
```

资料来源：[src/peft/peft_model.py:300-400]()

## Tuner Implementations

### Common Tuner Structure

All tuners follow a consistent pattern:

```python
class SomeTuner(PeftModel):
    prefix: str = "tuner_"
    tuner_layer_cls = SomeLayerClass
    target_module_mapping = TRANSFORMERS_MODELS_TO_SOME_TARGET_MODULES_MAPPING
    
    def _create_and_replace(self, config, adapter_name, target, target_name, parent, current_key, **kwargs):
        # Implementation
```

### Target Module Mapping

Each tuner defines which modules can be targeted for adaptation based on the model architecture.

```python
TRANSFORMERS_MODELS_TO_LORA_TARGET_MODULES_MAPPING = {
    "t5": ["q", "v"],
    "llama": ["q_proj", "v_proj"],
    "bert": ["query", "value"],
    # ... more mappings
}
```

### Example: SHiRA Tuner

```python
class ShiraModel(PeftModel):
    prefix: str = "shira_"
    tuner_layer_cls = ShiraLayer
    target_module_mapping = TRANSFORMERS_MODELS_TO_SHIRA_TARGET_MODULES_MAPPING
```

**Key Features**:
- Supports random mask generation with `mask_type == "random"` and configurable `random_seed`
- Wraps `Linear` layers with SHiRA adapter logic

资料来源：[src/peft/tuners/shira/model.py:1-80]()

### Example: GraLoRA Tuner

```python
class GraloraModel(PeftModel):
    prefix: str = "gralora_"
    tuner_layer_cls = GraloraLayer
    target_module_mapping = TRANSFORMERS_MODELS_TO_GRALORA_TARGET_MODULES_MAPPING
```

资料来源：[src/peft/tuners/gralora/model.py:1-80]()

### Example: X-LoRA Tuner

X-LoRA supports multiple adapter loading with device placement:

```python
def __init__(
    self,
    model: nn.Module,
    config: Union[dict[str, XLoraConfig], XLoraConfig],
    adapter_name: str,
    torch_device: Optional[str] = None,
    ephemeral_gpu_offload: bool = False,
    autocast_adapter_dtype: bool = True,
    **kwargs,
)
```

资料来源：[src/peft/tuners/xlora/model.py:1-100]()

## Model Loading and Serialization

### From Pretrained

```python
@classmethod
def from_pretrained(
    cls,
    model: torch.nn.Module,
    model_id: str,
    adapter_name: str = "default",
    is_trainable: bool = False,
    config: Optional[PeftConfig] = None,
    autocast_adapter_dtype: bool = True,
    **kwargs
) -> PeftModel:
```

**Parameters**:
| Parameter | Type | Description |
|-----------|------|-------------|
| `model` | `torch.nn.Module` | The base model to adapt |
| `model_id` | `str` | Path or HuggingFace Hub identifier |
| `adapter_name` | `str` | Adapter name (default: "default") |
| `is_trainable` | `bool` | Whether adapter is trainable |
| `config` | `PeftConfig` | Pre-loaded configuration |
| `autocast_adapter_dtype` | `bool` | Auto-cast adapter dtype |

资料来源：[src/peft/peft_model.py:400-600]()

### Hotswap Adapter

For runtime adapter replacement without full model reload:

```python
def hotswap_adapter(
    model,
    model_name_or_path,
    adapter_name="default",
    torch_device=None,
    **kwargs
):
```

资料来源：[src/peft/utils/hotswap.py:1-100]()

## Helper Utilities

### Signature Updates

For model compatibility, PEFT provides utilities to update method signatures:

```python
def update_forward_signature(model: PeftModel) -> None:
    """Updates forward signature to include parent's signature."""

def update_generate_signature(model: PeftModel) -> None:
    """Updates generate signature to include parent's signature."""

def update_signature(model: PeftModel, method: str = "all") -> None:
    """Updates forward and/or generate signature."""
```

**Logic**: Updates signatures only when the current signature only has `*args` and `**kwargs`:

```python
current_signature = inspect.signature(model.forward)
if (
    len(current_signature.parameters) == 2
    and "args" in current_signature.parameters
    and "kwargs" in current_signature.parameters
):
    # Update with parent's signature
```

资料来源：[src/peft/helpers.py:1-150]()

### Adapter Scale Rescaling

Context manager for temporary adapter scaling:

```python
@contextmanager
def rescale_adapter_scale(model, multiplier):
    """Context manager to temporarily rescale adapter scaling."""
```

## Data Flow Diagram

```mermaid
graph LR
    subgraph Input
        InputIDs[input_ids]
        Attention[attention_mask]
        Embeds[inputs_embeds]
    end
    
    subgraph Processing
        PEFTConfig[PeftConfig]
        BaseModel[Base Model]
        Adapters[Adapter Layers]
    end
    
    subgraph Output
        OutputLogits[Output Logits]
        HiddenStates[Hidden States]
        AttentionWeights[Attention Weights]
    end
    
    InputIDs --> BaseModel
    Attention --> BaseModel
    Embeds --> BaseModel
    PEFTConfig --> Adapters
    BaseModel <--> Adapters
    Adapters --> OutputLogits
    Adapters --> HiddenStates
    Adapters --> AttentionWeights
```

## Configuration Classes

Each tuner type has a corresponding configuration class:

| Tuner | Config Class | Key Parameters |
|-------|--------------|----------------|
| LoRA | `LoraConfig` | `r`, `lora_alpha`, `lora_dropout`, `target_modules` |
| IA³ | `IA3Config` | `target_modules`, `feedforward_modules` |
| Prefix Tuning | `PrefixTuningConfig` | `num_virtual_tokens`, `num_transformer_submodules` |
| Prompt Learning | `PromptEncoderConfig` | `num_virtual_tokens`, `encoder_hidden_size` |
| SHiRA | `ShiraConfig` | `r`, `mask_type`, `random_seed` |
| GraLoRA | `GraloraConfig` | `r` |
| X-LoRA | `XLoraConfig` | Multiple adapter configs |
| Hira | `HiraConfig` | `r`, `hira_dropout` |
| DeLoRA | `DeloraConfig` | `rank_pattern`, `lambda_pattern` |
| MiSS | `MissConfig` | `r`, `target_modules`, `init_weights` |
| Adamss | `AdamssConfig` | `r`, `num_subspaces`, `target_modules` |

## Multiple Adapter Support

PEFT supports loading and managing multiple adapters simultaneously:

```mermaid
graph TD
    BaseModel[Base Model] --> Adapter1[Adapter 1: default]
    BaseModel --> Adapter2[Adapter 2: adapter_v2]
    BaseModel --> AdapterN[Adapter N: custom_name]
    
    ActiveAdapter[Active Adapter] --> Selection[Selection]
    Selection --> Adapter1
    Selection --> Adapter2
    Selection --> AdapterN
```

**Key Operations**:
- Add adapters via `inject_adapter()` with unique names
- Activate specific adapter via `set_adapter()`
- Merge single or multiple adapters via `merge_and_unload(adapter_names=[...])`
- Hotswap adapters at runtime via `hotswap_adapter()`

## Class Inheritance Diagram

```mermaid
classDiagram
    class PeftModel {
        +model
        +peft_config
        +active_adapters
        +inject_adapter()
        +merge_and_unload()
        +unload()
        +get_prompt()
    }
    
    class PeftModelForCausalLM {
        +forward()
    }
    
    class PeftModelForSeq2SeqLM {
        +forward()
        +prepare_inputs_for_generation()
    }
    
    class PeftModelForSequenceClassification {
        +forward()
    }
    
    class PeftModelForQuestionAnswering {
        +forward()
    }
    
    class PeftModelForTokenClassification {
        +forward()
    }
    
    class PeftModelForFeatureExtraction {
        +forward()
    }
    
    PeftModel <|-- PeftModelForCausalLM
    PeftModel <|-- PeftModelForSeq2SeqLM
    PeftModel <|-- PeftModelForSequenceClassification
    PeftModel <|-- PeftModelForQuestionAnswering
    PeftModel <|-- PeftModelForTokenClassification
    PeftModel <|-- PeftModelForFeatureExtraction
```

## Summary

The PEFT system architecture provides a flexible, extensible framework for parameter-efficient fine-tuning through:

1. **Centralized Model Management**: `PeftModel` base class handles adapter lifecycle
2. **Modular Tuner System**: Each technique (LoRA, IA³, etc.) implements the `BaseTuner` interface
3. **Configuration-Driven Design**: Factory pattern maps PEFT types to configs
4. **Task-Specific Optimizations**: Specialized model classes for different downstream tasks
5. **Multi-Adapter Support**: Runtime switching and hotswapping of adapters
6. **Seamless Integration**: Auto-loading and signature updates for transformer compatibility

This architecture enables researchers and practitioners to easily extend PEFT with new fine-tuning methods while maintaining backward compatibility and performance optimizations.

---

<a id='page-core-components'></a>

## Core Components

### 相关页面

相关主题：[System Architecture](#page-architecture), [Configuration System](#page-configuration), [Model Loading and Saving](#page-model-loading)

<details>
<summary>相关源码文件</summary>

以下源码文件用于生成本页说明：

- [src/peft/peft_model.py](https://github.com/huggingface/peft/blob/main/src/peft/peft_model.py)
- [src/peft/config.py](https://github.com/huggingface/peft/blob/main/src/peft/config.py)
- [src/peft/mapping.py](https://github.com/huggingface/peft/blob/main/src/peft/mapping.py)
- [src/peft/mapping_func.py](https://github.com/huggingface/peft/blob/main/src/peft/mapping_func.py)
- [src/peft/helpers.py](https://github.com/huggingface/peft/blob/main/src/peft/helpers.py)
- [src/peft/tuners/tuners_utils.py](https://github.com/huggingface/peft/blob/main/src/peft/tuners/tuners_utils.py)
</details>

# Core Components

## Overview

The PEFT (Parameter-Efficient Fine-Tuning) library provides a modular architecture for adapting pre-trained models with minimal computational overhead. The Core Components form the foundational layer that enables all PEFT methods—including LoRA, IA³, Prefix Tuning, and custom tuners—to inject trainable parameters into base models efficiently.

The core architecture consists of:

- **PeftModel**: The primary wrapper class that encapsulates base models with adapter layers
- **PeftConfig**: Configuration objects that define adapter-specific parameters
- **BaseTunerLayer**: Base class for all adapter layer implementations
- **inject_adapter**: Core mechanism for attaching adapters to target modules
- **Mapping System**: Registry connecting PEFT types to their implementations

资料来源：[src/peft/peft_model.py:1-50]()

## Architecture Overview

```mermaid
graph TD
    A[Pre-trained Model] --> B[PeftModel]
    B --> C{PEFT Type}
    C -->|LORA| D[LoRA Layers]
    C -->|IA3| E[IA³ Layers]
    C -->|PREFIX_TUNING| F[Prefix Layers]
    C -->|CUSTOM| G[Custom Tuners]
    
    H[PeftConfig] --> B
    I[Adapter Registry] --> B
    
    J[Target Modules] --> K[inject_adapter]
    K --> B
    
    L[from_pretrained] --> B
    M[get_peft_model] --> B
```

## PeftModel Base Class

The `PeftModel` class serves as the central abstraction for all PEFT-adapted models. It wraps a base model and manages one or more adapters, each containing trainable parameters.

### Key Responsibilities

| Responsibility | Description |
|----------------|-------------|
| Adapter Management | Load, activate, and switch between multiple adapters |
| Forward Pass | Intercept forward calls to route through active adapters |
| Parameter Tracking | Report trainable vs. total parameter counts |
| Serialization | Save and load adapter weights and configurations |

### Task-Specific Model Classes

PEFT provides specialized model classes for different transformer tasks:

| Model Class | Task Type | Use Case |
|-------------|-----------|----------|
| `PeftModel` | Generic | Base wrapper for any model |
| `PeftModelForSequenceClassification` | SEQ_CLS | Text classification |
| `PeftModelForTokenClassification` | TOKEN_CLS | Named entity recognition |
| `PeftModelForQuestionAnswering` | QUESTION_ANS | Extractive QA |
| `PeftModelForSeq2SeqLM` | SEQ_2_SEQ_LM | Translation, summarization |
| `PeftModelForCausalLM` | CAUSAL_LM | Text generation |
| `PeftModelForFeatureExtraction` | FEATURE_EXTRACTION | Embedding extraction |

资料来源：[src/peft/peft_model.py:50-150]()

### Key Methods

```python
def from_pretrained(
    model: torch.nn.Module,
    model_id: str | os.PathLike,
    adapter_name: str = "default",
    is_trainable: bool = False,
    config: PeftConfig = None,
    autocast_adapter_dtype: bool = True,
    **kwargs
) -> PeftModel
```

This factory method instantiates a PEFT model from a pretrained configuration and optionally loads adapter weights.

资料来源：[src/peft/peft_model.py:150-200]()

## PeftConfig System

The `PeftConfig` class hierarchy defines adapter-specific hyperparameters. Each PEFT method has its own configuration class that inherits from the base `PeftConfig`.

### Configuration Class Hierarchy

```mermaid
graph TD
    A[PeftConfig] --> B[LoraConfig]
    A --> C[PromptLearningConfig]
    C --> D[PrefixTuningConfig]
    C --> E[PromptEncoderConfig]
    A --> F[IA3Config]
    A --> G[LoHaConfig]
    A --> H[OFTConfig]
    A --> I[TinyLoRAConfig]
    A --> J[AdamssConfig]
```

### Common Configuration Parameters

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `peft_type` | `PeftType` | Required | The PEFT method being used |
| `task_type` | `TaskType` | Required | The downstream task type |
| `inference_mode` | `bool` | `False` | Whether model is in inference mode |
| `target_modules` | `List[str]` | `None` | Module names to apply adapters to |
| `r` | `int` | `8` | LoRA rank dimension |
| `lora_alpha` | `int` | `8` | LoRA scaling factor |
| `lora_dropout` | `float` | `0.0` | Dropout probability for LoRA layers |

资料来源：[src/peft/config.py](), [src/peft/mapping.py]()

## Tuner Layer Base Classes

### BaseTunerLayer

The `BaseTunerLayer` class provides the interface that all adapter layer implementations must follow. It defines methods for layer initialization, adapter updating, and merging.

```mermaid
classDiagram
    class BaseTunerLayer {
        +base_layer: nn.Module
        +active_adapters: List[str]
        +adapter_list: List[str]
        +update_layer(adapter_name, ...)
        +merge()
        +unmerge()
    }
```

### Key Methods

| Method | Description |
|--------|-------------|
| `update_layer(adapter_name, **kwargs)` | Initialize or update adapter weights |
| `merge()` | Merge adapter weights into base layer |
| `unmerge()` | Restore original base layer weights |
| `scale_layer(scale)` | Apply scaling factor to adapter output |

资料来源：[src/peft/tuners/tuners_utils.py:100-150]()

### Method-Specific Tuner Layers

Each PEFT method implements its own tuner layer class:

| Tuner | Layer Class | Key Parameters |
|-------|-------------|----------------|
| LoRA | `LoraLayer` | `r`, `lora_alpha`, `lora_dropout`, `lora_A`, `lora_B` |
| IA³ | `IA3Layer` | `inn_factor`, `key_value_dim` |
| OFT | `OFTLayer` | `oft_r`, `oft_diag_blocks` |
| SHiRA | `ShiraLayer` | `mask_fn`, `random_seed` |
| Gralora | `GraloraLayer` | `r` (SVD rank) |

资料来源：[src/peft/tuners/ia3/model.py](), [src/peft/tuners/oft/model.py](), [src/peft/tuners/shira/model.py](), [src/peft/tuners/gralora/model.py]()

## Adapter Injection Mechanism

The `inject_adapter` method is the core mechanism that replaces target modules with adapter layers. This process traverses the model and substitutes compatible modules.

```mermaid
graph TD
    A[inject_adapter called] --> B{module.is_target_module?}
    B -->|Yes| C{Create New Module?}
    C -->|New adapter| D[_create_new_module]
    C -->|Existing adapter| E[update_layer]
    D --> F[_replace_module]
    E --> G[Set requires_grad False]
    F --> H[Module replaced]
    B -->|No| I[Skip module]
    G --> I
```

### Injection Flow

```python
def inject_adapter(
    model: nn.Module,
    adapter_name: str,
    autocast_adapter_dtype: bool = True,
    low_cpu_mem_usage: bool = False,
    state_dict: Optional[dict] = None,
) -> None
```

The method performs the following steps:

1. Identifies target modules based on `peft_config.target_modules`
2. For each target, either creates a new adapter module or updates an existing one
3. Replaces the original module in the parent model
4. Sets appropriate `requires_grad` flags based on `is_trainable`

资料来源：[src/peft/tuners/tuners_utils.py:150-250]()

### _create_and_replace Pattern

Each tuner implements `_create_and_replace` to handle the specific module creation logic:

```python
def _create_and_replace(
    self,
    config,
    adapter_name,
    target,
    target_name,
    parent,
    current_key,
    **optional_kwargs,
) -> None
```

资料来源：[src/peft/tuners/shira/model.py:40-80](), [src/peft/tuners/gralora/model.py:40-70](), [src/peft/tuners/miss/model.py:30-70]()

## Mixed Model Support

The `PeftMixedModel` class extends `PeftModel` to support heterogeneous adapters—models with different PEFT methods simultaneously.

```mermaid
graph LR
    A[Base Model] --> B[PeftMixedModel]
    B --> C[LoRA Adapter]
    B --> D[IA³ Adapter]
    B --> E[Prefix Adapter]
```

### Loading Mixed Models

```python
@classmethod
def from_pretrained(
    cls,
    model: nn.Module,
    model_id: str | os.PathLike,
    adapter_name: str = "default",
    is_trainable: bool = False,
    config: PeftConfig = None,
    low_cpu_mem_usage: bool = False,
    **kwargs,
) -> PeftMixedModel
```

资料来源：[src/peft/mixed_model.py:50-100]()

## Helper Functions

The `helpers.py` module provides utility functions for working with PEFT models.

### Signature Update Functions

These functions update the forward and generate signatures of PEFT models to expose parameters from the underlying base model.

| Function | Purpose |
|----------|---------|
| `update_forward_signature(model)` | Update `model.forward` signature to include base model parameters |
| `update_generate_signature(model)` | Update `model.generate` signature to include base model parameters |
| `update_signature(model, method)` | Update both signatures or specify `'forward'`/`'generate'`/`'all'` |

```python
def update_forward_signature(model: PeftModel) -> None:
    """Update the forward signature to include base model parameters."""
    current_signature = inspect.signature(model.forward)
    if (
        len(current_signature.parameters) == 2
        and "args" in current_signature.parameters
        and "kwargs" in current_signature.parameters
    ):
        # Copy signature from base model
        ...
```

资料来源：[src/peft/helpers.py:50-100]()

### Model Validation

```python
def check_if_peft_model(model_name_or_path: str) -> bool:
    """
    Check if the model is a PEFT model.
    
    Returns:
        bool: True if the model is a PEFT model, False otherwise.
    """
```

This function attempts to load a `PeftConfig` from the given path and returns `True` if successful.

资料来源：[src/peft/helpers.py:100-130]()

### Adapter Rescaling Context Manager

```python
@contextmanager
def rescale_adapter_scale(model, multiplier):
    """Temporarily rescale the scaling of the LoRA adapter."""
```

This context manager temporarily rescales adapter weights during inference, useful for ablation studies.

资料来源：[src/peft/helpers.py:130-160]()

## Hotswap Adapter

The `hotswap_adapter` function enables runtime replacement of loaded adapters without reloading the entire model.

```mermaid
graph TD
    A[hotswap_adapter called] --> B[Load new config]
    B --> C[Validate PEFT type]
    C --> D[Load state dict]
    D --> E[Transfer to device]
    E --> F[Replace adapter weights]
    F --> G[Success]
```

```python
def hotswap_adapter(
    model: PeftModel,
    model_name_or_path: str,
    adapter_name: str = "default",
    torch_device: str = None,
    **kwargs,
) -> None
```

资料来源：[src/peft/utils/hotswap.py:30-80]()

## Unload and Merge Operations

Base tuners provide methods to unload or merge adapter weights.

### merge_and_unload

```python
def merge_and_unload(progressbar: bool = False, safe_merge: bool = False, adapter_names = None) -> nn.Module
```

Merges adapter weights into the base model and returns the resulting model with adapter modules removed.

### unload

```python
def unload() -> nn.Module
```

Returns the base model by removing all PEFT modules without merging weights. This is useful when you need the original model but want to preserve the option to reload adapters later.

### _unload_and_optionally_merge

```python
def _unload_and_optionally_merge(
    progressbar: bool = False,
    safe_merge: bool = False,
    adapter_names = None,
    merge: bool = True,
) -> nn.Module
```

资料来源：[src/peft/tuners/tuners_utils.py:80-120]()

## Target Module Mapping

Each tuner defines a `target_module_mapping` that specifies which modules should be replaced for different model architectures.

```python
# Example: SHiRA target module mapping
target_module_mapping = TRANSFORMERS_MODELS_TO_SHIRA_TARGET_MODULES_MAPPING

# Example: GraLoRA target module mapping
target_module_mapping = TRANSFORMERS_MODELS_TO_GRALORA_TARGET_MODULES_MAPPING
```

These mappings allow PEFT methods to automatically identify compatible layers (e.g., `q_proj`, `v_proj`, `k_proj`) across different transformer architectures.

## BitsAndBytes Integration

PEFT supports quantized models through BitsAndBytes integration. The tuners detect quantized base layers and wrap them appropriately:

```python
if loaded_in_8bit and isinstance(target_base_layer, bnb.nn.Linear8bitLt):
    eightbit_kwargs = kwargs.copy()
    eightbit_kwargs.update({
        "has_fp16_weights": target_base_layer.state.has_fp16_weights,
        "threshold": target_base_layer.state.threshold,
        "index": target_base_layer.index,
    })
    new_module = Linear8bitLt(...)
```

资料来源：[src/peft/tuners/ia3/model.py:40-70]()

## Summary

The Core Components of PEFT provide a flexible, extensible architecture for parameter-efficient fine-tuning:

1. **PeftModel** wraps base models and manages adapters with a unified interface
2. **PeftConfig** classes define method-specific hyperparameters
3. **BaseTunerLayer** establishes the contract for all adapter implementations
4. **inject_adapter** replaces target modules with adapter layers
5. **Helper functions** provide utilities for signature updates, validation, and runtime operations
6. **Hotswap support** enables dynamic adapter replacement

This architecture allows developers to implement new PEFT methods by subclassing existing base classes while reusing the core model management infrastructure.

---

<a id='page-lora-methods'></a>

## LoRA and LoRA Variants

### 相关页面

相关主题：[Other PEFT Methods](#page-other-methods), [Quantization Integration](#page-quantization), [Configuration System](#page-configuration)

<details>
<summary>相关源码文件</summary>

以下源码文件用于生成本页说明：

- [src/peft/tuners/lora/model.py](https://github.com/huggingface/peft/blob/main/src/peft/tuners/lora/model.py)
- [src/peft/tuners/lora/config.py](https://github.com/huggingface/peft/blob/main/src/peft/tuners/lora/config.py)
- [src/peft/tuners/lora/layer.py](https://github.com/huggingface/peft/blob/main/src/peft/tuners/lora/layer.py)
- [src/peft/tuners/lora/dora.py](https://github.com/huggingface/peft/blob/main/src/peft/tuners/lora/dora.py)
- [src/peft/tuners/adalora/__init__.py](https://github.com/huggingface/peft/blob/main/src/peft/tuners/adalora/__init__.py)
- [src/peft/tuners/lokr/__init__.py](https://github.com/huggingface/peft/blob/main/src/peft/tuners/lokr/__init__.py)
- [src/peft/tuners/loha/__init__.py](https://github.com/huggingface/peft/blob/main/src/peft/tuners/loha/__init__.py)
- [docs/source/conceptual_guides/adapter.md](https://github.com/huggingface/peft/blob/main/docs/source/conceptual_guides/adapter.md)
- [docs/source/package_reference/lora.md](https://github.com/huggingface/peft/blob/main/docs/source/package_reference/lora.md)
</details>

# LoRA and LoRA Variants

## Overview

LoRA (Low-Rank Adaptation) is a parameter-efficient fine-tuning technique that reduces trainable parameters by representing weight updates as low-rank decompositions. The PEFT library implements LoRA and numerous variants that extend this foundational approach with different architectural innovations, training strategies, and optimization techniques.

The LoRA system in PEFT serves as both a standalone fine-tuning method and a framework upon which variants like DoRA, AdaLoRA, LoHa, LoKr, and others are built. These variants share a common plugin architecture but differ in how they decompose and apply trainable adapters to base model layers.

## Architecture

### Core LoRA Architecture

LoRA modifies pre-trained neural network layers by adding trainable low-rank decomposition matrices alongside frozen original weights. For a linear layer with weight matrix $W \in \mathbb{R}^{d \times k}$, LoRA represents the update as:

$$\Delta W = BA$$

where $B \in \mathbb{R}^{d \times r}$ and $A \in \mathbb{R}^{r \times k}$ with rank $r \ll \min(d, k)$.

```mermaid
graph TD
    A[Base Model Layer: Weight W] --> B[Original Forward Pass<br/>y = Wx]
    C[LoRA Adapter: BA Decomposition] --> D[Modified Forward Pass<br/>y = Wx + BAz]
    B --> D
    A --> C
    E[Input x] --> A
    E --> B
    F[Adapter Input z<br/>Same as x or modified] --> C
```

### LoRA Module Hierarchy

```mermaid
graph TD
    A[PeftModel] --> B[BaseModel Class]
    A --> C[LoraModel / VariantModel]
    C --> D[TunerLayerCls]
    C --> E[target_module_mapping]
    C --> F[prefix attribute]
    D --> G[LoraLayer / Conv2d / Conv1d]
    G --> H[Linear wrapper]
    H --> I[Forward with BA decomposition]
```

资料来源：[src/peft/tuners/lora/model.py:1-100]()

## LoRA Implementation

### Model Class

The `LoraModel` class serves as the base implementation for LoRA adapters. It extends the generic tuner base class and implements the core adapter creation logic.

```python
class LoraModel(BaseTuner):
    prefix: str = "lora_"
    tuner_layer_cls = LoraLayer
    target_module_mapping = TRANSFORMERS_MODELS_TO_LORA_TARGET_MODULES_MAPPING
```

资料来源：[src/peft/tuners/lora/model.py:90-95]()

### Layer Replacement Mechanism

The `_create_and_replace` method handles the injection of LoRA adapters into target modules:

```python
def _create_and_replace(
    self,
    lora_config,
    adapter_name,
    target,
    target_name,
    parent,
    current_key,
    *,
    parameter_name: Optional[str] = None,
) -> None:
```

资料来源：[src/peft/tuners/lora/model.py:105-120]()

### Forward Pass Computation

The LoRA forward pass combines the frozen base weights with trainable low-rank matrices:

```python
def forward(self, x: torch.Tensor) -> torch.Tensor:
    while self.active_adapters not in self.peft_config:
        self.active_adapters = self.peft_config
    
    scaling = {
        adapter: self.peft_config[adapter].scaling_weight
        for adapter in self.active_adapters
    }
    
    return self.base_layer(x) + sum(
        self._forward_weight(weight, x, scaling=scaling.get(adapter, 1.0))
        for adapter, weight in self.lora_A.items()
    )
```

## LoRA Configuration

### LoraConfig Parameters

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `r` | int | 8 | Rank of decomposition |
| `lora_alpha` | int | 8 | Scaling factor (often set to 2×r) |
| `lora_dropout` | float | 0.0 | Dropout probability for LoRA layers |
| `target_modules` | Optional[List[str]] | None | Module names to apply LoRA |
| `bias` | str | "none" | Bias training mode: "none", "all", "lora_only" |
| `fan_in_fan_out` | bool | False | Transpose weights for certain architectures |
| `init_weights` | bool | True | Initialize LoRA weights on creation |

### Advanced Configuration Options

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `target_modules_bd_a` | Optional[List[str]] | None | Modules for block-diagonal LoRA-A |
| `target_modules_bd_b` | Optional[List[str]] | None | Modules for block-diagonal LoRA-B |
| `nblocks` | int | 1 | Number of blocks in block-diagonal matrices |
| `match_strict` | bool | True | Require strict matching for all target modules |

资料来源：[src/peft/tuners/lora/config.py:1-200]()

## LoRA Variants

### DoRA (Weight-Decomposed LoRA)

DoRA extends standard LoRA by decomposing weights into magnitude and direction components. This variant often achieves better performance with comparable parameter counts.

```python
# DoRA configuration example
lora_config = LoraConfig(
    use_dora=True,
    r=32,
    lora_alpha=32,
    target_modules=["q_proj", "v_proj"]
)
```

资料来源：[examples/dora_finetuning/README.md]()

### AdaLoRA (Adaptive LoRA)

AdaLoRA dynamically adjusts the rank of LoRA blocks during training, allocating more parameters to important layers. This adaptive approach optimizes the parameter budget.

```bash
python examples/alora_finetuning/alora_finetuning.py \
  --base_model meta-llama/Llama-3.2-3B-Instruct \
  --data_path Lots-of-LoRAs/task1660_super_glue_question_generation \
  --invocation_string "<|start_header_id|>assistant<|end_header_id|>"
```

资料来源：[examples/alora_finetuning/README.md]()

### LoHa (Low-Rank Hadamard Product)

LoHa replaces the standard AB decomposition with a Hadamard product of low-rank matrices, potentially capturing more expressive updates.

```python
config_te = LoHaConfig(
    r=8,
    lora_alpha=32,
    target_modules=["k_proj", "q_proj", "v_proj", "out_proj", "fc1", "fc2"],
    rank_dropout=0.0,
    module_dropout=0.0,
)
```

资料来源：[src/peft/tuners/loha/__init__.py]()

### LoKr (Low-Kronecker Product)

LoKr applies Kronecker product decomposition to weight matrices, offering different trade-offs between rank and expressiveness.

```python
config_unet = LoKrConfig(
    r=8,
    lora_alpha=32,
    target_modules=["proj_in", "proj_out", "to_k", "to_q", "to_v"],
    rank_dropout=0.0,
    module_dropout=0.0,
    use_effective_conv2d=True,
)
```

资料来源：[src/peft/tuners/lokr/__init__.py]()

### Block-Diagonal LoRA

Block-diagonal LoRA constrains the LoRA matrices to be block-diagonal, enabling efficient multi-adapter serving with different sharding degrees.

```python
config = LoraConfig(
    r=16,
    target_modules_bd_a=["q_proj", "v_proj"],  # Block-diagonal A
    target_modules_bd_b=["out_proj"],            # Block-diagonal B
    nblocks=4,                                    # Sharding degree
)
```

## Variant Comparison

| Variant | Key Innovation | Target Use Case | Complexity |
|---------|---------------|-----------------|------------|
| LoRA | Low-rank decomposition | General fine-tuning | Low |
| DoRA | Magnitude + direction decomposition | High-quality adaptation | Low |
| AdaLoRA | Adaptive rank allocation | Resource-constrained tuning | Medium |
| LoHa | Hadamard product decomposition | Image generation | Medium |
| LoKr | Kronecker product decomposition | Diffusion models | Medium |
| Block-Diagonal | Constrained structure | Multi-adapter serving | Medium |

## Usage Patterns

### Basic LoRA Setup

```python
from transformers import AutoModelForCausalLM
from peft import get_peft_model, LoraConfig

model = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-2-7b")

peft_config = LoraConfig(
    task_type="CAUSAL_LM",
    r=16,
    lora_alpha=32,
    target_modules=["q_proj", "v_proj", "k_proj", "o_proj"],
    lora_dropout=0.05,
    bias="none",
)

peft_model = get_peft_model(model, peft_config)
peft_model.print_trainable_parameters()
```

### Multi-Adapter Configuration

```python
from peft import LoraConfig, PeftModel

# Load multiple adapters
peft_model = PeftModel.from_pretrained(
    base_model,
    adapters={
        "adapter_1": "./path/to/adapter_1",
        "adapter_2": "./path/to/adapter_2",
    },
)
```

### Quantization with LoRA

```python
from peft import get_peft_model, LoraConfig, prepare_model_for_kbit_training
from transformers import BitsAndBytesConfig

quantization_config = BitsAndBytesConfig(load_in_8bit=True)
model = AutoModelForCausalLM.from_pretrained(
    "meta-llama/Llama-2-7b",
    quantization_config=quantization_config,
)

model = prepare_model_for_kbit_training(model)
peft_model = get_peft_model(model, lora_config)
```

## Integration with PeftModel

All LoRA variants integrate with the base `PeftModel` architecture through the tuner pattern:

```mermaid
graph LR
    A[Base Transformers Model] --> B[PeftModel]
    B --> C[BaseModel Class]
    C --> D[LoraModel / VariantModel]
    D --> E[Adapter Injection]
    E --> F[Modified Forward]
```

The `PeftModel` class provides unified interfaces for:
- Forward pass handling
- Adapter switching
- Save/load operations
- Parameter printing

资料来源：[src/peft/peft_model.py:1-100]()

## Design Patterns

### Tuner Layer Class Structure

Each LoRA variant implements a `tuner_layer_cls` attribute that defines the layer wrapper class:

```python
class LoraModel(BaseTuner):
    tuner_layer_cls = LoraLayer
    
class LoHaModel(BaseTuner):
    prefix: str = "hada_"
    tuner_layer_cls = LoHaLayer
    layers_mapping: dict[type[torch.nn.Module], type[LoHaLayer]] = {
        torch.nn.Conv2d: Conv2d,
        torch.nn.Conv1d: Conv1d,
        torch.nn.Linear: Linear,
    }
```

### Target Module Mapping

Variants define target module mappings for automatic module detection:

```python
class LoraModel(BaseTuner):
    target_module_mapping = TRANSFORMERS_MODULES_TO_LORA_TARGET_MODULES_MAPPING

class ShiraModel(BaseTuner):
    prefix: str = "shira_"
    tuner_layer_cls = ShiraLayer
    target_module_mapping = TRANSFORMERS_MODELS_TO_SHIRA_TARGET_MODULES_MAPPING
```

资料来源：[src/peft/tuners/shira/model.py:40-45]()

## Conclusion

LoRA and its variants in the PEFT library provide a comprehensive suite of parameter-efficient fine-tuning techniques. The shared plugin architecture enables consistent APIs across variants while allowing each method to implement its unique adaptation strategy. From basic low-rank decomposition to advanced block-diagonal structures, PEFT supports a wide range of fine-tuning scenarios with minimal computational overhead.

---

<a id='page-other-methods'></a>

## Other PEFT Methods

### 相关页面

相关主题：[LoRA and LoRA Variants](#page-lora-methods), [Configuration System](#page-configuration)

<details>
<summary>相关源码文件</summary>

以下源码文件用于生成本页说明：

- [src/peft/tuners/prompt_tuning/__init__.py](https://github.com/huggingface/peft/blob/main/src/peft/tuners/prompt_tuning/__init__.py)
- [src/peft/tuners/prefix_tuning/__init__.py](https://github.com/huggingface/peft/blob/main/src/peft/tuners/prefix_tuning/__init__.py)
- [src/peft/tuners/p_tuning/__init__.py](https://github.com/huggingface/peft/blob/main/src/peft/tuners/p_tuning/__init__.py)
- [src/peft/tuners/ia3/__init__.py](https://github.com/huggingface/peft/blob/main/src/peft/tuners/ia3/__init__.py)
- [src/peft/tuners/oft/__init__.py](https://github.com/huggingface/peft/blob/main/src/peft/tuners/oft/__init__.py)
- [src/peft/tuners/fourierft/__init__.py](https://github.com/huggingface/peft/blob/main/src/peft/tuners/fourierft/__init__.py)
- [src/peft/tuners/multitask_prompt_tuning/__init__.py](https://github.com/huggingface/peft/blob/main/src/peft/tuners/multitask_prompt_tuning/__init__.py)
- [docs/source/conceptual_guides/prompting.md](https://github.com/huggingface/peft/blob/main/docs/source/conceptual_guides/prompting.md)
- [docs/source/conceptual_guides/ia3.md](https://github.com/huggingface/peft/blob/main/docs/source/conceptual_guides/ia3.md)
</details>

# Other PEFT Methods

PEFT (Parameter-Efficient Fine-Tuning) encompasses a diverse collection of techniques beyond LoRA and QLoRA. These methods offer alternative approaches to adapting pre-trained models with minimal parameter updates, each with distinct mechanisms, trade-offs, and optimal use cases. This page provides a comprehensive overview of the "Other PEFT Methods" available in the Hugging Face PEFT library.

## Overview of PEFT Method Categories

The PEFT library organizes fine-tuning methods into several categories based on their core adaptation mechanism. Understanding these categories helps practitioners select the appropriate method for their specific requirements.

```mermaid
graph TD
    A[PEFT Methods] --> B[Prompt-Based Methods]
    A --> C[Additive Methods]
    A --> D[Reparameterization Methods]
    A --> E[Multiplicative Methods]
    A --> F[Subspace Methods]
    
    B --> B1[Prompt Tuning]
    B --> B2[Prefix Tuning]
    B --> B3[P-Tuning]
    B --> B4[MultiTask Prompt Tuning]
    
    C --> C1[IA³]
    
    D --> D1[LoRA Variants<br/>AdaLoRA, Gralora, HiRA]
    
    E --> E1[OFT]
    
    F --> F1[FourierFT]
```

## Prompt-Based Methods

Prompt-based methods modify the model's input or activation space without changing the underlying model weights. These methods add trainable parameters as virtual tokens or prefix embeddings that guide the model's behavior.

### Prompt Tuning

Prompt Tuning introduces trainable "soft prompts" (embedding vectors) that are prepended to the input tokens. Unlike discrete text prompts, these are continuous vectors learned through backpropagation during fine-tuning.

**Key Characteristics:**
- Only the prompt embeddings are trainable
- No architectural changes to the base model
- Requires relatively few parameters compared to full fine-tuning
- Works well with larger models

**Configuration Parameters:**

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `num_virtual_tokens` | int | 20 | Number of virtual tokens in the prompt |
| `prompt_tuning_init` | str | "TEXT" | Initialization method for prompts |
| `prompt_tuning_init_text` | str | None | Text for TEXT initialization |
| `token_dim` | int | Model hidden dim | Dimension of model embeddings |
| `num_transformer_submodules` | int | 1 | Number of transformer layers with prompts |
| `num_attention_heads` | int | Model heads | Number of attention heads |
| `num_layers` | int | Model layers | Number of transformer layers |
| `encoder_hidden_size` | int | Same as token_dim | Hidden size for encoder |

资料来源：[src/peft/tuners/prompt_tuning/__init__.py](https://github.com/huggingface/peft/blob/main/src/peft/tuners/prompt_tuning/__init__.py)

### Prefix Tuning

Prefix Tuning adds trainable parameters to the attention mechanism by prepending learnable prefix vectors to the keys and values in every attention layer. Unlike Prompt Tuning, this affects all transformer layers directly.

**Architecture:**

```mermaid
graph LR
    A[Input Tokens] --> B[Embedding Layer]
    B --> C[Prefix P<sub>k</sub>, P<sub>v</sub>]
    B --> D[Standard K, V]
    C --> E[Multi-Head Attention]
    D --> E
    E --> F[Output]
```

**Key Differences from Prompt Tuning:**
- Affects hidden states at every transformer layer
- More parameter-efficient than full prompt tuning in some scenarios
- Requires specification of prefix projection for deeper integration

资料来源：[src/peft/tuners/prefix_tuning/__init__.py](https://github.com/huggingface/peft/blob/main/src/peft/tuners/prefix_tuning/__init__.py)

### P-Tuning

P-Tuning uses trainable continuous embeddings combined with a prompt encoder (typically an LSTM or MLP) to generate prompts. The encoder processes anchor tokens and produces virtual token embeddings.

**Unique Features:**
- Uses a small LSTM/MLP encoder to generate prompt embeddings
- Supports "anchor" tokens that provide natural language hints
- More flexible than pure continuous prompts

资料来源：[src/peft/tuners/p_tuning/__init__.py](https://github.com/huggingface/peft/blob/main/src/peft/tuners/p_tuning/__init__.py)

### MultiTask Prompt Tuning (MPT)

MultiTask Prompt Tuning extends standard prompt tuning by learning a shared prompt across multiple related tasks. This enables knowledge transfer and typically improves generalization.

**Use Cases:**
- Multi-task learning scenarios
- Domain adaptation with related tasks
- Few-shot learning with task similarity

资料来源：[src/peft/tuners/multitask_prompt_tuning/__init__.py](https://github.com/huggingface/peft/blob/main/src/peft/tuners/multitask_prompt_tuning/__init__.py)

## (IA)³ - Infused Adapter by Inhibiting and Amplifying Inner Activations

(IA)³ is a multiplicative adapter method that scales activations by learned vectors. It introduces trainable vectors that multiply with hidden states at specific positions in the transformer architecture.

### Mechanism

```mermaid
graph TD
    A[Hidden Activation h] --> B[Learned Vector l<sub>i</sub>]
    B --> C[Element-wise Multiplication]
    A --> C
    C --> D[h<sub>modified</sub> = l<sub>i</sub> ⊙ h]
    D --> E[Feed-Forward<br/>or Attention]
```

### Configuration Options

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `r` | int | 8 | Rank (not used in IA³ but kept for compatibility) |
| `target_modules` | list | None | Modules to apply IA³ to |
| `fan_in_fan_out` | bool | False | Transpose weights |
| `init_weights` | bool | True | Initialize adapter weights |

### Supported Target Modules

The IA³ method typically targets attention-related and feed-forward layers:

- `q_proj`, `k_proj`, `v_proj`, `o_proj` (attention projections)
- `fc1`, `fc2` (feed-forward layers)
- `gate_proj`, `up_proj`, `down_proj` (for modern architectures like Llama)

资料来源：[src/peft/tuners/ia3/__init__.py](https://github.com/huggingface/peft/blob/main/src/peft/tuners/ia3/__init__.py)
资料来源：[docs/source/conceptual_guides/ia3.md](https://github.com/huggingface/peft/blob/main/docs/source/conceptual_guides/ia3.md)

## OFT - Orthogonal Fine-Tuning

OFT constrains the fine-tuning updates to an orthogonal subspace, ensuring that the learned adapters do not interfere with each other. This method is particularly useful for multi-adapter scenarios.

### Key Principle

OFT optimizes a rotation matrix R such that the updated weights maintain orthogonality constraints:

```
W_new = W_original + β · R
```

Where R is constrained to be orthogonal, preventing gradient interference.

### Configuration Parameters

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `r` | int | 4 | Rank of the OFT transformation |
| `target_modules` | list | ["q_proj", "v_proj"] | Layers to adapt |
| `module_dropout` | float | 0.0 | Dropout probability for modules |
| `init_weights` | bool | True | Initialize with pretrained weights |

### Use Cases

- Stable diffusion model adaptation (text encoder, UNet)
- Multi-task learning with non-interfering adapters
- Computer vision models requiring structured updates

资料来源：[src/peft/tuners/oft/__init__.py](https://github.com/huggingface/peft/blob/main/src/peft/tuners/oft/__init__.py)

## FourierFT - Fourier Transform-Based Fine-Tuning

FourierFT operates in the frequency domain, learning adapters in Fourier space rather than the original weight space. This approach can capture different aspects of the model's behavior compared to spatial-domain methods.

### Advantages

- May capture global patterns more efficiently
- Different inductive bias compared to spatial methods
- Potential for more compact representations

资料来源：[src/peft/tuners/fourierft/__init__.py](https://github.com/huggingface/peft/blob/main/src/peft/tuners/fourierft/__init__.py)

## Advanced LoRA Variants

### AdaLoRA - Adaptive LoRA

AdaLoRA dynamically adjusts the rank of LoRA adaptations based on the importance of different weight matrices. It uses a budget allocation mechanism to invest more parameters in important layers.

**Key Method: `update_and_allocate`**

```python
# Called during training loop
model.base_model.update_and_allocate(global_step)
```

This method updates importance scores and reallocates the rank budget based on the current training step.

资料来源：[src/peft/tuners/adalora/model.py](https://github.com/huggingface/peft/blob/main/src/peft/tuners/adalora/model.py)

### HiRA - Hierarchical Rank Adaptation

HiRA extends LoRA with hierarchical rank adaptation, allowing for more nuanced parameter allocation across different model layers.

资料来源：[src/peft/tuners/hira/model.py](https://github.com/huggingface/peft/blob/main/src/peft/tuners/hira/model.py)

### GraLoRA - Gradient-Aware LoRA

GraLoRA considers gradient information when adapting LoRA layers, optimizing the adapter placement based on gradient flow.

资料来源：[src/peft/tuners/gralora/model.py](https://github.com/huggingface/peft/blob/main/src/peft/tuners/gralora/model.py)

## Special-Purpose Methods

### SHiRA - Structured Hints for Rank Adaptation

SHiRA provides structured hints for rank adaptation, offering a different approach to parameter-efficient fine-tuning with emphasis on interpretability.

资料来源：[src/peft/tuners/shira/model.py](https://github.com/huggingface/peft/blob/main/src/peft/tuners/shira/model.py)

### MiSS - Mixed Subspace Adaptation

MiSS adapts models in a mixed subspace, combining multiple adaptation strategies for enhanced flexibility.

资料来源：[src/peft/tuners/miss/model.py](https://github.com/huggingface/peft/blob/main/src/peft/tuners/miss/model.py)

### Adamss - Adaptive Subspace Selection

Adamss uses adaptive subspace selection for fine-tuning, choosing the most relevant subspaces based on the task at hand.

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `r` | int | 500 | Rank dimension |
| `num_subspaces` | int | 5 | Number of subspaces |
| `target_modules` | list | ["q_proj", "v_proj"] | Target layers |

资料来源：[src/peft/tuners/adamss/model.py](https://github.com/huggingface/peft/blob/main/src/peft/tuners/adamss/model.py)

### X-LoRA

X-LoRA supports multiple LoRA adapters with dynamic routing, allowing for sophisticated multi-adapter architectures.

资料来源：[src/peft/tuners/xlora/model.py](https://github.com/huggingface/peft/blob/main/src/peft/tuners/xlora/model.py)

## Comparison of Methods

| Method | Category | Trainable Parameters | Best For | Supports Multi-Adapter |
|--------|----------|---------------------|----------|------------------------|
| Prompt Tuning | Prompt-Based | Very Low | Large models, text tasks | Yes |
| Prefix Tuning | Prompt-Based | Low | Text generation | Yes |
| P-Tuning | Prompt-Based | Low-Medium | NLU tasks | Yes |
| MPT | Prompt-Based | Medium | Multi-task learning | Yes |
| (IA)³ | Multiplicative | Low | Efficient scaling | Yes |
| OFT | Multiplicative | Low-Medium | Stable diffusion, CV | Yes |
| FourierFT | Frequency-Domain | Low | Global patterns | Yes |
| AdaLoRA | Reparameterization | Variable | Dynamic budgets | Yes |
| X-LoRA | Reparameterization | Medium-High | Complex routing | Yes |

## Unified API Usage

All PEFT methods follow a consistent API pattern through `get_peft_model`:

```python
from peft import get_peft_model, PromptTuningConfig

config = PromptTuningConfig(
    task_type="SEQ_CLS",
    num_virtual_tokens=20,
    prompt_tuning_init="TEXT",
    prompt_tuning_init_text="Classify the sentiment:"
)

model = AutoModelForSequenceClassification.from_pretrained("bert-base-cased")
peft_model = get_peft_model(model, config)
peft_model.print_trainable_parameters()
```

资料来源：[docs/source/conceptual_guides/prompting.md](https://github.com/huggingface/peft/blob/main/docs/source/conceptual_guides/prompting.md)

## Best Practices

### Method Selection Guidelines

1. **For Large Language Models (>7B parameters):** Prompt Tuning, Prefix Tuning, or LoRA variants
2. **For Image Models:** OFT, (IA)³
3. **For Multi-Task Scenarios:** MultiTask Prompt Tuning, X-LoRA
4. **For Limited Compute:** (IA)³, standard Prompt Tuning
5. **For Maximum Flexibility:** AdaLoRA (dynamic rank allocation)

### Common Configuration Patterns

```python
# Efficient configuration for most cases
config = LoraConfig(
    r=8,
    lora_alpha=16,
    target_modules=["q_proj", "v_proj"],
    lora_dropout=0.05,
    task_type="CAUSAL_LM"
)

# For prompt-based methods
config = PromptTuningConfig(
    num_virtual_tokens=50,
    task_type="SEQ_CLS"
)
```

## Summary

The PEFT library provides a comprehensive suite of fine-tuning methods beyond LoRA and QLoRA. These methods offer diverse trade-offs in terms of parameter efficiency, task performance, and computational requirements. By understanding the mechanisms and use cases of each method, practitioners can select the most appropriate technique for their specific model adaptation needs.

Key takeaways:
- **Prompt-based methods** modify input representations without changing model weights
- **Multiplicative methods** (IA)³, OFT scale or rotate weights
- **Advanced LoRA variants** provide dynamic optimization capabilities
- **All methods** support multi-adapter scenarios and can be combined through the unified PEFT API

---

<a id='page-configuration'></a>

## Configuration System

### 相关页面

相关主题：[Core Components](#page-core-components), [Model Loading and Saving](#page-model-loading), [LoRA and LoRA Variants](#page-lora-methods)

<details>
<summary>相关源码文件</summary>

以下源码文件用于生成本页说明：

- [src/peft/config.py](https://github.com/huggingface/peft/blob/main/src/peft/config.py)
- [src/peft/utils/peft_types.py](https://github.com/huggingface/peft/blob/main/src/peft/utils/peft_types.py)
- [src/peft/peft_model.py](https://github.com/huggingface/peft/blob/main/src/peft/peft_model.py)
- [src/peft/tuners/lora/model.py](https://github.com/huggingface/peft/blob/main/src/peft/tuners/lora/model.py)
- [src/peft/tuners/tuners_utils.py](https://github.com/huggingface/peft/blob/main/src/peft/tuners/tuners_utils.py)
- [src/peft/helpers.py](https://github.com/huggingface/peft/blob/main/src/peft/helpers.py)
</details>

# Configuration System

## Overview

The PEFT (Parameter-Efficient Fine-Tuning) library implements a comprehensive configuration system that enables flexible and modular adapter integration across various transformer architectures. This system decouples adapter-specific parameters from model architecture, allowing users to define fine-tuning strategies through declarative configuration objects.

The configuration system serves as the foundational layer for all PEFT adapters, providing:
- Unified configuration interface across different fine-tuning methods
- Automatic model patching based on target module specifications
- Serialization and deserialization support for model saving/loading
- Multi-adapter management capabilities

```mermaid
graph TD
    A[User Configuration] --> B[PeftConfig Subclass]
    B --> C{Adapter Type}
    C -->|LoRA| D[LoraConfig]
    C -->|Prefix| E[PrefixTuningConfig]
    C -->|Prompt| F[PromptEncoderConfig]
    C -->|IA³| G[Ia3Config]
    C -->|Others| H[Tuner-Specific Config]
    
    D --> I[get_peft_model]
    E --> I
    F --> I
    G --> I
    H --> I
    
    I --> J[PeftModel Base]
    J --> K[BaseTuner.inject_adapter]
    K --> L[Model Patching]
```

## Core Components

### PeftConfig Base Class

The `PeftConfig` class is the foundational configuration object in PEFT, inheriting from `transformers.PretrainedConfig`. It provides the base interface for all adapter configurations.

**Key Attributes:**

| Attribute | Type | Description |
|-----------|------|-------------|
| `peft_type` | `PeftType` | Enum specifying the adapter method |
| `task_type` | `TaskType` | Enum specifying the ML task type |
| `inference_mode` | `bool` | Whether model is in inference mode |
| `auto_mapping` | `Optional[dict]` | Custom auto-mapping for loading |
| `base_model_name_or_path` | `str` | Path/identifier of base model |
| `revision` | `str` | Model revision for Hub models |
| `pad_token_id` | `Optional[int]` | Padding token ID |

**Source:** `src/peft/config.py`

### PeftType Enumeration

The `PeftType` enum defines all supported parameter-efficient fine-tuning methods:

| Value | Description |
|-------|-------------|
| `LORA` | Low-Rank Adaptation |
| `PROMPT_TUNING` | Soft prompt tuning |
| `PREFIX_TUNING` | Prefix tuning |
| `P_TUNING` | P-tuning (prompt encoder) |
| `IA3` | Infused Adapter by Inhibiting and Amplifying Inner Activations |
| `ADALORA` | Adaptive LoRA |
| `ADAPTION_PROMPT` | Adapter tuning with adaptive prompt |
| `POLY` | Poly (Polynomial) |
| `LNTYPOLY` | Linear typographic polynomial |
| `HRA` | Heterogeneous Re-Attention |
| `GRALORA` | Gradient Routing LoRA |
| `SHIRA` | Shifting Rank Adaptation |
| `XLORA` | X-LoRA (Cross-Layer LoRA) |
| `MISS` | Multi-Adapter Sparse Structure |
| `HIRA` | Hierarchical Reattention |
| `ADAMSS` | Adaptive Subspaces Selection |

**Source:** `src/peft/utils/peft_types.py:1-50`

### TaskType Enumeration

The `TaskType` enum specifies the machine learning task type:

| Value | Description |
|-------|-------------|
| `SEQ_CLS` | Sequence Classification |
| `SEQ_2_SEQ_LM` | Sequence-to-Sequence Language Modeling |
| `CAUSAL_LM` | Causal Language Modeling |
| `TOKEN_CLS` | Token Classification |
| `QUESTION_ANS` | Question Answering |
| `FEATURE_EXTRACTION` | Feature Extraction / Embeddings |
| `MULTIPLE_CHOICE` | Multiple Choice |
| `IMAGE_CLASSIFICATION` | Image Classification |

**Source:** `src/peft/utils/peft_types.py:50-80`

## Tuner-Specific Configurations

### LoraConfig

The `LoraConfig` class configures LoRA (Low-Rank Adaptation) adapters:

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `r` | `int` | 8 | LoRA attention dimension (rank) |
| `target_modules` | `Optional[Union[List[str], str]]` | `None` | Modules to apply LoRA to |
| `lora_alpha` | `int` | 8 | LoRA alpha scaling parameter |
| `lora_dropout` | `float` | 0.0 | Dropout probability for LoRA layers |
| `fan_in_fan_out` | `bool` | `False` | Set to transpose weight (for conv layers) |
| `bias` | `str` | `"none"` | Bias type: `"none"`, `"all"`, `"lora_only"` |
| `modules_to_save` | `Optional[List[str]]` | `None` | Modules to make trainable |
| `init_lora_weights` | `Union[bool, str]` | `True` | Initialization strategy |

**Example Configuration:**
```python
config = {
    "peft_type": "LORA",
    "task_type": "CAUSAL_LM",
    "r": 16,
    "target_modules": ["q_proj", "v_proj"],
    "lora_alpha": 32,
    "lora_dropout": 0.05,
}
peft_config = get_peft_config(config)
```

**Source:** `src/peft/tuners/lora/model.py`

### PrefixTuningConfig

Configuration for prefix-based prompt learning:

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `num_virtual_tokens` | `int` | None | Number of virtual tokens |
| `token_dim` | `int` | None | Dimensionality of token embeddings |
| `num_transformer_submodules` | `int` | 1 | Number of transformer modules |
| `num_attention_heads` | `int` | 12 | Number of attention heads |
| `num_layers` | `int` | 12 | Number of layers |
| `encoder_hidden_size` | `int` | None | Encoder hidden size |
| `prefix_projection` | `bool` | `False` | Whether to project prefix |

**Source:** `src/peft/peft_model.py`

## Configuration Loading and Saving

### Loading Configurations

The configuration system supports loading from both local paths and Hugging Face Hub:

```python
# From Hub
peft_config = PeftConfig.from_pretrained("user/peft-model")

# From dictionary
peft_config = get_peft_config(config_dict)

# Via mapping
config = PeftConfig.from_pretrained(
    model_name_or_path,
    **hf_kwargs
)
```

The `from_pretrained` method handles:
- Subfolder paths via `subfolder` parameter
- Model revisions via `revision` parameter  
- Authentication tokens via `token` or `use_auth_token` parameters

**Source:** `src/peft/config.py`, `src/peft/mixed_model.py`

### Saving Configurations

Configurations can be serialized using the standard Hugging Face `save_pretrained` method:

```python
peft_config.save_pretrained("output-directory")
```

### Auto-Mapping

The `auto_mapping` parameter enables custom configuration-to-model mappings, particularly useful for custom adapters or third-party integrations:

```python
peft_config = PeftConfig.from_pretrained(
    "model-id",
    auto_mapping={"custom_key": CustomAdapterClass}
)
```

## Adapter Injection Workflow

```mermaid
sequenceDiagram
    participant User
    participant PeftModel
    participant BaseTuner
    participant Config
    participant TargetModule
    
    User->>PeftModel: __init__(model, peft_config)
    PeftModel->>BaseTuner: inject_adapter(model, adapter_name)
    BaseTuner->>Config: Validate peft_config
    Config->>Config: Check target_module_compatibility
    
    loop For each target module
        BaseTuner->>TargetModule: Identify target layer
        BaseTuner->>BaseTuner: _create_and_replace(...)
        BaseTuner->>TargetModule: Replace with adapter layer
    end
    
    PeftModel-->>User: Ready model
```

The injection process:
1. Validates configuration compatibility with target modules
2. Identifies modules matching `target_modules` patterns
3. Creates adapter layers via `_create_and_replace` method
4. Replaces original modules with adapter wrappers
5. Marks appropriate parameters as trainable

**Source:** `src/peft/tuners/tuners_utils.py`

## Multi-Adapter Configuration

PEFT supports multiple adapters through the adapter naming system:

```python
# Load multiple adapters
peft_model = PeftModel.from_pretrained(
    base_model, 
    "adapter-1-path",
    adapter_name="adapter_1"
)
peft_model.load_adapter("adapter-2-path", adapter_name="adapter_2")

# Set active adapter
peft_model.set_adapter("adapter_1")
```

Each adapter maintains its own configuration accessible via:
```python
peft_model.peft_config["adapter_name"]
```

**Source:** `src/peft/tuners/tuners_utils.py`, `src/peft/helpers.py`

## Integration with Model Types

### Model-Specific Configurations

Different model architectures require specific configuration handling:

| Model Type | PeftModel Class | Special Config Parameters |
|------------|-----------------|---------------------------|
| Causal LM | `PeftModelForCausalLM` | Standard LoRA/Prefix |
| Seq2Seq | `PeftModelForSeq2SeqLM` | `prepare_inputs_for_generation` |
| Seq Classification | `PeftModelForSequenceClassification` | `classifier_module_names` |
| Token Classification | `PeftModelForTokenClassification` | `classifier_module_names` |
| Question Answering | `PeftModelForQuestionAnswering` | `qa_module_names` |
| Feature Extraction | `PeftModelForFeatureExtraction` | Standard config |

**Source:** `src/peft/peft_model.py`

### Target Module Mapping

Each tuner type defines a `target_module_mapping` that specifies compatible layers for different model architectures:

```python
# Example structure in tuners
target_module_mapping = TRANSFORMERS_MODELS_TO_LORA_TARGET_MODULES_MAPPING
```

This mapping ensures adapters are only applied to compatible modules (e.g., preventing LoRA application to incompatible modules in Mamba architectures).

**Source:** `src/peft/tuners/lora/model.py`, `src/peft/tuners/tuners_utils.py`

## Advanced Configuration Features

### Mixed Model Configuration

For models requiring multiple adapter types:

```python
# Load mixed configuration
mixed_model = PeftMixedModel.from_pretrained(
    model,
    peft_model_id="mixed-peft-model",
    config=mixed_config
)
```

### Hotswap Adapters

The hotswap functionality allows runtime adapter replacement:

```python
from peft import hotswap_adapter

hotswap_adapter(
    model, 
    "path-to-new-adapter", 
    adapter_name="default",
    torch_device="cuda:0"
)
```

**Source:** `src/peft/utils/hotswap.py`

### Context Manager for Adapter Scaling

Temporarily rescale adapter scaling:

```python
from peft import rescale_adapter_scale

with rescale_adapter_scale(model, multiplier=0.5):
    output = model(inputs)
```

**Source:** `src/peft/helpers.py`

## Configuration Validation

### Target Module Compatibility

The configuration system validates target modules against model architecture:

```python
def _check_target_module_compatiblity(self, peft_config, model, target_name):
    _check_lora_target_modules_mamba(peft_config, model, target_name)
```

This prevents applying adapters to incompatible modules in specific architectures.

**Source:** `src/peft/tuners/tuners_utils.py`

### PEFT Type Detection

Automatic PEFT type detection from model paths:

```python
peft_type = PeftConfig._get_peft_type(model_name_or_path, **hf_kwargs)
config_cls = PEFT_TYPE_TO_CONFIG_MAPPING[peft_type]
```

## Best Practices

1. **Always specify `task_type`**: Helps PEFT apply correct model wrapper
2. **Use `target_modules` wisely**: Restricting to key layers reduces memory
3. **Set `inference_mode=False` for training**: Required for gradient computation
4. **Save adapter config alongside weights**: Ensures reproducibility
5. **Use `modules_to_save` sparingly**: Only for task-specific heads

## See Also

- [LoRA Configuration](https://github.com/huggingface/peft/blob/main/docs/source/package_reference/config.md)
- [PEFT Types Reference](https://github.com/huggingface/peft/blob/main/docs/source/package_reference/peft_types.md)
- [PEFT Model Configuration Tutorial](https://github.com/huggingface/peft/blob/main/docs/source/tutorial/peft_model_config.md)

---

<a id='page-model-loading'></a>

## Model Loading and Saving

### 相关页面

相关主题：[Core Components](#page-core-components), [Configuration System](#page-configuration), [Quantization Integration](#page-quantization)

<details>
<summary>相关源码文件</summary>

以下源码文件用于生成本页说明：

- [src/peft/peft_model.py](https://github.com/huggingface/peft/blob/main/src/peft/peft_model.py)
- [src/peft/tuners/tuners_utils.py](https://github.com/huggingface/peft/blob/main/src/peft/tuners/tuners_utils.py)
- [src/peft/helpers.py](https://github.com/huggingface/peft/blob/main/src/peft/helpers.py)
- [src/peft/tuners/lora/model.py](https://github.com/huggingface/peft/blob/main/src/peft/tuners/lora/model.py)
- [src/peft/utils/save_and_load.py](https://github.com/huggingface/peft/blob/main/src/peft/utils/save_and_load.py)
</details>

# Model Loading and Saving

## Overview

The PEFT (Parameter-Efficient Fine-Tuning) library provides a comprehensive system for loading, saving, and managing adapter-based model configurations. This system enables users to efficiently fine-tune large language models by training only a small subset of parameters while maintaining the ability to save, load, and merge adapters with the base model.

The loading and saving architecture in PEFT is designed to be:

- **Interoperable**: Adapters can be shared via Hugging Face Hub
- **Flexible**: Multiple adapters can coexist and be switched dynamically
- **Memory-efficient**: Supports low CPU memory usage during loading
- **Non-destructive**: Original base models remain unmodified

资料来源：[src/peft/tuners/tuners_utils.py:1-50]()

## Architecture

```mermaid
graph TD
    A[Base Model] --> B[PeftModel]
    B --> C[Adapter 1]
    B --> D[Adapter 2]
    B --> N[Adapter N]
    
    E[save_pretrained] --> F[adapter_config.json]
    E --> G[adapter_model.safetensors]
    
    H[from_pretrained] --> I[Load Base Model]
    H --> J[Load Adapter Config]
    H --> K[Inject Adapters]
    
    L[merge_and_unload] --> M[Merged Base Model]
    L --> N[No Adapters]
    
    O[unload] --> P[Original Base Model]
    O --> Q[Adapters Removed]
```

## Loading PEFT Models

### Loading from Pretrained

The `PeftModel.from_pretrained()` class method loads a PEFT model configuration and applies it to a base model:

```python
from peft import PeftModel, PeftConfig

# Load PEFT configuration
peft_config = PeftConfig.from_pretrained("path/to/peft_model")

# Load base model
base_model = AutoModelForCausalLM.from_pretrained("base_model_name")

# Create PEFT model with loaded adapters
peft_model = PeftModel.from_pretrained(base_model, "path/to/peft_model")
```

### Using get_peft_model

For creating new PEFT models from scratch:

```python
from peft import get_peft_model, LoraConfig, TaskType

config = LoraConfig(
    task_type=TaskType.CAUSAL_LM,
    r=16,
    lora_alpha=32,
    target_modules=["q_proj", "v_proj"],
    lora_dropout=0.05,
)

model = AutoModelForCausalLM.from_pretrained("base_model")
peft_model = get_peft_model(model, config)
```

资料来源：[src/peft/peft_model.py:1-100]()

### Loading Parameters

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `model` | `torch.nn.Module` | Required | The base model to apply PEFT to |
| `model_id` | `str` | Required | Path or HF Hub identifier for PEFT checkpoint |
| `adapter_name` | `str` | "default" | Name for the loaded adapter |
| `is_trainable` | `bool` | `False` | Whether adapter should be trainable |
| `low_cpu_mem_usage` | `bool` | `False` | Create weights on meta device for faster loading |
| `torch_dtype` | `torch.dtype` | None | Data type for loaded weights |
| `device_map` | `str/dict` | None | Device placement strategy |

资料来源：[src/peft/peft_model.py:100-200]()

## Saving PEFT Models

### Saving to Disk

The `save_pretrained()` method saves the PEFT adapter weights and configuration:

```python
peft_model.save_pretrained("output/path")
```

This creates:
- `adapter_config.json` - Adapter configuration
- `adapter_model.safetensors` - Adapter weights

### Save Configuration Options

| Parameter | Type | Description |
|-----------|------|-------------|
| `save_adapters` | `bool` | Whether to save all adapters (default: `True`) |
| `adapter_names` | `List[str]` | Specific adapters to save (default: all active) |
| `safe_serialization` | `bool` | Use safetensors format (default: `True`) |

## Merging and Unloading

### Merge and Unload

The `merge_and_unload()` method merges all adapter weights into the base model and returns the combined model:

```python
from transformers import AutoModelForCausalLM

base_model = AutoModelForCausalLM.from_pretrained("base_model")
peft_model = PeftModel.from_pretrained(base_model, "path/to/peft_model")

# Merge adapters into base model
merged_model = peft_model.merge_and_unload()
```

This operation:
- Combines adapter weights with base model weights
- Removes PEFT wrapper layers
- Returns a standard HuggingFace model

资料来源：[src/peft/tuners/tuners_utils.py:1-100]()

### Safe Merge

For secure merging with validation:

```python
merged_model = peft_model.merge_and_unload(safe_merge=True)
```

Safe merge checks tensor shapes and dtypes before merging to prevent corruption.

### Unload

The `unload()` method removes all PEFT adapters and returns the original base model:

```python
base_model = peft_model.unload()
```

Unlike `merge_and_unload()`, this operation:
- Does not modify model weights
- Simply removes PEFT wrapper layers
- Returns the original base model unchanged

```mermaid
graph LR
    A[PeftModel] -->|merge_and_unload| B[Merged Base Model]
    A -->|unload| C[Original Base Model]
    
    B --> D[Combined Weights]
    C --> E[Original Weights Intact]
```

资料来源：[src/peft/tuners/tuners_utils.py:100-200]()

### Merge Utilities

The `merge_utils.py` module provides low-level merging functions:

| Function | Description |
|----------|-------------|
| `merge_linear_weights` | Merges LoRA weights into linear layers |
| `merge_qkv_weights` | Merges QKV attention weights |
| `merge叠加` | Generic merge operation |

## Multi-Adapter Management

### Adding Multiple Adapters

PEFT supports loading multiple adapters onto a single base model:

```python
peft_model.load_adapter("path/to/adapter1", adapter_name="adapter1")
peft_model.load_adapter("path/to/adapter2", adapter_name="adapter2")
```

### Switching Active Adapters

```python
# Set active adapter
peft_model.set_adapter("adapter1")

# Enable adapter fusion for inference
peft_model.enable_fusion()
```

### Merging Specific Adapters

```python
# Merge only specific adapters
merged_model = peft_model.merge_and_unload(adapter_names=["adapter1"])
```

## Signature Updates

When using PEFT models with adapters, the model signatures may differ from the base model. PEFT provides utility functions to update signatures:

### Update Forward Signature

```python
from peft import update_forward_signature

update_forward_signature(peft_model)
```

This allows `help(peft_model.forward)` to show the full signature including parameters from parent classes.

### Update Generate Signature

```python
from peft import update_generate_signature

update_generate_signature(peft_model)
```

Enables `help(peft_model.generate)` to display the complete generation parameters.

资料来源：[src/peft/helpers.py:1-100]()

## Checking PEFT Models

Use `check_if_peft_model()` to verify if a model path contains a PEFT configuration:

```python
from peft import check_if_peft_model

is_peft = check_if_peft_model("path/to/model")
```

This function:
- Attempts to load a `adapter_config.json`
- Returns `True` if valid PEFT config found
- Returns `False` otherwise

资料来源：[src/peft/helpers.py:100-200]()

## Loading with Quantization

PEFT models can be loaded with quantized base models using BitsAndBytes:

```python
from transformers import AutoModelForCausalLM, BitsAndBytesConfig
from peft import prepare_model_for_kbit_training

quantization_config = BitsAndBytesConfig(load_in_8bit=True)
base_model = AutoModelForCausalLM.from_pretrained(
    "model_name",
    quantization_config=quantization_config,
)

base_model = prepare_model_for_kbit_training(base_model)
peft_model = get_peft_model(base_model, lora_config)
```

资料来源：[src/peft/tuners/lora/model.py:1-100]()

## Rescaling Adapter Scale

The `rescale_adapter_scale()` context manager temporarily adjusts adapter scaling:

```python
from peft import rescale_adapter_scale

with rescale_adapter_scale(model, multiplier=0.5):
    output = model(inputs)  # Scaled by 0.5
```

资料来源：[src/peft/helpers.py:200-300]()

## Workflow Diagram

```mermaid
graph TD
    A[Start] --> B{Load Base Model}
    B --> C[Load PEFT Config]
    C --> D{Existing Adapter?}
    
    D -->|Yes| E[from_pretrained]
    D -->|No| F[get_peft_model]
    
    E --> G[PeftModel with Adapters]
    F --> H[PeftModel with New Config]
    
    G --> I{Training}
    H --> I
    
    I --> J[Train Adapters]
    J --> K[save_pretrained]
    
    K --> L[Share via Hub]
    
    I --> M{Inference}
    M --> N{Use Merged?}
    
    N -->|Yes| O[merge_and_unload]
    N -->|No| P[Use with Adapters]
    
    O --> Q[Merged Model]
    P --> R[Forward with Adapters]
```

## Best Practices

1. **Memory Optimization**: Use `low_cpu_mem_usage=True` when loading large adapters to speed up the process
2. **Safe Serialization**: Always use `save_pretrained()` with `safe_serialization=True` (default) for secure model sharing
3. **Multiple Adapters**: Load adapters with distinct names and switch between them using `set_adapter()`
4. **Signature Updates**: Call `update_forward_signature()` and `update_generate_signature()` for better IDE support
5. **Quantization**: Prepare quantized models with `prepare_model_for_kbit_training()` before applying PEFT

---

<a id='page-quantization'></a>

## Quantization Integration

### 相关页面

相关主题：[LoRA and LoRA Variants](#page-lora-methods), [Model Loading and Saving](#page-model-loading), [Advanced Features](#page-advanced-features)

<details>
<summary>相关源码文件</summary>

以下源码文件用于生成本页说明：

- [src/peft/tuners/lora/model.py](https://github.com/huggingface/peft/blob/main/src/peft/tuners/lora/model.py)
- [src/peft/tuners/ia3/model.py](https://github.com/huggingface/peft/blob/main/src/peft/tuners/ia3/model.py)
- [src/peft/tuners/lokr/model.py](https://github.com/huggingface/peft/blob/main/src/peft/tuners/lokr/model.py)
- [src/peft/tuners/loha/model.py](https://github.com/huggingface/peft/blob/main/src/peft/tuners/loha/model.py)
- [src/peft/helpers.py](https://github.com/huggingface/peft/blob/main/src/peft/helpers.py)
- [src/peft/peft_model.py](https://github.com/huggingface/peft/blob/main/src/peft/peft_model.py)
</details>

# Quantization Integration

PEFT (Parameter-Efficient Fine-Tuning) provides comprehensive support for integrating quantized base models with various parameter-efficient fine-tuning methods. This integration enables training large models that would otherwise require prohibitive amounts of memory by combining quantization techniques with PEFT adapters.

## Overview

Quantization integration in PEFT allows users to:

- Load base models in quantized form (8-bit, 4-bit, or other formats) to reduce memory footprint
- Apply PEFT adapters (LoRA, IA³, LoHa, LoKr, etc.) on top of quantized layers
- Fine-tune the adapters while keeping the quantized base model frozen
- Maintain model quality while significantly reducing GPU memory requirements

资料来源：[src/peft/tuners/lora/model.py]()

## Supported Quantization Methods

PEFT supports multiple quantization backends through integration with popular quantization libraries.

| Quantization Method | Backend Library | Precision Options | Status |
|--------------------|-----------------|-------------------|--------|
| BitsAndBytes | `bitsandbytes` | 8-bit, 4-bit | Fully Supported |
| GPTQ | `auto-gptq` | 4-bit | Fully Supported |
| AWQ | `awq` | 4-bit | Fully Supported |
| AQLM | `aqlm` | Mixed-bit | Fully Supported |
| EETQ | `eetq` | 8-bit | Fully Supported |
| HQQ | `hqq` | Configurable | Fully Supported |

## Architecture

### Quantization Integration Flow

```mermaid
graph TD
    A[Base Model Loading] --> B{Quantization Backend}
    B -->|bitsandbytes| C[BitsAndBytes 8-bit/4-bit]
    B -->|GPTQ| D[GPTQ 4-bit]
    B -->|AWQ| E[AWQ 4-bit]
    B -->|AQLM| F[AQLM]
    B -->|EETQ| G[EETQ 8-bit]
    B -->|HQQ| H[HQQ]
    
    C --> I[PEFT Adapter Injection]
    D --> I
    E --> I
    F --> I
    G --> I
    H --> I
    
    I --> J[LoRA / IA³ / LoHa / LoKr Layers]
    J --> K[Fine-tuning with Frozen Quantized Base]
```

### Module Replacement Strategy

When applying PEFT adapters to quantized models, the system replaces specific linear layers with quantized-aware versions that preserve quantization state.

```mermaid
graph LR
    A[Original Linear / Quantized Linear] --> B{Is Quantized?}
    B -->|Yes - 8-bit| C[Linear8bitLt + Adapter]
    B -->|Yes - 4-bit| D[Linear4bit + Adapter]
    B -->|No| E[Linear + Adapter]
    
    C --> F[Forward with Quantization]
    D --> F
    E --> F
```

## BitsAndBytes Integration

The BitsAndBytes integration provides 8-bit and 4-bit quantization support through the `bitsandbytes` library.

### Configuration

```python
from transformers import AutoModelForCausalLM, BitsAndBytesConfig
from peft import get_peft_model, LoraConfig

quantization_config = BitsAndBytesConfig(
    load_in_8bit=True  # or load_in_4bit=True
)

model = AutoModelForCausalLM.from_pretrained(
    "model_name",
    quantization_config=quantization_config,
)

peft_config = LoraConfig(
    r=16,
    lora_alpha=32,
    target_modules=["q_proj", "v_proj"],
)

peft_model = get_peft_model(model, peft_config)
```

### 8-bit Layer Implementation

When loading an 8-bit model, PEFT replaces standard linear layers with `Linear8bitLt` that inherits quantization state from the base layer:

```python
# From src/peft/tuners/ia3/model.py
if loaded_in_8bit and isinstance(target_base_layer, bnb.nn.Linear8bitLt):
    eightbit_kwargs = kwargs.copy()
    eightbit_kwargs.update(
        {
            "has_fp16_weights": target_base_layer.state.has_fp16_weights,
            "threshold": target_base_layer.state.threshold,
            "index": target_base_layer.index,
        }
    )
```

资料来源：[src/peft/tuners/ia3/model.py:40-49]()

### 4-bit Layer Implementation

Similarly, 4-bit quantized layers are handled with `Linear4bit`:

```python
if loaded_in_4bit and isinstance(target_base_layer, bnb.nn.Linear4bit):
    fourbit_kwargs = kwargs.copy()
    fourbit_kwargs.update(
        {
            "quant_type": target_base_layer.quant_type,
            "compute_dtype": target_base_layer.compute_dtype,
            "compress_statistics": target_base_layer.weight._quantize_state,
        }
    )
```

资料来源：[src/peft/tuners/ia3/model.py:50-56]()

## Preparing Quantized Models for Training

PEFT provides the `prepare_model_for_kbit_training` utility function to prepare quantized models for training with PEFT adapters.

### Function Signature

```python
def prepare_model_for_kbit_training(
    model,
    use_gradient_checkpointing: bool = True,
    layer_replication: Optional[List[Tuple[int, int]]] = None,
):
```

资料来源：[src/peft/helpers.py]()

### Key Operations

1. **Gradient Checkpointing**: Enables gradient checkpointing to save memory during backpropagation
2. **Layer Replication**: Optionally replicates layers for certain architectures
3. **Cast Forward Parameters**: Ensures proper dtype handling for training

### Usage Example

```python
from peft import prepare_model_for_kbit_training

# After loading quantized model
model = AutoModelForCausalLM.from_pretrained(
    "mistralai/Mistral-7B-Instruct-v0.1",
    quantization_config=int8_config,
    device_map="cuda:0",
)

# Prepare for k-bit training
model = prepare_model_for_kbit_training(model, use_gradient_checkpointing=True)
```

## Supported Tuners with Quantization

All major PEFT tuners support integration with quantized base models:

| Tuner | 8-bit Support | 4-bit Support | File Location |
|-------|---------------|--------------|---------------|
| LoRA | ✅ | ✅ | `src/peft/tuners/lora/` |
| IA³ | ✅ | ✅ | `src/peft/tuners/ia3/` |
| LoHa | ✅ | ✅ | `src/peft/tuners/loha/` |
| LoKr | ✅ | ✅ | `src/peft/tuners/lokr/` |
| AdaLoRA | ✅ | ✅ | `src/peft/tuners/adalora/` |
| OALoRA | ✅ | ✅ | `src/peft/tuners/oaloora/` |

## Layer Class Mappings

Each tuner defines specific layer mappings for different layer types:

```python
# From src/peft/tuners/lokr/model.py
layers_mapping: dict[type[torch.nn.Module], type[LoKrLayer]] = {
    torch.nn.Conv2d: Conv2d,
    torch.nn.Conv1d: Conv1d,
    torch.nn.Linear: Linear,
}

# From src/peft/tuners/loha/model.py  
layers_mapping: dict[type[torch.nn.Module], type[LoHaLayer]] = {
    torch.nn.Conv2d: Conv2d,
    torch.nn.Conv1d: Conv1d,
    torch.nn.Linear: Linear,
}
```

资料来源：[src/peft/tuners/lokr/model.py:87-90]()
资料来源：[src/peft/tuners/loha/model.py:79-82]()

## Base Tuner Layer Properties

All quantized-aware tuner layers inherit from `BaseTunerLayer` which provides key functionality:

### Key Methods

| Method | Purpose |
|--------|---------|
| `get_base_layer()` | Retrieves the underlying base layer (quantized or not) |
| `update_layer()` | Updates adapter weights for existing layers |
| `merge()` | Merges adapter weights into base layer |
| `unmerge()` | Separates merged adapter weights |

```python
if isinstance(target, BaseTunerLayer):
    target_base_layer = target.get_base_layer()
else:
    target_base_layer = target
```

资料来源：[src/peft/tuners/ia3/model.py:34-37]()

## Adapter Management with Quantization

### Creating New Modules

When creating new adapter modules for quantized layers:

1. Detect the quantization state from the base layer
2. Preserve quantization parameters (thresholds, compute dtype, etc.)
3. Create appropriate quantized-aware adapter layer

```mermaid
sequenceDiagram
    participant Base as Base Model (Quantized)
    participant PEFT as PEFT System
    participant Adapter as Adapter Layer
    
    Base->>PEFT: Target Linear Layer
    PEFT->>PEFT: Detect 8-bit / 4-bit quantization
    PEFT->>Adapter: Create with quantization state
    Adapter->>Base: Store reference + quantization params
```

### Multiple Adapters

PEFT supports multiple adapters on quantized models through the `active_adapters` mechanism:

```python
# Adding additional adapters to quantized model
if adapter_name not in self.active_adapters:
    # adding an additional adapter: it is not automatically trainable
    new_module.requires_grad_(False)
```

资料来源：[src/peft/tuners/loha/model.py:1]()
资料来源：[src/peft/tuners/lokr/model.py:1]()

## Memory Efficiency Considerations

### Memory Breakdown

| Component | Full Precision | 8-bit | 4-bit |
|-----------|---------------|-------|-------|
| Base Model | ~70GB | ~35GB | ~18GB |
| Gradients | ~70GB | ~70GB | ~70GB |
| Activations | Variable | Variable | Variable |
| Optimizer | ~280GB | ~280GB | ~280GB |

### Best Practices

1. **Use Gradient Checkpointing**: Reduces activation memory at cost of extra compute
2. **Target Specific Modules**: Only apply adapters to key layers (q_proj, v_proj)
3. **Batch Size**: Start with small batch sizes and scale based on available memory
4. **Mixed Precision**: Use bfloat16 for gradients when possible

## Context Manager for Adapter Scaling

PEFT provides `rescale_adapter_scale` for temporarily adjusting adapter scaling:

```python
@contextmanager
def rescale_adapter_scale(model, multiplier):
    """
    Context manager to temporarily rescale the scaling of the LoRA adapter.
    
    The original scaling values are restored when the context manager exits.
    """
```

资料来源：[src/peft/helpers.py:80-90]()

## Error Handling

### Common Issues

| Error | Cause | Solution |
|-------|-------|----------|
| TypeError on forward | Quantization state not preserved | Ensure proper layer replacement |
| OOM during forward | Batch size too large | Reduce batch size, use gradient checkpointing |
| Mismatched dtypes | Mixed precision issues | Cast to consistent dtype before training |

### Verification Steps

1. Verify quantization config is properly set
2. Confirm adapter layers are correctly injected
3. Check that gradient checkpointing is enabled for large models

## Configuration Reference

### BitsAndBytesConfig Options

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `load_in_8bit` | bool | False | Load model in 8-bit |
| `load_in_4bit` | bool | False | Load model in 4-bit |
| `llm_int8_threshold` | float | 6.0 | Outlier threshold for 8-bit |
| `llm_int8_skip_modules` | List | None | Modules to skip 8-bit conversion |
| `llm_int8_enable_fp32_cpu_offload` | bool | False | Enable CPU offload for32-bit tensors |

## See Also

- [LoRA Configuration](../package_reference/lora.md)
- [Developer Guides - Quantization](../developer_guides/quantization.md)
- [bitsandbytes Integration](https://github.com/TimDettmers/bitsandbytes)
- [GPTQ Quantization](https://github.com/autodistributegptq)

---

<a id='page-advanced-features'></a>

## Advanced Features

### 相关页面

相关主题：[Quantization Integration](#page-quantization)

<details>
<summary>相关源码文件</summary>

以下源码文件用于生成本页说明：

- [src/peft/mixed_model.py](https://github.com/huggingface/peft/blob/main/src/peft/mixed_model.py)
- [src/peft/tuners/mixed/model.py](https://github.com/huggingface/peft/blob/main/src/peft/tuners/mixed/model.py)
- [src/peft/utils/hotswap.py](https://github.com/huggingface/peft/blob/main/src/peft/utils/hotswap.py)
- [src/peft/utils/incremental_pca.py](https://github.com/huggingface/peft/blob/main/src/peft/utils/incremental_pca.py)
- [docs/source/developer_guides/checkpoint.md](https://github.com/huggingface/peft/blob/main/docs/source/developer_guides/checkpoint.md)
- [docs/source/developer_guides/mixed_models.md](https://github.com/huggingface/peft/blob/main/docs/source/developer_guides/mixed_models.md)
- [docs/source/accelerate/deepspeed.md](https://github.com/huggingface/peft/blob/main/docs/source/accelerate/deepspeed.md)
- [docs/source/accelerate/fsdp.md](https://github.com/huggingface/peft/blob/main/docs/source/accelerate/fsdp.md)
</details>

# Advanced Features

PEFT (Parameter-Efficient Fine-Tuning) provides a comprehensive suite of advanced features that extend beyond basic adapter-based fine-tuning. These features enable sophisticated model adaptation strategies, including mixed adapter configurations, runtime adapter switching, distributed training support, and advanced optimization techniques.

## Mixed Adapter Models

Mixed adapter models allow multiple adapter types to coexist within a single base model. This powerful feature enables combining different fine-tuning techniques to leverage their respective strengths.

### Overview

The mixed model architecture in PEFT allows a base model to have multiple adapters of different types applied simultaneously. This is particularly useful when different adapters excel at different aspects of a task, or when you want to experiment with combining adapter strengths.

The mixed model functionality is implemented across two primary modules:

| Module | File Path | Purpose |
|--------|-----------|---------|
| `PeftMixedModel` | `src/peft/mixed_model.py` | Base mixed model class |
| `MixedModel` | `src/peft/tuners/mixed/model.py` | Tuner-specific mixed model implementation |

### Architecture

```mermaid
graph TD
    A[Base Model] --> B[Mixed Adapter Layer]
    B --> C[LoRA Adapter]
    B --> D[IA³ Adapter]
    B --> E[AdaLoRA Adapter]
    B --> N[Additional Adapters]
    
    F[Adapter Config 1] --> C
    G[Adapter Config 2] --> D
    H[Adapter Config 3] --> E
    
    I[Active Adapter Selection] --> B
    J[Multi-Adapter Inference] --> B
```

### Supported Adapter Combinations

PEFT supports multiple tuner types that can be combined in mixed configurations:

| Tuner Type | Prefix | Description |
|------------|--------|-------------|
| LoRA | `lora_` | Low-Rank Adaptation |
| AdaLoRA | `adalora_` | Adaptive LoRA with budget allocation |
| IA³ | `ia3_` | (IA)³ - Learnable input/output/residual scaling |
| OFT | `oft_` | Orthogonal Fine-Tuning |
| HRA | `hra_` | Hypernetwork-based Rank Adaptation |
| HiRA | `hira_` | Hierarchical Rank Adaptation |
| SHiRA | `shira_` | Structured Hiera rchy-aware Rank Adaptation |
| GraLoRA | `gralora_` | Gradient-aware LoRA |
| MiSS | `miss_` | Multi-adapter Image-to-Image Spatial Shift |
| AdaMSS | `adamss_` | Adaptive Multi-subspace Schur Complement |
| X-LoRA | `xlora_` | Extended LoRA with quantization support |
| Poly | `poly_` | Polynomial projection-based adaptation |

### Key Implementation Details

Each tuner in PEFT defines specific attributes that enable mixed adapter support:

```python
# Common tuner model attributes
prefix: str  # Unique prefix for the tuner (e.g., "lora_", "ia3_")
tuner_layer_cls = SpecificLayerClass  # The layer class for this tuner
target_module_mapping = {...}  # Mapping of model types to target modules
```

The mixed model implementation handles adapter creation through the `_create_and_replace` method, which validates the current key and delegates to appropriate adapter-specific logic.

资料来源：[src/peft/tuners/shira/model.py:1-50](https://github.com/huggingface/peft/blob/main/src/peft/tuners/shira/model.py)
资料来源：[src/peft/tuners/mixed/model.py](https://github.com/huggingface/peft/blob/main/src/peft/tuners/mixed/model.py)

## Adapter Hotswap

The hotswap feature enables runtime replacement of adapters without requiring full model reload. This is essential for production environments where model availability must be maintained during adapter updates.

### Purpose

Adapter hotswapping allows you to:

- Replace a deployed adapter with an updated version
- Switch between different fine-tuned adapters for different tasks
- Update model capabilities without downtime
- A/B test different adapter versions in production

### Implementation

The hotswap functionality is implemented in `src/peft/utils/hotswap.py` and provides the `hotswap_adapter` function for runtime adapter replacement.

```python
def hotswap_adapter(
    model: "PeftModel",
    model_name_or_path: str,
    adapter_name: str,
    torch_device: Optional[str] = None,
    **kwargs,
) -> None:
```

### Parameters

| Parameter | Type | Description |
|-----------|------|-------------|
| `model` | `PeftModel` | The PEFT model with the loaded adapter |
| `model_name_or_path` | `str` | Path or identifier for the new adapter |
| `adapter_name` | `str` | Name of the adapter to replace (e.g., `"default"`) |
| `torch_device` | `str`, optional | Target device for adapter weights |
| `**kwargs` | | Additional arguments for config/weight loading |

### Workflow

```mermaid
graph TD
    A[Load New Adapter Config] --> B[Validate Adapter Type]
    B --> C[Load Adapter Weights to Device]
    C --> D[Validate Weight Compatibility]
    D --> E[Replace Adapter Weights in Model]
    E --> F[Update Model State]
    F --> G[Model Ready for Inference]
    
    H[Inference with New Adapter] -.-> G
```

### Usage Example

```python
from peft import hotswap_adapter

# Replace the "default" lora adapter with a new one
hotswap_adapter(model, "path-to-new-adapter", adapter_name="default", torch_device="cuda:0")

# Use the updated model
with torch.inference_mode():
    output = model(inputs).logits
```

### Configuration Validation

During hotswap, the system performs several validations:

1. **Config Loading**: Loads the new adapter configuration using `config_cls.from_pretrained()`
2. **Type Matching**: Ensures the new adapter type is compatible with existing adapters
3. **Weight Loading**: Loads weights onto the specified device with appropriate quantization settings

资料来源：[src/peft/utils/hotswap.py:1-80](https://github.com/huggingface/peft/blob/main/src/peft/utils/hotswap.py)
资料来源：[docs/source/developer_guides/checkpoint.md](https://github.com/huggingface/peft/blob/main/docs/source/developer_guides/checkpoint.md)

## Incremental PCA Utilities

PEFT includes incremental PCA utilities for advanced analysis and optimization of adapter matrices. Incremental PCA is particularly useful for:

- Analyzing the rank structure of trained adapters
- Identifying redundant parameters in low-rank adaptations
- Computing principal components in a memory-efficient manner

### Implementation

The incremental PCA implementation is located in `src/peft/utils/incremental_pca.py`. This utility supports processing large matrices in batches to avoid memory constraints.

### Key Features

| Feature | Description |
|---------|-------------|
| Batch Processing | Process large matrices incrementally |
| Memory Efficiency | Avoid loading entire matrices into memory |
| Rank Analysis | Determine effective rank of adapter matrices |
| Component Extraction | Extract principal components for analysis |

### Use Cases

1. **Adapter Analysis**: Understand the dimensionality requirements of trained adapters
2. **Compression**: Identify opportunities for matrix rank reduction
3. **Quality Assessment**: Verify that low-rank approximations maintain sufficient information

资料来源：[src/peft/utils/incremental_pca.py](https://github.com/huggingface/peft/blob/main/src/peft/utils/incremental_pca.py)

## Distributed Training Support

PEFT provides comprehensive support for distributed training frameworks, enabling efficient fine-tuning of large models across multiple devices and nodes.

### DeepSpeed Integration

PEFT integrates with DeepSpeed ZeRO optimizations for memory-efficient distributed training.

#### Features

| Feature | Description |
|---------|-------------|
| ZeRO Stage 2/3 | Partition optimizer states across devices |
| CPU Offload | Offload parameters/optimizer states to CPU |
| Activation Checkpointing | Reduce memory for activations |
| Mixed Precision | FP16/BF16 training support |

#### Configuration

```python
from peft import LoraConfig, get_peft_model
from transformers import AutoModelForCausalLM

model = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-2-7b-hf")
peft_config = LoraConfig(
    r=16,
    lora_alpha=32,
    target_modules=["q_proj", "v_proj"],
    lora_dropout=0.05,
)

peft_model = get_peft_model(model, peft_config)
# Train with DeepSpeed ZeRO-3 config
```

#### Key Considerations

- Only non-trainable weights should remain on the original device when using PEFT with DeepSpeed
- Trainable adapter weights are managed by DeepSpeed's optimizer partitioning
- Offloading should be configured at the DeepSpeed level, not within PEFT configs

资料来源：[docs/source/accelerate/deepspeed.md](https://github.com/huggingface/peft/blob/main/docs/source/accelerate/deepspeed.md)

### FSDP Integration

Fully Sharded Data Parallel (FSDP) support enables sharding model parameters, gradients, and optimizer states across GPUs.

#### Features

| Feature | Description |
|---------|-------------|
| Parameter Sharding | Distribute model parameters across GPUs |
| Gradient Sharding | Partition gradients during backward pass |
| Optimizer Sharding | Distribute optimizer states |
| Mixed Precision | Automatic FP16/BF16 handling |

#### Configuration with Accelerate

```python
# accelerate config.yaml
compute_environment: LOCAL_MACHINE
distributed_type: FSDP
fsdp_config:
  fsdp_sharding_strategy: FULL_SHARD
  fsdp_auto_wrap_policy: TRANSFORMER_BASED_WRAP
  fsdp_backward_prefetch: BACKWARD_PRE
  fsdp_state_dict_type: FULL_STATE_DICT
```

#### Compatibility Notes

- FSDP support requires `transformers>=4.36.0`
- Auto-wrap policies should wrap transformer layers containing PEFT adapters
- State dict type should be `FULL_STATE_DICT` for checkpoint saving

资料来源：[docs/source/accelerate/fsdp.md](https://github.com/huggingface/peft/blob/main/docs/source/accelerate/fsdp.md)

## Advanced Tuner Configurations

### AdaLoRA - Adaptive Budget Allocation

AdaLoRA implements an intelligent budget allocation strategy that dynamically adjusts the rank of different adapter matrices during training.

#### Training Workflow

```mermaid
graph TD
    A[Initialize with Uniform Rank] --> B[Forward Pass]
    B --> C[Calculate Importance Scores]
    C --> D{Global Step < Total - T_final?}
    D -->|Yes| E[Update Rank Pattern]
    E --> B
    D -->|No| F{Mask Unimportant Weights}
    F --> G[Finalize Adapter]
```

#### Key Parameters

| Parameter | Description |
|-----------|-------------|
| `r` | Initial rank for all adapters |
| `total_step` | Total training steps |
| `tinit` | Steps for initial warmup |
| `tfinal` | Steps for final budget freezing |
| `deltaT` | Interval between rank adjustments |

资料来源：[src/peft/tuners/adalora/model.py:1-100](https://github.com/huggingface/peft/blob/main/src/peft/tuners/adalora/model.py)

### X-LoRA - Extended LoRA with Quantization

X-LoRA supports advanced configurations including quantization-aware training and multi-adapter loading.

#### Features

| Feature | Description |
|---------|-------------|
| 8-bit Quantization | Load base models in int8 format |
| 4-bit Quantization | Load base models in int4 format |
| Flash Attention | Integration with flash_attention_2 |
| Ephemeral GPU Offload | Temporary GPU memory management |
| Multiple Adapter Loading | Load multiple adapters simultaneously |

#### Configuration

```python
from peft import XLoraConfig, get_peft_model
from transformers import AutoModelForCausalLM, BitsAndBytesConfig
import torch

quantization_config = BitsAndBytesConfig(load_in_8bit=True)
model = AutoModelForCausalLM.from_pretrained(
    "mistralai/Mistral-7B-Instruct-v0.1",
    quantization_config=quantization_config,
    device_map="cuda:0",
)
xlora_model = get_peft_model(model, config)
```

资料来源：[src/peft/tuners/xlora/model.py:1-80](https://github.com/huggingface/peft/blob/main/src/peft/tuners/xlora/model.py)

### IA³ - Learned Initiation and Adaptation

The (IA)³ method applies learnable scaling vectors to key components of transformer models.

#### Target Modules

| Model Type | Target Modules |
|------------|----------------|
| Encoder-only | `q_proj`, `v_proj`, `k_proj`, `output_proj` |
| Decoder-only | `q_proj`, `v_proj`, `k_proj`, `output_proj`, `fc1` |
| Seq2Seq | `q_proj`, `v_proj`, `k_proj`, `output_proj`, `fc1`, `fc2` |

#### Implementation Details

The IA³ implementation creates scaling vectors that are multiplied with the hidden states at specific positions in the forward pass. The scaling vectors are initialized to ones (neutral) and learned during training.

资料来源：[src/peft/tuners/ia3/model.py:1-80](https://github.com/huggingface/peft/blob/main/src/peft/tuners/ia3/model.py)

## Helper Functions

PEFT provides utility functions for common operations that enhance the developer experience.

### Signature Management

#### `update_forward_signature`

Updates the forward signature of a PeftModel to include the base model's signature, enabling proper IDE autocompletion and documentation.

```python
from peft import update_forward_signature

update_forward_signature(peft_model)
help(peft_model.forward)  # Now shows complete signature
```

#### `update_generate_signature`

Similar to forward signature update but for the `generate` method, essential for seq2seq models.

```python
from peft import update_generate_signature

update_generate_signature(peft_model)
help(peft_model.generate)  # Now shows complete signature
```

### Model Validation

#### `check_if_peft_model`

Validates whether a model path or identifier corresponds to a PEFT model by attempting to load its configuration.

```python
from peft import check_if_peft_model

is_peft = check_if_peft_model("meta-llama/Llama-2-7b-adapter")
# Returns: True or False
```

### Adapter Scale Context Manager

The `rescale_adapter_scale` context manager temporarily adjusts adapter scaling factors, useful for controlled inference experiments.

```python
from peft.utils import rescale_adapter_scale

with rescale_adapter_scale(model, multiplier=0.5):
    output = model(inputs)  # Scaled by 0.5
# Original scaling restored after context exit
```

资料来源：[src/peft/helpers.py:1-150](https://github.com/huggingface/peft/blob/main/src/peft/helpers.py)

## Task-Specific Models

PEFT provides specialized model classes optimized for different task types.

| Task Type | Model Class | Use Case |
|-----------|-------------|----------|
| Feature Extraction | `PeftModelForFeatureExtraction` | Extracting embeddings |
| Question Answering | `PeftModelForQuestionAnswering` | QA tasks |
| Sequence Classification | `PeftModelForSequenceClassification` | Text classification |
| Token Classification | `PeftModelForTokenClassification` | NER, POS tagging |
| Seq2Seq LM | `PeftModelForSeq2SeqLM` | Translation, summarization |

### Common Initialization Pattern

All task-specific models follow a consistent initialization pattern:

```python
def __init__(
    self,
    model: torch.nn.Module,
    peft_config: PeftConfig,
    adapter_name: str = "default",
    **kwargs,
) -> None:
    super().__init__(model, peft_config, adapter_name, **kwargs)
```

Each model class may add task-specific module name patterns for modules to save (e.g., classifier layers in sequence classification models).

资料来源：[src/peft/peft_model.py:1-200](https://github.com/huggingface/peft/blob/main/src/peft/peft_model.py)

## Summary

PEFT's advanced features provide a comprehensive toolkit for parameter-efficient model adaptation:

| Category | Features |
|----------|----------|
| **Mixed Adapters** | Multiple adapter types per model |
| **Runtime Switching** | Adapter hotswap without reload |
| **Analysis Tools** | Incremental PCA for matrix analysis |
| **Distributed Training** | DeepSpeed ZeRO, FSDP support |
| **Advanced Tuners** | AdaLoRA, X-LoRA, IA³, OFT, and more |
| **Developer Utilities** | Signature management, validation helpers |

These features enable both research experimentation and production deployment of efficient fine-tuning solutions across a wide range of model architectures and training configurations.

---

---

## Doramagic 踩坑日志

项目：huggingface/peft

摘要：发现 18 个潜在踩坑项，其中 2 个为 high/blocking；最高优先级：配置坑 - 来源证据：[BUG] peft 0.19 target_modules (str) use `set`。

## 1. 配置坑 · 来源证据：[BUG] peft 0.19 target_modules (str) use `set`

- 严重度：high
- 证据强度：source_linked
- 发现：GitHub 社区证据显示该项目存在一个配置相关的待验证问题：[BUG] peft 0.19 target_modules (str) use `set`
- 对用户的影响：可能增加新用户试用和生产接入成本。
- 建议检查：来源问题仍为 open，Pack Agent 需要复核是否仍影响当前版本。
- 防护动作：不得脱离来源链接放大为确定性结论；需要标注适用版本和复核状态。
- 证据：community_evidence:github | cevd_bd098228d56f4251949a351ac90335fc | https://github.com/huggingface/peft/issues/3229 | 来源讨论提到 python 相关条件，需在安装/试用前复核。

## 2. 安全/权限坑 · 来源证据：Comparison of Different Fine-Tuning Techniques for Conversational AI

- 严重度：high
- 证据强度：source_linked
- 发现：GitHub 社区证据显示该项目存在一个安全/权限相关的待验证问题：Comparison of Different Fine-Tuning Techniques for Conversational AI
- 对用户的影响：可能影响授权、密钥配置或安全边界。
- 建议检查：来源问题仍为 open，Pack Agent 需要复核是否仍影响当前版本。
- 防护动作：不得脱离来源链接放大为确定性结论；需要标注适用版本和复核状态。
- 证据：community_evidence:github | cevd_408252d26b4a4d87b9ca9362c3b4b37b | https://github.com/huggingface/peft/issues/2310 | 来源类型 github_issue 暴露的待验证使用条件。

## 3. 安装坑 · 来源证据：Feature Request: Improve offline support for custom architectures in get_peft_model_state_dict

- 严重度：medium
- 证据强度：source_linked
- 发现：GitHub 社区证据显示该项目存在一个安装相关的待验证问题：Feature Request: Improve offline support for custom architectures in get_peft_model_state_dict
- 对用户的影响：可能增加新用户试用和生产接入成本。
- 建议检查：来源显示可能已有修复、规避或版本变化，说明书中必须标注适用版本。
- 防护动作：不得脱离来源链接放大为确定性结论；需要标注适用版本和复核状态。
- 证据：community_evidence:github | cevd_32e0990aa35b430bac525df543e75cac | https://github.com/huggingface/peft/issues/3211 | 来源讨论提到 python 相关条件，需在安装/试用前复核。

## 4. 配置坑 · 来源证据：0.17.0: SHiRA, MiSS, LoRA for MoE, and more

- 严重度：medium
- 证据强度：source_linked
- 发现：GitHub 社区证据显示该项目存在一个配置相关的待验证问题：0.17.0: SHiRA, MiSS, LoRA for MoE, and more
- 对用户的影响：可能影响升级、迁移或版本选择。
- 建议检查：来源显示可能已有修复、规避或版本变化，说明书中必须标注适用版本。
- 防护动作：不得脱离来源链接放大为确定性结论；需要标注适用版本和复核状态。
- 证据：community_evidence:github | cevd_a7ec4779d09a4fcebe0901d73f869bf0 | https://github.com/huggingface/peft/releases/tag/v0.17.0 | 来源讨论提到 python 相关条件，需在安装/试用前复核。

## 5. 配置坑 · 来源证据：Applying Dora to o_proj of Meta-Llama-3.1-8B results in NaN

- 严重度：medium
- 证据强度：source_linked
- 发现：GitHub 社区证据显示该项目存在一个配置相关的待验证问题：Applying Dora to o_proj of Meta-Llama-3.1-8B results in NaN
- 对用户的影响：可能增加新用户试用和生产接入成本。
- 建议检查：来源显示可能已有修复、规避或版本变化，说明书中必须标注适用版本。
- 防护动作：不得脱离来源链接放大为确定性结论；需要标注适用版本和复核状态。
- 证据：community_evidence:github | cevd_ce144c340d9f40929a6551e9dbca770d | https://github.com/huggingface/peft/issues/2049 | 来源讨论提到 python 相关条件，需在安装/试用前复核。

## 6. 能力坑 · 能力判断依赖假设

- 严重度：medium
- 证据强度：source_linked
- 发现：README/documentation is current enough for a first validation pass.
- 对用户的影响：假设不成立时，用户拿不到承诺的能力。
- 建议检查：将假设转成下游验证清单。
- 防护动作：假设必须转成验证项；没有验证结果前不能写成事实。
- 证据：capability.assumptions | github_repo:570384908 | https://github.com/huggingface/peft | README/documentation is current enough for a first validation pass.

## 7. 运行坑 · 来源证据：0.17.1

- 严重度：medium
- 证据强度：source_linked
- 发现：GitHub 社区证据显示该项目存在一个运行相关的待验证问题：0.17.1
- 对用户的影响：可能增加新用户试用和生产接入成本。
- 建议检查：来源显示可能已有修复、规避或版本变化，说明书中必须标注适用版本。
- 防护动作：不得脱离来源链接放大为确定性结论；需要标注适用版本和复核状态。
- 证据：community_evidence:github | cevd_cd675dc497c44319af556a2e7059dd95 | https://github.com/huggingface/peft/releases/tag/v0.17.1 | 来源类型 github_release 暴露的待验证使用条件。

## 8. 运行坑 · 来源证据：v0.15.1

- 严重度：medium
- 证据强度：source_linked
- 发现：GitHub 社区证据显示该项目存在一个运行相关的待验证问题：v0.15.1
- 对用户的影响：可能增加新用户试用和生产接入成本。
- 建议检查：来源显示可能已有修复、规避或版本变化，说明书中必须标注适用版本。
- 防护动作：不得脱离来源链接放大为确定性结论；需要标注适用版本和复核状态。
- 证据：community_evidence:github | cevd_66bfe8be731a44de971b991569f61e57 | https://github.com/huggingface/peft/releases/tag/v0.15.1 | 来源类型 github_release 暴露的待验证使用条件。

## 9. 运行坑 · 来源证据：v0.15.2

- 严重度：medium
- 证据强度：source_linked
- 发现：GitHub 社区证据显示该项目存在一个运行相关的待验证问题：v0.15.2
- 对用户的影响：可能增加新用户试用和生产接入成本。
- 建议检查：来源显示可能已有修复、规避或版本变化，说明书中必须标注适用版本。
- 防护动作：不得脱离来源链接放大为确定性结论；需要标注适用版本和复核状态。
- 证据：community_evidence:github | cevd_3d5933ee300d4f68bfab2f0440fae679 | https://github.com/huggingface/peft/releases/tag/v0.15.2 | 来源类型 github_release 暴露的待验证使用条件。

## 10. 维护坑 · 来源证据：0.16.0: LoRA-FA, RandLoRA, C³A, and much more

- 严重度：medium
- 证据强度：source_linked
- 发现：GitHub 社区证据显示该项目存在一个维护/版本相关的待验证问题：0.16.0: LoRA-FA, RandLoRA, C³A, and much more
- 对用户的影响：可能增加新用户试用和生产接入成本。
- 建议检查：来源显示可能已有修复、规避或版本变化，说明书中必须标注适用版本。
- 防护动作：不得脱离来源链接放大为确定性结论；需要标注适用版本和复核状态。
- 证据：community_evidence:github | cevd_5ef66863f7c64b3e9e3ba6a72eaab639 | https://github.com/huggingface/peft/releases/tag/v0.16.0 | 来源类型 github_release 暴露的待验证使用条件。

## 11. 维护坑 · 维护活跃度未知

- 严重度：medium
- 证据强度：source_linked
- 发现：未记录 last_activity_observed。
- 对用户的影响：新项目、停更项目和活跃项目会被混在一起，推荐信任度下降。
- 建议检查：补 GitHub 最近 commit、release、issue/PR 响应信号。
- 防护动作：维护活跃度未知时，推荐强度不能标为高信任。
- 证据：evidence.maintainer_signals | github_repo:570384908 | https://github.com/huggingface/peft | last_activity_observed missing

## 12. 安全/权限坑 · 下游验证发现风险项

- 严重度：medium
- 证据强度：source_linked
- 发现：no_demo
- 对用户的影响：下游已经要求复核，不能在页面中弱化。
- 建议检查：进入安全/权限治理复核队列。
- 防护动作：下游风险存在时必须保持 review/recommendation 降级。
- 证据：downstream_validation.risk_items | github_repo:570384908 | https://github.com/huggingface/peft | no_demo; severity=medium

## 13. 安全/权限坑 · 存在评分风险

- 严重度：medium
- 证据强度：source_linked
- 发现：no_demo
- 对用户的影响：风险会影响是否适合普通用户安装。
- 建议检查：把风险写入边界卡，并确认是否需要人工复核。
- 防护动作：评分风险必须进入边界卡，不能只作为内部分数。
- 证据：risks.scoring_risks | github_repo:570384908 | https://github.com/huggingface/peft | no_demo; severity=medium

## 14. 安全/权限坑 · 来源证据：0.18.0: RoAd, ALoRA, Arrow, WaveFT, DeLoRA, OSF, and more

- 严重度：medium
- 证据强度：source_linked
- 发现：GitHub 社区证据显示该项目存在一个安全/权限相关的待验证问题：0.18.0: RoAd, ALoRA, Arrow, WaveFT, DeLoRA, OSF, and more
- 对用户的影响：可能影响授权、密钥配置或安全边界。
- 建议检查：来源显示可能已有修复、规避或版本变化，说明书中必须标注适用版本。
- 防护动作：不得脱离来源链接放大为确定性结论；需要标注适用版本和复核状态。
- 证据：community_evidence:github | cevd_b28315fbb2d44b748ca46f87fafd3d33 | https://github.com/huggingface/peft/releases/tag/v0.18.0 | 来源讨论提到 python 相关条件，需在安装/试用前复核。

## 15. 安全/权限坑 · 来源证据：v0.15.0

- 严重度：medium
- 证据强度：source_linked
- 发现：GitHub 社区证据显示该项目存在一个安全/权限相关的待验证问题：v0.15.0
- 对用户的影响：可能影响升级、迁移或版本选择。
- 建议检查：来源显示可能已有修复、规避或版本变化，说明书中必须标注适用版本。
- 防护动作：不得脱离来源链接放大为确定性结论；需要标注适用版本和复核状态。
- 证据：community_evidence:github | cevd_1a3ce413d14349658dc005c25754bb1f | https://github.com/huggingface/peft/releases/tag/v0.15.0 | 来源类型 github_release 暴露的待验证使用条件。

## 16. 安全/权限坑 · 来源证据：v0.19.0

- 严重度：medium
- 证据强度：source_linked
- 发现：GitHub 社区证据显示该项目存在一个安全/权限相关的待验证问题：v0.19.0
- 对用户的影响：可能影响授权、密钥配置或安全边界。
- 建议检查：来源显示可能已有修复、规避或版本变化，说明书中必须标注适用版本。
- 防护动作：不得脱离来源链接放大为确定性结论；需要标注适用版本和复核状态。
- 证据：community_evidence:github | cevd_abcf15a2812744f0a37ad5c5d75643cf | https://github.com/huggingface/peft/releases/tag/v0.19.0 | 来源类型 github_release 暴露的待验证使用条件。

## 17. 维护坑 · issue/PR 响应质量未知

- 严重度：low
- 证据强度：source_linked
- 发现：issue_or_pr_quality=unknown。
- 对用户的影响：用户无法判断遇到问题后是否有人维护。
- 建议检查：抽样最近 issue/PR，判断是否长期无人处理。
- 防护动作：issue/PR 响应未知时，必须提示维护风险。
- 证据：evidence.maintainer_signals | github_repo:570384908 | https://github.com/huggingface/peft | issue_or_pr_quality=unknown

## 18. 维护坑 · 发布节奏不明确

- 严重度：low
- 证据强度：source_linked
- 发现：release_recency=unknown。
- 对用户的影响：安装命令和文档可能落后于代码，用户踩坑概率升高。
- 建议检查：确认最近 release/tag 和 README 安装命令是否一致。
- 防护动作：发布节奏未知或过期时，安装说明必须标注可能漂移。
- 证据：evidence.maintainer_signals | github_repo:570384908 | https://github.com/huggingface/peft | release_recency=unknown

<!-- canonical_name: huggingface/peft; human_manual_source: deepwiki_human_wiki -->