# https://github.com/OSU-NLP-Group/HippoRAG 项目说明书

生成时间：2026-05-16 23:41:44 UTC

## 目录

- [Installation and Setup](#page-1)
- [Quick Start Guide](#page-2)
- [Configuration System](#page-3)
- [HippoRAG Core Class](#page-4)
- [Knowledge Graph and Retrieval](#page-5)
- [Embedding Store and Management](#page-6)
- [LLM Integrations](#page-7)
- [Embedding Models](#page-8)
- [Open Information Extraction (OpenIE)](#page-9)
- [Deployment Options](#page-10)

<a id='page-1'></a>

## Installation and Setup

### 相关页面

相关主题：[Configuration System](#page-3), [Deployment Options](#page-10)

<details>
<summary>相关源码文件</summary>

以下源码文件用于生成本页说明：

- [setup.py](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/setup.py)
- [requirements.txt](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/requirements.txt)
- [README.md](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/README.md)
- [src/hipporag/utils/config_utils.py](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/src/hipporag/utils/config_utils.py)
- [main.py](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/main.py)
- [test_transformers.py](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/test_transformers.py)
- [CONTRIBUTING.md](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/CONTRIBUTING.md)
</details>

# Installation and Setup

## Overview

HippoRAG is a graph-based Retrieval-Augmented Generation (RAG) framework designed to enable Large Language Models (LLMs) to identify and leverage connections within knowledge bases for improved retrieval and question answering. The installation process configures the necessary dependencies, environment variables, and model configurations to run HippoRAG in either cloud (OpenAI) or local (vLLM) deployment modes.

资料来源：[README.md](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/README.md)

## System Requirements

### Python Version

| Requirement | Version |
|-------------|---------|
| Python | >= 3.10 |

The package explicitly requires Python 3.10 or higher as specified in the `setup.py` configuration.

资料来源：[setup.py:16](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/setup.py#L16)

### Hardware Requirements

| Component | Requirement |
|-----------|-------------|
| GPU | CUDA-compatible GPU(s) recommended |
| GPU Memory | Varies based on model size (see deployment sections) |

For local deployment with vLLM, the framework supports tensor parallelism across multiple GPUs. The README recommends reserving enough memory for embedding models when deploying LLM servers.

资料来源：[README.md](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/README.md)

## Installation Methods

### Method 1: pip Installation (Recommended)

```sh
conda create -n hipporag python=3.10
conda activate hipporag
pip install hipporag
```

This method installs HippoRAG version 2.0.0-alpha.4 along with all core dependencies from PyPI.

资料来源：[README.md](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/README.md)

### Method 2: Source Installation (For Development)

```sh
git clone https://github.com/OSU-NLP-Group/HippoRAG.git
cd HippoRAG
pip install -e .
```

Clone the repository and install in editable mode to work with the latest source code.

资料来源：[CONTRIBUTING.md](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/CONTRIBUTING.md)

## Environment Variables

Proper configuration of environment variables is essential for HippoRAG to function correctly. These variables control GPU allocation, model caching, and API access.

### Required Environment Variables

| Variable | Description | Example |
|----------|-------------|---------|
| `CUDA_VISIBLE_DEVICES` | Comma-separated list of GPU device IDs | `0,1,2,3` |
| `HF_HOME` | Path to Hugging Face cache directory | `/path/to/huggingface/home` |
| `OPENAI_API_KEY` | API key for OpenAI models (cloud mode only) | `sk-...` |

### Setting Environment Variables

```sh
# Set CUDA visible devices
export CUDA_VISIBLE_DEVICES=0,1,2,3

# Set Hugging Face cache location
export HF_HOME=<path to Huggingface home directory>

# Set OpenAI API key (required for cloud deployment)
export OPENAI_API_KEY=<your openai api key>
```

资料来源：[README.md](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/README.md)

## Core Dependencies

HippoRAG depends on a comprehensive set of libraries for LLM inference, embedding models, graph processing, and data handling.

### Dependency Overview

| Package | Version | Purpose |
|---------|---------|---------|
| torch | 2.5.1 | PyTorch deep learning framework |
| transformers | 4.45.2 | Model architectures and tokenizers |
| vllm | 0.6.6.post1 | High-throughput LLM inference |
| openai | 1.91.1 | OpenAI API client |
| litellm | 1.73.1 | Unified LLM interface |
| gritlm | 1.0.2 | Embedding model |
| networkx | 3.4.2 | Graph data structures |
| python_igraph | 0.11.8 | Graph algorithms |
| tiktoken | 0.7.0 | Tokenization |
| pydantic | 2.10.4 | Data validation |
| tenacity | 8.5.0 | Retry logic |
| einops | (latest) | Tensor operations |
| tqdm | (latest) | Progress bars |
| boto3 | (latest) | AWS S3 integration |

资料来源：[setup.py:17-32](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/setup.py#L17-L32), [requirements.txt](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/requirements.txt)

### Additional Dependencies

The `requirements.txt` file includes additional packages not pinned to specific versions:

| Package | Purpose |
|---------|---------|
| nest_asyncio | Asynchronous operations |
| numpy | Numerical computing |
| scipy | Scientific computing |

资料来源：[requirements.txt](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/requirements.txt)

## Configuration

HippoRAG uses a Pydantic-based configuration system defined in `BaseConfig` within `config_utils.py`. This configuration controls all aspects of indexing, retrieval, and QA.

### Configuration Parameters

#### Embedding Configuration

| Parameter | Default | Description |
|-----------|---------|-------------|
| embedding_model_name | nvidia/NV-Embed-v2 | Name of the embedding model |
| embedding_batch_size | 16 | Batch size for embedding encoding |
| embedding_return_as_normalized | True | Whether to normalize embeddings |
| embedding_max_seq_len | 2048 | Maximum sequence length for embeddings |
| embedding_model_dtype | auto | Data type for local embedding models |

#### Retrieval Configuration

| Parameter | Default | Description |
|-----------|---------|-------------|
| retrieval_top_k | 200 | Number of documents to retrieve |
| linking_top_k | 5 | Number of linked nodes per retrieval step |
| damping | 0.5 | Damping factor for PPR algorithm |
| passage_node_weight | 0.05 | Weight modifier for passage nodes in PPR |

#### QA Configuration

| Parameter | Default | Description |
|-----------|---------|-------------|
| max_qa_steps | 1 | Maximum steps for interleaved retrieval and reasoning |
| qa_top_k | 5 | Top k documents fed to QA model |

#### Graph Construction Configuration

| Parameter | Default | Description |
|-----------|---------|-------------|
| synonymy_edge_topk | 2047 | K for KNN retrieval in synonymy edge building |
| synonymy_edge_sim_threshold | 0.8 | Similarity threshold for synonymy nodes |
| is_directed_graph | False | Whether the graph is directed |
| graph_type | facts_and_sim_passage_node_unidirectional | Type of graph structure |

#### Information Extraction Configuration

| Parameter | Default | Description |
|-----------|---------|-------------|
| information_extraction_model_name | openie_openai_gpt | OpenIE model class name |
| openie_mode | online | Mode: "online" or "offline" |

#### Preprocessing Configuration

| Parameter | Default | Description |
|-----------|---------|-------------|
| text_preprocessor_class_name | TextPreprocessor | Preprocessor class name |
| preprocess_encoder_name | gpt-4o | Encoder for preprocessing |
| preprocess_chunk_overlap_token_size | 128 | Overlap tokens between chunks |
| preprocess_chunk_max_token_size | None | Max tokens per chunk (None = whole doc) |
| preprocess_chunk_func | by_token | Chunking function type |

资料来源：[src/hipporag/utils/config_utils.py](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/src/hipporag/utils/config_utils.py)

## Deployment Modes

HippoRAG supports two primary deployment modes for LLM inference.

```mermaid
graph TD
    A[HippoRAG Deployment] --> B[Cloud Mode]
    A --> C[Local Mode]
    
    B --> B1[OpenAI API]
    B --> B2[OpenAI Compatible API]
    
    C --> C1[vLLM Server]
    C --> C1b[Local Embedding Model]
    
    B1 --> D[Requires OPENAI_API_KEY]
    B2 --> E[Custom LLM Base URL]
    C1 --> F[Multi-GPU Support]
```

### Cloud Mode (OpenAI)

Cloud mode uses OpenAI's API for both LLM and embedding inference.

```python
from hipporag import HippoRAG

hipporag = HippoRAG(
    save_dir='outputs',
    llm_model_name='gpt-4o-mini',
    embedding_model_name='nvidia/NV-Embed-v2'
)
```

#### OpenAI Compatible Embeddings

For OpenAI-compatible embedding endpoints:

```python
hipporag = HippoRAG(
    save_dir=save_dir,
    llm_model_name='Your LLM Model name',
    llm_base_url='Your LLM Model url',
    embedding_model_name='Your Embedding model name',
    embedding_base_url='Your Embedding model url'
)
```

资料来源：[README.md](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/README.md)

### Local Mode (vLLM)

Local mode deploys LLM servers using vLLM for offline inference with GPU acceleration.

#### Step 1: Start vLLM Server

```sh
export CUDA_VISIBLE_DEVICES=0,1
export VLLM_WORKER_MULTIPROC_METHOD=spawn
export HF_HOME=<path to Huggingface home directory>

vllm serve meta-llama/Llama-3.3-70B-Instruct \
    --tensor-parallel-size 2 \
    --max_model_len 4096 \
    --gpu-memory-utilization 0.95 \
    --port 6578
```

#### Step 2: Run HippoRAG with Different GPUs

```sh
export CUDA_VISIBLE_DEVICES=2,3
export HF_HOME=<path to Huggingface home directory>
python main.py --dataset sample --llm_base_url http://localhost:6578/v1
```

资料来源：[README.md](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/README.md)

## Quick Start Workflow

```mermaid
graph LR
    A[Install HippoRAG] --> B[Configure Environment]
    B --> C[Set Environment Variables]
    C --> D[Initialize HippoRAG]
    D --> E[index Documents]
    E --> F[RAG QA Queries]
```

### Complete Example

```python
from hipporag import HippoRAG

# Define documents
docs = [
    "Oliver Badman is a politician.",
    "George Rankin is a politician.",
    "Cinderella attended the royal ball.",
    "The prince used the lost glass slipper to search the kingdom.",
    "Erik Hort's birthplace is Montebello.",
    "Montebello is a part of Rockland County."
]

# Initialize HippoRAG
hipporag = HippoRAG(
    save_dir='outputs',
    llm_model_name='gpt-4o-mini',
    embedding_model_name='nvidia/NV-Embed-v2'
)

# Index documents
hipporag.index(docs)

# Define queries and gold standard answers
queries = [
    "What is George Rankin's occupation?",
    "How did Cinderella reach her happy ending?",
    "What county is Erik Hort's birthplace a part of?"
]

gold_docs = [
    ["George Rankin is a politician."],
    ["Cinderella attended the royal ball.",
     "The prince used the lost glass slipper to search the kingdom."],
    ["Montebello is a part of Rockland County."]
]

answers = [
    ["Politician"],
    ["By going to the ball."],
    ["Rockland County"]
]

# Run RAG QA
results = hipporag.rag_qa(
    queries=queries,
    gold_docs=gold_docs,
    gold_answers=answers
)
```

资料来源：[README.md](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/README.md)

## Testing Your Installation

### OpenAI Test

Run this test to verify cloud mode functionality:

```sh
export OPENAI_API_KEY=<your openai api key>
conda activate hipporag
python tests_openai.py
```

### Local Test

Run this test to verify local vLLM mode:

```sh
export CUDA_VISIBLE_DEVICES=0
export VLLM_WORKER_MULTIPROC_METHOD=spawn
export HF_HOME=<path to Huggingface home directory>

# Start vLLM server
vllm serve meta-llama/Llama-3.1-8B-Instruct \
    --tensor-parallel-size 2 \
    --max_model_len 4096 \
    --gpu-memory-utilization 0.95 \
    --port 6578

# Run local test
CUDA_VISIBLE=1 python tests_local.py
```

资料来源：[README.md](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/README.md)

## Troubleshooting

### Out of Memory (OOM) Errors

If you encounter OOM errors during local deployment:

1. Reduce `gpu-memory-utilization` parameter in vLLM
2. Reduce `max_model_len` in vLLM server
3. Adjust `CUDA_VISIBLE_DEVICES` to use more GPUs
4. Reduce `embedding_batch_size` in configuration

### Environment Variable Issues

Ensure all required environment variables are set before running HippoRAG:

```sh
# Verify environment variables are set
echo $CUDA_VISIBLE_DEVICES
echo $HF_HOME
echo $OPENAI_API_KEY
```

### Conda Environment

Always activate the correct conda environment before running commands:

```sh
conda activate hipporag
```

资料来源：[README.md](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/README.md)

## Reproducing Experiments

To reproduce the paper's experiments:

1. Clone the repository and install dependencies
2. Download datasets from [HuggingFace](https://huggingface.co/datasets/osunlp/HippoRAG_v2) or use provided samples in `reproduce/dataset`
3. Set required environment variables
4. Run the main script with appropriate parameters:

```sh
# OpenAI model
python main.py \
    --dataset sample \
    --llm_base_url https://api.openai.com/v1 \
    --llm_name gpt-4o-mini \
    --embedding_name nvidia/NV-Embed-v2

# Local vLLM model
python main.py \
    --dataset sample \
    --llm_base_url http://localhost:6578/v1 \
    --llm_name meta-llama/Llama-3.3-70B-Instruct \
    --embedding_name nvidia/NV-Embed-v2
```

资料来源：[README.md](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/README.md), [main.py](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/main.py)

---

<a id='page-2'></a>

## Quick Start Guide

### 相关页面

相关主题：[Installation and Setup](#page-1), [HippoRAG Core Class](#page-4)

<details>
<summary>相关源码文件</summary>

以下源码文件用于生成本页说明：

- [README.md](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/README.md)
- [demo_azure.py](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/demo_azure.py)
- [main.py](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/main.py)
- [src/hipporag/utils/config_utils.py](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/src/hipporag/utils/config_utils.py)
- [src/hipporag/HippoRAG.py](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/src/hipporag/HippoRAG.py)
- [requirements.txt](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/requirements.txt)
</details>

# Quick Start Guide

This guide provides a comprehensive walkthrough for setting up and running HippoRAG, enabling you to quickly leverage neurobiologically inspired long-term memory capabilities for Large Language Models.

## Prerequisites

Before beginning, ensure your environment meets the following requirements:

| Requirement | Specification |
|------------|---------------|
| Python | >= 3.10 |
| CUDA GPUs | Required for local embedding model inference |
| HuggingFace Home | Configured via `HF_HOME` environment variable |
| API Keys | OpenAI API key (if using OpenAI models) |

### Environment Setup

```sh
# Create conda environment
conda create -n hipporag python=3.10
conda activate hipporag

# Install HippoRAG
pip install hipporag

# Configure environment variables
export CUDA_VISIBLE_DEVICES=0,1,2,3
export HF_HOME=<path to Huggingface home directory>
export OPENAI_API_KEY=<your openai api key>
```

资料来源：[README.md:150-165]()

## Core Usage Patterns

HippoRAG supports three primary deployment configurations. The initialization workflow follows this pattern:

```mermaid
graph TD
    A[Initialize HippoRAG] --> B{Select LLM Backend}
    B -->|OpenAI| C[Set llm_model_name + llm_base_url]
    B -->|vLLM| D[Set llm_model_name + llm_base_url]
    B -->|Azure| E[Set azure_endpoint]
    A --> F{Select Embedding Backend}
    F -->|HuggingFace| G[Set embedding_model_name]
    F -->|Custom| H[Set embedding_base_url]
```

资料来源：[demo_azure.py:1-30]()

### Pattern 1: OpenAI Models

The simplest configuration uses OpenAI for both LLM inference and embeddings:

```python
from hipporag import HippoRAG

# Configuration
save_dir = 'outputs'
llm_model_name = 'gpt-4o-mini'
embedding_model_name = 'nvidia/NV-Embed-v2'

# Initialize HippoRAG instance
hipporag = HippoRAG(
    save_dir=save_dir, 
    llm_model_name=llm_model_name,
    embedding_model_name=embedding_model_name
)
```

资料来源：[README.md:175-195]()

### Pattern 2: OpenAI Compatible Embeddings

For custom LLM endpoints that follow OpenAI's API format:

```python
hipporag = HippoRAG(
    save_dir=save_dir, 
    llm_model_name='Your LLM Model name',
    llm_base_url='Your LLM Model url',
    embedding_model_name='Your Embedding model name',  
    embedding_base_url='Your Embedding model url'
)
```

资料来源：[README.md:210-220]()

### Pattern 3: Azure OpenAI Integration

For Azure-hosted models:

```python
hipporag = HippoRAG(
    save_dir=save_dir,
    llm_model_name=llm_model_name,
    embedding_model_name=embedding_model_name,
    azure_endpoint="https://[ENDPOINT NAME].openai.azure.com/openai/deployments/gpt-4o-mini/chat/completions?api-version=2025-01-01-preview",
    azure_embedding_endpoint="https://[ENDPOINT NAME].openai.azure.com/openai/deployments/text-embedding-3-small/embeddings?api-version=2023-05-15"
)
```

资料来源：[demo_azure.py:10-15]()

## Indexing Documents

The indexing process converts raw documents into HippoRAG's knowledge graph structure:

```mermaid
graph LR
    A[Raw Documents] --> B[Chunking]
    B --> C[OpenIE Extraction]
    C --> D[Embedding Generation]
    D --> E[Graph Construction]
    E --> F[Knowledge Graph Index]
```

### Input Data Format

Documents should be provided as a list of strings:

```python
docs = [
    "Oliver Badman is a politician.",
    "George Rankin is a politician.",
    "Cinderella attended the royal ball.",
    "The prince used the lost glass slipper to search the kingdom.",
]
```

### Execute Indexing

```python
hipporag.index(docs=docs)
```

资料来源：[demo_azure.py:18-45]()

## Retrieval and Question Answering

The `rag_qa` method performs retrieval-augmented question answering:

```mermaid
graph TD
    A[Query Input] --> B[Retrieval]
    B --> C[Personalized PageRank]
    C --> D[Document Selection]
    D --> E[QA Generation]
    E --> F[Final Answer]
    
    C -.->|links documents| G[Knowledge Graph]
    G -.->|context| D
```

### Complete QA Example

```python
# Prepare queries and evaluation data
queries = [
    "What is George Rankin's occupation?",
    "How did Cinderella reach her happy ending?"
]

answers = [
    ["Politician"],
    ["By going to the ball."]
]

gold_docs = [
    ["George Rankin is a politician."],
    ["Cinderella attended the royal ball.",
     "The prince used the lost glass slipper to search the kingdom.",
     "When the slipper fit perfectly, Cinderella was reunited with the prince."]
]

# Execute RAG QA
results = hipporag.rag_qa(
    queries=queries, 
    gold_docs=gold_docs,
    gold_answers=answers
)

print(results)
```

资料来源：[README.md:195-215]()

## Local Deployment with vLLM

For running LLMs locally, HippoRAG supports vLLM server integration:

### Step 1: Start vLLM Server

```sh
export CUDA_VISIBLE_DEVICES=0,1
export VLLM_WORKER_MULTIPROC_METHOD=spawn
export HF_HOME=<path to Huggingface home directory>

conda activate hipporag

# Adjust gpu-memory-utilization and max_model_len based on your GPU memory
vllm serve meta-llama/Llama-3.1-8B-Instruct \
    --tensor-parallel-size 2 \
    --max_model_len 4096 \
    --gpu-memory-utilization 0.95 \
    --port 6578
```

资料来源：[README.md:225-240]()

### Step 2: Initialize HippoRAG with vLLM

```python
hipporag = HippoRAG(
    save_dir=save_dir, 
    llm_model_name='meta-llama/Llama-3.1-8B-Instruct',
    llm_base_url='http://localhost:6578/v1',
    embedding_model_name='nvidia/NV-Embed-v2'
)
```

## Reproducing Experiments

For reproducing published experiments, follow the structured workflow:

### Dataset Structure

| File Type | Naming Convention | Purpose |
|-----------|-------------------|---------|
| Corpus | `{dataset}_corpus.json` | Document collection |
| Queries | `{dataset}.json` | Questions with answers |
| Output | `outputs/{dataset}/` | Index and results |

### Corpus JSON Format

```json
[
  {
    "title": "FIRST PASSAGE TITLE",
    "text": "FIRST PASSAGE TEXT",
    "idx": 0
  },
  {
    "title": "SECOND PASSAGE TITLE",
    "text": "SECOND PASSAGE TEXT",
    "idx": 1
  }
]
```

资料来源：[README.md:100-125]()

### Running Experiments

```sh
# Set environment variables
export CUDA_VISIBLE_DEVICES=0,1,2,3
export HF_HOME=<path to Huggingface home directory>
export OPENAI_API_KEY=<your openai api key>
conda activate hipporag

# Run with OpenAI model
dataset=sample
python main.py --dataset $dataset \
    --llm_base_url https://api.openai.com/v1 \
    --llm_name gpt-4o-mini \
    --embedding_name nvidia/NV-Embed-v2
```

资料来源：[main.py:1-35]()

## Testing Your Installation

### OpenAI Test

Verify installation with minimal OpenAI API cost:

```sh
export OPENAI_API_KEY=<your openai api key> 
conda activate hipporag
python tests_openai.py
```

### Local Test with vLLM

Test with a locally deployed model:

```sh
export CUDA_VISIBLE_DEVICES=0
export VLLM_WORKER_MULTIPROC_METHOD=spawn
export HF_HOME=<path to Huggingface home directory>

conda activate hipporag

# Start vLLM server with smaller model
vllm serve meta-llama/Llama-3.1-8B-Instruct \
    --tensor-parallel-size 2 \
    --max_model_len 4096 \
    --gpu-memory-utilization 0.95 \
    --port 6578

# Run test
CUDA_VISIBLE=1 python tests_local.py
```

资料来源：[README.md:250-280]()

## Configuration Parameters

### Core Parameters

| Parameter | Default | Description |
|-----------|---------|-------------|
| `save_dir` | `outputs` | Directory for saving all related information |
| `llm_model_name` | - | LLM model identifier |
| `llm_base_url` | - | Base URL for LLM API endpoint |
| `embedding_model_name` | `nvidia/NV-Embed-v2` | Embedding model identifier |
| `embedding_batch_size` | `16` | Batch size for embedding model |

资料来源：[src/hipporag/utils/config_utils.py:50-80]()

### Retrieval Parameters

| Parameter | Default | Description |
|-----------|---------|-------------|
| `retrieval_top_k` | `200` | Number of documents to retrieve initially |
| `linking_top_k` | `5` | Number of linked nodes at each retrieval step |
| `qa_top_k` | `5` | Number of documents fed to QA model |
| `max_qa_steps` | `1` | Maximum interleaved retrieval-reasoning steps |
| `damping` | `0.5` | Damping factor for Personalized PageRank |

资料来源：[src/hipporag/utils/config_utils.py:30-50]()

### Graph Construction Parameters

| Parameter | Default | Description |
|-----------|---------|-------------|
| `synonymy_edge_topk` | `2047` | K for KNN retrieval in synonymy edge building |
| `synonymy_edge_sim_threshold` | `0.8` | Similarity threshold for synonymy nodes |
| `graph_type` | `facts_and_sim_passage_node_unidirectional` | Type of graph structure to construct |
| `is_directed_graph` | `False` | Whether to build a directed graph |

资料来源：[src/hipporag/utils/config_utils.py:80-110]()

## Troubleshooting

### Common Issues

| Issue | Solution |
|-------|----------|
| CUDA OOM errors | Reduce `gpu-memory-utilization` or `max_model_len` in vLLM; reduce `embedding_batch_size` |
| Connection errors | Verify API endpoint URLs and network connectivity |
| Index loading failures | Check that `save_dir` contains valid index files |

### Environment Validation

Always verify your setup before running experiments:

```sh
# Verify CUDA availability
python -c "import torch; print(torch.cuda.is_available())"

# Verify package installation
pip list | grep hipporag
```

## Next Steps

- Explore the [Code Structure documentation](README.md#code-structure) for deep-dive into modules
- Review experiment reproducibility guidelines in `main.py`
- Access pre-processed datasets from the [HuggingFace dataset page](https://huggingface.co/datasets/osunlp/HippoRAG_v2)

---

<a id='page-3'></a>

## Configuration System

### 相关页面

相关主题：[Installation and Setup](#page-1), [HippoRAG Core Class](#page-4)

<details>
<summary>Related Source Files</summary>

The following source files were used to generate this documentation page:

- [src/hipporag/utils/config_utils.py](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/src/hipporag/utils/config_utils.py)
- [main.py](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/main.py)
- [setup.py](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/setup.py)
- [tests_openai.py](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/tests_openai.py)
- [test_transformers.py](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/test_transformers.py)
</details>

# Configuration System

HippoRAG provides a comprehensive configuration system built on Pydantic's data validation framework. The configuration system enables fine-grained control over all aspects of the indexing, retrieval, and QA pipeline while maintaining type safety and default values for common use cases.

## Architecture Overview

The configuration system is centered around the `BaseConfig` class defined in `config_utils.py`. This class uses Pydantic's `BaseModel` with `Field` definitions to provide structured configuration with metadata and validation.

```mermaid
graph TD
    A[BaseConfig] --> B[OpenIE Configuration]
    A --> C[Embedding Configuration]
    A --> D[Graph Construction Configuration]
    A --> E[Retrieval Configuration]
    A --> F[QA Configuration]
    A --> G[Save/Directory Configuration]
    A --> H[Dataset Configuration]
    
    I[main.py] --> A
    J[HippoRAG class] --> A
    K[StandardRAG class] --> A
```

Source: [src/hipporag/utils/config_utils.py:1-100]()

## Core Configuration Class

### BaseConfig

The `BaseConfig` class serves as the single source of truth for all pipeline parameters. It inherits from Pydantic's `BaseModel` and provides automatic validation, serialization, and documentation through field metadata.

```python
from hipporag.utils.config_utils import BaseConfig

global_config = BaseConfig(
    openie_mode='openai_gpt',
    information_extraction_model_name='gpt-4o-mini',
    embedding_model_name='nvidia/NV-Embed-v2',
    retrieval_top_k=200,
    linking_top_k=5,
    max_qa_steps=3,
    qa_top_k=5,
    graph_type="facts_and_sim_passage_node_unidirectional",
    embedding_batch_size=8
)
```

Source: [main.py:20-35]()

## Configuration Categories

### OpenIE (Open Information Extraction) Configuration

Controls the information extraction module that identifies facts and entities from passages.

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `openie_mode` | `Literal["openai_gpt", "vllm_offline", "Transformers-offline"]` | `"openai_gpt"` | The mode for information extraction model |
| `information_extraction_model_name` | `str` | `"gpt-4o-mini"` | Model name for information extraction |

The `openie_mode` parameter supports three execution modes:
- **`openai_gpt`**: Uses OpenAI's GPT models for extraction via API
- **`vllm_offline`**: Uses locally deployed LLMs through vLLM server
- **`Transformers-offline`**: Uses HuggingFace Transformers models directly

Source: [src/hipporag/utils/config_utils.py:config_fields]()

### Embedding Model Configuration

Manages embedding generation for passages and queries.

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `embedding_model_name` | `str` | `"nvidia/NV-Embed-v2"` | Name of the embedding model |
| `embedding_batch_size` | `int` | `16` | Batch size for embedding generation |
| `embedding_return_as_normalized` | `bool` | `True` | Whether to normalize embeddings |
| `embedding_max_seq_len` | `int` | `2048` | Maximum sequence length for embedding model |
| `embedding_model_dtype` | `Literal["float16", "float32", "bfloat16", "auto"]` | `"auto"` | Data type for local embedding model |
| `embedding_base_url` | `Optional[str]` | `None` | Base URL for OpenAI-compatible embedding endpoints |

Source: [src/hipporag/utils/config_utils.py:embedding_batch_size-def]()

### Graph Construction Configuration

Controls the knowledge graph construction process that forms the backbone of HippoRAG's memory system.

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `synonymy_edge_topk` | `int` | `2047` | K value for KNN retrieval in building synonymy edges |
| `synonymy_edge_query_batch_size` | `int` | `1000` | Batch size for query embeddings during KNN retrieval |
| `synonymy_edge_key_batch_size` | `int` | `10000` | Batch size for key embeddings during KNN retrieval |
| `synonymy_edge_sim_threshold` | `float` | `0.8` | Similarity threshold for including candidate synonymy nodes |
| `is_directed_graph` | `bool` | `False` | Whether the constructed graph is directed or undirected |
| `graph_type` | `str` | `"facts_and_sim_passage_node_unidirectional"` | Type of graph structure to build |

Supported `graph_type` values include:
- `facts_and_sim_passage_node_unidirectional` - Passages connected via facts with similarity edges
- `facts_and_sim_passage_node_bidirectional` - Bidirectional passage connections
- `facts_only` - Only fact-based connections
- `sim_passage_node` - Only passage similarity connections

Source: [src/hipporag/utils/config_utils.py:synonymy_edge_topk-def]()

### Retrieval Configuration

Parameters governing the retrieval and linking process using Personalized PageRank (PPR).

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `linking_top_k` | `int` | `5` | Number of linked nodes at each retrieval step |
| `retrieval_top_k` | `int` | `200` | Number of documents to retrieve at each step |
| `damping` | `float` | `0.5` | Damping factor for PPR algorithm |

The `damping` parameter controls the probability of following graph edges during the random walk in PPR. A higher value (closer to 1.0) results in more exploration, while lower values favor exploitation of high-probability paths.

Source: [src/hipporag/utils/config_utils.py:linking_top_k-def](), [main.py:28]()

### QA (Question Answering) Configuration

Controls the iterative QA process that interleaves retrieval with reasoning.

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `max_qa_steps` | `int` | `1` | Maximum steps for interleaved retrieval and reasoning |
| `qa_top_k` | `int` | `5` | Number of top documents fed to the QA model |

The `max_qa_steps` parameter enables multi-step reasoning where the system can retrieve additional documents based on intermediate reasoning results before producing the final answer.

Source: [src/hipporag/utils/config_utils.py:max_qa_steps-def](), [main.py:27]()

### LLM Configuration

Manages the language model used for QA and information extraction.

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `llm_model_name` | `str` | `"gpt-4o-mini"` | Name of the LLM |
| `llm_base_url` | `Optional[str]` | `None` | Base URL for OpenAI-compatible LLM endpoints |
| `max_new_tokens` | `Optional[int]` | `None` | Maximum new tokens for generation |

Source: [src/hipporag/utils/config_utils.py:llm_model_name-def]()

### Save and Directory Configuration

Controls output persistence and directory structure.

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `save_dir` | `str` | `"outputs"` | Top-level directory for saving all related information |
| `corpus_len` | `int` | Required | Length of the corpus being processed |

The `save_dir` parameter specifies where HippoRAG objects, intermediate results, and evaluation outputs are stored. When running with specific datasets, the default saves to a dataset-customized output directory under `save_dir`.

Source: [src/hipporag/utils/config_utils.py:save_dir-def](), [main.py:32]()

## Configuration Workflow

```mermaid
graph LR
    A[Define BaseConfig] --> B[Initialize HippoRAG]
    B --> C[Index Documents]
    C --> D[Run RAG QA]
    D --> E[Results Saved to save_dir]
    
    F[Modify Config] -->|Update| B
    G[New Documents] -->|Index| C
```

### Initialization Example

```python
from hipporag.utils.config_utils import BaseConfig
from hipporag import HippoRAG

config = BaseConfig(
    openie_mode='openai_gpt',
    information_extraction_model_name='gpt-4o-mini',
    embedding_model_name='nvidia/NV-Embed-v2',
    retrieval_top_k=200,
    linking_top_k=5,
    max_qa_steps=3,
    qa_top_k=5,
    graph_type="facts_and_sim_passage_node_unidirectional",
    embedding_batch_size=8,
    max_new_tokens=None,
    corpus_len=len(corpus),
)

hipporag = HippoRAG(global_config=config)
```

Source: [main.py:19-38]()

## Configuration for Different Execution Modes

### OpenAI API Mode

```python
config = BaseConfig(
    openie_mode='openai_gpt',
    information_extraction_model_name='gpt-4o-mini',
    llm_model_name='gpt-4o-mini',
    embedding_model_name='nvidia/NV-Embed-v2',
)
```

Source: [main.py:20-26]()

### Local vLLM Deployment Mode

```python
config = BaseConfig(
    openie_mode='vllm_offline',
    information_extraction_model_name='meta-llama/Llama-3.1-8B-Instruct',
    llm_model_name='meta-llama/Llama-3.3-70B-Instruct',
    llm_base_url='http://localhost:8000/v1',
    embedding_model_name='nvidia/NV-Embed-v2',
)
```

Source: [README.md:vllm_example]()

### Transformers Offline Mode

```python
config = BaseConfig(
    openie_mode='Transformers-offline',
    information_extraction_model_name='Transformers/Qwen/Qwen2.5-7B-Instruct',
    llm_model_name='gpt-4o-mini',
    embedding_model_name='nvidia/NV-Embed-v2',
)
```

Source: [test_transformers.py:16-20]()

## Testing with Configuration

The test suite demonstrates configuration usage across different scenarios:

```python
# tests_openai.py - Basic indexing and QA
hipporag = HippoRAG(
    save_dir=save_dir,
    llm_model_name='gpt-4o-mini',
    embedding_model_name='nvidia/NV-Embed-v2'
)

# tests_openai.py - Document deletion
hipporag.delete(docs_to_delete)

# test_transformers.py - Transformers offline mode
hipporag = HippoRAG(
    global_config=global_config,
    save_dir=save_dir,
    llm_model_name='gpt-4o-mini',
    embedding_model_name='nvidia/NV-Embed-v2',
)
```

Source: [tests_openai.py:test_structure](), [test_transformers.py:16-25]()

## Package Dependencies

The configuration system depends on the following packages specified in `setup.py`:

| Package | Version | Purpose |
|---------|---------|---------|
| `torch` | `2.5.1` | PyTorch backend for models |
| `transformers` | `4.45.2` | HuggingFace Transformers |
| `pydantic` | `2.10.4` | Data validation and settings |
| `vllm` | `0.6.6.post1` | LLM inference server |
| `openai` | `1.91.1` | OpenAI API client |
| `litellm` | `1.73.1` | Unified LLM interface |
| `gritlm` | `1.0.2` | GritLM embedding model |
| `networkx` | `3.4.2` | Graph operations |
| `python_igraph` | `0.11.8` | Graph algorithms |
| `tiktoken` | `0.7.0` | Tokenization |
| `tenacity` | `8.5.0` | Retry logic |

Source: [setup.py:14-27]()

## Best Practices

1. **Use environment variables** for sensitive configuration like API keys:
   ```bash
   export OPENAI_API_KEY=<your_openai_api_key>
   export HF_HOME=<path_to_huggingface_home>
   ```

2. **Set GPU devices** before initialization:
   ```bash
   export CUDA_VISIBLE_DEVICES=0,1,2,3
   ```

3. **Adjust batch sizes** based on available GPU memory when using local models

4. **Configure damping factor** carefully for retrieval - higher values (0.7-0.85) work better for complex multi-hop questions

5. **Set corpus_len** correctly to enable proper progress tracking and memory management

---

<a id='page-4'></a>

## HippoRAG Core Class

### 相关页面

相关主题：[Knowledge Graph and Retrieval](#page-5), [Embedding Models](#page-8)

<details>
<summary>相关源码文件</summary>

以下源码文件用于生成本页说明：

- [src/hipporag/utils/config_utils.py](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/src/hipporag/utils/config_utils.py)
- [setup.py](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/setup.py)
- [main.py](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/main.py)
- [main_dpr.py](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/main_dpr.py)
- [tests_openai.py](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/tests_openai.py)
</details>

# HippoRAG Core Class

## Overview

HippoRAG is a neurobiologically inspired graph-based Retrieval-Augmented Generation (RAG) framework designed to enable Large Language Models (LLMs) to identify and leverage connections within knowledge for improved retrieval and question answering. The project implements two primary RAG classes: `HippoRAG` (neurobiologically inspired with Personal Knowledge Graph) and `StandardRAG` (traditional DPR-based approach).

资料来源：[setup.py:8-9]()

## Architecture Overview

```mermaid
graph TB
    subgraph "Input Layer"
        Docs[Documents/Passages]
        Queries[User Queries]
    end
    
    subgraph "HippoRAG Core"
        Index[Indexing Pipeline]
        Retrieve[Retrieval Pipeline]
        QA[Question Answering]
    end
    
    subgraph "Knowledge Graph Construction"
        OpenIE[OpenIE Information Extraction]
        Embed[Embedding Model]
        GraphBuild[Graph Building]
    end
    
    subgraph "Backend Services"
        LLM[LLM Inference]
        EmbedModel[Embedding Service]
    end
    
    Docs --> Index
    Index --> OpenIE
    Index --> Embed
    OpenIE --> GraphBuild
    Embed --> GraphBuild
    GraphBuild --> KG[Knowledge Graph]
    
    Queries --> Retrieve
    Retrieve --> KG
    KG --> QA
    QA --> LLM
    Retrieve --> EmbedModel
```

## Core Classes

### HippoRAG Class

The `HippoRAG` class is the main entry point for the neurobiologically inspired RAG system. It extends a base RAG implementation with Personal Knowledge Graph (PKG) capabilities.

**Initialization Parameters**

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `save_dir` | `str` | Required | Directory to save all related information |
| `llm_model_name` | `str` | Required | LLM model identifier (e.g., `gpt-4o-mini`) |
| `embedding_model_name` | `str` | Required | Embedding model name (e.g., `nvidia/NV-Embed-v2`) |
| `global_config` | `BaseConfig` | `None` | Full configuration object |
| `llm_base_url` | `str` | `None` | Custom LLM API endpoint for OpenAI-compatible models |
| `embedding_base_url` | `str` | `None` | Custom embedding API endpoint |
| `azure_endpoint` | `str` | `None` | Azure OpenAI endpoint for LLM |
| `azure_embedding_endpoint` | `str` | `None` | Azure OpenAI endpoint for embeddings |

资料来源：[main.py:19-28]()

**Basic Usage Pattern**

```python
from hipporag import HippoRAG

hipporag = HippoRAG(
    save_dir='outputs',
    llm_model_name='gpt-4o-mini',
    embedding_model_name='nvidia/NV-Embed-v2'
)

# Index documents
hipporag.index(docs=documents_list)

# Retrieve and answer queries
results = hipporag.rag_qa(
    queries=query_list,
    gold_docs=expected_documents,
    gold_answers=expected_answers
)
```

### StandardRAG Class

The `StandardRAG` class provides traditional Dense Passage Retrieval (DPR) based RAG without the Personal Knowledge Graph components. This is useful for baseline comparisons.

资料来源：[main_dpr.py:19]()

## Configuration System

### BaseConfig Parameters

The `BaseConfig` class (defined in `src/hipporag/utils/config_utils.py`) provides comprehensive configuration options:

**OpenIE Configuration**

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `openie_mode` | `str` | Required | OpenIE mode: `OpenAI`, `vllm-offline`, or `Transformers-offline` |
| `information_extraction_model_name` | `str` | `None` | Model for offline OpenIE (e.g., `Qwen/Qwen2.5-7B-Instruct`) |

**Embedding Configuration**

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `embedding_batch_size` | `int` | `16` | Batch size for embedding model inference |
| `embedding_return_as_normalized` | `bool` | `True` | Whether to normalize embeddings |
| `embedding_max_seq_len` | `int` | `2048` | Maximum sequence length for embedding |
| `embedding_model_dtype` | `str` | `"auto"` | Data type: `float16`, `float32`, `bfloat16`, or `auto` |

**Graph Construction Configuration**

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `synonymy_edge_topk` | `int` | `2047` | K value for KNN retrieval in synonymy edge construction |
| `synonymy_edge_query_batch_size` | `int` | `1000` | Batch size for query embeddings |
| `synonymy_edge_key_batch_size` | `int` | `10000` | Batch size for key embeddings |
| `synonymy_edge_sim_threshold` | `float` | `0.8` | Similarity threshold for synonymy edges |
| `is_directed_graph` | `bool` | `False` | Whether the graph is directed |

**Retrieval Configuration**

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `retrieval_top_k` | `int` | `200` | Number of documents to retrieve initially |
| `linking_top_k` | `int` | `5` | Number of linked nodes at each retrieval step |
| `damping` | `float` | `0.5` | Damping factor for Personalized PageRank |

**QA Configuration**

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `max_qa_steps` | `int` | `1` | Maximum interleaved retrieval and reasoning steps |
| `qa_top_k` | `int` | `5` | Top k documents fed to QA model |

资料来源：[src/hipporag/utils/config_utils.py:1-80]()

## Core Methods

### Indexing Pipeline

```mermaid
graph LR
    A[Documents] --> B[Passage Embedding]
    B --> C[OpenIE Extraction]
    C --> D[Fact Node Creation]
    D --> E[Similarity Edge Building]
    E --> F[Knowledge Graph]
```

**Method Signature**

```python
def index(self, docs: List[str], **kwargs) -> None
```

The indexing process:
1. Embeds passages using the configured embedding model
2. Runs OpenIE to extract factual triples from each passage
3. Constructs fact nodes and passage nodes in the knowledge graph
4. Builds synonymy edges based on embedding similarity
5. Persists the graph structure to `save_dir`

### RAG QA Pipeline

```mermaid
graph TD
    Q[Query] --> EP[Embedding]
    EP --> PPR[Personalized PageRank]
    PPR --> LN[Linked Nodes]
    LN --> LLM[LLM Reasoning]
    LLM -->|Iteration| Check{More Steps?}
    Check -->|Yes| EP
    Check -->|No| Final[Final Answer]
```

**Method Signature**

```python
def rag_qa(
    self,
    queries: List[str],
    gold_docs: Optional[List[List[str]]] = None,
    gold_answers: Optional[List[List[str]]] = None,
    **kwargs
) -> Dict
```

**Parameters**

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `queries` | `List[str]` | Yes | List of questions to answer |
| `gold_docs` | `List[List[str]]` | No | Ground truth documents for evaluation |
| `gold_answers` | `List[List[str]]` | No | Ground truth answers for evaluation |

**Returns**

A dictionary containing evaluation metrics and retrieved results.

### Document Deletion

```python
def delete(self, docs_to_delete: List[str]) -> None
```

Removes specified documents from the knowledge graph and updates persistence.

## Supported Backend Models

### LLM Backends

| Backend | Configuration | Example Model |
|---------|---------------|---------------|
| OpenAI | `llm_model_name` | `gpt-4o-mini` |
| Azure OpenAI | `azure_endpoint` | Azure deployment URL |
| vLLM (Local) | `llm_base_url` + vLLM server | `meta-llama/Llama-3.1-8B-Instruct` |
| OpenAI-Compatible | `llm_model_name` + `llm_base_url` | Custom endpoint |

资料来源：[README.md:80-95]()

### Embedding Models

| Model Type | Configuration | Notes |
|------------|---------------|-------|
| NV-Embed-v2 | `embedding_model_name='nvidia/NV-Embed-v2'` | Recommended |
| GritLM | `embedding_model_name='GritLM'` | Supported |
| Contriever | `embedding_model_name='Contriever'` | Supported |
| Azure Embeddings | `azure_embedding_endpoint` | Via Azure OpenAI |
| Custom OpenAI-Compatible | `embedding_base_url` | Any compatible endpoint |

## OpenIE Modes

HippoRAG supports three OpenIE (Open Information Extraction) modes:

| Mode | Description | Use Case |
|------|-------------|----------|
| `OpenAI` | Uses OpenAI GPT models for extraction | Cloud-based, high quality |
| `vllm-offline` | Uses locally deployed vLLM models | GPU-equipped servers |
| `Transformers-offline` | Uses HuggingFace Transformers | CPU or limited GPU |

资料来源：[test_transformers.py:20-22]()

## Workflow Example

```python
from hipporag import HippoRAG

# Initialize
hipporag = HippoRAG(
    save_dir='outputs',
    llm_model_name='gpt-4o-mini',
    embedding_model_name='nvidia/NV-Embed-v2'
)

# Prepare data
docs = [
    "Oliver Badman is a politician.",
    "George Rankin is a politician.",
    "Cinderella attended the royal ball."
]

# Index
hipporag.index(docs=docs)

# Query
queries = ["What is George Rankin's occupation?"]
answers = [["Politician"]]
gold_docs = [["George Rankin is a politician."]]

# Retrieve and evaluate
results = hipporag.rag_qa(
    queries=queries,
    gold_docs=gold_docs,
    gold_answers=answers
)
```

## Graph Types

The framework supports configurable graph structures:

| Graph Type | Description |
|------------|-------------|
| `facts_and_sim_passage_node_unidirectional` | Facts with similarity-based passage connections (default) |

Graph edges include:
- **Fact-to-Fact edges**: Created from OpenIE extractions
- **Synonymy edges**: Based on embedding similarity above threshold
- **Passage edges**: Connect passages to their extracted facts

## Dependencies

Key package dependencies managed in `setup.py`:

| Package | Version | Purpose |
|---------|---------|---------|
| `torch` | `2.5.1` | Deep learning framework |
| `transformers` | `4.45.2` | Model architectures |
| `vllm` | `0.6.6.post1` | LLM inference |
| `openai` | `1.91.1` | OpenAI API client |
| `gritlm` | `1.0.2` | GritLM embedding model |
| `networkx` | `3.4.2` | Graph operations |
| `python_igraph` | `0.11.8` | Graph algorithms |
| `pydantic` | `2.10.4` | Configuration validation |
| `tiktoken` | `0.7.0` | Tokenization |

资料来源：[setup.py:15-30]()

## Error Handling

The framework uses `tenacity` for retry mechanisms with configurable backoff strategies when interacting with external APIs (OpenAI, Azure, vLLM).

## Persistence

All indexed data is persisted to the `save_dir` directory with the following structure:

```
save_dir/
└── {llm_model_name}_{embedding_model_name}/
    ├── knowledge_graph.pkl       # Serialized graph
    ├── passages.pkl              # Passage embeddings
    ├── fact_nodes.pkl            # Extracted facts
    └── config.json                # Configuration snapshot

---

<a id='page-5'></a>

## Knowledge Graph and Retrieval

### 相关页面

相关主题：[Embedding Store and Management](#page-6), [LLM Integrations](#page-7)

<details>
<summary>相关源码文件</summary>

以下源码文件用于生成本页说明：

- [src/hipporag/utils/embed_utils.py](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/src/hipporag/utils/embed_utils.py)
- [src/hipporag/utils/qa_utils.py](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/src/hipporag/utils/qa_utils.py)
- [src/hipporag/rerank.py](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/src/hipporag/rerank.py)
- [src/hipporag/utils/config_utils.py](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/src/hipporag/utils/config_utils.py)
- [src/hipporag/llm/__init__.py](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/src/hipporag/llm/__init__.py)
- [src/hipporag/evaluation/__init__.py](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/src/hipporag/evaluation/__init__.py)
- [src/hipporag/retrieval/__init__.py](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/src/hipporag/retrieval/__init__.py)
</details>

# Knowledge Graph and Retrieval

## Overview

HippoRAG implements a neurobiologically inspired retrieval system that combines knowledge graph construction with advanced retrieval algorithms. The system is designed to enable LLMs to identify and leverage connections within new knowledge for improved retrieval performance. 资料来源：[setup.py:8]()

The Knowledge Graph and Retrieval module forms the core of HippoRAG's architecture, providing mechanisms to:

- Extract factual knowledge from text passages using Open Information Extraction (OpenIE)
- Construct heterogeneous graphs with multiple node and edge types
- Perform personalized PageRank (PPR) based retrieval over the constructed graphs
- Support incremental updates and document deletion operations

资料来源：[src/hipporag/utils/config_utils.py:48-72]()

## Architecture

### High-Level System Design

HippoRAG's retrieval system integrates several key components working in concert to provide accurate and efficient knowledge retrieval:

```mermaid
graph TD
    A[Input Documents] --> B[OpenIE Processing]
    B --> C[Knowledge Graph Construction]
    C --> D[Embedding Generation]
    D --> E[Synonymy Edge Building]
    C --> F[Hybrid Graph]
    
    G[Query Input] --> H[Query Embedding]
    H --> I[Personalized PageRank]
    I --> F
    F --> J[Retrieval Results]
    J --> K[Reranking]
    K --> L[Final QA Output]
    
    M[LLM Inference] --> L
```

### Graph Construction Pipeline

The graph construction process transforms raw text into a structured knowledge representation:

```mermaid
graph LR
    A[Passages] --> B[OpenIE Extractor]
    B --> C[Triplets/Entities]
    C --> D[Fact Nodes]
    
    E[Passages] --> F[Embedding Model]
    F --> G[Passage Embeddings]
    G --> H[Passage Nodes]
    
    D --> I[Passage-Fact Edges]
    H --> I
    
    G --> J[Synonymy Edges]
    J --> K[knn Retrieval]
    K --> L[Similarity Threshold Filter]
    L --> M[Synonymy Edge Network]
```

## Knowledge Graph Components

### Node Types

| Node Type | Description | Attributes |
|-----------|-------------|------------|
| Passage Nodes | Represent original text passages | idx, title, text, embedding |
| Fact Nodes | Extracted facts/triplets from OpenIE | subject, predicate, object, embedding |

### Edge Types

| Edge Type | Source | Target | Purpose |
|-----------|--------|--------|---------|
| Passage-to-Fact | Passage Node | Fact Node | Links passages to their extracted facts |
| Fact-to-Fact | Fact Node | Fact Node | Connects semantically related facts |
| Synonymy | Passage Node | Passage Node | Links passages with high semantic similarity |
| Bidirectional | Both | Both | Full edge in both directions |

资料来源：[src/hipporag/utils/config_utils.py:70-85]()

### Graph Types Configuration

The system supports multiple graph configurations via the `graph_type` parameter:

| Graph Type | Description |
|------------|-------------|
| `facts_and_sim_passage_node_unidirectional` | Facts + similar passage nodes, unidirectional edges |
| `facts_and_sim_passage_node_bidirectional` | Facts + similar passage nodes, bidirectional edges |
| Custom types | Extensible graph construction patterns |

资料来源：[main.py:18]()

## Retrieval Process

### Personalized PageRank (PPR) Algorithm

HippoRAG uses Personalized PageRank for graph-based retrieval, which allows queries to propagate through the knowledge graph to identify relevant nodes.

```mermaid
graph TD
    A[Query] --> B[Query Embedding]
    B --> C[Initial PPR Scores]
    C --> D[Graph Propagation]
    D --> E{Iteration}
    E -->|Continue| F[Score Aggregation]
    F --> D
    E -->|Converge| G[Top-K Selection]
    G --> H[Linked Nodes]
    
    I[damping factor: 0.5] --> D
    J[linking_top_k: 5] --> G
```

### Retrieval Configuration Parameters

| Parameter | Default | Description |
|-----------|---------|-------------|
| `retrieval_top_k` | 200 | Number of documents retrieved at each step |
| `linking_top_k` | 5 | Number of linked nodes at each retrieval step |
| `damping` | 0.5 | Damping factor for PPR algorithm |
| `qa_top_k` | 5 | Top-k documents fed to QA model |

资料来源：[src/hipporag/utils/config_utils.py:60-72]()

### Synonymy Edge Construction

Synonymy edges connect passages with high semantic similarity, enabling cross-document retrieval:

```mermaid
graph TD
    A[All Passage Embeddings] --> B[KNN Retrieval]
    B --> C[Top-K Candidates]
    C --> D{Similarity > Threshold?}
    D -->|Yes| E[Create Synonymy Edge]
    D -->|No| F[Discard]
    E --> G[Synonymy Edge Network]
```

#### Synonymy Edge Parameters

| Parameter | Default | Description |
|-----------|---------|-------------|
| `synonymy_edge_topk` | 2047 | k for knn retrieval in building synonymy edges |
| `synonymy_edge_query_batch_size` | 1000 | Batch size for query embeddings |
| `synonymy_edge_key_batch_size` | 10000 | Batch size for key embeddings |
| `synonymy_edge_sim_threshold` | 0.8 | Similarity threshold for candidate synonymy nodes |

资料来源：[src/hipporag/utils/config_utils.py:73-85]()

## Embedding Integration

### Embedding Model Configuration

| Parameter | Default | Description |
|-----------|---------|-------------|
| `embedding_model_name` | - | Name of the embedding model |
| `embedding_batch_size` | 16 | Batch size for embedding calls |
| `embedding_return_as_normalized` | True | Whether to normalize embeddings |
| `embedding_max_seq_len` | 2048 | Maximum sequence length |
| `embedding_model_dtype` | auto | Data type for local models (float16/float32/bfloat16/auto) |

资料来源：[src/hipporag/utils/config_utils.py:40-54]()

### Supported Embedding Models

The system integrates with multiple embedding model providers:

- **NV-Embed-v2**: NVIDIA's embedding model
- **GritLM**: GritLM embedding model
- **Contriever**: Facebook's dense retriever
- **OpenAI Compatible**: Any OpenAI-compatible embedding endpoint
- **Azure OpenAI**: Azure-hosted embedding models

## Reranking Module

After initial retrieval, HippoRAG applies reranking to improve result quality. The reranking module reorders retrieved candidates using additional scoring mechanisms.

```mermaid
graph LR
    A[Retrieved Candidates] --> B[Reranker Model]
    B --> C[Relevance Scores]
    C --> D[Ranked Results]
    D --> E[Top Results]
```

资料来源：[src/hipporag/rerank.py]()

## QA Integration

### Multi-Step Retrieval and Reasoning

HippoRAG supports interleaved retrieval and reasoning with configurable steps:

| Parameter | Default | Description |
|-----------|---------|-------------|
| `max_qa_steps` | 1 | Maximum steps for interleaved retrieval and reasoning |
| `qa_top_k` | 5 | Number of documents for QA model to process |

资料来源：[src/hipporag/utils/config_utils.py:68-72]()

### QA Pipeline Flow

```mermaid
graph TD
    A[Query] --> B[QA Step 1]
    B --> C[Retrieval]
    C --> D[Read Documents]
    D --> E{More Steps Needed?}
    E -->|Yes| F[Update Context]
    F --> B
    E -->|No| G[Final Answer]
    
    H[gold_docs] --> I[Evaluation]
    I --> J[Metrics]
    J --> K[Recall, EM, F1]
```

## Data Formats

### Corpus JSON Structure

```json
[
  {
    "title": "PASSAGE TITLE",
    "text": "PASSAGE TEXT",
    "idx": 0
  }
]
```

### Query JSON Structure

```json
[
  {
    "id": "question_id",
    "question": "QUESTION TEXT",
    "answer": ["ANSWER"],
    "answerable": true,
    "paragraphs": [
      {
        "title": "SUPPORTING TITLE",
        "text": "SUPPORTING TEXT",
        "is_supporting": true,
        "idx": 0
      }
    ]
  }
]
```

## Usage Examples

### Basic Retrieval with HippoRAG

```python
from hipporag import HippoRAG

hipporag = HippoRAG(
    save_dir='outputs',
    llm_model_name='gpt-4o-mini',
    embedding_model_name='nvidia/NV-Embed-v2'
)

# Index documents
docs = [
    "Oliver Badman is a politician.",
    "George Rankin is a politician.",
    "Erik Hort's birthplace is Montebello.",
    "Montebello is a part of Rockland County."
]

hipporag.index(docs)

# Query with evaluation
queries = ["What is George Rankin's occupation?"]
gold_docs = [["George Rankin is a politician."]]
answers = [["Politician"]]

results = hipporag.rag_qa(
    queries=queries,
    gold_docs=gold_docs,
    gold_answers=answers
)
```

资料来源：[README.md:Quick_Start](), [tests_openai.py:22-60]()

### Incremental Updates

```python
# Add new documents
new_docs = [
    "Tom Hort's birthplace is Montebello.",
    "Sam Hort's birthplace is Montebello."
]
hipporag.index(docs=new_docs)

# Delete documents
docs_to_delete = [
    "Tom Hort's birthplace is Montebello.",
    "Sam Hort's birthplace is Montebello."
]
hipporag.delete(docs_to_delete)
```

资料来源：[tests_openai.py:61-82]()

## Evaluation Metrics

The retrieval system is evaluated using standard information retrieval metrics:

| Metric | Description |
|--------|-------------|
| Recall@k | Fraction of relevant documents in top-k |
| EM | Exact Match accuracy |
| F1 | Harmonic mean of precision and recall |

## Summary

The Knowledge Graph and Retrieval module in HippoRAG provides a sophisticated pipeline for:

1. **Knowledge Extraction**: Using OpenIE to extract factual triplets from text
2. **Graph Construction**: Building heterogeneous graphs with passage nodes, fact nodes, and multiple edge types
3. **Synonymy Discovery**: Creating semantic links between similar passages via embedding similarity
4. **PPR-based Retrieval**: Performing personalized PageRank for graph-aware document retrieval
5. **Reranking**: Refining retrieval results for improved accuracy
6. **Incremental Updates**: Supporting document additions and deletions

This architecture enables HippoRAG to perform complex associativity and multi-hop reasoning tasks that traditional vector similarity retrieval cannot accomplish effectively.

---

<a id='page-6'></a>

## Embedding Store and Management

### 相关页面

相关主题：[LLM Integrations](#page-7), [Embedding Models](#page-8)

<details>
<summary>相关源码文件</summary>

以下源码文件用于生成本页说明：

- [src/hipporag/embedding_store.py](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/src/hipporag/embedding_store.py)
- [src/hipporag/utils/embed_utils.py](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/src/hipporag/utils/embed_utils.py)
- [src/hipporag/utils/config_utils.py](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/src/hipporag/utils/config_utils.py)
- [src/hipporag/embedding_model/__init__.py](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/src/hipporag/embedding_model/__init__.py)
- [src/hipporag/embedding_model/base.py](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/src/hipporag/embedding_model/base.py)
</details>

# Embedding Store and Management

## Overview

The Embedding Store and Management system in HippoRAG provides a unified interface for encoding text passages into vector embeddings, managing these embeddings throughout the indexing and retrieval lifecycle, and supporting multiple embedding model backends including NVIDIA NV-Embed-v2, GritLM, and Contriever. The system is designed to handle batch processing of documents with configurable parameters for sequence length, data type precision, and normalization behavior.

HippoRAG's embedding management is tightly integrated with the knowledge graph construction process, where embeddings serve dual purposes: enabling semantic similarity search for passage linking and powering the retrieval phase through Personalized PageRank (PPR) algorithms. The embedding store abstracts away the underlying model implementation details, allowing the framework to switch between different embedding providers without changing the core indexing and retrieval logic.

资料来源：[src/hipporag/utils/config_utils.py:1-50]()

## Architecture

### High-Level Components

The embedding system consists of three primary layers that work together to provide embedding services throughout the HippoRAG pipeline.

The **Model Layer** contains implementations for specific embedding models, each inheriting from a common base class that enforces a consistent interface. Currently supported models include NV-Embed-v2, GritLM, and Contriever, with the architecture supporting easy extension to additional models. Each model implementation handles the specific requirements of its underlying transformer architecture, including tokenizer configuration, padding strategies, and model-specific inference optimizations.

The **Utility Layer** provides helper functions for common embedding operations such as batch processing, embedding normalization, and similarity computation. These utilities ensure consistent handling of embeddings across different contexts and help optimize memory usage during large-scale indexing operations.

The **Configuration Layer** defines the parameters that control embedding behavior, including batch sizes, sequence length limits, and model-specific settings. This layer connects the embedding system to HippoRAG's global configuration management, allowing users to customize embedding behavior without modifying code.

```mermaid
graph TD
    A[Documents] --> B[Embedding Store]
    B --> C[Model Layer<br/>NV-Embed-v2<br/>GritLM<br/>Contriever]
    B --> D[Utility Layer<br/>Batch Processing<br/>Normalization]
    C --> E[Vector Storage]
    D --> E
    E --> F[Graph Construction]
    E --> G[Retrieval Phase]
```

资料来源：[src/hipporag/embedding_store.py:1-30]()

### Data Flow

During the **indexing phase**, documents are first processed by the embedding store to generate passage vectors. These vectors are stored alongside the passage metadata and serve as the foundation for graph construction. The embedding store processes passages in configurable batch sizes to balance memory usage and throughput, with the default batch size set to 16 documents per batch.

During the **retrieval phase**, incoming queries are encoded using the same embedding model to produce a query vector. This query vector is then used for similarity computation against the indexed passage vectors, enabling semantic matching between the query intent and stored knowledge. The retrieval system can perform k-nearest neighbor (kNN) searches over the embedding space to identify candidate passages for further processing.

```mermaid
graph LR
    A[Indexing Flow] --> B[Input Documents]
    B --> C[Batch Processing<br/>batch_size=16]
    C --> D[Embedding Encoding]
    D --> E[Normalized Vectors]
    E --> F[Vector Storage]
    
    G[Retrieval Flow] --> H[Query Text]
    H --> I[Query Encoding]
    I --> J[Similarity Search]
    J --> K[kNN Retrieval<br/>top-k candidates]
    K --> L[Ranked Passages]
```

资料来源：[src/hipporag/utils/embed_utils.py:1-25]()

## Configuration Parameters

The embedding system is controlled through several configuration parameters defined in the global configuration structure. These parameters allow fine-tuning of embedding behavior for different hardware configurations and use cases.

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `embedding_batch_size` | int | 16 | Number of documents processed in each embedding batch |
| `embedding_return_as_normalized` | bool | true | Whether to L2-normalize output embeddings |
| `embedding_max_seq_len` | int | 2048 | Maximum sequence length in tokens for the embedding model |
| `embedding_model_dtype` | Literal | "auto" | Data type for local embedding models: float16, float32, bfloat16, or auto |
| `embedding_model_name` | str | varies | Identifier for the embedding model (e.g., "nvidia/NV-Embed-v2") |
| `embedding_base_url` | str | None | Base URL for OpenAI-compatible embedding endpoints |
| `synonymy_edge_topk` | int | 2047 | k value for kNN retrieval when building synonymy edges |
| `synonymy_edge_sim_threshold` | float | 0.8 | Minimum similarity threshold for synonymy edge candidates |

资料来源：[src/hipporag/utils/config_utils.py:15-40]()

## Embedding Model Interface

### Base Class Contract

All embedding models must inherit from `BaseEmbeddingModel`, which defines the core interface that HippoRAG expects. The base class enforces implementation of the `__call__` method that accepts text inputs and returns embeddings, ensuring polymorphism across different model implementations.

The base class also defines the `EmbeddingConfig` dataclass that encapsulates model-specific settings. This configuration includes the model name, batch size, maximum sequence length, and data type settings. The configuration object is passed to the embedding model during initialization and can be modified to adjust model behavior without recreating the model instance.

### Supported Models

**NV-Embed-v2** is the primary embedding model recommended for production use, developed by NVIDIA. It provides high-quality sentence embeddings optimized for retrieval tasks. The model is accessed through HuggingFace and supports automatic device placement based on available GPU resources.

**GritLM** provides an alternative embedding approach that combines retrieval and generation capabilities. It can serve both as an embedding model and as a decoder for generation tasks, offering flexibility in deployment configurations.

**Contriever** is an open-source bi-encoder model for dense retrieval, useful for scenarios requiring a completely open-source embedding solution without proprietary dependencies.

资料来源：[src/hipporag/embedding_model/__init__.py:1-20]()

## Embedding Store API

### Initialization

The embedding store is typically instantiated through the main HippoRAG class rather than directly. When creating a HippoRAG instance, the embedding model name and optional endpoint configuration are passed as parameters:

```python
hipporag = HippoRAG(
    save_dir="outputs",
    llm_model_name="gpt-4o-mini",
    embedding_model_name="nvidia/NV-Embed-v2"
)
```

For OpenAI-compatible embedding endpoints, the base URL can be specified:

```python
hipporag = HippoRAG(
    save_dir="outputs",
    llm_model_name="gpt-4o-mini",
    embedding_model_name="text-embedding-3-small",
    embedding_base_url="https://api.openai.com/v1"
)
```

资料来源：[README.md:1-50]()

### Encoding Operations

The embedding store provides batch encoding capabilities for processing multiple documents efficiently. The encoding operation returns normalized embeddings by default, which is required for proper similarity computation during retrieval. The normalization is L2 normalization, ensuring that all embedding vectors have unit length.

For Azure OpenAI deployments, specialized endpoint parameters are supported:

```python
hipporag = HippoRAG(
    save_dir="save_dir",
    llm_model_name="gpt-4o-mini",
    embedding_model_name="text-embedding-3-small",
    azure_endpoint="https://[ENDPOINT].openai.azure.com/...",
    azure_embedding_endpoint="https://[ENDPOINT].openai.azure.com/..."
)
```

资料来源：[demo_azure.py:1-30]()

## Integration with Knowledge Graph

The embedding system plays a critical role in HippoRAG's knowledge graph construction phase. After passages are indexed and encoded, the embeddings are used for two key graph-related operations.

**Synonymy Edge Construction** uses embeddings to identify semantically similar passage pairs that should be connected in the knowledge graph. The system performs k-nearest neighbor searches over the passage embedding space, where the `synonymy_edge_topk` parameter controls how many candidates are considered for each passage. The `synonymy_edge_sim_threshold` parameter filters these candidates, with only pairs exceeding the similarity threshold being connected as synonymy edges.

**Retrieval-Graph Linking** during the PPR retrieval process uses passage embeddings to establish the connection between the query and the knowledge graph. The query embedding enables the system to identify the most relevant starting nodes in the graph for the random walk algorithm.

资料来源：[src/hipporag/utils/config_utils.py:30-45]()

## Memory Management and Optimization

### Batch Processing Strategy

The embedding store implements batch processing to optimize GPU memory utilization and throughput. The batch size is configurable via `embedding_batch_size` with a default of 16, meaning 16 documents are processed simultaneously during encoding. For systems with larger GPU memory, increasing this value can significantly improve indexing performance.

The system also supports separate batch sizes for the synonymy edge construction phase. The `synonymy_edge_query_batch_size` (default 1000) controls how many passage embeddings are queried at once during kNN search, while `synonymy_edge_key_batch_size` (default 10000) controls the key batch size for the search index.

### Data Type Selection

The `embedding_model_dtype` parameter allows selection of the precision for local embedding models. The "auto" setting allows the system to select an appropriate default based on the hardware and model. Available options include float16 for memory-constrained environments, float32 for maximum precision, and bfloat16 which offers a good balance of range and memory efficiency on newer GPUs.

资料来源：[src/hipporag/utils/config_utils.py:25-35]()

## Error Handling and Resilience

The embedding system is designed with error handling patterns compatible with HippoRAG's overall resilience strategy. Batch processing allows partial failures to be identified and retried without losing all progress. The configuration system supports specifying fallback models or endpoints for production deployments requiring high availability.

Tenacity is used for retry logic in the embedding utilities, ensuring transient network failures or temporary service unavailability do not cause complete pipeline failures. This is particularly important when using remote embedding endpoints that may experience temporary connectivity issues.

资料来源：[setup.py:1-30]()

## Performance Considerations

When optimizing HippoRAG for production deployment, the embedding configuration should be tuned based on the available hardware and expected workload characteristics. The primary tuning parameters include batch size for indexing throughput, sequence length limits for handling long documents, and data type selection for memory-constrained environments.

For maximum retrieval quality, the default normalization behavior should be maintained as it ensures consistent similarity computation across the retrieval pipeline. Disabling normalization may lead to suboptimal retrieval results as the similarity metrics assume unit-normalized vectors.

资料来源：[src/hipporag/utils/config_utils.py:18-22]()

## Related Components

The embedding system interacts closely with several other HippoRAG components. The Information Extraction module uses embeddings for processing extracted facts, the retrieval module depends on embeddings for kNN search and PPR initialization, and the evaluation module uses embeddings for computing retrieval metrics such as recall and MRR.

The embedding model implementations in `src/hipporag/embedding_model/` follow a consistent interface defined in `base.py`, allowing the embedding store to work with any model that adheres to this contract.

---

<a id='page-7'></a>

## LLM Integrations

### 相关页面

相关主题：[Embedding Models](#page-8), [Deployment Options](#page-10)

<details>
<summary>Relevant Source Files</summary>

以下源码文件用于生成本页说明：

- [src/hipporag/llm/base.py](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/src/hipporag/llm/base.py)
- [src/hipporag/llm/openai_gpt.py](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/src/hipporag/llm/openai_gpt.py)
- [src/hipporag/llm/vllm_offline.py](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/src/hipporag/llm/vllm_offline.py)
- [src/hipporag/llm/bedrock_llm.py](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/src/hipporag/llm/bedrock_llm.py)
</details>

# LLM Integrations

HippoRAG provides a flexible, pluggable architecture for integrating various Large Language Model (LLM) providers. This modular design enables the framework to support multiple inference backends including OpenAI, vLLM for local deployment, and AWS Bedrock, allowing researchers and developers to choose the most appropriate LLM backend for their specific use case and infrastructure requirements.

## Architecture Overview

The LLM integration system follows a strategy pattern where a base abstract class defines the interface contract, and concrete implementations handle provider-specific details. This design ensures that the core HippoRAG logic remains independent of any particular LLM vendor while maintaining the ability to leverage specialized features offered by different providers.

```mermaid
graph TD
    A[HippoRAG Core] --> B[LLM Base Class]
    B --> C[OpenAIGPT]
    B --> D[VLLMOffline]
    B --> E[BedrockLLM]
    B --> F[Custom LLM Adapter]
    
    C --> G[OpenAI API]
    D --> H[Local vLLM Server]
    E --> I[AWS Bedrock]
```

The `BaseLLM` abstract class in `src/hipporag/llm/base.py` defines the common interface that all LLM adapters must implement, ensuring consistent behavior across different providers.

## Supported LLM Providers

### OpenAI Models

HippoRAG supports all OpenAI chat completion models through the `OpenAIGPT` class. This integration allows users to leverage the GPT family of models for both information extraction and question answering tasks.

**Configuration Parameters:**

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `model_name` | string | required | OpenAI model identifier (e.g., `gpt-4o-mini`, `gpt-4o`) |
| `api_key` | string | env `OPENAI_API_KEY` | OpenAI API authentication key |
| `base_url` | string | `https://api.openai.com/v1` | API endpoint base URL |
| `max_tokens` | int | `None` | Maximum tokens in generated response |
| `temperature` | float | `0.0` | Sampling temperature for generation |

**Usage Example:**

```python
from hipporag import HippoRAG

hipporag = HippoRAG(
    save_dir='outputs',
    llm_model_name='gpt-4o-mini',
    embedding_model_name='nvidia/NV-Embed-v2'
)
```

资料来源：[README.md:67-72](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/README.md)

### vLLM Local Deployment

For scenarios requiring local inference, HippoRAG supports vLLM-deployed models through the `VLLMOffline` class. This approach is particularly useful for privacy-sensitive applications, cost reduction at scale, or when working with custom fine-tuned models.

**Server Setup:**

```bash
export CUDA_VISIBLE_DEVICES=0,1
export VLLM_WORKER_MULTIPROC_METHOD=spawn
export HF_HOME=<path to Huggingface home directory>

vllm serve meta-llama/Llama-3.1-8B-Instruct \
    --tensor-parallel-size 2 \
    --max_model_len 4096 \
    --gpu-memory-utilization 0.95 \
    --port 6578
```

资料来源：[README.md:93-101](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/README.md)

**Configuration Parameters:**

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `model_name` | string | required | Model identifier for vLLM server |
| `base_url` | string | required | vLLM server endpoint URL |
| `openie_mode` | string | `"online"` | Mode for OpenIE processing (`online` or `offline`) |
| `max_tokens` | int | `None` | Maximum tokens in generated response |
| `temperature` | float | `0.0` | Sampling temperature for generation |

**Offline Mode for OpenIE:**

The vLLM integration supports an offline mode where OpenIE extraction runs separately from the main pipeline. This is useful for debugging or when OpenIE results can be cached and reused.

```python
python main.py \
    --dataset sample \
    --llm_name meta-llama/Llama-3.3-70B-Instruct \
    --openie_mode offline \
    --skip_graph
```

资料来源：[README.md:130-135](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/README.md)

### AWS Bedrock

HippoRAG integrates with AWS Bedrock through the `BedrockLLM` class, enabling access to various foundation models hosted on AWS infrastructure. This integration is designed for enterprise deployments requiring scalable, managed LLM services.

**Configuration Parameters:**

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `model_name` | string | required | Bedrock model identifier |
| `aws_region` | string | `"us-east-1"` | AWS region for Bedrock endpoint |
| `max_tokens` | int | `None` | Maximum tokens in generated response |
| `temperature` | float | `0.0` | Sampling temperature for generation |

### Azure OpenAI

For enterprise users with Azure OpenAI deployments, HippoRAG provides direct integration with Azure endpoints.

**Configuration Example:**

```python
hipporag = HippoRAG(
    save_dir=save_dir,
    llm_model_name='gpt-4o-mini',
    embedding_model_name='embedding-model-name',
    azure_endpoint="https://[ENDPOINT NAME].openai.azure.com/openai/deployments/gpt-4o-mini/chat/completions?api-version=2025-01-01-preview",
    azure_embedding_endpoint="https://[ENDPOINT NAME].openai.azure.com/openai/deployments/text-embedding-3-small/embeddings?api-version=2023-05-15"
)
```

资料来源：[demo_azure.py:16-21](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/demo_azure.py)

## Base LLM Interface

All LLM adapters inherit from the `BaseLLM` abstract class, which defines the core contract for LLM interactions.

```mermaid
classDiagram
    class BaseLLM {
        <<abstract>>
        +generate(prompt: str) str
        +batch_generate(prompts: List[str]) List[str]
        +get_model_name() str
    }
    
    class OpenAIGPT {
        +generate(prompt: str) str
        +batch_generate(prompts: List[str]) List[str]
    }
    
    class VLLMOffline {
        +generate(prompt: str) str
        +batch_generate(prompts: List[str]) List[str]
    }
    
    class BedrockLLM {
        +generate(prompt: str) str
        +batch_generate(prompts: List[str]) List[str]
    }
    
    BaseLLM <|-- OpenAIGPT
    BaseLLM <|-- VLLMOffline
    BaseLLM <|-- BedrockLLM
```

**Core Methods:**

| Method | Parameters | Return Type | Description |
|--------|------------|-------------|-------------|
| `generate` | `prompt: str` | `str` | Generate a single response from a prompt |
| `batch_generate` | `prompts: List[str]` | `List[str]` | Generate responses for multiple prompts in batch |
| `get_model_name` | None | `str` | Return the configured model identifier |

## OpenIE Integration

Open Information Extraction (OpenIE) is a critical component of HippoRAG's knowledge graph construction pipeline. The LLM integration system supports multiple OpenIE modes to accommodate different deployment scenarios.

```mermaid
graph LR
    A[Documents] --> B{HippoRAG}
    B --> C{OpenIE Mode}
    
    C -->|online| D[Real-time OpenIE]
    C -->|offline| E[Cached OpenIE Results]
    
    D --> F[OpenIE with LLM]
    E --> G[Load from JSON]
    
    F --> H[Knowledge Graph]
    G --> H
```

**OpenIE Implementation Classes:**

| Class | Provider | Use Case |
|-------|----------|----------|
| `OpenAI_GPT` | OpenAI API | Cloud-based OpenIE extraction |
| `VLLM_Offline` | Local vLLM | Private/onsite OpenIE extraction |

资料来源：[README.md:47-48](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/README.md)

## Configuration Schema

The LLM integration configuration is defined through the `HippoRAGConfig` class, which validates and manages all LLM-related settings.

**Configuration Fields:**

| Field | Type | Default | Description |
|-------|------|---------|-------------|
| `llm_name` | string | required | LLM model identifier |
| `llm_base_url` | string | `None` | Base URL for LLM API endpoint |
| `llm_max_tokens` | int | `None` | Maximum tokens per generation |
| `llm_temperature` | float | `0.0` | Sampling temperature |
| `openie_mode` | string | `"online"` | OpenIE processing mode |
| `skip_graph` | bool | `False` | Skip graph construction step |

资料来源：[main.py:18-26](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/main.py)

## Workflow Integration

The following diagram illustrates how LLM integrations fit into the HippoRAG indexing and retrieval pipeline:

```mermaid
graph TD
    subgraph Indexing
        A1[Input Documents] --> A2[Chunking]
        A2 --> A3[Embedding Generation]
        A3 --> A4[OpenIE with LLM]
        A4 --> A5[Knowledge Graph Construction]
        A5 --> A6[Graph Indexing]
    end
    
    subgraph Retrieval & QA
        B1[User Query] --> B2[Query Embedding]
        B2 --> B3[Graph Traversal]
        B3 --> B4[LLM for Answer Synthesis]
        B4 --> B5[Final Answer]
    end
    
    A4 -.->|Uses| LLM1[LLM Adapter]
    B4 -.->|Uses| LLM1
```

## Environment Variables

Proper configuration of environment variables is essential for LLM integrations to function correctly.

| Variable | Required | Description |
|----------|----------|-------------|
| `OPENAI_API_KEY` | For OpenAI | OpenAI API authentication key |
| `HF_HOME` | For vLLM | Hugging Face cache directory |
| `CUDA_VISIBLE_DEVICES` | For GPU | Comma-separated GPU device IDs |
| `AWS_ACCESS_KEY_ID` | For Bedrock | AWS access credentials |
| `AWS_SECRET_ACCESS_KEY` | For Bedrock | AWS secret credentials |

资料来源：[README.md:58-66](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/README.md)

## Testing LLM Integrations

HippoRAG provides dedicated test scripts to verify LLM integration functionality.

### OpenAI Test

```bash
export OPENAI_API_KEY=<your-api-key>
conda activate hipporag
python tests_openai.py
```

### Local vLLM Test

```bash
# Terminal 1: Start vLLM server
export CUDA_VISIBLE_DEVICES=0
vllm serve meta-llama/Llama-3.1-8B-Instruct --port 6578

# Terminal 2: Run test
CUDA_VISIBLE_DEVICES=1 python tests_local.py
```

资料来源：[README.md:137-148](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/README.md)

## Error Handling and Retries

The LLM integrations leverage the `tenacity` library for automatic retry behavior with exponential backoff. This ensures robust operation when dealing with network issues or rate limiting from LLM providers.

Configuration options for retry behavior:

| Parameter | Default | Description |
|-----------|---------|-------------|
| `max_attempts` | 3 | Maximum number of retry attempts |
| `wait_exponential_multiplier` | 1000 | Initial wait time in milliseconds |
| `wait_exponential_max` | 10000 | Maximum wait time in milliseconds |

## Extending LLM Support

To add support for a new LLM provider, implement a new class that inherits from `BaseLLM` and implements the required abstract methods:

```python
from hipporag.llm.base import BaseLLM

class CustomLLM(BaseLLM):
    def __init__(self, model_name: str, **kwargs):
        self.model_name = model_name
        # Initialize provider-specific client
        
    def generate(self, prompt: str) -> str:
        # Implement generation logic
        pass
        
    def batch_generate(self, prompts: List[str]) -> List[str]:
        # Implement batch generation
        pass
        
    def get_model_name(self) -> str:
        return self.model_name
```

## Performance Considerations

When selecting and configuring LLM integrations, consider the following factors:

1. **Latency**: OpenAI APIs typically offer lower latency for small workloads, while vLLM provides better performance for high-throughput scenarios
2. **Cost**: Local vLLM deployment eliminates API costs but requires GPU infrastructure
3. **Privacy**: For sensitive data, local deployment via vLLM or Bedrock private endpoints is recommended
4. **Model Size**: Larger models (e.g., Llama-3.3-70B) require more GPU memory but often provide better extraction quality

---

<a id='page-8'></a>

## Embedding Models

### 相关页面

相关主题：[Embedding Store and Management](#page-6), [LLM Integrations](#page-7)

<details>
<summary>相关源码文件</summary>

以下源码文件用于生成本页说明：

- [src/hipporag/embedding_model/base.py](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/src/hipporag/embedding_model/base.py)
- [src/hipporag/embedding_model/NVEmbedV2.py](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/src/hipporag/embedding_model/NVEmbedV2.py)
- [src/hipporag/embedding_model/GritLM.py](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/src/hipporag/embedding_model/GritLM.py)
- [src/hipporag/embedding_model/Transformers.py](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/src/hipporag/embedding_model/Transformers.py)
- [src/hipporag/embedding_model/VLLM.py](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/src/hipporag/embedding_model/__init__.py)
- [src/hipporag/utils/config_utils.py](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/src/hipporag/utils/config_utils.py)
</details>

# Embedding Models

HippoRAG provides a flexible, modular embedding model system that supports multiple embedding backends including NVIDIA's NV-Embed-v2, GritLM, HuggingFace Transformers, and vLLM endpoints. This modular architecture enables the system to generate high-quality text embeddings for both passage encoding and query understanding in the retrieval pipeline.

## Architecture Overview

The embedding model subsystem follows a base class pattern with specialized implementations. All embedding models inherit from `BaseEmbeddingModel` which defines the common interface and configuration schema.

```mermaid
graph TD
    A[HippoRAG Core] --> B[Embedding Model Factory]
    B --> C[BaseEmbeddingModel]
    C --> D[NVEmbedV2]
    C --> E[GritLM]
    C --> F[TransformersEmbeddingModel]
    C --> G[VLLMEmbeddingModel]
```

The factory pattern in `__init__.py` dynamically instantiates the appropriate embedding model based on the model name prefix:

| Prefix | Model Class | Backend |
|--------|-------------|---------|
| `nvidia/NV-Embed-v2` | `NVEmbedV2` | HuggingFace |
| `GritLM` | `GritLM` | GritLM library |
| `Transformers/` | `TransformersEmbeddingModel` | SentenceTransformers |
| `VLLM/` | `VLLMEmbeddingModel` | vLLM endpoints |

资料来源：[src/hipporag/embedding_model/__init__.py](src/hipporag/embedding_model/__init__.py)

## Base Configuration

The `BaseEmbeddingModel` and `EmbeddingConfig` classes define the configuration schema used across all embedding implementations. Configuration parameters include:

| Parameter | Default | Description |
|-----------|---------|-------------|
| `embedding_batch_size` | 16 | Batch size for encoding operations |
| `embedding_return_as_normalized` | True | Whether to normalize output embeddings |
| `embedding_max_seq_len` | 2048 | Maximum sequence length for tokenization |
| `embedding_model_dtype` | "auto" | Data type: float16, float32, bfloat16, or auto |

资料来源：[src/hipporag/utils/config_utils.py:16-35](src/hipporag/utils/config_utils.py)

## Available Embedding Models

### NV-Embed-v2

The `NVEmbedV2` class provides integration with NVIDIA's NV-Embed-v2 embedding model, a high-performance encoder optimized for retrieval tasks.

```python
class NVEmbedV2(BaseEmbeddingModel):
    def __init__(self, global_config: BaseConfig, embedding_model_name: str) -> None:
        super().__init__(global_config=global_config)
        # Model initialization with HuggingFace transformers
```

资料来源：[src/hipporag/embedding_model/NVEmbedV2.py](src/hipporag/embedding_model/NVEmbedV2.py)

### GritLM

The `GritLM` class wraps the GritLM library for generating embeddings with built-in instruction-following capabilities.

```python
class GritLM(BaseEmbeddingModel):
    def __init__(self, global_config: BaseConfig, embedding_model_name: str) -> None:
        super().__init__(global_config=global_config)
        # GritLM-specific initialization
```

资料来源：[src/hipporag/embedding_model/GritLM.py](src/hipporag/embedding_model/GritLM.py)

### Transformers (SentenceTransformers)

The `TransformersEmbeddingModel` class enables using any model from the HuggingFace ecosystem via the SentenceTransformers library. Select this implementation by using `embedding_model_name` that starts with `"Transformers/"`.

```python
class TransformersEmbeddingModel(BaseEmbeddingModel):
    """
    To select this implementation you can initialise HippoRAG with:
        embedding_model_name starts with "Transformers/"
    """
    def __init__(self, global_config: BaseConfig, embedding_model_name: str) -> None:
        super().__init__(global_config=global_config)
        self.model_id = embedding_model_name[len("Transformers/"):]
        self.batch_size = 64
        self.model = SentenceTransformer(
            self.model_id, 
            device="cuda" if torch.cuda.is_available() else "cpu"
        )
```

Key characteristics:
- Automatically detects CUDA availability for GPU acceleration
- Uses batch size of 64 for efficient processing
- Extracts model ID by removing the `"Transformers/"` prefix

资料来源：[src/hipporag/embedding_model/Transformers.py:1-40](src/hipporag/embedding_model/Transformers.py)

### VLLM (Endpoint-based)

The `VLLMEmbeddingModel` class provides integration with OpenAI-compatible vLLM embedding endpoints. Select this implementation by using `embedding_model_name` that starts with `"VLLM/"`.

```python
class VLLMEmbeddingModel(BaseEmbeddingModel):
    """
    To select this implementation you can initialise HippoRAG with:
        embedding_model_name starts with "VLLM/"
    The embedding base url should contain the v1/embeddings.
    """
    def __init__(self, global_config: BaseConfig, embedding_model_name: str) -> None:
        super().__init__(global_config=global_config)
        self.model_id = embedding_model_name[len("VLLM/"):]
        self.batch_size = 32
        self.url = global_config.embedding_base_url
```

The model communicates with the endpoint using the OpenAI embeddings API format:

```python
payload = {
    "model": self.model_id,
    "input": input_text,
}
response = requests.post(self.base_url, headers=headers, json=payload)
```

资料来源：[src/hipporag/embedding_model/VLLM.py:1-50](src/hipporag/embedding_model/VLLM.py)

## Query Instructions

Embedding models support query instruction templates for improving retrieval relevance. The system uses instructions for mapping queries to facts and passages:

```python
self.search_query_instr = set([
    get_query_instruction('query_to_fact'),
    get_query_instruction('query_to_passage')
])
```

资料来源：[src/hipporag/embedding_model/Transformers.py:23-27](src/hipporag/embedding_model/Transformers.py)

## Usage Patterns

### Quick Start with OpenAI-style Models

```python
hipporag = HippoRAG(
    save_dir=save_dir,
    llm_model_name='gpt-4o-mini',
    llm_base_url='https://api.openai.com/v1',
    embedding_model_name='nvidia/NV-Embed-v2',
    embedding_base_url='https://api.openai.com/v1'
)
```

### Using Custom Endpoints

```python
hipporag = HippoRAG(
    save_dir=save_dir,
    llm_model_name='Your LLM Model name',
    llm_base_url='Your LLM Model url',
    embedding_model_name='Your Embedding model name',
    embedding_base_url='Your Embedding model url'
)
```

### Using vLLM Local Deployment

```python
# Start vLLM server
vllm serve meta-llama/Llama-3.1-8B-Instruct --tensor-parallel-size 2

# Configure with VLLM prefix
hipporag = HippoRAG(
    save_dir=save_dir,
    llm_model_name='...',
    embedding_model_name='VLLM/your-model-name',
    embedding_base_url='http://localhost:8000/v1/embeddings'
)
```

## Dependencies

The embedding model system depends on the following packages:

| Package | Version | Purpose |
|---------|---------|---------|
| `transformers` | 4.45.2 | Core model loading |
| `sentence-transformers` | (via Transformers) | Sentence encoding |
| `gritlm` | 1.0.2 | GritLM embeddings |
| `torch` | 2.5.1 | GPU acceleration |
| `einops` | (latest) | Tensor operations |

资料来源：[setup.py:19-32](setup.py)

## Configuration Parameters Summary

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `embedding_batch_size` | int | 16 | Batch size for embedding inference |
| `embedding_return_as_normalized` | bool | True | L2 normalize embeddings |
| `embedding_max_seq_len` | int | 2048 | Maximum token sequence length |
| `embedding_model_dtype` | str | "auto" | Model precision (float16/float32/bfloat16/auto) |

资料来源：[src/hipporag/utils/config_utils.py:16-29](src/hipporag/utils/config_utils.py)

---

<a id='page-9'></a>

## Open Information Extraction (OpenIE)

### 相关页面

相关主题：[Knowledge Graph and Retrieval](#page-5), [LLM Integrations](#page-7)

<details>
<summary>Relevant Source Files</summary>

以下源码文件用于生成本页说明：

- [setup.py](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/setup.py)
- [README.md](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/README.md)
- [src/hipporag/utils/config_utils.py](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/src/hipporag/utils/config_utils.py)
- [main.py](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/main.py)
- [src/hipporag/prompts/templates/triple_extraction.py](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/src/hipporag/prompts/templates/triple_extraction.py)
- [src/hipporag/prompts/templates/ner.py](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/src/hipporag/prompts/templates/ner.py)

</details>

# Open Information Extraction (OpenIE)

## Overview

Open Information Extraction (OpenIE) is a critical component in the HippoRAG pipeline that enables the extraction of structured knowledge triples from unstructured text. The system extracts **entities**, **relations**, and **triples** from passages to construct a knowledge graph that mimics hippocampal memory formation in biological systems.

In HippoRAG, OpenIE serves as the foundation for building the associative memory graph. Extracted triples form **fact nodes** in the knowledge graph, enabling parametric nearest neighbor (PPR) retrieval that connects related information across documents.

资料来源：[README.md](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/README.md)

## Architecture

The OpenIE system in HippoRAG supports multiple deployment modes and LLM backends:

```mermaid
graph TD
    A[Unstructured Text] --> B[Information Extraction Module]
    B --> C{openie_mode}
    C -->|online| D[OpenAI GPT]
    C -->|offline| E[vLLM Offline]
    D --> F[Triple Extraction]
    E --> F
    F --> G[NER Processing]
    G --> H[Knowledge Triples]
    H --> I[Knowledge Graph Construction]
```

### Module Structure

| Module | File | Purpose |
|--------|------|---------|
| Base Interface | `information_extraction/__init__.py` | Exports model classes |
| OpenAI Integration | `openie_openai_gpt.py` | Online OpenIE via OpenAI API |
| vLLM Offline | `openie_vllm_offline.py` | Offline batch processing with vLLM |
| Triple Extraction Prompt | `prompts/templates/triple_extraction.py` | LLM prompt for triple extraction |
| NER Prompt | `prompts/templates/ner.py` | LLM prompt for named entity recognition |

资料来源：[README.md - Code Structure](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/README.md)

## Configuration

### ConfigUtils Class Parameters

The `InformationExtractionConfig` dataclass provides the following configuration options:

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `information_extraction_model_name` | `Literal["openie_openai_gpt"]` | `"openie_openai_gpt"` | Class name indicating which information extraction model to use |
| `openie_mode` | `Literal["offline", "online"]` | `"online"` | Mode of the OpenIE model: `online` uses OpenAI API, `offline` uses vLLM batch processing |
| `skip_graph` | `bool` | `False` | Whether to skip graph construction. Set to `True` when running vLLM offline indexing for the first time |

资料来源：[src/hipporag/utils/config_utils.py](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/src/hipporag/utils/config_utils.py)

### Main Entry Point Configuration

In the `main.py` script, OpenIE parameters are passed via command-line arguments:

```python
config = BaseConfig(
    retrieval_top_k=200,
    linking_top_k=5,
    max_qa_steps=3,
    qa_top_k=5,
    graph_type="facts_and_sim_passage_node_unidirectional",
    embedding_batch_size=8,
    max_new_tokens=None,
    corpus_len=len(corpus),
    openie_mode=args.openie_mode  # 'online' or 'offline'
)
```

**Command-line arguments:**
- `--openie_mode`: Choose between `online` (OpenAI API) or `offline` (vLLM)
- `--force_openie_from_scratch`: If `False`, reuse existing OpenIE results if available

资料来源：[main.py](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/main.py)

## Extraction Workflow

### Triple Extraction Process

The triple extraction workflow follows these steps:

```mermaid
sequenceDiagram
    participant Text as Raw Text Input
    participant Triple as Triple Extraction Prompt
    participant LLM as Language Model
    participant NER as NER Prompt
    participant Output as Knowledge Triples
    
    Text->>Triple: Passage text
    Triple->>LLM: Structured prompt
    LLM->>Output: Subject-Predicate-Object triples
    Output->>NER: Named Entity Recognition
    NER->>LLM: Entity labels
    LLM->>Output: Typed entities
```

### Supported Deployment Modes

| Mode | Backend | Use Case | API Key Required |
|------|---------|----------|------------------|
| `online` | OpenAI GPT | Quick testing, small corpora | Yes (`OPENAI_API_KEY`) |
| `offline` | vLLM | Large-scale indexing, cost efficiency | No (local deployment) |

## Knowledge Graph Integration

OpenIE extracted triples are converted into graph structures:

```mermaid
graph LR
    A[Passage Text] -->|OpenIE| B[Triple: Entity1 → Relation → Entity2]
    B --> C[Fact Node]
    C --> D[Knowledge Graph]
    D --> E[Personalized PageRank]
    E --> F[Associative Retrieval]
```

The extracted triples serve dual purposes:

1. **Fact Nodes**: Create direct connections between related entities
2. **Association Links**: Enable multi-hop reasoning through the graph

This design mirrors the dentate gyrus pattern separation mechanism in the hippocampus, where similar memories are differentiated to reduce interference.

资料来源：[README.md - Methodology](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/README.md)

## Usage Examples

### Online Mode (OpenAI)

```python
from hipporag import HippoRAG

hipporag = HippoRAG(
    save_dir='outputs',
    llm_model_name='gpt-4o-mini',
    embedding_model_name='nvidia/NV-Embed-v2'
)

# OpenIE runs automatically during indexing
hipporag.index(docs=["Passage containing facts to extract."])
```

### Offline Mode (vLLM)

```bash
# 1. Start vLLM server
vllm serve meta-llama/Llama-3.3-70B-Instruct \
    --tensor-parallel-size 2 \
    --max_model_len 4096 \
    --gpu-memory-utilization 0.95

# 2. Run indexing with offline OpenIE
python main.py --dataset sample --openie_mode offline
```

资料来源：[README.md - Quick Start](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/README.md)

## Dependencies

The OpenIE system requires the following core dependencies:

| Package | Version | Purpose |
|---------|---------|---------|
| `torch` | 2.5.1 | PyTorch backend |
| `transformers` | 4.45.2 | Model architecture |
| `openai` | 1.91.1 | Online OpenAI API |
| `vllm` | 0.6.6.post1 | Offline inference |
| `litellm` | 1.73.1 | Unified LLM interface |
| `tqdm` | - | Progress bars |

资料来源：[setup.py](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/setup.py)

## Extracted Data Format

OpenIE produces structured triples in the following format:

| Field | Type | Description |
|-------|------|-------------|
| `subject` | str | First entity |
| `predicate` | str | Relation verb/phrase |
| `object` | str | Second entity |
| `context` | str | Source passage text |

These triples are then processed into graph nodes and edges for the knowledge graph construction phase.

---

<a id='page-10'></a>

## Deployment Options

### 相关页面

相关主题：[Installation and Setup](#page-1), [LLM Integrations](#page-7)

<details>
<summary>相关源码文件</summary>

以下源码文件用于生成本页说明：

- [main_azure.py](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/main_azure.py)
- [main_dpr.py](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/main_dpr.py)
- [demo_azure.py](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/demo_azure.py)
- [demo_local.py](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/demo_local.py)
- [main.py](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/main.py)
- [setup.py](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/setup.py)
- [README.md](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/README.md)
</details>

# Deployment Options

HippoRAG supports multiple deployment configurations to accommodate different infrastructure requirements and use cases. This page documents the available deployment options, configuration parameters, and setup procedures for running HippoRAG in various environments.

## Overview

HippoRAG provides three primary deployment models:

| Deployment Type | LLM Backend | Embedding Backend | Typical Use Case |
|-----------------|-------------|-------------------|------------------|
| **OpenAI API** | OpenAI hosted models | OpenAI/NVIDIA hosted | Quickstart, development |
| **vLLM (Local)** | Self-hosted LLMs via vLLM | Local embedding models | Production, cost-sensitive |
| **Azure OpenAI** | Azure-hosted models | Azure-hosted embeddings | Enterprise compliance |

资料来源：[README.md](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/README.md)

## Environment Setup

Regardless of deployment type, certain environment variables must be configured:

```bash
export CUDA_VISIBLE_DEVICES=0,1,2,3
export HF_HOME=<path to Huggingface home directory>
```

For OpenAI and Azure deployments, additional API credentials are required:

```bash
export OPENAI_API_KEY=<your openai api key>
```

资料来源：[README.md:1](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/README.md)

## OpenAI API Deployment

The simplest deployment option uses OpenAI's hosted API endpoints for both LLM inference and embeddings.

### Configuration Parameters

| Parameter | Description | Example Value |
|-----------|-------------|---------------|
| `--llm_base_url` | OpenAI API endpoint | `https://api.openai.com/v1` |
| `--llm_name` | OpenAI model identifier | `gpt-4o-mini` |
| `--embedding_name` | Embedding model name | `nvidia/NV-Embed-v2` |

### Running with OpenAI Models

```bash
dataset=sample

python main.py --dataset $dataset \
    --llm_base_url https://api.openai.com/v1 \
    --llm_name gpt-4o-mini \
    --embedding_name nvidia/NV-Embed-v2
```

资料来源：[README.md:1](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/README.md)

### Programmatic Usage

```python
from hipporag import HippoRAG

hipporag = HippoRAG(
    save_dir='outputs',
    llm_model_name='gpt-4o-mini',
    embedding_model_name='nvidia/NV-Embed-v2'
)
```

资料来源：[README.md:1](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/README.md)

## Local vLLM Deployment

For production environments or cost-sensitive deployments, HippoRAG supports self-hosted LLMs using vLLM.

### Architecture

```mermaid
graph TD
    A[HippoRAG Main Process] --> B[vLLM Server]
    A --> C[Local Embedding Model]
    B --> D[GPU 0-1]
    C --> D
    E[Indexing Pipeline] --> A
    F[QA Pipeline] --> A
```

### Starting vLLM Server

Launch the vLLM server with tensor parallelism for multi-GPU setups:

```bash
export CUDA_VISIBLE_DEVICES=0,1
export VLLM_WORKER_MULTIPROC_METHOD=spawn
export HF_HOME=<path to Huggingface home directory>

vllm serve meta-llama/Llama-3.3-70B-Instruct \
    --tensor-parallel-size 2 \
    --max_model_len 4096 \
    --gpu-memory-utilization 0.95 \
    --port 6578
```

资料来源：[README.md:1](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/README.md)

### Configuration Parameters

| Parameter | Description | Default |
|-----------|-------------|---------|
| `--llm_base_url` | vLLM server endpoint | `http://localhost:6578/v1` |
| `--llm_name` | Model name (must match deployed model) | `meta-llama/Llama-3.1-8B-Instruct` |
| `--embedding_name` | Local embedding model identifier | `nvidia/NV-Embed-v2` |

### Running Main Process

With vLLM server running on GPUs 0-1, run the main process on separate GPUs:

```bash
export CUDA_VISIBLE_DEVICES=2,3
export HF_HOME=<path to Huggingface home directory>

python main.py --dataset $dataset \
    --llm_base_url http://localhost:6578/v1 \
    --llm_name meta-llama/Llama-3.3-70B-Instruct \
    --embedding_name nvidia/NV-Embed-v2
```

资料来源：[README.md:1](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/README.md)

## Azure OpenAI Deployment

Enterprise deployments requiring Azure infrastructure can use Azure OpenAI endpoints.

### Configuration Parameters

| Parameter | CLI Argument | Description |
|-----------|--------------|-------------|
| `azure_endpoint` | `--azure_endpoint` | Azure OpenAI chat completions endpoint |
| `azure_embedding_endpoint` | `--azure_embedding_endpoint` | Azure OpenAI embeddings endpoint |

### Endpoint Format

```python
azure_endpoint = (
    "https://[ENDPOINT_NAME].openai.azure.com/"
    "openai/deployments/gpt-4o-mini/chat/completions"
    "?api-version=2025-01-01-preview"
)

azure_embedding_endpoint = (
    "https://[ENDPOINT_NAME].openai.azure.com/"
    "openai/deployments/text-embedding-3-small/embeddings"
    "?api-version=2023-05-15"
)
```

资料来源：[demo_azure.py](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/demo_azure.py)

### Programmatic Usage

```python
from hipporag import HippoRAG

hipporag = HippoRAG(
    save_dir='outputs',
    llm_model_name='gpt-4o-mini',
    embedding_model_name='nvidia/NV-Embed-v2',
    azure_endpoint="https://[ENDPOINT_NAME].openai.azure.com/openai/deployments/gpt-4o-mini/chat/completions?api-version=2025-01-01-preview",
    azure_embedding_endpoint="https://[ENDPOINT_NAME].openai.azure.com/openai/deployments/text-embedding-3-small/embeddings?api-version=2023-05-15"
)

hipporag.index(docs=docs)
```

资料来源：[demo_azure.py](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/demo_azure.py)

### CLI Usage

```bash
python main_azure.py \
    --dataset sample \
    --azure_endpoint "https://[ENDPOINT].openai.azure.com/openai/deployments/gpt-4o-mini/chat/completions?api-version=2025-01-01-preview" \
    --azure_embedding_endpoint "https://[ENDPOINT].openai.azure.com/openai/deployments/text-embedding-3-small/embeddings?api-version=2023-05-15" \
    --save_dir outputs
```

资料来源：[main_azure.py](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/main_azure.py)

## Indexing Options

### OpenIE Modes

HippoRAG supports two Open Information Extraction (OpenIE) modes:

| Mode | Description | Resource Usage |
|------|-------------|----------------|
| `online` | Uses OpenAI GPT for real-time extraction | API costs |
| `offline` | Uses local vLLM batch processing | GPU compute |

```bash
python main.py --dataset $dataset --openie_mode offline
```

资料来源：[main.py:1](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/main.py)

### Force Rebuild Options

| Parameter | Description |
|-----------|-------------|
| `--force_index_from_scratch` | Ignores existing storage and rebuilds from scratch |
| `--force_openie_from_scratch` | Ignores cached OpenIE results and recomputes |

```bash
python main_azure.py \
    --force_index_from_scratch true \
    --force_openie_from_scratch true
```

资料来源：[main_azure.py](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/main_azure.py)

## StandardRAG vs HippoRAG

The codebase provides two RAG implementations selectable via configuration:

```python
# Standard HippoRAG (default)
hipporag = HippoRAG(global_config=config)

# Alternative DPR-style implementation
hipporag = StandardRAG(global_config=config)
```

资料来源：[main.py](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/main.py) and [main_dpr.py](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/main_dpr.py)

## Installation Requirements

All deployment options require the HippoRAG package and its dependencies:

```bash
conda create -n hipporag python=3.10
conda activate hipporag
pip install hipporag
```

Or install from source:

```bash
pip install -e .
```

Core dependencies include:

| Package | Version | Purpose |
|---------|---------|---------|
| `torch` | 2.5.1 | Deep learning framework |
| `transformers` | 4.45.2 | Model loading |
| `vllm` | 0.6.6.post1 | Local inference |
| `openai` | 1.91.1 | API client |
| `litellm` | 1.73.1 | Unified LLM interface |
| `gritlm` | 1.0.2 | Embedding models |
| `networkx` | 3.4.2 | Graph operations |
| `pydantic` | 2.10.4 | Configuration validation |

资料来源：[setup.py](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/setup.py)

## Testing Deployments

### OpenAI Test

```bash
export OPENAI_API_KEY=<your openai api key>
conda activate hipporag
python tests_openai.py
```

资料来源：[README.md:1](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/README.md)

### Local vLLM Test

```bash
export CUDA_VISIBLE_DEVICES=0
export VLLM_WORKER_MULTIPROC_METHOD=spawn
export HF_HOME=<path to Huggingface home directory>

# Start vLLM server
vllm serve meta-llama/Llama-3.1-8B-Instruct \
    --tensor-parallel-size 2 \
    --max_model_len 4096 \
    --gpu-memory-utilization 0.95 \
    --port 6578

# Run tests
CUDA_VISIBLE=1 python tests_local.py
```

资料来源：[README.md:1](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/README.md)

### Azure Test

```bash
python tests_azure.py
```

资料来源：[tests_azure.py](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/tests_azure.py)

## Deployment Decision Matrix

| Criteria | OpenAI API | vLLM Local | Azure |
|----------|------------|------------|-------|
| Setup complexity | Low | High | Medium |
| Cost | Pay-per-use | GPU infrastructure | Azure subscription |
| Data privacy | Data leaves your environment | All data stays local | Configurable |
| Latency | Network dependent | Local, optimized | Network dependent |
| Model flexibility | Limited to API models | Any HuggingFace model | Limited to deployed models |
| Recommended for | Development, prototyping | Production, research | Enterprise compliance |

---

---

## Doramagic 踩坑日志

项目：OSU-NLP-Group/HippoRAG

摘要：发现 18 个潜在踩坑项，其中 3 个为 high/blocking；最高优先级：安装坑 - 来源证据：add_fact_edges function adds the same edge twice?。

## 1. 安装坑 · 来源证据：add_fact_edges function adds the same edge twice?

- 严重度：high
- 证据强度：source_linked
- 发现：GitHub 社区证据显示该项目存在一个安装相关的待验证问题：add_fact_edges function adds the same edge twice?
- 对用户的影响：可能增加新用户试用和生产接入成本。
- 建议检查：来源问题仍为 open，Pack Agent 需要复核是否仍影响当前版本。
- 防护动作：不得脱离来源链接放大为确定性结论；需要标注适用版本和复核状态。
- 证据：community_evidence:github | cevd_6c7ca8232561460290f1ad50663233af | https://github.com/OSU-NLP-Group/HippoRAG/issues/174 | 来源讨论提到 python 相关条件，需在安装/试用前复核。

## 2. 安装坑 · 来源证据：pypi hipporag libraries

- 严重度：high
- 证据强度：source_linked
- 发现：GitHub 社区证据显示该项目存在一个安装相关的待验证问题：pypi hipporag libraries
- 对用户的影响：可能阻塞安装或首次运行。
- 建议检查：来源问题仍为 open，Pack Agent 需要复核是否仍影响当前版本。
- 防护动作：不得脱离来源链接放大为确定性结论；需要标注适用版本和复核状态。
- 证据：community_evidence:github | cevd_0da5afa434114138a3c745efba4c9ded | https://github.com/OSU-NLP-Group/HippoRAG/issues/168 | 来源类型 github_issue 暴露的待验证使用条件。

## 3. 安全/权限坑 · 来源证据：Take the "musique" dataset as an example. The process of constructing an index based on individual paragraphs takes an…

- 严重度：high
- 证据强度：source_linked
- 发现：GitHub 社区证据显示该项目存在一个安全/权限相关的待验证问题：Take the "musique" dataset as an example. The process of constructing an index based on individual paragraphs takes an extremely long time. Is this normal?
- 对用户的影响：可能影响授权、密钥配置或安全边界。
- 建议检查：来源问题仍为 open，Pack Agent 需要复核是否仍影响当前版本。
- 防护动作：不得脱离来源链接放大为确定性结论；需要标注适用版本和复核状态。
- 证据：community_evidence:github | cevd_90b68b1be49048efba510bfd10623d41 | https://github.com/OSU-NLP-Group/HippoRAG/issues/173 | 来源讨论提到 node 相关条件，需在安装/试用前复核。

## 4. 安装坑 · 来源证据：OpenAI version incompatibility in latest 2.0.0a4 version

- 严重度：medium
- 证据强度：source_linked
- 发现：GitHub 社区证据显示该项目存在一个安装相关的待验证问题：OpenAI version incompatibility in latest 2.0.0a4 version
- 对用户的影响：可能增加新用户试用和生产接入成本。
- 建议检查：来源显示可能已有修复、规避或版本变化，说明书中必须标注适用版本。
- 防护动作：不得脱离来源链接放大为确定性结论；需要标注适用版本和复核状态。
- 证据：community_evidence:github | cevd_f6679eb5cf884eb9a2d003b39da93c8d | https://github.com/OSU-NLP-Group/HippoRAG/issues/140 | 来源讨论提到 linux 相关条件，需在安装/试用前复核。

## 5. 安装坑 · 来源证据：Windows Compatibility Issues with vLLM dependency

- 严重度：medium
- 证据强度：source_linked
- 发现：GitHub 社区证据显示该项目存在一个安装相关的待验证问题：Windows Compatibility Issues with vLLM dependency
- 对用户的影响：可能增加新用户试用和生产接入成本。
- 建议检查：来源显示可能已有修复、规避或版本变化，说明书中必须标注适用版本。
- 防护动作：不得脱离来源链接放大为确定性结论；需要标注适用版本和复核状态。
- 证据：community_evidence:github | cevd_57d57b9365f342db9a5e8ed48727e99e | https://github.com/OSU-NLP-Group/HippoRAG/issues/117 | 来源讨论提到 python 相关条件，需在安装/试用前复核。

## 6. 配置坑 · 来源证据：How to use local embedding_model_

- 严重度：medium
- 证据强度：source_linked
- 发现：GitHub 社区证据显示该项目存在一个配置相关的待验证问题：How to use local embedding_model_
- 对用户的影响：可能增加新用户试用和生产接入成本。
- 建议检查：来源显示可能已有修复、规避或版本变化，说明书中必须标注适用版本。
- 防护动作：不得脱离来源链接放大为确定性结论；需要标注适用版本和复核状态。
- 证据：community_evidence:github | cevd_dd0e2350e55240b3ab754359ca93cb11 | https://github.com/OSU-NLP-Group/HippoRAG/issues/127 | 来源类型 github_issue 暴露的待验证使用条件。

## 7. 能力坑 · 能力判断依赖假设

- 严重度：medium
- 证据强度：source_linked
- 发现：README/documentation is current enough for a first validation pass.
- 对用户的影响：假设不成立时，用户拿不到承诺的能力。
- 建议检查：将假设转成下游验证清单。
- 防护动作：假设必须转成验证项；没有验证结果前不能写成事实。
- 证据：capability.assumptions | github_repo:805115184 | https://github.com/OSU-NLP-Group/HippoRAG | README/documentation is current enough for a first validation pass.

## 8. 运行坑 · 来源证据：Inquiry Regarding OpenIE Extraction Results for HippoRAG 2

- 严重度：medium
- 证据强度：source_linked
- 发现：GitHub 社区证据显示该项目存在一个运行相关的待验证问题：Inquiry Regarding OpenIE Extraction Results for HippoRAG 2
- 对用户的影响：可能增加新用户试用和生产接入成本。
- 建议检查：来源问题仍为 open，Pack Agent 需要复核是否仍影响当前版本。
- 防护动作：不得脱离来源链接放大为确定性结论；需要标注适用版本和复核状态。
- 证据：community_evidence:github | cevd_b735fa4a09f942db8f1825092ef8e368 | https://github.com/OSU-NLP-Group/HippoRAG/issues/177 | 来源类型 github_issue 暴露的待验证使用条件。

## 9. 维护坑 · 维护活跃度未知

- 严重度：medium
- 证据强度：source_linked
- 发现：未记录 last_activity_observed。
- 对用户的影响：新项目、停更项目和活跃项目会被混在一起，推荐信任度下降。
- 建议检查：补 GitHub 最近 commit、release、issue/PR 响应信号。
- 防护动作：维护活跃度未知时，推荐强度不能标为高信任。
- 证据：evidence.maintainer_signals | github_repo:805115184 | https://github.com/OSU-NLP-Group/HippoRAG | last_activity_observed missing

## 10. 安全/权限坑 · 下游验证发现风险项

- 严重度：medium
- 证据强度：source_linked
- 发现：no_demo
- 对用户的影响：下游已经要求复核，不能在页面中弱化。
- 建议检查：进入安全/权限治理复核队列。
- 防护动作：下游风险存在时必须保持 review/recommendation 降级。
- 证据：downstream_validation.risk_items | github_repo:805115184 | https://github.com/OSU-NLP-Group/HippoRAG | no_demo; severity=medium

## 11. 安全/权限坑 · 存在评分风险

- 严重度：medium
- 证据强度：source_linked
- 发现：no_demo
- 对用户的影响：风险会影响是否适合普通用户安装。
- 建议检查：把风险写入边界卡，并确认是否需要人工复核。
- 防护动作：评分风险必须进入边界卡，不能只作为内部分数。
- 证据：risks.scoring_risks | github_repo:805115184 | https://github.com/OSU-NLP-Group/HippoRAG | no_demo; severity=medium

## 12. 安全/权限坑 · 来源证据：How to distinguish Hipporag1 from Hipporag2

- 严重度：medium
- 证据强度：source_linked
- 发现：GitHub 社区证据显示该项目存在一个安全/权限相关的待验证问题：How to distinguish Hipporag1 from Hipporag2
- 对用户的影响：可能影响授权、密钥配置或安全边界。
- 建议检查：来源显示可能已有修复、规避或版本变化，说明书中必须标注适用版本。
- 防护动作：不得脱离来源链接放大为确定性结论；需要标注适用版本和复核状态。
- 证据：community_evidence:github | cevd_7dc27422dd8b4cb8a1384848ddbfa750 | https://github.com/OSU-NLP-Group/HippoRAG/issues/167 | 来源类型 github_issue 暴露的待验证使用条件。

## 13. 安全/权限坑 · 来源证据：Inquiry on Sample Selection for HippoRAG Experiments

- 严重度：medium
- 证据强度：source_linked
- 发现：GitHub 社区证据显示该项目存在一个安全/权限相关的待验证问题：Inquiry on Sample Selection for HippoRAG Experiments
- 对用户的影响：可能影响授权、密钥配置或安全边界。
- 建议检查：来源显示可能已有修复、规避或版本变化，说明书中必须标注适用版本。
- 防护动作：不得脱离来源链接放大为确定性结论；需要标注适用版本和复核状态。
- 证据：community_evidence:github | cevd_6a0069bfedfc4cf28e0cc18e51171a42 | https://github.com/OSU-NLP-Group/HippoRAG/issues/125 | 来源类型 github_issue 暴露的待验证使用条件。

## 14. 安全/权限坑 · 来源证据：Quadratic runtime during indexing

- 严重度：medium
- 证据强度：source_linked
- 发现：GitHub 社区证据显示该项目存在一个安全/权限相关的待验证问题：Quadratic runtime during indexing
- 对用户的影响：可能影响授权、密钥配置或安全边界。
- 建议检查：来源显示可能已有修复、规避或版本变化，说明书中必须标注适用版本。
- 防护动作：不得脱离来源链接放大为确定性结论；需要标注适用版本和复核状态。
- 证据：community_evidence:github | cevd_2681acee71064f72b24098fba0e05227 | https://github.com/OSU-NLP-Group/HippoRAG/issues/170 | 来源讨论提到 node 相关条件，需在安装/试用前复核。

## 15. 安全/权限坑 · 来源证据：[Discussion] Ablation: multi-component scoring layer over HippoRAG's KG?

- 严重度：medium
- 证据强度：source_linked
- 发现：GitHub 社区证据显示该项目存在一个安全/权限相关的待验证问题：[Discussion] Ablation: multi-component scoring layer over HippoRAG's KG?
- 对用户的影响：可能影响授权、密钥配置或安全边界。
- 建议检查：来源问题仍为 open，Pack Agent 需要复核是否仍影响当前版本。
- 防护动作：不得脱离来源链接放大为确定性结论；需要标注适用版本和复核状态。
- 证据：community_evidence:github | cevd_b65aca3d12234444b97a67bb7baac278 | https://github.com/OSU-NLP-Group/HippoRAG/issues/178 | 来源讨论提到 python 相关条件，需在安装/试用前复核。

## 16. 安全/权限坑 · 来源证据：division by zero

- 严重度：medium
- 证据强度：source_linked
- 发现：GitHub 社区证据显示该项目存在一个安全/权限相关的待验证问题：division by zero
- 对用户的影响：可能影响授权、密钥配置或安全边界。
- 建议检查：来源显示可能已有修复、规避或版本变化，说明书中必须标注适用版本。
- 防护动作：不得脱离来源链接放大为确定性结论；需要标注适用版本和复核状态。
- 证据：community_evidence:github | cevd_49e401ba15b74b5d943336fa0a2dceda | https://github.com/OSU-NLP-Group/HippoRAG/issues/93 | 来源讨论提到 python 相关条件，需在安装/试用前复核。

## 17. 维护坑 · issue/PR 响应质量未知

- 严重度：low
- 证据强度：source_linked
- 发现：issue_or_pr_quality=unknown。
- 对用户的影响：用户无法判断遇到问题后是否有人维护。
- 建议检查：抽样最近 issue/PR，判断是否长期无人处理。
- 防护动作：issue/PR 响应未知时，必须提示维护风险。
- 证据：evidence.maintainer_signals | github_repo:805115184 | https://github.com/OSU-NLP-Group/HippoRAG | issue_or_pr_quality=unknown

## 18. 维护坑 · 发布节奏不明确

- 严重度：low
- 证据强度：source_linked
- 发现：release_recency=unknown。
- 对用户的影响：安装命令和文档可能落后于代码，用户踩坑概率升高。
- 建议检查：确认最近 release/tag 和 README 安装命令是否一致。
- 防护动作：发布节奏未知或过期时，安装说明必须标注可能漂移。
- 证据：evidence.maintainer_signals | github_repo:805115184 | https://github.com/OSU-NLP-Group/HippoRAG | release_recency=unknown

<!-- canonical_name: OSU-NLP-Group/HippoRAG; human_manual_source: deepwiki_human_wiki -->