HippoRAG Manual Preview

Doramagic Project Pack · Human Manual

HippoRAG

HippoRAG is a graph-based Retrieval-Augmented Generation (RAG) framework designed to enable Large Language Models (LLMs) to identify and leverage connections within knowledge bases for imp...

Installation and Setup

Related topics: Configuration System, Deployment Options

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Python Version

Continue reading this section for the full explanation and source context.

Section Hardware Requirements

Continue reading this section for the full explanation and source context.

Section Method 1: pip Installation (Recommended)

Continue reading this section for the full explanation and source context.

Related topics: Configuration System, Deployment Options

Installation and Setup

Overview

HippoRAG is a graph-based Retrieval-Augmented Generation (RAG) framework designed to enable Large Language Models (LLMs) to identify and leverage connections within knowledge bases for improved retrieval and question answering. The installation process configures the necessary dependencies, environment variables, and model configurations to run HippoRAG in either cloud (OpenAI) or local (vLLM) deployment modes.

Sources: README.md

System Requirements

Python Version

Requirement	Version
Python	>= 3.10

The package explicitly requires Python 3.10 or higher as specified in the setup.py configuration.

Sources: setup.py:16

Hardware Requirements

Component	Requirement
GPU	CUDA-compatible GPU(s) recommended
GPU Memory	Varies based on model size (see deployment sections)

For local deployment with vLLM, the framework supports tensor parallelism across multiple GPUs. The README recommends reserving enough memory for embedding models when deploying LLM servers.

Sources: README.md

Installation Methods

Method 1: pip Installation (Recommended)

conda create -n hipporag python=3.10
conda activate hipporag
pip install hipporag

This method installs HippoRAG version 2.0.0-alpha.4 along with all core dependencies from PyPI.

Sources: README.md

Method 2: Source Installation (For Development)

git clone https://github.com/OSU-NLP-Group/HippoRAG.git
cd HippoRAG
pip install -e .

Clone the repository and install in editable mode to work with the latest source code.

Sources: CONTRIBUTING.md

Environment Variables

Proper configuration of environment variables is essential for HippoRAG to function correctly. These variables control GPU allocation, model caching, and API access.

Required Environment Variables

Variable	Description	Example
`CUDA_VISIBLE_DEVICES`	Comma-separated list of GPU device IDs	`0,1,2,3`
`HF_HOME`	Path to Hugging Face cache directory	`/path/to/huggingface/home`
`OPENAI_API_KEY`	API key for OpenAI models (cloud mode only)	`sk-...`

Setting Environment Variables

# Set CUDA visible devices
export CUDA_VISIBLE_DEVICES=0,1,2,3

# Set Hugging Face cache location
export HF_HOME=<path to Huggingface home directory>

# Set OpenAI API key (required for cloud deployment)
export OPENAI_API_KEY=<your openai api key>

Sources: README.md

Core Dependencies

HippoRAG depends on a comprehensive set of libraries for LLM inference, embedding models, graph processing, and data handling.

Dependency Overview

Package	Version	Purpose
torch	2.5.1	PyTorch deep learning framework
transformers	4.45.2	Model architectures and tokenizers
vllm	0.6.6.post1	High-throughput LLM inference
openai	1.91.1	OpenAI API client
litellm	1.73.1	Unified LLM interface
gritlm	1.0.2	Embedding model
networkx	3.4.2	Graph data structures
python_igraph	0.11.8	Graph algorithms
tiktoken	0.7.0	Tokenization
pydantic	2.10.4	Data validation
tenacity	8.5.0	Retry logic
einops	(latest)	Tensor operations
tqdm	(latest)	Progress bars
boto3	(latest)	AWS S3 integration

Sources: setup.py:17-32, requirements.txt

Additional Dependencies

The requirements.txt file includes additional packages not pinned to specific versions:

Package	Purpose
nest_asyncio	Asynchronous operations
numpy	Numerical computing
scipy	Scientific computing

Sources: requirements.txt

Configuration

HippoRAG uses a Pydantic-based configuration system defined in BaseConfig within config_utils.py. This configuration controls all aspects of indexing, retrieval, and QA.

Configuration Parameters

#### Embedding Configuration

Parameter	Default	Description
embedding_model_name	nvidia/NV-Embed-v2	Name of the embedding model
embedding_batch_size	16	Batch size for embedding encoding
embedding_return_as_normalized	True	Whether to normalize embeddings
embedding_max_seq_len	2048	Maximum sequence length for embeddings
embedding_model_dtype	auto	Data type for local embedding models

#### Retrieval Configuration

Parameter	Default	Description
retrieval_top_k	200	Number of documents to retrieve
linking_top_k	5	Number of linked nodes per retrieval step
damping	0.5	Damping factor for PPR algorithm
passage_node_weight	0.05	Weight modifier for passage nodes in PPR

#### QA Configuration

Parameter	Default	Description
max_qa_steps	1	Maximum steps for interleaved retrieval and reasoning
qa_top_k	5	Top k documents fed to QA model

#### Graph Construction Configuration

Parameter	Default	Description
synonymy_edge_topk	2047	K for KNN retrieval in synonymy edge building
synonymy_edge_sim_threshold	0.8	Similarity threshold for synonymy nodes
is_directed_graph	False	Whether the graph is directed
graph_type	facts_and_sim_passage_node_unidirectional	Type of graph structure

#### Information Extraction Configuration

Parameter	Default	Description
information_extraction_model_name	openie_openai_gpt	OpenIE model class name
openie_mode	online	Mode: "online" or "offline"

#### Preprocessing Configuration

Parameter	Default	Description
text_preprocessor_class_name	TextPreprocessor	Preprocessor class name
preprocess_encoder_name	gpt-4o	Encoder for preprocessing
preprocess_chunk_overlap_token_size	128	Overlap tokens between chunks
preprocess_chunk_max_token_size	None	Max tokens per chunk (None = whole doc)
preprocess_chunk_func	by_token	Chunking function type

Sources: src/hipporag/utils/config_utils.py

Deployment Modes

HippoRAG supports two primary deployment modes for LLM inference.

graph TD
    A[HippoRAG Deployment] --> B[Cloud Mode]
    A --> C[Local Mode]
    
    B --> B1[OpenAI API]
    B --> B2[OpenAI Compatible API]
    
    C --> C1[vLLM Server]
    C --> C1b[Local Embedding Model]
    
    B1 --> D[Requires OPENAI_API_KEY]
    B2 --> E[Custom LLM Base URL]
    C1 --> F[Multi-GPU Support]

Cloud Mode (OpenAI)

Cloud mode uses OpenAI's API for both LLM and embedding inference.

from hipporag import HippoRAG

hipporag = HippoRAG(
    save_dir='outputs',
    llm_model_name='gpt-4o-mini',
    embedding_model_name='nvidia/NV-Embed-v2'
)

#### OpenAI Compatible Embeddings

For OpenAI-compatible embedding endpoints:

hipporag = HippoRAG(
    save_dir=save_dir,
    llm_model_name='Your LLM Model name',
    llm_base_url='Your LLM Model url',
    embedding_model_name='Your Embedding model name',
    embedding_base_url='Your Embedding model url'
)

Sources: README.md

Local Mode (vLLM)

Local mode deploys LLM servers using vLLM for offline inference with GPU acceleration.

#### Step 1: Start vLLM Server

export CUDA_VISIBLE_DEVICES=0,1
export VLLM_WORKER_MULTIPROC_METHOD=spawn
export HF_HOME=<path to Huggingface home directory>

vllm serve meta-llama/Llama-3.3-70B-Instruct \
    --tensor-parallel-size 2 \
    --max_model_len 4096 \
    --gpu-memory-utilization 0.95 \
    --port 6578

#### Step 2: Run HippoRAG with Different GPUs

export CUDA_VISIBLE_DEVICES=2,3
export HF_HOME=<path to Huggingface home directory>
python main.py --dataset sample --llm_base_url http://localhost:6578/v1

Sources: README.md

Quick Start Workflow

graph LR
    A[Install HippoRAG] --> B[Configure Environment]
    B --> C[Set Environment Variables]
    C --> D[Initialize HippoRAG]
    D --> E[index Documents]
    E --> F[RAG QA Queries]

Complete Example

from hipporag import HippoRAG

# Define documents
docs = [
    "Oliver Badman is a politician.",
    "George Rankin is a politician.",
    "Cinderella attended the royal ball.",
    "The prince used the lost glass slipper to search the kingdom.",
    "Erik Hort's birthplace is Montebello.",
    "Montebello is a part of Rockland County."
]

# Initialize HippoRAG
hipporag = HippoRAG(
    save_dir='outputs',
    llm_model_name='gpt-4o-mini',
    embedding_model_name='nvidia/NV-Embed-v2'
)

# Index documents
hipporag.index(docs)

# Define queries and gold standard answers
queries = [
    "What is George Rankin's occupation?",
    "How did Cinderella reach her happy ending?",
    "What county is Erik Hort's birthplace a part of?"
]

gold_docs = [
    ["George Rankin is a politician."],
    ["Cinderella attended the royal ball.",
     "The prince used the lost glass slipper to search the kingdom."],
    ["Montebello is a part of Rockland County."]
]

answers = [
    ["Politician"],
    ["By going to the ball."],
    ["Rockland County"]
]

# Run RAG QA
results = hipporag.rag_qa(
    queries=queries,
    gold_docs=gold_docs,
    gold_answers=answers
)

Sources: README.md

Testing Your Installation

OpenAI Test

Run this test to verify cloud mode functionality:

export OPENAI_API_KEY=<your openai api key>
conda activate hipporag
python tests_openai.py

Local Test

Run this test to verify local vLLM mode:

export CUDA_VISIBLE_DEVICES=0
export VLLM_WORKER_MULTIPROC_METHOD=spawn
export HF_HOME=<path to Huggingface home directory>

# Start vLLM server
vllm serve meta-llama/Llama-3.1-8B-Instruct \
    --tensor-parallel-size 2 \
    --max_model_len 4096 \
    --gpu-memory-utilization 0.95 \
    --port 6578

# Run local test
CUDA_VISIBLE=1 python tests_local.py

Sources: README.md

Troubleshooting

Out of Memory (OOM) Errors

If you encounter OOM errors during local deployment:

Reduce gpu-memory-utilization parameter in vLLM
Reduce max_model_len in vLLM server
Adjust CUDA_VISIBLE_DEVICES to use more GPUs
Reduce embedding_batch_size in configuration

Environment Variable Issues

Ensure all required environment variables are set before running HippoRAG:

# Verify environment variables are set
echo $CUDA_VISIBLE_DEVICES
echo $HF_HOME
echo $OPENAI_API_KEY

Conda Environment

Always activate the correct conda environment before running commands:

conda activate hipporag

Sources: README.md

Reproducing Experiments

To reproduce the paper's experiments:

Clone the repository and install dependencies
Download datasets from HuggingFace or use provided samples in reproduce/dataset
Set required environment variables
Run the main script with appropriate parameters:

# OpenAI model
python main.py \
    --dataset sample \
    --llm_base_url https://api.openai.com/v1 \
    --llm_name gpt-4o-mini \
    --embedding_name nvidia/NV-Embed-v2

# Local vLLM model
python main.py \
    --dataset sample \
    --llm_base_url http://localhost:6578/v1 \
    --llm_name meta-llama/Llama-3.3-70B-Instruct \
    --embedding_name nvidia/NV-Embed-v2

Sources: README.md, main.py

Sources: [README.md](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/README.md)

Quick Start Guide

Related topics: Installation and Setup, HippoRAG Core Class

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Environment Setup

Continue reading this section for the full explanation and source context.

Section Pattern 1: OpenAI Models

Continue reading this section for the full explanation and source context.

Section Pattern 2: OpenAI Compatible Embeddings

Continue reading this section for the full explanation and source context.

Related topics: Installation and Setup, HippoRAG Core Class

Quick Start Guide

This guide provides a comprehensive walkthrough for setting up and running HippoRAG, enabling you to quickly leverage neurobiologically inspired long-term memory capabilities for Large Language Models.

Prerequisites

Before beginning, ensure your environment meets the following requirements:

Requirement	Specification
Python	>= 3.10
CUDA GPUs	Required for local embedding model inference
HuggingFace Home	Configured via `HF_HOME` environment variable
API Keys	OpenAI API key (if using OpenAI models)

Environment Setup

# Create conda environment
conda create -n hipporag python=3.10
conda activate hipporag

# Install HippoRAG
pip install hipporag

# Configure environment variables
export CUDA_VISIBLE_DEVICES=0,1,2,3
export HF_HOME=<path to Huggingface home directory>
export OPENAI_API_KEY=<your openai api key>

Sources: README.md:150-165

Core Usage Patterns

HippoRAG supports three primary deployment configurations. The initialization workflow follows this pattern:

graph TD
    A[Initialize HippoRAG] --> B{Select LLM Backend}
    B -->|OpenAI| C[Set llm_model_name + llm_base_url]
    B -->|vLLM| D[Set llm_model_name + llm_base_url]
    B -->|Azure| E[Set azure_endpoint]
    A --> F{Select Embedding Backend}
    F -->|HuggingFace| G[Set embedding_model_name]
    F -->|Custom| H[Set embedding_base_url]

Sources: demo_azure.py:1-30

Pattern 1: OpenAI Models

The simplest configuration uses OpenAI for both LLM inference and embeddings:

from hipporag import HippoRAG

# Configuration
save_dir = 'outputs'
llm_model_name = 'gpt-4o-mini'
embedding_model_name = 'nvidia/NV-Embed-v2'

# Initialize HippoRAG instance
hipporag = HippoRAG(
    save_dir=save_dir, 
    llm_model_name=llm_model_name,
    embedding_model_name=embedding_model_name
)

Sources: README.md:175-195

Pattern 2: OpenAI Compatible Embeddings

For custom LLM endpoints that follow OpenAI's API format:

hipporag = HippoRAG(
    save_dir=save_dir, 
    llm_model_name='Your LLM Model name',
    llm_base_url='Your LLM Model url',
    embedding_model_name='Your Embedding model name',  
    embedding_base_url='Your Embedding model url'
)

Sources: README.md:210-220

Pattern 3: Azure OpenAI Integration

For Azure-hosted models:

hipporag = HippoRAG(
    save_dir=save_dir,
    llm_model_name=llm_model_name,
    embedding_model_name=embedding_model_name,
    azure_endpoint="https://[ENDPOINT NAME].openai.azure.com/openai/deployments/gpt-4o-mini/chat/completions?api-version=2025-01-01-preview",
    azure_embedding_endpoint="https://[ENDPOINT NAME].openai.azure.com/openai/deployments/text-embedding-3-small/embeddings?api-version=2023-05-15"
)

Sources: demo_azure.py:10-15

Indexing Documents

The indexing process converts raw documents into HippoRAG's knowledge graph structure:

graph LR
    A[Raw Documents] --> B[Chunking]
    B --> C[OpenIE Extraction]
    C --> D[Embedding Generation]
    D --> E[Graph Construction]
    E --> F[Knowledge Graph Index]

Input Data Format

Documents should be provided as a list of strings:

docs = [
    "Oliver Badman is a politician.",
    "George Rankin is a politician.",
    "Cinderella attended the royal ball.",
    "The prince used the lost glass slipper to search the kingdom.",
]

Execute Indexing

hipporag.index(docs=docs)

Sources: demo_azure.py:18-45

Retrieval and Question Answering

The rag_qa method performs retrieval-augmented question answering:

graph TD
    A[Query Input] --> B[Retrieval]
    B --> C[Personalized PageRank]
    C --> D[Document Selection]
    D --> E[QA Generation]
    E --> F[Final Answer]
    
    C -.->|links documents| G[Knowledge Graph]
    G -.->|context| D

Complete QA Example

# Prepare queries and evaluation data
queries = [
    "What is George Rankin's occupation?",
    "How did Cinderella reach her happy ending?"
]

answers = [
    ["Politician"],
    ["By going to the ball."]
]

gold_docs = [
    ["George Rankin is a politician."],
    ["Cinderella attended the royal ball.",
     "The prince used the lost glass slipper to search the kingdom.",
     "When the slipper fit perfectly, Cinderella was reunited with the prince."]
]

# Execute RAG QA
results = hipporag.rag_qa(
    queries=queries, 
    gold_docs=gold_docs,
    gold_answers=answers
)

print(results)

Sources: README.md:195-215

Local Deployment with vLLM

For running LLMs locally, HippoRAG supports vLLM server integration:

Step 1: Start vLLM Server

export CUDA_VISIBLE_DEVICES=0,1
export VLLM_WORKER_MULTIPROC_METHOD=spawn
export HF_HOME=<path to Huggingface home directory>

conda activate hipporag

# Adjust gpu-memory-utilization and max_model_len based on your GPU memory
vllm serve meta-llama/Llama-3.1-8B-Instruct \
    --tensor-parallel-size 2 \
    --max_model_len 4096 \
    --gpu-memory-utilization 0.95 \
    --port 6578

Sources: README.md:225-240

Step 2: Initialize HippoRAG with vLLM

hipporag = HippoRAG(
    save_dir=save_dir, 
    llm_model_name='meta-llama/Llama-3.1-8B-Instruct',
    llm_base_url='http://localhost:6578/v1',
    embedding_model_name='nvidia/NV-Embed-v2'
)

Reproducing Experiments

For reproducing published experiments, follow the structured workflow:

Dataset Structure

File Type	Naming Convention	Purpose
Corpus	`{dataset}_corpus.json`	Document collection
Queries	`{dataset}.json`	Questions with answers
Output	`outputs/{dataset}/`	Index and results

Corpus JSON Format

[
  {
    "title": "FIRST PASSAGE TITLE",
    "text": "FIRST PASSAGE TEXT",
    "idx": 0
  },
  {
    "title": "SECOND PASSAGE TITLE",
    "text": "SECOND PASSAGE TEXT",
    "idx": 1
  }
]

Sources: README.md:100-125

Running Experiments

# Set environment variables
export CUDA_VISIBLE_DEVICES=0,1,2,3
export HF_HOME=<path to Huggingface home directory>
export OPENAI_API_KEY=<your openai api key>
conda activate hipporag

# Run with OpenAI model
dataset=sample
python main.py --dataset $dataset \
    --llm_base_url https://api.openai.com/v1 \
    --llm_name gpt-4o-mini \
    --embedding_name nvidia/NV-Embed-v2

Sources: main.py:1-35

Testing Your Installation

OpenAI Test

Verify installation with minimal OpenAI API cost:

export OPENAI_API_KEY=<your openai api key> 
conda activate hipporag
python tests_openai.py

Local Test with vLLM

Test with a locally deployed model:

export CUDA_VISIBLE_DEVICES=0
export VLLM_WORKER_MULTIPROC_METHOD=spawn
export HF_HOME=<path to Huggingface home directory>

conda activate hipporag

# Start vLLM server with smaller model
vllm serve meta-llama/Llama-3.1-8B-Instruct \
    --tensor-parallel-size 2 \
    --max_model_len 4096 \
    --gpu-memory-utilization 0.95 \
    --port 6578

# Run test
CUDA_VISIBLE=1 python tests_local.py

Sources: README.md:250-280

Configuration Parameters

Core Parameters

Parameter	Default	Description
`save_dir`	`outputs`	Directory for saving all related information
`llm_model_name`	-	LLM model identifier
`llm_base_url`	-	Base URL for LLM API endpoint
`embedding_model_name`	`nvidia/NV-Embed-v2`	Embedding model identifier
`embedding_batch_size`	`16`	Batch size for embedding model

Sources: src/hipporag/utils/config_utils.py:50-80

Retrieval Parameters

Parameter	Default	Description
`retrieval_top_k`	`200`	Number of documents to retrieve initially
`linking_top_k`	`5`	Number of linked nodes at each retrieval step
`qa_top_k`	`5`	Number of documents fed to QA model
`max_qa_steps`	`1`	Maximum interleaved retrieval-reasoning steps
`damping`	`0.5`	Damping factor for Personalized PageRank

Sources: src/hipporag/utils/config_utils.py:30-50

Graph Construction Parameters

Parameter	Default	Description
`synonymy_edge_topk`	`2047`	K for KNN retrieval in synonymy edge building
`synonymy_edge_sim_threshold`	`0.8`	Similarity threshold for synonymy nodes
`graph_type`	`facts_and_sim_passage_node_unidirectional`	Type of graph structure to construct
`is_directed_graph`	`False`	Whether to build a directed graph

Sources: src/hipporag/utils/config_utils.py:80-110

Troubleshooting

Common Issues

Issue	Solution
CUDA OOM errors	Reduce `gpu-memory-utilization` or `max_model_len` in vLLM; reduce `embedding_batch_size`
Connection errors	Verify API endpoint URLs and network connectivity
Index loading failures	Check that `save_dir` contains valid index files

Environment Validation

Always verify your setup before running experiments:

# Verify CUDA availability
python -c "import torch; print(torch.cuda.is_available())"

# Verify package installation
pip list | grep hipporag

Next Steps

Explore the Code Structure documentation for deep-dive into modules
Review experiment reproducibility guidelines in main.py
Access pre-processed datasets from the HuggingFace dataset page

Sources: [README.md:150-165]()

Configuration System

Related topics: Installation and Setup, HippoRAG Core Class

Section Related Pages

Continue reading this section for the full explanation and source context.

Section BaseConfig

Continue reading this section for the full explanation and source context.

Section OpenIE (Open Information Extraction) Configuration

Continue reading this section for the full explanation and source context.

Section Embedding Model Configuration

Continue reading this section for the full explanation and source context.

Related topics: Installation and Setup, HippoRAG Core Class

Configuration System

HippoRAG provides a comprehensive configuration system built on Pydantic's data validation framework. The configuration system enables fine-grained control over all aspects of the indexing, retrieval, and QA pipeline while maintaining type safety and default values for common use cases.

Architecture Overview

The configuration system is centered around the BaseConfig class defined in config_utils.py. This class uses Pydantic's BaseModel with Field definitions to provide structured configuration with metadata and validation.

graph TD
    A[BaseConfig] --> B[OpenIE Configuration]
    A --> C[Embedding Configuration]
    A --> D[Graph Construction Configuration]
    A --> E[Retrieval Configuration]
    A --> F[QA Configuration]
    A --> G[Save/Directory Configuration]
    A --> H[Dataset Configuration]
    
    I[main.py] --> A
    J[HippoRAG class] --> A
    K[StandardRAG class] --> A

Source: src/hipporag/utils/config_utils.py:1-100

Core Configuration Class

BaseConfig

The BaseConfig class serves as the single source of truth for all pipeline parameters. It inherits from Pydantic's BaseModel and provides automatic validation, serialization, and documentation through field metadata.

from hipporag.utils.config_utils import BaseConfig

global_config = BaseConfig(
    openie_mode='openai_gpt',
    information_extraction_model_name='gpt-4o-mini',
    embedding_model_name='nvidia/NV-Embed-v2',
    retrieval_top_k=200,
    linking_top_k=5,
    max_qa_steps=3,
    qa_top_k=5,
    graph_type="facts_and_sim_passage_node_unidirectional",
    embedding_batch_size=8
)

Source: main.py:20-35

Configuration Categories

OpenIE (Open Information Extraction) Configuration

Controls the information extraction module that identifies facts and entities from passages.

Parameter	Type	Default	Description
`openie_mode`	`Literal["openai_gpt", "vllm_offline", "Transformers-offline"]`	`"openai_gpt"`	The mode for information extraction model
`information_extraction_model_name`	`str`	`"gpt-4o-mini"`	Model name for information extraction

The openie_mode parameter supports three execution modes:

openai_gpt: Uses OpenAI's GPT models for extraction via API
vllm_offline: Uses locally deployed LLMs through vLLM server
Transformers-offline: Uses HuggingFace Transformers models directly

Source: src/hipporag/utils/config_utils.py:config_fields

Embedding Model Configuration

Manages embedding generation for passages and queries.

Parameter	Type	Default	Description
`embedding_model_name`	`str`	`"nvidia/NV-Embed-v2"`	Name of the embedding model
`embedding_batch_size`	`int`	`16`	Batch size for embedding generation
`embedding_return_as_normalized`	`bool`	`True`	Whether to normalize embeddings
`embedding_max_seq_len`	`int`	`2048`	Maximum sequence length for embedding model
`embedding_model_dtype`	`Literal["float16", "float32", "bfloat16", "auto"]`	`"auto"`	Data type for local embedding model
`embedding_base_url`	`Optional[str]`	`None`	Base URL for OpenAI-compatible embedding endpoints

Source: src/hipporag/utils/config_utils.py:embedding_batch_size-def

Graph Construction Configuration

Controls the knowledge graph construction process that forms the backbone of HippoRAG's memory system.

Parameter	Type	Default	Description
`synonymy_edge_topk`	`int`	`2047`	K value for KNN retrieval in building synonymy edges
`synonymy_edge_query_batch_size`	`int`	`1000`	Batch size for query embeddings during KNN retrieval
`synonymy_edge_key_batch_size`	`int`	`10000`	Batch size for key embeddings during KNN retrieval
`synonymy_edge_sim_threshold`	`float`	`0.8`	Similarity threshold for including candidate synonymy nodes
`is_directed_graph`	`bool`	`False`	Whether the constructed graph is directed or undirected
`graph_type`	`str`	`"facts_and_sim_passage_node_unidirectional"`	Type of graph structure to build

Supported graph_type values include:

facts_and_sim_passage_node_unidirectional - Passages connected via facts with similarity edges
facts_and_sim_passage_node_bidirectional - Bidirectional passage connections
facts_only - Only fact-based connections
sim_passage_node - Only passage similarity connections

Source: src/hipporag/utils/config_utils.py:synonymy_edge_topk-def

Retrieval Configuration

Parameters governing the retrieval and linking process using Personalized PageRank (PPR).

Parameter	Type	Default	Description
`linking_top_k`	`int`	`5`	Number of linked nodes at each retrieval step
`retrieval_top_k`	`int`	`200`	Number of documents to retrieve at each step
`damping`	`float`	`0.5`	Damping factor for PPR algorithm

The damping parameter controls the probability of following graph edges during the random walk in PPR. A higher value (closer to 1.0) results in more exploration, while lower values favor exploitation of high-probability paths.

Source: src/hipporag/utils/config_utils.py:linking_top_k-def, main.py:28

QA (Question Answering) Configuration

Controls the iterative QA process that interleaves retrieval with reasoning.

Parameter	Type	Default	Description
`max_qa_steps`	`int`	`1`	Maximum steps for interleaved retrieval and reasoning
`qa_top_k`	`int`	`5`	Number of top documents fed to the QA model

The max_qa_steps parameter enables multi-step reasoning where the system can retrieve additional documents based on intermediate reasoning results before producing the final answer.

Source: src/hipporag/utils/config_utils.py:max_qa_steps-def, main.py:27

LLM Configuration

Manages the language model used for QA and information extraction.

Parameter	Type	Default	Description
`llm_model_name`	`str`	`"gpt-4o-mini"`	Name of the LLM
`llm_base_url`	`Optional[str]`	`None`	Base URL for OpenAI-compatible LLM endpoints
`max_new_tokens`	`Optional[int]`	`None`	Maximum new tokens for generation

Source: src/hipporag/utils/config_utils.py:llm_model_name-def

Save and Directory Configuration

Controls output persistence and directory structure.

Parameter	Type	Default	Description
`save_dir`	`str`	`"outputs"`	Top-level directory for saving all related information
`corpus_len`	`int`	Required	Length of the corpus being processed

The save_dir parameter specifies where HippoRAG objects, intermediate results, and evaluation outputs are stored. When running with specific datasets, the default saves to a dataset-customized output directory under save_dir.

Source: src/hipporag/utils/config_utils.py:save_dir-def, main.py:32

Configuration Workflow

graph LR
    A[Define BaseConfig] --> B[Initialize HippoRAG]
    B --> C[Index Documents]
    C --> D[Run RAG QA]
    D --> E[Results Saved to save_dir]
    
    F[Modify Config] -->|Update| B
    G[New Documents] -->|Index| C

Initialization Example

from hipporag.utils.config_utils import BaseConfig
from hipporag import HippoRAG

config = BaseConfig(
    openie_mode='openai_gpt',
    information_extraction_model_name='gpt-4o-mini',
    embedding_model_name='nvidia/NV-Embed-v2',
    retrieval_top_k=200,
    linking_top_k=5,
    max_qa_steps=3,
    qa_top_k=5,
    graph_type="facts_and_sim_passage_node_unidirectional",
    embedding_batch_size=8,
    max_new_tokens=None,
    corpus_len=len(corpus),
)

hipporag = HippoRAG(global_config=config)

Source: main.py:19-38

Configuration for Different Execution Modes

OpenAI API Mode

config = BaseConfig(
    openie_mode='openai_gpt',
    information_extraction_model_name='gpt-4o-mini',
    llm_model_name='gpt-4o-mini',
    embedding_model_name='nvidia/NV-Embed-v2',
)

Source: main.py:20-26

Local vLLM Deployment Mode

config = BaseConfig(
    openie_mode='vllm_offline',
    information_extraction_model_name='meta-llama/Llama-3.1-8B-Instruct',
    llm_model_name='meta-llama/Llama-3.3-70B-Instruct',
    llm_base_url='http://localhost:8000/v1',
    embedding_model_name='nvidia/NV-Embed-v2',
)

Source: README.md:vllm_example

Transformers Offline Mode

config = BaseConfig(
    openie_mode='Transformers-offline',
    information_extraction_model_name='Transformers/Qwen/Qwen2.5-7B-Instruct',
    llm_model_name='gpt-4o-mini',
    embedding_model_name='nvidia/NV-Embed-v2',
)

Source: test_transformers.py:16-20

Testing with Configuration

The test suite demonstrates configuration usage across different scenarios:

# tests_openai.py - Basic indexing and QA
hipporag = HippoRAG(
    save_dir=save_dir,
    llm_model_name='gpt-4o-mini',
    embedding_model_name='nvidia/NV-Embed-v2'
)

# tests_openai.py - Document deletion
hipporag.delete(docs_to_delete)

# test_transformers.py - Transformers offline mode
hipporag = HippoRAG(
    global_config=global_config,
    save_dir=save_dir,
    llm_model_name='gpt-4o-mini',
    embedding_model_name='nvidia/NV-Embed-v2',
)

Source: tests_openai.py:test_structure, test_transformers.py:16-25

Package Dependencies

The configuration system depends on the following packages specified in setup.py:

Package	Version	Purpose
`torch`	`2.5.1`	PyTorch backend for models
`transformers`	`4.45.2`	HuggingFace Transformers
`pydantic`	`2.10.4`	Data validation and settings
`vllm`	`0.6.6.post1`	LLM inference server
`openai`	`1.91.1`	OpenAI API client
`litellm`	`1.73.1`	Unified LLM interface
`gritlm`	`1.0.2`	GritLM embedding model
`networkx`	`3.4.2`	Graph operations
`python_igraph`	`0.11.8`	Graph algorithms
`tiktoken`	`0.7.0`	Tokenization
`tenacity`	`8.5.0`	Retry logic

Source: setup.py:14-27

Best Practices

``bash export OPENAI_API_KEY=<your_openai_api_key> export HF_HOME=<path_to_huggingface_home> ``

Use environment variables for sensitive configuration like API keys:

``bash export CUDA_VISIBLE_DEVICES=0,1,2,3 ``

Set GPU devices before initialization:

Adjust batch sizes based on available GPU memory when using local models

Configure damping factor carefully for retrieval - higher values (0.7-0.85) work better for complex multi-hop questions

Set corpus_len correctly to enable proper progress tracking and memory management

Source: https://github.com/OSU-NLP-Group/HippoRAG / Human Manual

HippoRAG Core Class

Related topics: Knowledge Graph and Retrieval, Embedding Models

Section Related Pages

Continue reading this section for the full explanation and source context.

Section HippoRAG Class

Continue reading this section for the full explanation and source context.

Section StandardRAG Class

Continue reading this section for the full explanation and source context.

Section BaseConfig Parameters

Continue reading this section for the full explanation and source context.

HippoRAG Core Class

Overview

HippoRAG is a neurobiologically inspired graph-based Retrieval-Augmented Generation (RAG) framework designed to enable Large Language Models (LLMs) to identify and leverage connections within knowledge for improved retrieval and question answering. The project implements two primary RAG classes: HippoRAG (neurobiologically inspired with Personal Knowledge Graph) and StandardRAG (traditional DPR-based approach).

Sources: setup.py:8-9

Architecture Overview

graph TB
    subgraph "Input Layer"
        Docs[Documents/Passages]
        Queries[User Queries]
    end
    
    subgraph "HippoRAG Core"
        Index[Indexing Pipeline]
        Retrieve[Retrieval Pipeline]
        QA[Question Answering]
    end
    
    subgraph "Knowledge Graph Construction"
        OpenIE[OpenIE Information Extraction]
        Embed[Embedding Model]
        GraphBuild[Graph Building]
    end
    
    subgraph "Backend Services"
        LLM[LLM Inference]
        EmbedModel[Embedding Service]
    end
    
    Docs --> Index
    Index --> OpenIE
    Index --> Embed
    OpenIE --> GraphBuild
    Embed --> GraphBuild
    GraphBuild --> KG[Knowledge Graph]
    
    Queries --> Retrieve
    Retrieve --> KG
    KG --> QA
    QA --> LLM
    Retrieve --> EmbedModel

Core Classes

HippoRAG Class

The HippoRAG class is the main entry point for the neurobiologically inspired RAG system. It extends a base RAG implementation with Personal Knowledge Graph (PKG) capabilities.

Initialization Parameters

Parameter	Type	Default	Description
`save_dir`	`str`	Required	Directory to save all related information
`llm_model_name`	`str`	Required	LLM model identifier (e.g., `gpt-4o-mini`)
`embedding_model_name`	`str`	Required	Embedding model name (e.g., `nvidia/NV-Embed-v2`)
`global_config`	`BaseConfig`	`None`	Full configuration object
`llm_base_url`	`str`	`None`	Custom LLM API endpoint for OpenAI-compatible models
`embedding_base_url`	`str`	`None`	Custom embedding API endpoint
`azure_endpoint`	`str`	`None`	Azure OpenAI endpoint for LLM
`azure_embedding_endpoint`	`str`	`None`	Azure OpenAI endpoint for embeddings

Sources: main.py:19-28

Basic Usage Pattern

from hipporag import HippoRAG

hipporag = HippoRAG(
    save_dir='outputs',
    llm_model_name='gpt-4o-mini',
    embedding_model_name='nvidia/NV-Embed-v2'
)

# Index documents
hipporag.index(docs=documents_list)

# Retrieve and answer queries
results = hipporag.rag_qa(
    queries=query_list,
    gold_docs=expected_documents,
    gold_answers=expected_answers
)

StandardRAG Class

The StandardRAG class provides traditional Dense Passage Retrieval (DPR) based RAG without the Personal Knowledge Graph components. This is useful for baseline comparisons.

Sources: main_dpr.py:19

Configuration System

BaseConfig Parameters

The BaseConfig class (defined in src/hipporag/utils/config_utils.py) provides comprehensive configuration options:

OpenIE Configuration

Parameter	Type	Default	Description
`openie_mode`	`str`	Required	OpenIE mode: `OpenAI`, `vllm-offline`, or `Transformers-offline`
`information_extraction_model_name`	`str`	`None`	Model for offline OpenIE (e.g., `Qwen/Qwen2.5-7B-Instruct`)

Embedding Configuration

Parameter	Type	Default	Description
`embedding_batch_size`	`int`	`16`	Batch size for embedding model inference
`embedding_return_as_normalized`	`bool`	`True`	Whether to normalize embeddings
`embedding_max_seq_len`	`int`	`2048`	Maximum sequence length for embedding
`embedding_model_dtype`	`str`	`"auto"`	Data type: `float16`, `float32`, `bfloat16`, or `auto`

Graph Construction Configuration

Parameter	Type	Default	Description
`synonymy_edge_topk`	`int`	`2047`	K value for KNN retrieval in synonymy edge construction
`synonymy_edge_query_batch_size`	`int`	`1000`	Batch size for query embeddings
`synonymy_edge_key_batch_size`	`int`	`10000`	Batch size for key embeddings
`synonymy_edge_sim_threshold`	`float`	`0.8`	Similarity threshold for synonymy edges
`is_directed_graph`	`bool`	`False`	Whether the graph is directed

Retrieval Configuration

Parameter	Type	Default	Description
`retrieval_top_k`	`int`	`200`	Number of documents to retrieve initially
`linking_top_k`	`int`	`5`	Number of linked nodes at each retrieval step
`damping`	`float`	`0.5`	Damping factor for Personalized PageRank

QA Configuration

Parameter	Type	Default	Description
`max_qa_steps`	`int`	`1`	Maximum interleaved retrieval and reasoning steps
`qa_top_k`	`int`	`5`	Top k documents fed to QA model

Sources: src/hipporag/utils/config_utils.py:1-80

Core Methods

Indexing Pipeline

graph LR
    A[Documents] --> B[Passage Embedding]
    B --> C[OpenIE Extraction]
    C --> D[Fact Node Creation]
    D --> E[Similarity Edge Building]
    E --> F[Knowledge Graph]

Method Signature

def index(self, docs: List[str], **kwargs) -> None

The indexing process:

Embeds passages using the configured embedding model
Runs OpenIE to extract factual triples from each passage
Constructs fact nodes and passage nodes in the knowledge graph
Builds synonymy edges based on embedding similarity
Persists the graph structure to save_dir

RAG QA Pipeline

graph TD
    Q[Query] --> EP[Embedding]
    EP --> PPR[Personalized PageRank]
    PPR --> LN[Linked Nodes]
    LN --> LLM[LLM Reasoning]
    LLM -->|Iteration| Check{More Steps?}
    Check -->|Yes| EP
    Check -->|No| Final[Final Answer]

Method Signature

def rag_qa(
    self,
    queries: List[str],
    gold_docs: Optional[List[List[str]]] = None,
    gold_answers: Optional[List[List[str]]] = None,
    **kwargs
) -> Dict

Parameters

Parameter	Type	Required	Description
`queries`	`List[str]`	Yes	List of questions to answer
`gold_docs`	`List[List[str]]`	No	Ground truth documents for evaluation
`gold_answers`	`List[List[str]]`	No	Ground truth answers for evaluation

Returns

A dictionary containing evaluation metrics and retrieved results.

Document Deletion

def delete(self, docs_to_delete: List[str]) -> None

Removes specified documents from the knowledge graph and updates persistence.

Supported Backend Models

LLM Backends

Backend	Configuration	Example Model
OpenAI	`llm_model_name`	`gpt-4o-mini`
Azure OpenAI	`azure_endpoint`	Azure deployment URL
vLLM (Local)	`llm_base_url` + vLLM server	`meta-llama/Llama-3.1-8B-Instruct`
OpenAI-Compatible	`llm_model_name` + `llm_base_url`	Custom endpoint

Sources: README.md:80-95

Embedding Models

Model Type	Configuration	Notes
NV-Embed-v2	`embedding_model_name='nvidia/NV-Embed-v2'`	Recommended
GritLM	`embedding_model_name='GritLM'`	Supported
Contriever	`embedding_model_name='Contriever'`	Supported
Azure Embeddings	`azure_embedding_endpoint`	Via Azure OpenAI
Custom OpenAI-Compatible	`embedding_base_url`	Any compatible endpoint

OpenIE Modes

HippoRAG supports three OpenIE (Open Information Extraction) modes:

Mode	Description	Use Case
`OpenAI`	Uses OpenAI GPT models for extraction	Cloud-based, high quality
`vllm-offline`	Uses locally deployed vLLM models	GPU-equipped servers
`Transformers-offline`	Uses HuggingFace Transformers	CPU or limited GPU

Sources: test_transformers.py:20-22

Workflow Example

from hipporag import HippoRAG

# Initialize
hipporag = HippoRAG(
    save_dir='outputs',
    llm_model_name='gpt-4o-mini',
    embedding_model_name='nvidia/NV-Embed-v2'
)

# Prepare data
docs = [
    "Oliver Badman is a politician.",
    "George Rankin is a politician.",
    "Cinderella attended the royal ball."
]

# Index
hipporag.index(docs=docs)

# Query
queries = ["What is George Rankin's occupation?"]
answers = [["Politician"]]
gold_docs = [["George Rankin is a politician."]]

# Retrieve and evaluate
results = hipporag.rag_qa(
    queries=queries,
    gold_docs=gold_docs,
    gold_answers=answers
)

Graph Types

The framework supports configurable graph structures:

Graph Type	Description
`facts_and_sim_passage_node_unidirectional`	Facts with similarity-based passage connections (default)

Graph edges include:

Fact-to-Fact edges: Created from OpenIE extractions
Synonymy edges: Based on embedding similarity above threshold
Passage edges: Connect passages to their extracted facts

Dependencies

Key package dependencies managed in setup.py:

Package	Version	Purpose
`torch`	`2.5.1`	Deep learning framework
`transformers`	`4.45.2`	Model architectures
`vllm`	`0.6.6.post1`	LLM inference
`openai`	`1.91.1`	OpenAI API client
`gritlm`	`1.0.2`	GritLM embedding model
`networkx`	`3.4.2`	Graph operations
`python_igraph`	`0.11.8`	Graph algorithms
`pydantic`	`2.10.4`	Configuration validation
`tiktoken`	`0.7.0`	Tokenization

Sources: setup.py:15-30

Error Handling

The framework uses tenacity for retry mechanisms with configurable backoff strategies when interacting with external APIs (OpenAI, Azure, vLLM).

Persistence

All indexed data is persisted to the save_dir directory with the following structure:

save_dir/
└── {llm_model_name}_{embedding_model_name}/
    ├── knowledge_graph.pkl       # Serialized graph
    ├── passages.pkl              # Passage embeddings
    ├── fact_nodes.pkl            # Extracted facts
    └── config.json                # Configuration snapshot

Sources: [setup.py:8-9]()

Knowledge Graph and Retrieval

Related topics: Embedding Store and Management, LLM Integrations

Section Related Pages

Continue reading this section for the full explanation and source context.

Section High-Level System Design

Continue reading this section for the full explanation and source context.

Section Graph Construction Pipeline

Continue reading this section for the full explanation and source context.

Section Node Types

Continue reading this section for the full explanation and source context.

Knowledge Graph and Retrieval

Overview

HippoRAG implements a neurobiologically inspired retrieval system that combines knowledge graph construction with advanced retrieval algorithms. The system is designed to enable LLMs to identify and leverage connections within new knowledge for improved retrieval performance. Sources: setup.py:8

The Knowledge Graph and Retrieval module forms the core of HippoRAG's architecture, providing mechanisms to:

Extract factual knowledge from text passages using Open Information Extraction (OpenIE)
Construct heterogeneous graphs with multiple node and edge types
Perform personalized PageRank (PPR) based retrieval over the constructed graphs
Support incremental updates and document deletion operations

Sources: src/hipporag/utils/config_utils.py:48-72

Architecture

High-Level System Design

HippoRAG's retrieval system integrates several key components working in concert to provide accurate and efficient knowledge retrieval:

graph TD
    A[Input Documents] --> B[OpenIE Processing]
    B --> C[Knowledge Graph Construction]
    C --> D[Embedding Generation]
    D --> E[Synonymy Edge Building]
    C --> F[Hybrid Graph]
    
    G[Query Input] --> H[Query Embedding]
    H --> I[Personalized PageRank]
    I --> F
    F --> J[Retrieval Results]
    J --> K[Reranking]
    K --> L[Final QA Output]
    
    M[LLM Inference] --> L

Graph Construction Pipeline

The graph construction process transforms raw text into a structured knowledge representation:

graph LR
    A[Passages] --> B[OpenIE Extractor]
    B --> C[Triplets/Entities]
    C --> D[Fact Nodes]
    
    E[Passages] --> F[Embedding Model]
    F --> G[Passage Embeddings]
    G --> H[Passage Nodes]
    
    D --> I[Passage-Fact Edges]
    H --> I
    
    G --> J[Synonymy Edges]
    J --> K[knn Retrieval]
    K --> L[Similarity Threshold Filter]
    L --> M[Synonymy Edge Network]

Knowledge Graph Components

Node Types

Node Type	Description	Attributes
Passage Nodes	Represent original text passages	idx, title, text, embedding
Fact Nodes	Extracted facts/triplets from OpenIE	subject, predicate, object, embedding

Edge Types

Edge Type	Source	Target	Purpose
Passage-to-Fact	Passage Node	Fact Node	Links passages to their extracted facts
Fact-to-Fact	Fact Node	Fact Node	Connects semantically related facts
Synonymy	Passage Node	Passage Node	Links passages with high semantic similarity
Bidirectional	Both	Both	Full edge in both directions

Sources: src/hipporag/utils/config_utils.py:70-85

Graph Types Configuration

The system supports multiple graph configurations via the graph_type parameter:

Graph Type	Description
`facts_and_sim_passage_node_unidirectional`	Facts + similar passage nodes, unidirectional edges
`facts_and_sim_passage_node_bidirectional`	Facts + similar passage nodes, bidirectional edges
Custom types	Extensible graph construction patterns

Sources: main.py:18

Retrieval Process

Personalized PageRank (PPR) Algorithm

HippoRAG uses Personalized PageRank for graph-based retrieval, which allows queries to propagate through the knowledge graph to identify relevant nodes.

graph TD
    A[Query] --> B[Query Embedding]
    B --> C[Initial PPR Scores]
    C --> D[Graph Propagation]
    D --> E{Iteration}
    E -->|Continue| F[Score Aggregation]
    F --> D
    E -->|Converge| G[Top-K Selection]
    G --> H[Linked Nodes]
    
    I[damping factor: 0.5] --> D
    J[linking_top_k: 5] --> G

Retrieval Configuration Parameters

Parameter	Default	Description
`retrieval_top_k`	200	Number of documents retrieved at each step
`linking_top_k`	5	Number of linked nodes at each retrieval step
`damping`	0.5	Damping factor for PPR algorithm
`qa_top_k`	5	Top-k documents fed to QA model

Sources: src/hipporag/utils/config_utils.py:60-72

Synonymy Edge Construction

Synonymy edges connect passages with high semantic similarity, enabling cross-document retrieval:

graph TD
    A[All Passage Embeddings] --> B[KNN Retrieval]
    B --> C[Top-K Candidates]
    C --> D{Similarity > Threshold?}
    D -->|Yes| E[Create Synonymy Edge]
    D -->|No| F[Discard]
    E --> G[Synonymy Edge Network]

#### Synonymy Edge Parameters

Parameter	Default	Description
`synonymy_edge_topk`	2047	k for knn retrieval in building synonymy edges
`synonymy_edge_query_batch_size`	1000	Batch size for query embeddings
`synonymy_edge_key_batch_size`	10000	Batch size for key embeddings
`synonymy_edge_sim_threshold`	0.8	Similarity threshold for candidate synonymy nodes

Sources: src/hipporag/utils/config_utils.py:73-85

Embedding Integration

Embedding Model Configuration

Parameter	Default	Description
`embedding_model_name`	-	Name of the embedding model
`embedding_batch_size`	16	Batch size for embedding calls
`embedding_return_as_normalized`	True	Whether to normalize embeddings
`embedding_max_seq_len`	2048	Maximum sequence length
`embedding_model_dtype`	auto	Data type for local models (float16/float32/bfloat16/auto)

Sources: src/hipporag/utils/config_utils.py:40-54

Supported Embedding Models

The system integrates with multiple embedding model providers:

NV-Embed-v2: NVIDIA's embedding model
GritLM: GritLM embedding model
Contriever: Facebook's dense retriever
OpenAI Compatible: Any OpenAI-compatible embedding endpoint
Azure OpenAI: Azure-hosted embedding models

Reranking Module

After initial retrieval, HippoRAG applies reranking to improve result quality. The reranking module reorders retrieved candidates using additional scoring mechanisms.

graph LR
    A[Retrieved Candidates] --> B[Reranker Model]
    B --> C[Relevance Scores]
    C --> D[Ranked Results]
    D --> E[Top Results]

Sources: src/hipporag/rerank.py

QA Integration

Multi-Step Retrieval and Reasoning

HippoRAG supports interleaved retrieval and reasoning with configurable steps:

Parameter	Default	Description
`max_qa_steps`	1	Maximum steps for interleaved retrieval and reasoning
`qa_top_k`	5	Number of documents for QA model to process

Sources: src/hipporag/utils/config_utils.py:68-72

QA Pipeline Flow

graph TD
    A[Query] --> B[QA Step 1]
    B --> C[Retrieval]
    C --> D[Read Documents]
    D --> E{More Steps Needed?}
    E -->|Yes| F[Update Context]
    F --> B
    E -->|No| G[Final Answer]
    
    H[gold_docs] --> I[Evaluation]
    I --> J[Metrics]
    J --> K[Recall, EM, F1]

Data Formats

Corpus JSON Structure

[
  {
    "title": "PASSAGE TITLE",
    "text": "PASSAGE TEXT",
    "idx": 0
  }
]

Query JSON Structure

[
  {
    "id": "question_id",
    "question": "QUESTION TEXT",
    "answer": ["ANSWER"],
    "answerable": true,
    "paragraphs": [
      {
        "title": "SUPPORTING TITLE",
        "text": "SUPPORTING TEXT",
        "is_supporting": true,
        "idx": 0
      }
    ]
  }
]

Usage Examples

Basic Retrieval with HippoRAG

from hipporag import HippoRAG

hipporag = HippoRAG(
    save_dir='outputs',
    llm_model_name='gpt-4o-mini',
    embedding_model_name='nvidia/NV-Embed-v2'
)

# Index documents
docs = [
    "Oliver Badman is a politician.",
    "George Rankin is a politician.",
    "Erik Hort's birthplace is Montebello.",
    "Montebello is a part of Rockland County."
]

hipporag.index(docs)

# Query with evaluation
queries = ["What is George Rankin's occupation?"]
gold_docs = [["George Rankin is a politician."]]
answers = [["Politician"]]

results = hipporag.rag_qa(
    queries=queries,
    gold_docs=gold_docs,
    gold_answers=answers
)

Sources: README.md:Quick_Start, tests_openai.py:22-60

Incremental Updates

# Add new documents
new_docs = [
    "Tom Hort's birthplace is Montebello.",
    "Sam Hort's birthplace is Montebello."
]
hipporag.index(docs=new_docs)

# Delete documents
docs_to_delete = [
    "Tom Hort's birthplace is Montebello.",
    "Sam Hort's birthplace is Montebello."
]
hipporag.delete(docs_to_delete)

Sources: tests_openai.py:61-82

Evaluation Metrics

The retrieval system is evaluated using standard information retrieval metrics:

Metric	Description
Recall@k	Fraction of relevant documents in top-k
EM	Exact Match accuracy
F1	Harmonic mean of precision and recall

Summary

The Knowledge Graph and Retrieval module in HippoRAG provides a sophisticated pipeline for:

Knowledge Extraction: Using OpenIE to extract factual triplets from text
Graph Construction: Building heterogeneous graphs with passage nodes, fact nodes, and multiple edge types
Synonymy Discovery: Creating semantic links between similar passages via embedding similarity
PPR-based Retrieval: Performing personalized PageRank for graph-aware document retrieval
Reranking: Refining retrieval results for improved accuracy
Incremental Updates: Supporting document additions and deletions

This architecture enables HippoRAG to perform complex associativity and multi-hop reasoning tasks that traditional vector similarity retrieval cannot accomplish effectively.

Sources: [src/hipporag/utils/config_utils.py:48-72]()

Embedding Store and Management

Related topics: LLM Integrations, Embedding Models

Section Related Pages

Continue reading this section for the full explanation and source context.

Section High-Level Components

Continue reading this section for the full explanation and source context.

Section Data Flow

Continue reading this section for the full explanation and source context.

Section Base Class Contract

Continue reading this section for the full explanation and source context.

Related topics: LLM Integrations, Embedding Models

Embedding Store and Management

Overview

The Embedding Store and Management system in HippoRAG provides a unified interface for encoding text passages into vector embeddings, managing these embeddings throughout the indexing and retrieval lifecycle, and supporting multiple embedding model backends including NVIDIA NV-Embed-v2, GritLM, and Contriever. The system is designed to handle batch processing of documents with configurable parameters for sequence length, data type precision, and normalization behavior.

HippoRAG's embedding management is tightly integrated with the knowledge graph construction process, where embeddings serve dual purposes: enabling semantic similarity search for passage linking and powering the retrieval phase through Personalized PageRank (PPR) algorithms. The embedding store abstracts away the underlying model implementation details, allowing the framework to switch between different embedding providers without changing the core indexing and retrieval logic.

Sources: src/hipporag/utils/config_utils.py:1-50

Architecture

High-Level Components

The embedding system consists of three primary layers that work together to provide embedding services throughout the HippoRAG pipeline.

The Model Layer contains implementations for specific embedding models, each inheriting from a common base class that enforces a consistent interface. Currently supported models include NV-Embed-v2, GritLM, and Contriever, with the architecture supporting easy extension to additional models. Each model implementation handles the specific requirements of its underlying transformer architecture, including tokenizer configuration, padding strategies, and model-specific inference optimizations.

The Utility Layer provides helper functions for common embedding operations such as batch processing, embedding normalization, and similarity computation. These utilities ensure consistent handling of embeddings across different contexts and help optimize memory usage during large-scale indexing operations.

The Configuration Layer defines the parameters that control embedding behavior, including batch sizes, sequence length limits, and model-specific settings. This layer connects the embedding system to HippoRAG's global configuration management, allowing users to customize embedding behavior without modifying code.

graph TD
    A[Documents] --> B[Embedding Store]
    B --> C[Model Layer<br/>NV-Embed-v2<br/>GritLM<br/>Contriever]
    B --> D[Utility Layer<br/>Batch Processing<br/>Normalization]
    C --> E[Vector Storage]
    D --> E
    E --> F[Graph Construction]
    E --> G[Retrieval Phase]

Sources: src/hipporag/embedding_store.py:1-30

Data Flow

During the indexing phase, documents are first processed by the embedding store to generate passage vectors. These vectors are stored alongside the passage metadata and serve as the foundation for graph construction. The embedding store processes passages in configurable batch sizes to balance memory usage and throughput, with the default batch size set to 16 documents per batch.

During the retrieval phase, incoming queries are encoded using the same embedding model to produce a query vector. This query vector is then used for similarity computation against the indexed passage vectors, enabling semantic matching between the query intent and stored knowledge. The retrieval system can perform k-nearest neighbor (kNN) searches over the embedding space to identify candidate passages for further processing.

graph LR
    A[Indexing Flow] --> B[Input Documents]
    B --> C[Batch Processing<br/>batch_size=16]
    C --> D[Embedding Encoding]
    D --> E[Normalized Vectors]
    E --> F[Vector Storage]
    
    G[Retrieval Flow] --> H[Query Text]
    H --> I[Query Encoding]
    I --> J[Similarity Search]
    J --> K[kNN Retrieval<br/>top-k candidates]
    K --> L[Ranked Passages]

Sources: src/hipporag/utils/embed_utils.py:1-25

Configuration Parameters

The embedding system is controlled through several configuration parameters defined in the global configuration structure. These parameters allow fine-tuning of embedding behavior for different hardware configurations and use cases.

Parameter	Type	Default	Description
`embedding_batch_size`	int	16	Number of documents processed in each embedding batch
`embedding_return_as_normalized`	bool	true	Whether to L2-normalize output embeddings
`embedding_max_seq_len`	int	2048	Maximum sequence length in tokens for the embedding model
`embedding_model_dtype`	Literal	"auto"	Data type for local embedding models: float16, float32, bfloat16, or auto
`embedding_model_name`	str	varies	Identifier for the embedding model (e.g., "nvidia/NV-Embed-v2")
`embedding_base_url`	str	None	Base URL for OpenAI-compatible embedding endpoints
`synonymy_edge_topk`	int	2047	k value for kNN retrieval when building synonymy edges
`synonymy_edge_sim_threshold`	float	0.8	Minimum similarity threshold for synonymy edge candidates

Sources: src/hipporag/utils/config_utils.py:15-40

Embedding Model Interface

Base Class Contract

All embedding models must inherit from BaseEmbeddingModel, which defines the core interface that HippoRAG expects. The base class enforces implementation of the __call__ method that accepts text inputs and returns embeddings, ensuring polymorphism across different model implementations.

The base class also defines the EmbeddingConfig dataclass that encapsulates model-specific settings. This configuration includes the model name, batch size, maximum sequence length, and data type settings. The configuration object is passed to the embedding model during initialization and can be modified to adjust model behavior without recreating the model instance.

Supported Models

NV-Embed-v2 is the primary embedding model recommended for production use, developed by NVIDIA. It provides high-quality sentence embeddings optimized for retrieval tasks. The model is accessed through HuggingFace and supports automatic device placement based on available GPU resources.

GritLM provides an alternative embedding approach that combines retrieval and generation capabilities. It can serve both as an embedding model and as a decoder for generation tasks, offering flexibility in deployment configurations.

Contriever is an open-source bi-encoder model for dense retrieval, useful for scenarios requiring a completely open-source embedding solution without proprietary dependencies.

Sources: src/hipporag/embedding_model/__init__.py:1-20

Embedding Store API

Initialization

The embedding store is typically instantiated through the main HippoRAG class rather than directly. When creating a HippoRAG instance, the embedding model name and optional endpoint configuration are passed as parameters:

hipporag = HippoRAG(
    save_dir="outputs",
    llm_model_name="gpt-4o-mini",
    embedding_model_name="nvidia/NV-Embed-v2"
)

For OpenAI-compatible embedding endpoints, the base URL can be specified:

hipporag = HippoRAG(
    save_dir="outputs",
    llm_model_name="gpt-4o-mini",
    embedding_model_name="text-embedding-3-small",
    embedding_base_url="https://api.openai.com/v1"
)

Sources: README.md:1-50

Encoding Operations

The embedding store provides batch encoding capabilities for processing multiple documents efficiently. The encoding operation returns normalized embeddings by default, which is required for proper similarity computation during retrieval. The normalization is L2 normalization, ensuring that all embedding vectors have unit length.

For Azure OpenAI deployments, specialized endpoint parameters are supported:

hipporag = HippoRAG(
    save_dir="save_dir",
    llm_model_name="gpt-4o-mini",
    embedding_model_name="text-embedding-3-small",
    azure_endpoint="https://[ENDPOINT].openai.azure.com/...",
    azure_embedding_endpoint="https://[ENDPOINT].openai.azure.com/..."
)

Sources: demo_azure.py:1-30

Integration with Knowledge Graph

The embedding system plays a critical role in HippoRAG's knowledge graph construction phase. After passages are indexed and encoded, the embeddings are used for two key graph-related operations.

Synonymy Edge Construction uses embeddings to identify semantically similar passage pairs that should be connected in the knowledge graph. The system performs k-nearest neighbor searches over the passage embedding space, where the synonymy_edge_topk parameter controls how many candidates are considered for each passage. The synonymy_edge_sim_threshold parameter filters these candidates, with only pairs exceeding the similarity threshold being connected as synonymy edges.

Retrieval-Graph Linking during the PPR retrieval process uses passage embeddings to establish the connection between the query and the knowledge graph. The query embedding enables the system to identify the most relevant starting nodes in the graph for the random walk algorithm.

Sources: src/hipporag/utils/config_utils.py:30-45

Memory Management and Optimization

Batch Processing Strategy

The embedding store implements batch processing to optimize GPU memory utilization and throughput. The batch size is configurable via embedding_batch_size with a default of 16, meaning 16 documents are processed simultaneously during encoding. For systems with larger GPU memory, increasing this value can significantly improve indexing performance.

The system also supports separate batch sizes for the synonymy edge construction phase. The synonymy_edge_query_batch_size (default 1000) controls how many passage embeddings are queried at once during kNN search, while synonymy_edge_key_batch_size (default 10000) controls the key batch size for the search index.

Data Type Selection

The embedding_model_dtype parameter allows selection of the precision for local embedding models. The "auto" setting allows the system to select an appropriate default based on the hardware and model. Available options include float16 for memory-constrained environments, float32 for maximum precision, and bfloat16 which offers a good balance of range and memory efficiency on newer GPUs.

Sources: src/hipporag/utils/config_utils.py:25-35

Error Handling and Resilience

The embedding system is designed with error handling patterns compatible with HippoRAG's overall resilience strategy. Batch processing allows partial failures to be identified and retried without losing all progress. The configuration system supports specifying fallback models or endpoints for production deployments requiring high availability.

Tenacity is used for retry logic in the embedding utilities, ensuring transient network failures or temporary service unavailability do not cause complete pipeline failures. This is particularly important when using remote embedding endpoints that may experience temporary connectivity issues.

Sources: setup.py:1-30

Performance Considerations

When optimizing HippoRAG for production deployment, the embedding configuration should be tuned based on the available hardware and expected workload characteristics. The primary tuning parameters include batch size for indexing throughput, sequence length limits for handling long documents, and data type selection for memory-constrained environments.

For maximum retrieval quality, the default normalization behavior should be maintained as it ensures consistent similarity computation across the retrieval pipeline. Disabling normalization may lead to suboptimal retrieval results as the similarity metrics assume unit-normalized vectors.

Sources: src/hipporag/utils/config_utils.py:18-22

The embedding system interacts closely with several other HippoRAG components. The Information Extraction module uses embeddings for processing extracted facts, the retrieval module depends on embeddings for kNN search and PPR initialization, and the evaluation module uses embeddings for computing retrieval metrics such as recall and MRR.

The embedding model implementations in src/hipporag/embedding_model/ follow a consistent interface defined in base.py, allowing the embedding store to work with any model that adheres to this contract.

Sources: [src/hipporag/utils/config_utils.py:1-50]()

LLM Integrations

Related topics: Embedding Models, Deployment Options

Section Related Pages

Continue reading this section for the full explanation and source context.

Section OpenAI Models

Continue reading this section for the full explanation and source context.

Section vLLM Local Deployment

Continue reading this section for the full explanation and source context.

Section AWS Bedrock

Continue reading this section for the full explanation and source context.

Related topics: Embedding Models, Deployment Options

LLM Integrations

HippoRAG provides a flexible, pluggable architecture for integrating various Large Language Model (LLM) providers. This modular design enables the framework to support multiple inference backends including OpenAI, vLLM for local deployment, and AWS Bedrock, allowing researchers and developers to choose the most appropriate LLM backend for their specific use case and infrastructure requirements.

Architecture Overview

The LLM integration system follows a strategy pattern where a base abstract class defines the interface contract, and concrete implementations handle provider-specific details. This design ensures that the core HippoRAG logic remains independent of any particular LLM vendor while maintaining the ability to leverage specialized features offered by different providers.

graph TD
    A[HippoRAG Core] --> B[LLM Base Class]
    B --> C[OpenAIGPT]
    B --> D[VLLMOffline]
    B --> E[BedrockLLM]
    B --> F[Custom LLM Adapter]
    
    C --> G[OpenAI API]
    D --> H[Local vLLM Server]
    E --> I[AWS Bedrock]

The BaseLLM abstract class in src/hipporag/llm/base.py defines the common interface that all LLM adapters must implement, ensuring consistent behavior across different providers.

Supported LLM Providers

OpenAI Models

HippoRAG supports all OpenAI chat completion models through the OpenAIGPT class. This integration allows users to leverage the GPT family of models for both information extraction and question answering tasks.

Configuration Parameters:

Parameter	Type	Default	Description
`model_name`	string	required	OpenAI model identifier (e.g., `gpt-4o-mini`, `gpt-4o`)
`api_key`	string	env `OPENAI_API_KEY`	OpenAI API authentication key
`base_url`	string	`https://api.openai.com/v1`	API endpoint base URL
`max_tokens`	int	`None`	Maximum tokens in generated response
`temperature`	float	`0.0`	Sampling temperature for generation

Usage Example:

from hipporag import HippoRAG

hipporag = HippoRAG(
    save_dir='outputs',
    llm_model_name='gpt-4o-mini',
    embedding_model_name='nvidia/NV-Embed-v2'
)

Sources: README.md:67-72

vLLM Local Deployment

For scenarios requiring local inference, HippoRAG supports vLLM-deployed models through the VLLMOffline class. This approach is particularly useful for privacy-sensitive applications, cost reduction at scale, or when working with custom fine-tuned models.

Server Setup:

export CUDA_VISIBLE_DEVICES=0,1
export VLLM_WORKER_MULTIPROC_METHOD=spawn
export HF_HOME=<path to Huggingface home directory>

vllm serve meta-llama/Llama-3.1-8B-Instruct \
    --tensor-parallel-size 2 \
    --max_model_len 4096 \
    --gpu-memory-utilization 0.95 \
    --port 6578

Sources: README.md:93-101

Configuration Parameters:

Parameter	Type	Default	Description
`model_name`	string	required	Model identifier for vLLM server
`base_url`	string	required	vLLM server endpoint URL
`openie_mode`	string	`"online"`	Mode for OpenIE processing (`online` or `offline`)
`max_tokens`	int	`None`	Maximum tokens in generated response
`temperature`	float	`0.0`	Sampling temperature for generation

Offline Mode for OpenIE:

The vLLM integration supports an offline mode where OpenIE extraction runs separately from the main pipeline. This is useful for debugging or when OpenIE results can be cached and reused.

python main.py \
    --dataset sample \
    --llm_name meta-llama/Llama-3.3-70B-Instruct \
    --openie_mode offline \
    --skip_graph

Sources: README.md:130-135

AWS Bedrock

HippoRAG integrates with AWS Bedrock through the BedrockLLM class, enabling access to various foundation models hosted on AWS infrastructure. This integration is designed for enterprise deployments requiring scalable, managed LLM services.

Configuration Parameters:

Parameter	Type	Default	Description
`model_name`	string	required	Bedrock model identifier
`aws_region`	string	`"us-east-1"`	AWS region for Bedrock endpoint
`max_tokens`	int	`None`	Maximum tokens in generated response
`temperature`	float	`0.0`	Sampling temperature for generation

Azure OpenAI

For enterprise users with Azure OpenAI deployments, HippoRAG provides direct integration with Azure endpoints.

Configuration Example:

hipporag = HippoRAG(
    save_dir=save_dir,
    llm_model_name='gpt-4o-mini',
    embedding_model_name='embedding-model-name',
    azure_endpoint="https://[ENDPOINT NAME].openai.azure.com/openai/deployments/gpt-4o-mini/chat/completions?api-version=2025-01-01-preview",
    azure_embedding_endpoint="https://[ENDPOINT NAME].openai.azure.com/openai/deployments/text-embedding-3-small/embeddings?api-version=2023-05-15"
)

Sources: demo_azure.py:16-21

Base LLM Interface

All LLM adapters inherit from the BaseLLM abstract class, which defines the core contract for LLM interactions.

classDiagram
    class BaseLLM {
        <<abstract>>
        +generate(prompt: str) str
        +batch_generate(prompts: List[str]) List[str]
        +get_model_name() str
    }
    
    class OpenAIGPT {
        +generate(prompt: str) str
        +batch_generate(prompts: List[str]) List[str]
    }
    
    class VLLMOffline {
        +generate(prompt: str) str
        +batch_generate(prompts: List[str]) List[str]
    }
    
    class BedrockLLM {
        +generate(prompt: str) str
        +batch_generate(prompts: List[str]) List[str]
    }
    
    BaseLLM <|-- OpenAIGPT
    BaseLLM <|-- VLLMOffline
    BaseLLM <|-- BedrockLLM

Core Methods:

Method	Parameters	Return Type	Description
`generate`	`prompt: str`	`str`	Generate a single response from a prompt
`batch_generate`	`prompts: List[str]`	`List[str]`	Generate responses for multiple prompts in batch
`get_model_name`	None	`str`	Return the configured model identifier

OpenIE Integration

Open Information Extraction (OpenIE) is a critical component of HippoRAG's knowledge graph construction pipeline. The LLM integration system supports multiple OpenIE modes to accommodate different deployment scenarios.

graph LR
    A[Documents] --> B{HippoRAG}
    B --> C{OpenIE Mode}
    
    C -->|online| D[Real-time OpenIE]
    C -->|offline| E[Cached OpenIE Results]
    
    D --> F[OpenIE with LLM]
    E --> G[Load from JSON]
    
    F --> H[Knowledge Graph]
    G --> H

OpenIE Implementation Classes:

Class	Provider	Use Case
`OpenAI_GPT`	OpenAI API	Cloud-based OpenIE extraction
`VLLM_Offline`	Local vLLM	Private/onsite OpenIE extraction

Sources: README.md:47-48

Configuration Schema

The LLM integration configuration is defined through the HippoRAGConfig class, which validates and manages all LLM-related settings.

Configuration Fields:

Field	Type	Default	Description
`llm_name`	string	required	LLM model identifier
`llm_base_url`	string	`None`	Base URL for LLM API endpoint
`llm_max_tokens`	int	`None`	Maximum tokens per generation
`llm_temperature`	float	`0.0`	Sampling temperature
`openie_mode`	string	`"online"`	OpenIE processing mode
`skip_graph`	bool	`False`	Skip graph construction step

Sources: main.py:18-26

Workflow Integration

The following diagram illustrates how LLM integrations fit into the HippoRAG indexing and retrieval pipeline:

graph TD
    subgraph Indexing
        A1[Input Documents] --> A2[Chunking]
        A2 --> A3[Embedding Generation]
        A3 --> A4[OpenIE with LLM]
        A4 --> A5[Knowledge Graph Construction]
        A5 --> A6[Graph Indexing]
    end
    
    subgraph Retrieval & QA
        B1[User Query] --> B2[Query Embedding]
        B2 --> B3[Graph Traversal]
        B3 --> B4[LLM for Answer Synthesis]
        B4 --> B5[Final Answer]
    end
    
    A4 -.->|Uses| LLM1[LLM Adapter]
    B4 -.->|Uses| LLM1

Environment Variables

Proper configuration of environment variables is essential for LLM integrations to function correctly.

Variable	Required	Description
`OPENAI_API_KEY`	For OpenAI	OpenAI API authentication key
`HF_HOME`	For vLLM	Hugging Face cache directory
`CUDA_VISIBLE_DEVICES`	For GPU	Comma-separated GPU device IDs
`AWS_ACCESS_KEY_ID`	For Bedrock	AWS access credentials
`AWS_SECRET_ACCESS_KEY`	For Bedrock	AWS secret credentials

Sources: README.md:58-66

Testing LLM Integrations

HippoRAG provides dedicated test scripts to verify LLM integration functionality.

OpenAI Test

export OPENAI_API_KEY=<your-api-key>
conda activate hipporag
python tests_openai.py

Local vLLM Test

# Terminal 1: Start vLLM server
export CUDA_VISIBLE_DEVICES=0
vllm serve meta-llama/Llama-3.1-8B-Instruct --port 6578

# Terminal 2: Run test
CUDA_VISIBLE_DEVICES=1 python tests_local.py

Sources: README.md:137-148

Error Handling and Retries

The LLM integrations leverage the tenacity library for automatic retry behavior with exponential backoff. This ensures robust operation when dealing with network issues or rate limiting from LLM providers.

Configuration options for retry behavior:

Parameter	Default	Description
`max_attempts`	3	Maximum number of retry attempts
`wait_exponential_multiplier`	1000	Initial wait time in milliseconds
`wait_exponential_max`	10000	Maximum wait time in milliseconds

Extending LLM Support

To add support for a new LLM provider, implement a new class that inherits from BaseLLM and implements the required abstract methods:

from hipporag.llm.base import BaseLLM

class CustomLLM(BaseLLM):
    def __init__(self, model_name: str, **kwargs):
        self.model_name = model_name
        # Initialize provider-specific client
        
    def generate(self, prompt: str) -> str:
        # Implement generation logic
        pass
        
    def batch_generate(self, prompts: List[str]) -> List[str]:
        # Implement batch generation
        pass
        
    def get_model_name(self) -> str:
        return self.model_name

Performance Considerations

When selecting and configuring LLM integrations, consider the following factors:

Latency: OpenAI APIs typically offer lower latency for small workloads, while vLLM provides better performance for high-throughput scenarios
Cost: Local vLLM deployment eliminates API costs but requires GPU infrastructure
Privacy: For sensitive data, local deployment via vLLM or Bedrock private endpoints is recommended
Model Size: Larger models (e.g., Llama-3.3-70B) require more GPU memory but often provide better extraction quality

Sources: [README.md:67-72](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/README.md)

Embedding Models

Related topics: Embedding Store and Management, LLM Integrations

Section Related Pages

Continue reading this section for the full explanation and source context.

Section NV-Embed-v2

Continue reading this section for the full explanation and source context.

Section GritLM

Continue reading this section for the full explanation and source context.

Section Transformers (SentenceTransformers)

Continue reading this section for the full explanation and source context.

Embedding Models

HippoRAG provides a flexible, modular embedding model system that supports multiple embedding backends including NVIDIA's NV-Embed-v2, GritLM, HuggingFace Transformers, and vLLM endpoints. This modular architecture enables the system to generate high-quality text embeddings for both passage encoding and query understanding in the retrieval pipeline.

Architecture Overview

The embedding model subsystem follows a base class pattern with specialized implementations. All embedding models inherit from BaseEmbeddingModel which defines the common interface and configuration schema.

graph TD
    A[HippoRAG Core] --> B[Embedding Model Factory]
    B --> C[BaseEmbeddingModel]
    C --> D[NVEmbedV2]
    C --> E[GritLM]
    C --> F[TransformersEmbeddingModel]
    C --> G[VLLMEmbeddingModel]

The factory pattern in __init__.py dynamically instantiates the appropriate embedding model based on the model name prefix:

Prefix	Model Class	Backend
`nvidia/NV-Embed-v2`	`NVEmbedV2`	HuggingFace
`GritLM`	`GritLM`	GritLM library
`Transformers/`	`TransformersEmbeddingModel`	SentenceTransformers
`VLLM/`	`VLLMEmbeddingModel`	vLLM endpoints

Sources: src/hipporag/embedding_model/__init__.py

Base Configuration

The BaseEmbeddingModel and EmbeddingConfig classes define the configuration schema used across all embedding implementations. Configuration parameters include:

Parameter	Default	Description
`embedding_batch_size`	16	Batch size for encoding operations
`embedding_return_as_normalized`	True	Whether to normalize output embeddings
`embedding_max_seq_len`	2048	Maximum sequence length for tokenization
`embedding_model_dtype`	"auto"	Data type: float16, float32, bfloat16, or auto

Sources: src/hipporag/utils/config_utils.py:16-35

Available Embedding Models

NV-Embed-v2

The NVEmbedV2 class provides integration with NVIDIA's NV-Embed-v2 embedding model, a high-performance encoder optimized for retrieval tasks.

class NVEmbedV2(BaseEmbeddingModel):
    def __init__(self, global_config: BaseConfig, embedding_model_name: str) -> None:
        super().__init__(global_config=global_config)
        # Model initialization with HuggingFace transformers

Sources: src/hipporag/embedding_model/NVEmbedV2.py

GritLM

The GritLM class wraps the GritLM library for generating embeddings with built-in instruction-following capabilities.

class GritLM(BaseEmbeddingModel):
    def __init__(self, global_config: BaseConfig, embedding_model_name: str) -> None:
        super().__init__(global_config=global_config)
        # GritLM-specific initialization

Sources: src/hipporag/embedding_model/GritLM.py

Transformers (SentenceTransformers)

The TransformersEmbeddingModel class enables using any model from the HuggingFace ecosystem via the SentenceTransformers library. Select this implementation by using embedding_model_name that starts with "Transformers/".

class TransformersEmbeddingModel(BaseEmbeddingModel):
    """
    To select this implementation you can initialise HippoRAG with:
        embedding_model_name starts with "Transformers/"
    """
    def __init__(self, global_config: BaseConfig, embedding_model_name: str) -> None:
        super().__init__(global_config=global_config)
        self.model_id = embedding_model_name[len("Transformers/"):]
        self.batch_size = 64
        self.model = SentenceTransformer(
            self.model_id, 
            device="cuda" if torch.cuda.is_available() else "cpu"
        )

Key characteristics:

Automatically detects CUDA availability for GPU acceleration
Uses batch size of 64 for efficient processing
Extracts model ID by removing the "Transformers/" prefix

Sources: src/hipporag/embedding_model/Transformers.py:1-40

VLLM (Endpoint-based)

The VLLMEmbeddingModel class provides integration with OpenAI-compatible vLLM embedding endpoints. Select this implementation by using embedding_model_name that starts with "VLLM/".

class VLLMEmbeddingModel(BaseEmbeddingModel):
    """
    To select this implementation you can initialise HippoRAG with:
        embedding_model_name starts with "VLLM/"
    The embedding base url should contain the v1/embeddings.
    """
    def __init__(self, global_config: BaseConfig, embedding_model_name: str) -> None:
        super().__init__(global_config=global_config)
        self.model_id = embedding_model_name[len("VLLM/"):]
        self.batch_size = 32
        self.url = global_config.embedding_base_url

The model communicates with the endpoint using the OpenAI embeddings API format:

payload = {
    "model": self.model_id,
    "input": input_text,
}
response = requests.post(self.base_url, headers=headers, json=payload)

Sources: src/hipporag/embedding_model/VLLM.py:1-50

Query Instructions

Embedding models support query instruction templates for improving retrieval relevance. The system uses instructions for mapping queries to facts and passages:

self.search_query_instr = set([
    get_query_instruction('query_to_fact'),
    get_query_instruction('query_to_passage')
])

Sources: src/hipporag/embedding_model/Transformers.py:23-27

Usage Patterns

Quick Start with OpenAI-style Models

hipporag = HippoRAG(
    save_dir=save_dir,
    llm_model_name='gpt-4o-mini',
    llm_base_url='https://api.openai.com/v1',
    embedding_model_name='nvidia/NV-Embed-v2',
    embedding_base_url='https://api.openai.com/v1'
)

Using Custom Endpoints

hipporag = HippoRAG(
    save_dir=save_dir,
    llm_model_name='Your LLM Model name',
    llm_base_url='Your LLM Model url',
    embedding_model_name='Your Embedding model name',
    embedding_base_url='Your Embedding model url'
)

Using vLLM Local Deployment

# Start vLLM server
vllm serve meta-llama/Llama-3.1-8B-Instruct --tensor-parallel-size 2

# Configure with VLLM prefix
hipporag = HippoRAG(
    save_dir=save_dir,
    llm_model_name='...',
    embedding_model_name='VLLM/your-model-name',
    embedding_base_url='http://localhost:8000/v1/embeddings'
)

Dependencies

The embedding model system depends on the following packages:

Package	Version	Purpose
`transformers`	4.45.2	Core model loading
`sentence-transformers`	(via Transformers)	Sentence encoding
`gritlm`	1.0.2	GritLM embeddings
`torch`	2.5.1	GPU acceleration
`einops`	(latest)	Tensor operations

Sources: setup.py:19-32

Configuration Parameters Summary

Parameter	Type	Default	Description
`embedding_batch_size`	int	16	Batch size for embedding inference
`embedding_return_as_normalized`	bool	True	L2 normalize embeddings
`embedding_max_seq_len`	int	2048	Maximum token sequence length
`embedding_model_dtype`	str	"auto"	Model precision (float16/float32/bfloat16/auto)

Sources: src/hipporag/utils/config_utils.py:16-29

Sources: [src/hipporag/embedding_model/__init__.py](src/hipporag/embedding_model/__init__.py)

Open Information Extraction (OpenIE)

Related topics: Knowledge Graph and Retrieval, LLM Integrations

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Module Structure

Continue reading this section for the full explanation and source context.

Section ConfigUtils Class Parameters

Continue reading this section for the full explanation and source context.

Section Main Entry Point Configuration

Continue reading this section for the full explanation and source context.

Open Information Extraction (OpenIE)

Overview

Open Information Extraction (OpenIE) is a critical component in the HippoRAG pipeline that enables the extraction of structured knowledge triples from unstructured text. The system extracts entities, relations, and triples from passages to construct a knowledge graph that mimics hippocampal memory formation in biological systems.

In HippoRAG, OpenIE serves as the foundation for building the associative memory graph. Extracted triples form fact nodes in the knowledge graph, enabling parametric nearest neighbor (PPR) retrieval that connects related information across documents.

Sources: README.md

Architecture

The OpenIE system in HippoRAG supports multiple deployment modes and LLM backends:

graph TD
    A[Unstructured Text] --> B[Information Extraction Module]
    B --> C{openie_mode}
    C -->|online| D[OpenAI GPT]
    C -->|offline| E[vLLM Offline]
    D --> F[Triple Extraction]
    E --> F
    F --> G[NER Processing]
    G --> H[Knowledge Triples]
    H --> I[Knowledge Graph Construction]

Module Structure

Module	File	Purpose
Base Interface	`information_extraction/__init__.py`	Exports model classes
OpenAI Integration	`openie_openai_gpt.py`	Online OpenIE via OpenAI API
vLLM Offline	`openie_vllm_offline.py`	Offline batch processing with vLLM
Triple Extraction Prompt	`prompts/templates/triple_extraction.py`	LLM prompt for triple extraction
NER Prompt	`prompts/templates/ner.py`	LLM prompt for named entity recognition

Sources: README.md - Code Structure

Configuration

ConfigUtils Class Parameters

The InformationExtractionConfig dataclass provides the following configuration options:

Parameter	Type	Default	Description
`information_extraction_model_name`	`Literal["openie_openai_gpt"]`	`"openie_openai_gpt"`	Class name indicating which information extraction model to use
`openie_mode`	`Literal["offline", "online"]`	`"online"`	Mode of the OpenIE model: `online` uses OpenAI API, `offline` uses vLLM batch processing
`skip_graph`	`bool`	`False`	Whether to skip graph construction. Set to `True` when running vLLM offline indexing for the first time

Sources: src/hipporag/utils/config_utils.py

Main Entry Point Configuration

In the main.py script, OpenIE parameters are passed via command-line arguments:

config = BaseConfig(
    retrieval_top_k=200,
    linking_top_k=5,
    max_qa_steps=3,
    qa_top_k=5,
    graph_type="facts_and_sim_passage_node_unidirectional",
    embedding_batch_size=8,
    max_new_tokens=None,
    corpus_len=len(corpus),
    openie_mode=args.openie_mode  # 'online' or 'offline'
)

Command-line arguments:

--openie_mode: Choose between online (OpenAI API) or offline (vLLM)
--force_openie_from_scratch: If False, reuse existing OpenIE results if available

Sources: main.py

Extraction Workflow

Triple Extraction Process

The triple extraction workflow follows these steps:

sequenceDiagram
    participant Text as Raw Text Input
    participant Triple as Triple Extraction Prompt
    participant LLM as Language Model
    participant NER as NER Prompt
    participant Output as Knowledge Triples
    
    Text->>Triple: Passage text
    Triple->>LLM: Structured prompt
    LLM->>Output: Subject-Predicate-Object triples
    Output->>NER: Named Entity Recognition
    NER->>LLM: Entity labels
    LLM->>Output: Typed entities

Supported Deployment Modes

Mode	Backend	Use Case	API Key Required
`online`	OpenAI GPT	Quick testing, small corpora	Yes (`OPENAI_API_KEY`)
`offline`	vLLM	Large-scale indexing, cost efficiency	No (local deployment)

Knowledge Graph Integration

OpenIE extracted triples are converted into graph structures:

graph LR
    A[Passage Text] -->|OpenIE| B[Triple: Entity1 → Relation → Entity2]
    B --> C[Fact Node]
    C --> D[Knowledge Graph]
    D --> E[Personalized PageRank]
    E --> F[Associative Retrieval]

The extracted triples serve dual purposes:

Fact Nodes: Create direct connections between related entities
Association Links: Enable multi-hop reasoning through the graph

This design mirrors the dentate gyrus pattern separation mechanism in the hippocampus, where similar memories are differentiated to reduce interference.

Sources: README.md - Methodology

Usage Examples

Online Mode (OpenAI)

from hipporag import HippoRAG

hipporag = HippoRAG(
    save_dir='outputs',
    llm_model_name='gpt-4o-mini',
    embedding_model_name='nvidia/NV-Embed-v2'
)

# OpenIE runs automatically during indexing
hipporag.index(docs=["Passage containing facts to extract."])

Offline Mode (vLLM)

# 1. Start vLLM server
vllm serve meta-llama/Llama-3.3-70B-Instruct \
    --tensor-parallel-size 2 \
    --max_model_len 4096 \
    --gpu-memory-utilization 0.95

# 2. Run indexing with offline OpenIE
python main.py --dataset sample --openie_mode offline

Sources: README.md - Quick Start

Dependencies

The OpenIE system requires the following core dependencies:

Package	Version	Purpose
`torch`	2.5.1	PyTorch backend
`transformers`	4.45.2	Model architecture
`openai`	1.91.1	Online OpenAI API
`vllm`	0.6.6.post1	Offline inference
`litellm`	1.73.1	Unified LLM interface
`tqdm`	-	Progress bars

Sources: setup.py

Extracted Data Format

OpenIE produces structured triples in the following format:

Field	Type	Description
`subject`	str	First entity
`predicate`	str	Relation verb/phrase
`object`	str	Second entity
`context`	str	Source passage text

These triples are then processed into graph nodes and edges for the knowledge graph construction phase.

Sources: [README.md](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/README.md)

Deployment Options

Related topics: Installation and Setup, LLM Integrations

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Configuration Parameters

Continue reading this section for the full explanation and source context.

Section Running with OpenAI Models

Continue reading this section for the full explanation and source context.

Section Programmatic Usage

Continue reading this section for the full explanation and source context.

Related topics: Installation and Setup, LLM Integrations

Deployment Options

HippoRAG supports multiple deployment configurations to accommodate different infrastructure requirements and use cases. This page documents the available deployment options, configuration parameters, and setup procedures for running HippoRAG in various environments.

Overview

HippoRAG provides three primary deployment models:

Deployment Type	LLM Backend	Embedding Backend	Typical Use Case
OpenAI API	OpenAI hosted models	OpenAI/NVIDIA hosted	Quickstart, development
vLLM (Local)	Self-hosted LLMs via vLLM	Local embedding models	Production, cost-sensitive
Azure OpenAI	Azure-hosted models	Azure-hosted embeddings	Enterprise compliance

Sources: README.md

Environment Setup

Regardless of deployment type, certain environment variables must be configured:

export CUDA_VISIBLE_DEVICES=0,1,2,3
export HF_HOME=<path to Huggingface home directory>

For OpenAI and Azure deployments, additional API credentials are required:

export OPENAI_API_KEY=<your openai api key>

Sources: README.md:1

OpenAI API Deployment

The simplest deployment option uses OpenAI's hosted API endpoints for both LLM inference and embeddings.

Configuration Parameters

Parameter	Description	Example Value
`--llm_base_url`	OpenAI API endpoint	`https://api.openai.com/v1`
`--llm_name`	OpenAI model identifier	`gpt-4o-mini`
`--embedding_name`	Embedding model name	`nvidia/NV-Embed-v2`

Running with OpenAI Models

dataset=sample

python main.py --dataset $dataset \
    --llm_base_url https://api.openai.com/v1 \
    --llm_name gpt-4o-mini \
    --embedding_name nvidia/NV-Embed-v2

Sources: README.md:1

Programmatic Usage

from hipporag import HippoRAG

hipporag = HippoRAG(
    save_dir='outputs',
    llm_model_name='gpt-4o-mini',
    embedding_model_name='nvidia/NV-Embed-v2'
)

Sources: README.md:1

Local vLLM Deployment

For production environments or cost-sensitive deployments, HippoRAG supports self-hosted LLMs using vLLM.

Architecture

graph TD
    A[HippoRAG Main Process] --> B[vLLM Server]
    A --> C[Local Embedding Model]
    B --> D[GPU 0-1]
    C --> D
    E[Indexing Pipeline] --> A
    F[QA Pipeline] --> A

Starting vLLM Server

Launch the vLLM server with tensor parallelism for multi-GPU setups:

export CUDA_VISIBLE_DEVICES=0,1
export VLLM_WORKER_MULTIPROC_METHOD=spawn
export HF_HOME=<path to Huggingface home directory>

vllm serve meta-llama/Llama-3.3-70B-Instruct \
    --tensor-parallel-size 2 \
    --max_model_len 4096 \
    --gpu-memory-utilization 0.95 \
    --port 6578

Sources: README.md:1

Configuration Parameters

Parameter	Description	Default
`--llm_base_url`	vLLM server endpoint	`http://localhost:6578/v1`
`--llm_name`	Model name (must match deployed model)	`meta-llama/Llama-3.1-8B-Instruct`
`--embedding_name`	Local embedding model identifier	`nvidia/NV-Embed-v2`

Running Main Process

With vLLM server running on GPUs 0-1, run the main process on separate GPUs:

export CUDA_VISIBLE_DEVICES=2,3
export HF_HOME=<path to Huggingface home directory>

python main.py --dataset $dataset \
    --llm_base_url http://localhost:6578/v1 \
    --llm_name meta-llama/Llama-3.3-70B-Instruct \
    --embedding_name nvidia/NV-Embed-v2

Sources: README.md:1

Azure OpenAI Deployment

Enterprise deployments requiring Azure infrastructure can use Azure OpenAI endpoints.

Configuration Parameters

Parameter	CLI Argument	Description
`azure_endpoint`	`--azure_endpoint`	Azure OpenAI chat completions endpoint
`azure_embedding_endpoint`	`--azure_embedding_endpoint`	Azure OpenAI embeddings endpoint

Endpoint Format

azure_endpoint = (
    "https://[ENDPOINT_NAME].openai.azure.com/"
    "openai/deployments/gpt-4o-mini/chat/completions"
    "?api-version=2025-01-01-preview"
)

azure_embedding_endpoint = (
    "https://[ENDPOINT_NAME].openai.azure.com/"
    "openai/deployments/text-embedding-3-small/embeddings"
    "?api-version=2023-05-15"
)

Sources: demo_azure.py

Programmatic Usage

from hipporag import HippoRAG

hipporag = HippoRAG(
    save_dir='outputs',
    llm_model_name='gpt-4o-mini',
    embedding_model_name='nvidia/NV-Embed-v2',
    azure_endpoint="https://[ENDPOINT_NAME].openai.azure.com/openai/deployments/gpt-4o-mini/chat/completions?api-version=2025-01-01-preview",
    azure_embedding_endpoint="https://[ENDPOINT_NAME].openai.azure.com/openai/deployments/text-embedding-3-small/embeddings?api-version=2023-05-15"
)

hipporag.index(docs=docs)

Sources: demo_azure.py

CLI Usage

python main_azure.py \
    --dataset sample \
    --azure_endpoint "https://[ENDPOINT].openai.azure.com/openai/deployments/gpt-4o-mini/chat/completions?api-version=2025-01-01-preview" \
    --azure_embedding_endpoint "https://[ENDPOINT].openai.azure.com/openai/deployments/text-embedding-3-small/embeddings?api-version=2023-05-15" \
    --save_dir outputs

Sources: main_azure.py

Indexing Options

OpenIE Modes

HippoRAG supports two Open Information Extraction (OpenIE) modes:

Mode	Description	Resource Usage
`online`	Uses OpenAI GPT for real-time extraction	API costs
`offline`	Uses local vLLM batch processing	GPU compute

python main.py --dataset $dataset --openie_mode offline

Sources: main.py:1

Force Rebuild Options

Parameter	Description
`--force_index_from_scratch`	Ignores existing storage and rebuilds from scratch
`--force_openie_from_scratch`	Ignores cached OpenIE results and recomputes

python main_azure.py \
    --force_index_from_scratch true \
    --force_openie_from_scratch true

Sources: main_azure.py

StandardRAG vs HippoRAG

The codebase provides two RAG implementations selectable via configuration:

# Standard HippoRAG (default)
hipporag = HippoRAG(global_config=config)

# Alternative DPR-style implementation
hipporag = StandardRAG(global_config=config)

Sources: main.py and main_dpr.py

Installation Requirements

All deployment options require the HippoRAG package and its dependencies:

conda create -n hipporag python=3.10
conda activate hipporag
pip install hipporag

Or install from source:

pip install -e .

Core dependencies include:

Package	Version	Purpose
`torch`	2.5.1	Deep learning framework
`transformers`	4.45.2	Model loading
`vllm`	0.6.6.post1	Local inference
`openai`	1.91.1	API client
`litellm`	1.73.1	Unified LLM interface
`gritlm`	1.0.2	Embedding models
`networkx`	3.4.2	Graph operations
`pydantic`	2.10.4	Configuration validation

Sources: setup.py

Testing Deployments

OpenAI Test

export OPENAI_API_KEY=<your openai api key>
conda activate hipporag
python tests_openai.py

Sources: README.md:1

Local vLLM Test

export CUDA_VISIBLE_DEVICES=0
export VLLM_WORKER_MULTIPROC_METHOD=spawn
export HF_HOME=<path to Huggingface home directory>

# Start vLLM server
vllm serve meta-llama/Llama-3.1-8B-Instruct \
    --tensor-parallel-size 2 \
    --max_model_len 4096 \
    --gpu-memory-utilization 0.95 \
    --port 6578

# Run tests
CUDA_VISIBLE=1 python tests_local.py

Sources: README.md:1

Azure Test

python tests_azure.py

Sources: tests_azure.py

Deployment Decision Matrix

Criteria	OpenAI API	vLLM Local	Azure
Setup complexity	Low	High	Medium
Cost	Pay-per-use	GPU infrastructure	Azure subscription
Data privacy	Data leaves your environment	All data stays local	Configurable
Latency	Network dependent	Local, optimized	Network dependent
Model flexibility	Limited to API models	Any HuggingFace model	Limited to deployed models
Recommended for	Development, prototyping	Production, research	Enterprise compliance

Sources: [README.md](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/README.md)

Doramagic Pitfall Log

Source-linked risks stay visible on the manual page so the preview does not read like a recommendation.

high add_fact_edges function adds the same edge twice?

First-time setup may fail or require extra isolation and rollback planning.

high pypi hipporag libraries

First-time setup may fail or require extra isolation and rollback planning.

high Take the "musique" dataset as an example. The process of constructing an index based on individual paragraphs takes an…

The project may affect permissions, credentials, data exposure, or host boundaries.

medium OpenAI version incompatibility in latest 2.0.0a4 version

First-time setup may fail or require extra isolation and rollback planning.

Doramagic Pitfall Log

Doramagic extracted 16 source-linked risk signals. Review them before installing or handing real data to the project.

1. Installation risk: add_fact_edges function adds the same edge twice?

Severity: high
Finding: Installation risk is backed by a source signal: add_fact_edges function adds the same edge twice?. Treat it as a review item until the current version is checked.
User impact: First-time setup may fail or require extra isolation and rollback planning.
Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
Evidence: Source-linked evidence: https://github.com/OSU-NLP-Group/HippoRAG/issues/174

2. Installation risk: pypi hipporag libraries

Severity: high
Finding: Installation risk is backed by a source signal: pypi hipporag libraries. Treat it as a review item until the current version is checked.
User impact: First-time setup may fail or require extra isolation and rollback planning.
Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
Evidence: Source-linked evidence: https://github.com/OSU-NLP-Group/HippoRAG/issues/168

3. Security or permission risk: Take the "musique" dataset as an example. The process of constructing an index based on individual paragraphs takes an…

Severity: high
Finding: Security or permission risk is backed by a source signal: Take the "musique" dataset as an example. The process of constructing an index based on individual paragraphs takes an…. Treat it as a review item until the current version is checked.
User impact: The project may affect permissions, credentials, data exposure, or host boundaries.
Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
Evidence: Source-linked evidence: https://github.com/OSU-NLP-Group/HippoRAG/issues/173

4. Installation risk: OpenAI version incompatibility in latest 2.0.0a4 version

Severity: medium
Finding: Installation risk is backed by a source signal: OpenAI version incompatibility in latest 2.0.0a4 version. Treat it as a review item until the current version is checked.
User impact: First-time setup may fail or require extra isolation and rollback planning.
Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
Evidence: Source-linked evidence: https://github.com/OSU-NLP-Group/HippoRAG/issues/140

5. Installation risk: Windows Compatibility Issues with vLLM dependency

Severity: medium
Finding: Installation risk is backed by a source signal: Windows Compatibility Issues with vLLM dependency. Treat it as a review item until the current version is checked.
User impact: First-time setup may fail or require extra isolation and rollback planning.
Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
Evidence: Source-linked evidence: https://github.com/OSU-NLP-Group/HippoRAG/issues/117

6. Configuration risk: How to use local embedding_model_

Severity: medium
Finding: Configuration risk is backed by a source signal: How to use local embedding_model_. Treat it as a review item until the current version is checked.
User impact: Users may get misleading failures or incomplete behavior unless configuration is checked carefully.
Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
Evidence: Source-linked evidence: https://github.com/OSU-NLP-Group/HippoRAG/issues/127

7. Capability assumption: README/documentation is current enough for a first validation pass.

Severity: medium
Finding: README/documentation is current enough for a first validation pass.
User impact: The project should not be treated as fully validated until this signal is reviewed.
Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
Evidence: capability.assumptions | github_repo:805115184 | https://github.com/OSU-NLP-Group/HippoRAG | README/documentation is current enough for a first validation pass.

8. Project risk: Inquiry Regarding OpenIE Extraction Results for HippoRAG 2

Severity: medium
Finding: Project risk is backed by a source signal: Inquiry Regarding OpenIE Extraction Results for HippoRAG 2. Treat it as a review item until the current version is checked.
User impact: The project should not be treated as fully validated until this signal is reviewed.
Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
Evidence: Source-linked evidence: https://github.com/OSU-NLP-Group/HippoRAG/issues/177

9. Maintenance risk: Maintainer activity is unknown

Severity: medium
Finding: Maintenance risk is backed by a source signal: Maintainer activity is unknown. Treat it as a review item until the current version is checked.
User impact: Users cannot judge support quality until recent activity, releases, and issue response are checked.
Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
Evidence: evidence.maintainer_signals | github_repo:805115184 | https://github.com/OSU-NLP-Group/HippoRAG | last_activity_observed missing

10. Security or permission risk: no_demo

Severity: medium
Finding: no_demo
User impact: The project may affect permissions, credentials, data exposure, or host boundaries.
Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
Evidence: downstream_validation.risk_items | github_repo:805115184 | https://github.com/OSU-NLP-Group/HippoRAG | no_demo; severity=medium

11. Security or permission risk: no_demo

Severity: medium
Finding: no_demo
User impact: The project may affect permissions, credentials, data exposure, or host boundaries.
Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
Evidence: risks.scoring_risks | github_repo:805115184 | https://github.com/OSU-NLP-Group/HippoRAG | no_demo; severity=medium

12. Security or permission risk: How to distinguish Hipporag1 from Hipporag2

Severity: medium
Finding: Security or permission risk is backed by a source signal: How to distinguish Hipporag1 from Hipporag2. Treat it as a review item until the current version is checked.
User impact: The project may affect permissions, credentials, data exposure, or host boundaries.
Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
Evidence: Source-linked evidence: https://github.com/OSU-NLP-Group/HippoRAG/issues/167

Source: Doramagic discovery, validation, and Project Pack records

Community Discussion Evidence

These external discussion links are review inputs, not standalone proof that the project is production-ready.

Sources 12

Count of project-level external discussion links exposed on this manual page.

Use Review before install

Open the linked issues or discussions before treating the pack as ready for your environment.

Community Discussion Evidence

Doramagic exposes project-level community discussion separately from official documentation. Review these links before using HippoRAG with real data or production workflows.

[[Discussion] Ablation: multi-component scoring layer over HippoRAG's KG?](https://github.com/OSU-NLP-Group/HippoRAG/issues/178) - github / github_issue
Inquiry Regarding OpenIE Extraction Results for HippoRAG 2 - github / github_issue
How to use local embedding_model_ - github / github_issue
add_fact_edges function adds the same edge twice? - github / github_issue
Quadratic runtime during indexing - github / github_issue
Take the "musique" dataset as an example. The process of constructing an - github / github_issue
Windows Compatibility Issues with vLLM dependency - github / github_issue
OpenAI version incompatibility in latest 2.0.0a4 version - github / github_issue
division by zero - github / github_issue
Inquiry on Sample Selection for HippoRAG Experiments - github / github_issue
How to distinguish Hipporag1 from Hipporag2 - github / github_issue
pypi hipporag libraries - github / github_issue

Source: Project Pack community evidence and pitfall evidence