Doramagic Project Pack · Human Manual
HippoRAG
HippoRAG is a graph-based Retrieval-Augmented Generation (RAG) framework designed to enable Large Language Models (LLMs) to identify and leverage connections within knowledge bases for imp...
Installation and Setup
Related topics: Configuration System, Deployment Options
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Configuration System, Deployment Options
Installation and Setup
Overview
HippoRAG is a graph-based Retrieval-Augmented Generation (RAG) framework designed to enable Large Language Models (LLMs) to identify and leverage connections within knowledge bases for improved retrieval and question answering. The installation process configures the necessary dependencies, environment variables, and model configurations to run HippoRAG in either cloud (OpenAI) or local (vLLM) deployment modes.
Sources: README.md
System Requirements
Python Version
| Requirement | Version |
|---|---|
| Python | >= 3.10 |
The package explicitly requires Python 3.10 or higher as specified in the setup.py configuration.
Sources: setup.py:16
Hardware Requirements
| Component | Requirement |
|---|---|
| GPU | CUDA-compatible GPU(s) recommended |
| GPU Memory | Varies based on model size (see deployment sections) |
For local deployment with vLLM, the framework supports tensor parallelism across multiple GPUs. The README recommends reserving enough memory for embedding models when deploying LLM servers.
Sources: README.md
Installation Methods
Method 1: pip Installation (Recommended)
conda create -n hipporag python=3.10
conda activate hipporag
pip install hipporag
This method installs HippoRAG version 2.0.0-alpha.4 along with all core dependencies from PyPI.
Sources: README.md
Method 2: Source Installation (For Development)
git clone https://github.com/OSU-NLP-Group/HippoRAG.git
cd HippoRAG
pip install -e .
Clone the repository and install in editable mode to work with the latest source code.
Sources: CONTRIBUTING.md
Environment Variables
Proper configuration of environment variables is essential for HippoRAG to function correctly. These variables control GPU allocation, model caching, and API access.
Required Environment Variables
| Variable | Description | Example |
|---|---|---|
CUDA_VISIBLE_DEVICES | Comma-separated list of GPU device IDs | 0,1,2,3 |
HF_HOME | Path to Hugging Face cache directory | /path/to/huggingface/home |
OPENAI_API_KEY | API key for OpenAI models (cloud mode only) | sk-... |
Setting Environment Variables
# Set CUDA visible devices
export CUDA_VISIBLE_DEVICES=0,1,2,3
# Set Hugging Face cache location
export HF_HOME=<path to Huggingface home directory>
# Set OpenAI API key (required for cloud deployment)
export OPENAI_API_KEY=<your openai api key>
Sources: README.md
Core Dependencies
HippoRAG depends on a comprehensive set of libraries for LLM inference, embedding models, graph processing, and data handling.
Dependency Overview
| Package | Version | Purpose |
|---|---|---|
| torch | 2.5.1 | PyTorch deep learning framework |
| transformers | 4.45.2 | Model architectures and tokenizers |
| vllm | 0.6.6.post1 | High-throughput LLM inference |
| openai | 1.91.1 | OpenAI API client |
| litellm | 1.73.1 | Unified LLM interface |
| gritlm | 1.0.2 | Embedding model |
| networkx | 3.4.2 | Graph data structures |
| python_igraph | 0.11.8 | Graph algorithms |
| tiktoken | 0.7.0 | Tokenization |
| pydantic | 2.10.4 | Data validation |
| tenacity | 8.5.0 | Retry logic |
| einops | (latest) | Tensor operations |
| tqdm | (latest) | Progress bars |
| boto3 | (latest) | AWS S3 integration |
Sources: setup.py:17-32, requirements.txt
Additional Dependencies
The requirements.txt file includes additional packages not pinned to specific versions:
| Package | Purpose |
|---|---|
| nest_asyncio | Asynchronous operations |
| numpy | Numerical computing |
| scipy | Scientific computing |
Sources: requirements.txt
Configuration
HippoRAG uses a Pydantic-based configuration system defined in BaseConfig within config_utils.py. This configuration controls all aspects of indexing, retrieval, and QA.
Configuration Parameters
#### Embedding Configuration
| Parameter | Default | Description |
|---|---|---|
| embedding_model_name | nvidia/NV-Embed-v2 | Name of the embedding model |
| embedding_batch_size | 16 | Batch size for embedding encoding |
| embedding_return_as_normalized | True | Whether to normalize embeddings |
| embedding_max_seq_len | 2048 | Maximum sequence length for embeddings |
| embedding_model_dtype | auto | Data type for local embedding models |
#### Retrieval Configuration
| Parameter | Default | Description |
|---|---|---|
| retrieval_top_k | 200 | Number of documents to retrieve |
| linking_top_k | 5 | Number of linked nodes per retrieval step |
| damping | 0.5 | Damping factor for PPR algorithm |
| passage_node_weight | 0.05 | Weight modifier for passage nodes in PPR |
#### QA Configuration
| Parameter | Default | Description |
|---|---|---|
| max_qa_steps | 1 | Maximum steps for interleaved retrieval and reasoning |
| qa_top_k | 5 | Top k documents fed to QA model |
#### Graph Construction Configuration
| Parameter | Default | Description |
|---|---|---|
| synonymy_edge_topk | 2047 | K for KNN retrieval in synonymy edge building |
| synonymy_edge_sim_threshold | 0.8 | Similarity threshold for synonymy nodes |
| is_directed_graph | False | Whether the graph is directed |
| graph_type | facts_and_sim_passage_node_unidirectional | Type of graph structure |
#### Information Extraction Configuration
| Parameter | Default | Description |
|---|---|---|
| information_extraction_model_name | openie_openai_gpt | OpenIE model class name |
| openie_mode | online | Mode: "online" or "offline" |
#### Preprocessing Configuration
| Parameter | Default | Description |
|---|---|---|
| text_preprocessor_class_name | TextPreprocessor | Preprocessor class name |
| preprocess_encoder_name | gpt-4o | Encoder for preprocessing |
| preprocess_chunk_overlap_token_size | 128 | Overlap tokens between chunks |
| preprocess_chunk_max_token_size | None | Max tokens per chunk (None = whole doc) |
| preprocess_chunk_func | by_token | Chunking function type |
Sources: src/hipporag/utils/config_utils.py
Deployment Modes
HippoRAG supports two primary deployment modes for LLM inference.
graph TD
A[HippoRAG Deployment] --> B[Cloud Mode]
A --> C[Local Mode]
B --> B1[OpenAI API]
B --> B2[OpenAI Compatible API]
C --> C1[vLLM Server]
C --> C1b[Local Embedding Model]
B1 --> D[Requires OPENAI_API_KEY]
B2 --> E[Custom LLM Base URL]
C1 --> F[Multi-GPU Support]Cloud Mode (OpenAI)
Cloud mode uses OpenAI's API for both LLM and embedding inference.
from hipporag import HippoRAG
hipporag = HippoRAG(
save_dir='outputs',
llm_model_name='gpt-4o-mini',
embedding_model_name='nvidia/NV-Embed-v2'
)
#### OpenAI Compatible Embeddings
For OpenAI-compatible embedding endpoints:
hipporag = HippoRAG(
save_dir=save_dir,
llm_model_name='Your LLM Model name',
llm_base_url='Your LLM Model url',
embedding_model_name='Your Embedding model name',
embedding_base_url='Your Embedding model url'
)
Sources: README.md
Local Mode (vLLM)
Local mode deploys LLM servers using vLLM for offline inference with GPU acceleration.
#### Step 1: Start vLLM Server
export CUDA_VISIBLE_DEVICES=0,1
export VLLM_WORKER_MULTIPROC_METHOD=spawn
export HF_HOME=<path to Huggingface home directory>
vllm serve meta-llama/Llama-3.3-70B-Instruct \
--tensor-parallel-size 2 \
--max_model_len 4096 \
--gpu-memory-utilization 0.95 \
--port 6578
#### Step 2: Run HippoRAG with Different GPUs
export CUDA_VISIBLE_DEVICES=2,3
export HF_HOME=<path to Huggingface home directory>
python main.py --dataset sample --llm_base_url http://localhost:6578/v1
Sources: README.md
Quick Start Workflow
graph LR
A[Install HippoRAG] --> B[Configure Environment]
B --> C[Set Environment Variables]
C --> D[Initialize HippoRAG]
D --> E[index Documents]
E --> F[RAG QA Queries]Complete Example
from hipporag import HippoRAG
# Define documents
docs = [
"Oliver Badman is a politician.",
"George Rankin is a politician.",
"Cinderella attended the royal ball.",
"The prince used the lost glass slipper to search the kingdom.",
"Erik Hort's birthplace is Montebello.",
"Montebello is a part of Rockland County."
]
# Initialize HippoRAG
hipporag = HippoRAG(
save_dir='outputs',
llm_model_name='gpt-4o-mini',
embedding_model_name='nvidia/NV-Embed-v2'
)
# Index documents
hipporag.index(docs)
# Define queries and gold standard answers
queries = [
"What is George Rankin's occupation?",
"How did Cinderella reach her happy ending?",
"What county is Erik Hort's birthplace a part of?"
]
gold_docs = [
["George Rankin is a politician."],
["Cinderella attended the royal ball.",
"The prince used the lost glass slipper to search the kingdom."],
["Montebello is a part of Rockland County."]
]
answers = [
["Politician"],
["By going to the ball."],
["Rockland County"]
]
# Run RAG QA
results = hipporag.rag_qa(
queries=queries,
gold_docs=gold_docs,
gold_answers=answers
)
Sources: README.md
Testing Your Installation
OpenAI Test
Run this test to verify cloud mode functionality:
export OPENAI_API_KEY=<your openai api key>
conda activate hipporag
python tests_openai.py
Local Test
Run this test to verify local vLLM mode:
export CUDA_VISIBLE_DEVICES=0
export VLLM_WORKER_MULTIPROC_METHOD=spawn
export HF_HOME=<path to Huggingface home directory>
# Start vLLM server
vllm serve meta-llama/Llama-3.1-8B-Instruct \
--tensor-parallel-size 2 \
--max_model_len 4096 \
--gpu-memory-utilization 0.95 \
--port 6578
# Run local test
CUDA_VISIBLE=1 python tests_local.py
Sources: README.md
Troubleshooting
Out of Memory (OOM) Errors
If you encounter OOM errors during local deployment:
- Reduce
gpu-memory-utilizationparameter in vLLM - Reduce
max_model_lenin vLLM server - Adjust
CUDA_VISIBLE_DEVICESto use more GPUs - Reduce
embedding_batch_sizein configuration
Environment Variable Issues
Ensure all required environment variables are set before running HippoRAG:
# Verify environment variables are set
echo $CUDA_VISIBLE_DEVICES
echo $HF_HOME
echo $OPENAI_API_KEY
Conda Environment
Always activate the correct conda environment before running commands:
conda activate hipporag
Sources: README.md
Reproducing Experiments
To reproduce the paper's experiments:
- Clone the repository and install dependencies
- Download datasets from HuggingFace or use provided samples in
reproduce/dataset - Set required environment variables
- Run the main script with appropriate parameters:
# OpenAI model
python main.py \
--dataset sample \
--llm_base_url https://api.openai.com/v1 \
--llm_name gpt-4o-mini \
--embedding_name nvidia/NV-Embed-v2
# Local vLLM model
python main.py \
--dataset sample \
--llm_base_url http://localhost:6578/v1 \
--llm_name meta-llama/Llama-3.3-70B-Instruct \
--embedding_name nvidia/NV-Embed-v2
Sources: [README.md](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/README.md)
Quick Start Guide
Related topics: Installation and Setup, HippoRAG Core Class
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Installation and Setup, HippoRAG Core Class
Quick Start Guide
This guide provides a comprehensive walkthrough for setting up and running HippoRAG, enabling you to quickly leverage neurobiologically inspired long-term memory capabilities for Large Language Models.
Prerequisites
Before beginning, ensure your environment meets the following requirements:
| Requirement | Specification |
|---|---|
| Python | >= 3.10 |
| CUDA GPUs | Required for local embedding model inference |
| HuggingFace Home | Configured via HF_HOME environment variable |
| API Keys | OpenAI API key (if using OpenAI models) |
Environment Setup
# Create conda environment
conda create -n hipporag python=3.10
conda activate hipporag
# Install HippoRAG
pip install hipporag
# Configure environment variables
export CUDA_VISIBLE_DEVICES=0,1,2,3
export HF_HOME=<path to Huggingface home directory>
export OPENAI_API_KEY=<your openai api key>
Sources: README.md:150-165
Core Usage Patterns
HippoRAG supports three primary deployment configurations. The initialization workflow follows this pattern:
graph TD
A[Initialize HippoRAG] --> B{Select LLM Backend}
B -->|OpenAI| C[Set llm_model_name + llm_base_url]
B -->|vLLM| D[Set llm_model_name + llm_base_url]
B -->|Azure| E[Set azure_endpoint]
A --> F{Select Embedding Backend}
F -->|HuggingFace| G[Set embedding_model_name]
F -->|Custom| H[Set embedding_base_url]Sources: demo_azure.py:1-30
Pattern 1: OpenAI Models
The simplest configuration uses OpenAI for both LLM inference and embeddings:
from hipporag import HippoRAG
# Configuration
save_dir = 'outputs'
llm_model_name = 'gpt-4o-mini'
embedding_model_name = 'nvidia/NV-Embed-v2'
# Initialize HippoRAG instance
hipporag = HippoRAG(
save_dir=save_dir,
llm_model_name=llm_model_name,
embedding_model_name=embedding_model_name
)
Sources: README.md:175-195
Pattern 2: OpenAI Compatible Embeddings
For custom LLM endpoints that follow OpenAI's API format:
hipporag = HippoRAG(
save_dir=save_dir,
llm_model_name='Your LLM Model name',
llm_base_url='Your LLM Model url',
embedding_model_name='Your Embedding model name',
embedding_base_url='Your Embedding model url'
)
Sources: README.md:210-220
Pattern 3: Azure OpenAI Integration
For Azure-hosted models:
hipporag = HippoRAG(
save_dir=save_dir,
llm_model_name=llm_model_name,
embedding_model_name=embedding_model_name,
azure_endpoint="https://[ENDPOINT NAME].openai.azure.com/openai/deployments/gpt-4o-mini/chat/completions?api-version=2025-01-01-preview",
azure_embedding_endpoint="https://[ENDPOINT NAME].openai.azure.com/openai/deployments/text-embedding-3-small/embeddings?api-version=2023-05-15"
)
Sources: demo_azure.py:10-15
Indexing Documents
The indexing process converts raw documents into HippoRAG's knowledge graph structure:
graph LR
A[Raw Documents] --> B[Chunking]
B --> C[OpenIE Extraction]
C --> D[Embedding Generation]
D --> E[Graph Construction]
E --> F[Knowledge Graph Index]Input Data Format
Documents should be provided as a list of strings:
docs = [
"Oliver Badman is a politician.",
"George Rankin is a politician.",
"Cinderella attended the royal ball.",
"The prince used the lost glass slipper to search the kingdom.",
]
Execute Indexing
hipporag.index(docs=docs)
Sources: demo_azure.py:18-45
Retrieval and Question Answering
The rag_qa method performs retrieval-augmented question answering:
graph TD
A[Query Input] --> B[Retrieval]
B --> C[Personalized PageRank]
C --> D[Document Selection]
D --> E[QA Generation]
E --> F[Final Answer]
C -.->|links documents| G[Knowledge Graph]
G -.->|context| DComplete QA Example
# Prepare queries and evaluation data
queries = [
"What is George Rankin's occupation?",
"How did Cinderella reach her happy ending?"
]
answers = [
["Politician"],
["By going to the ball."]
]
gold_docs = [
["George Rankin is a politician."],
["Cinderella attended the royal ball.",
"The prince used the lost glass slipper to search the kingdom.",
"When the slipper fit perfectly, Cinderella was reunited with the prince."]
]
# Execute RAG QA
results = hipporag.rag_qa(
queries=queries,
gold_docs=gold_docs,
gold_answers=answers
)
print(results)
Sources: README.md:195-215
Local Deployment with vLLM
For running LLMs locally, HippoRAG supports vLLM server integration:
Step 1: Start vLLM Server
export CUDA_VISIBLE_DEVICES=0,1
export VLLM_WORKER_MULTIPROC_METHOD=spawn
export HF_HOME=<path to Huggingface home directory>
conda activate hipporag
# Adjust gpu-memory-utilization and max_model_len based on your GPU memory
vllm serve meta-llama/Llama-3.1-8B-Instruct \
--tensor-parallel-size 2 \
--max_model_len 4096 \
--gpu-memory-utilization 0.95 \
--port 6578
Sources: README.md:225-240
Step 2: Initialize HippoRAG with vLLM
hipporag = HippoRAG(
save_dir=save_dir,
llm_model_name='meta-llama/Llama-3.1-8B-Instruct',
llm_base_url='http://localhost:6578/v1',
embedding_model_name='nvidia/NV-Embed-v2'
)
Reproducing Experiments
For reproducing published experiments, follow the structured workflow:
Dataset Structure
| File Type | Naming Convention | Purpose |
|---|---|---|
| Corpus | {dataset}_corpus.json | Document collection |
| Queries | {dataset}.json | Questions with answers |
| Output | outputs/{dataset}/ | Index and results |
Corpus JSON Format
[
{
"title": "FIRST PASSAGE TITLE",
"text": "FIRST PASSAGE TEXT",
"idx": 0
},
{
"title": "SECOND PASSAGE TITLE",
"text": "SECOND PASSAGE TEXT",
"idx": 1
}
]
Sources: README.md:100-125
Running Experiments
# Set environment variables
export CUDA_VISIBLE_DEVICES=0,1,2,3
export HF_HOME=<path to Huggingface home directory>
export OPENAI_API_KEY=<your openai api key>
conda activate hipporag
# Run with OpenAI model
dataset=sample
python main.py --dataset $dataset \
--llm_base_url https://api.openai.com/v1 \
--llm_name gpt-4o-mini \
--embedding_name nvidia/NV-Embed-v2
Sources: main.py:1-35
Testing Your Installation
OpenAI Test
Verify installation with minimal OpenAI API cost:
export OPENAI_API_KEY=<your openai api key>
conda activate hipporag
python tests_openai.py
Local Test with vLLM
Test with a locally deployed model:
export CUDA_VISIBLE_DEVICES=0
export VLLM_WORKER_MULTIPROC_METHOD=spawn
export HF_HOME=<path to Huggingface home directory>
conda activate hipporag
# Start vLLM server with smaller model
vllm serve meta-llama/Llama-3.1-8B-Instruct \
--tensor-parallel-size 2 \
--max_model_len 4096 \
--gpu-memory-utilization 0.95 \
--port 6578
# Run test
CUDA_VISIBLE=1 python tests_local.py
Sources: README.md:250-280
Configuration Parameters
Core Parameters
| Parameter | Default | Description |
|---|---|---|
save_dir | outputs | Directory for saving all related information |
llm_model_name | - | LLM model identifier |
llm_base_url | - | Base URL for LLM API endpoint |
embedding_model_name | nvidia/NV-Embed-v2 | Embedding model identifier |
embedding_batch_size | 16 | Batch size for embedding model |
Sources: src/hipporag/utils/config_utils.py:50-80
Retrieval Parameters
| Parameter | Default | Description |
|---|---|---|
retrieval_top_k | 200 | Number of documents to retrieve initially |
linking_top_k | 5 | Number of linked nodes at each retrieval step |
qa_top_k | 5 | Number of documents fed to QA model |
max_qa_steps | 1 | Maximum interleaved retrieval-reasoning steps |
damping | 0.5 | Damping factor for Personalized PageRank |
Sources: src/hipporag/utils/config_utils.py:30-50
Graph Construction Parameters
| Parameter | Default | Description |
|---|---|---|
synonymy_edge_topk | 2047 | K for KNN retrieval in synonymy edge building |
synonymy_edge_sim_threshold | 0.8 | Similarity threshold for synonymy nodes |
graph_type | facts_and_sim_passage_node_unidirectional | Type of graph structure to construct |
is_directed_graph | False | Whether to build a directed graph |
Sources: src/hipporag/utils/config_utils.py:80-110
Troubleshooting
Common Issues
| Issue | Solution |
|---|---|
| CUDA OOM errors | Reduce gpu-memory-utilization or max_model_len in vLLM; reduce embedding_batch_size |
| Connection errors | Verify API endpoint URLs and network connectivity |
| Index loading failures | Check that save_dir contains valid index files |
Environment Validation
Always verify your setup before running experiments:
# Verify CUDA availability
python -c "import torch; print(torch.cuda.is_available())"
# Verify package installation
pip list | grep hipporag
Next Steps
- Explore the Code Structure documentation for deep-dive into modules
- Review experiment reproducibility guidelines in
main.py - Access pre-processed datasets from the HuggingFace dataset page
Sources: [README.md:150-165]()
Configuration System
Related topics: Installation and Setup, HippoRAG Core Class
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Installation and Setup, HippoRAG Core Class
Configuration System
HippoRAG provides a comprehensive configuration system built on Pydantic's data validation framework. The configuration system enables fine-grained control over all aspects of the indexing, retrieval, and QA pipeline while maintaining type safety and default values for common use cases.
Architecture Overview
The configuration system is centered around the BaseConfig class defined in config_utils.py. This class uses Pydantic's BaseModel with Field definitions to provide structured configuration with metadata and validation.
graph TD
A[BaseConfig] --> B[OpenIE Configuration]
A --> C[Embedding Configuration]
A --> D[Graph Construction Configuration]
A --> E[Retrieval Configuration]
A --> F[QA Configuration]
A --> G[Save/Directory Configuration]
A --> H[Dataset Configuration]
I[main.py] --> A
J[HippoRAG class] --> A
K[StandardRAG class] --> ASource: src/hipporag/utils/config_utils.py:1-100
Core Configuration Class
BaseConfig
The BaseConfig class serves as the single source of truth for all pipeline parameters. It inherits from Pydantic's BaseModel and provides automatic validation, serialization, and documentation through field metadata.
from hipporag.utils.config_utils import BaseConfig
global_config = BaseConfig(
openie_mode='openai_gpt',
information_extraction_model_name='gpt-4o-mini',
embedding_model_name='nvidia/NV-Embed-v2',
retrieval_top_k=200,
linking_top_k=5,
max_qa_steps=3,
qa_top_k=5,
graph_type="facts_and_sim_passage_node_unidirectional",
embedding_batch_size=8
)
Source: main.py:20-35
Configuration Categories
OpenIE (Open Information Extraction) Configuration
Controls the information extraction module that identifies facts and entities from passages.
| Parameter | Type | Default | Description |
|---|---|---|---|
openie_mode | Literal["openai_gpt", "vllm_offline", "Transformers-offline"] | "openai_gpt" | The mode for information extraction model |
information_extraction_model_name | str | "gpt-4o-mini" | Model name for information extraction |
The openie_mode parameter supports three execution modes:
openai_gpt: Uses OpenAI's GPT models for extraction via APIvllm_offline: Uses locally deployed LLMs through vLLM serverTransformers-offline: Uses HuggingFace Transformers models directly
Source: src/hipporag/utils/config_utils.py:config_fields
Embedding Model Configuration
Manages embedding generation for passages and queries.
| Parameter | Type | Default | Description |
|---|---|---|---|
embedding_model_name | str | "nvidia/NV-Embed-v2" | Name of the embedding model |
embedding_batch_size | int | 16 | Batch size for embedding generation |
embedding_return_as_normalized | bool | True | Whether to normalize embeddings |
embedding_max_seq_len | int | 2048 | Maximum sequence length for embedding model |
embedding_model_dtype | Literal["float16", "float32", "bfloat16", "auto"] | "auto" | Data type for local embedding model |
embedding_base_url | Optional[str] | None | Base URL for OpenAI-compatible embedding endpoints |
Source: src/hipporag/utils/config_utils.py:embedding_batch_size-def
Graph Construction Configuration
Controls the knowledge graph construction process that forms the backbone of HippoRAG's memory system.
| Parameter | Type | Default | Description |
|---|---|---|---|
synonymy_edge_topk | int | 2047 | K value for KNN retrieval in building synonymy edges |
synonymy_edge_query_batch_size | int | 1000 | Batch size for query embeddings during KNN retrieval |
synonymy_edge_key_batch_size | int | 10000 | Batch size for key embeddings during KNN retrieval |
synonymy_edge_sim_threshold | float | 0.8 | Similarity threshold for including candidate synonymy nodes |
is_directed_graph | bool | False | Whether the constructed graph is directed or undirected |
graph_type | str | "facts_and_sim_passage_node_unidirectional" | Type of graph structure to build |
Supported graph_type values include:
facts_and_sim_passage_node_unidirectional- Passages connected via facts with similarity edgesfacts_and_sim_passage_node_bidirectional- Bidirectional passage connectionsfacts_only- Only fact-based connectionssim_passage_node- Only passage similarity connections
Source: src/hipporag/utils/config_utils.py:synonymy_edge_topk-def
Retrieval Configuration
Parameters governing the retrieval and linking process using Personalized PageRank (PPR).
| Parameter | Type | Default | Description |
|---|---|---|---|
linking_top_k | int | 5 | Number of linked nodes at each retrieval step |
retrieval_top_k | int | 200 | Number of documents to retrieve at each step |
damping | float | 0.5 | Damping factor for PPR algorithm |
The damping parameter controls the probability of following graph edges during the random walk in PPR. A higher value (closer to 1.0) results in more exploration, while lower values favor exploitation of high-probability paths.
Source: src/hipporag/utils/config_utils.py:linking_top_k-def, main.py:28
QA (Question Answering) Configuration
Controls the iterative QA process that interleaves retrieval with reasoning.
| Parameter | Type | Default | Description |
|---|---|---|---|
max_qa_steps | int | 1 | Maximum steps for interleaved retrieval and reasoning |
qa_top_k | int | 5 | Number of top documents fed to the QA model |
The max_qa_steps parameter enables multi-step reasoning where the system can retrieve additional documents based on intermediate reasoning results before producing the final answer.
Source: src/hipporag/utils/config_utils.py:max_qa_steps-def, main.py:27
LLM Configuration
Manages the language model used for QA and information extraction.
| Parameter | Type | Default | Description |
|---|---|---|---|
llm_model_name | str | "gpt-4o-mini" | Name of the LLM |
llm_base_url | Optional[str] | None | Base URL for OpenAI-compatible LLM endpoints |
max_new_tokens | Optional[int] | None | Maximum new tokens for generation |
Source: src/hipporag/utils/config_utils.py:llm_model_name-def
Save and Directory Configuration
Controls output persistence and directory structure.
| Parameter | Type | Default | Description |
|---|---|---|---|
save_dir | str | "outputs" | Top-level directory for saving all related information |
corpus_len | int | Required | Length of the corpus being processed |
The save_dir parameter specifies where HippoRAG objects, intermediate results, and evaluation outputs are stored. When running with specific datasets, the default saves to a dataset-customized output directory under save_dir.
Source: src/hipporag/utils/config_utils.py:save_dir-def, main.py:32
Configuration Workflow
graph LR
A[Define BaseConfig] --> B[Initialize HippoRAG]
B --> C[Index Documents]
C --> D[Run RAG QA]
D --> E[Results Saved to save_dir]
F[Modify Config] -->|Update| B
G[New Documents] -->|Index| CInitialization Example
from hipporag.utils.config_utils import BaseConfig
from hipporag import HippoRAG
config = BaseConfig(
openie_mode='openai_gpt',
information_extraction_model_name='gpt-4o-mini',
embedding_model_name='nvidia/NV-Embed-v2',
retrieval_top_k=200,
linking_top_k=5,
max_qa_steps=3,
qa_top_k=5,
graph_type="facts_and_sim_passage_node_unidirectional",
embedding_batch_size=8,
max_new_tokens=None,
corpus_len=len(corpus),
)
hipporag = HippoRAG(global_config=config)
Source: main.py:19-38
Configuration for Different Execution Modes
OpenAI API Mode
config = BaseConfig(
openie_mode='openai_gpt',
information_extraction_model_name='gpt-4o-mini',
llm_model_name='gpt-4o-mini',
embedding_model_name='nvidia/NV-Embed-v2',
)
Source: main.py:20-26
Local vLLM Deployment Mode
config = BaseConfig(
openie_mode='vllm_offline',
information_extraction_model_name='meta-llama/Llama-3.1-8B-Instruct',
llm_model_name='meta-llama/Llama-3.3-70B-Instruct',
llm_base_url='http://localhost:8000/v1',
embedding_model_name='nvidia/NV-Embed-v2',
)
Source: README.md:vllm_example
Transformers Offline Mode
config = BaseConfig(
openie_mode='Transformers-offline',
information_extraction_model_name='Transformers/Qwen/Qwen2.5-7B-Instruct',
llm_model_name='gpt-4o-mini',
embedding_model_name='nvidia/NV-Embed-v2',
)
Source: test_transformers.py:16-20
Testing with Configuration
The test suite demonstrates configuration usage across different scenarios:
# tests_openai.py - Basic indexing and QA
hipporag = HippoRAG(
save_dir=save_dir,
llm_model_name='gpt-4o-mini',
embedding_model_name='nvidia/NV-Embed-v2'
)
# tests_openai.py - Document deletion
hipporag.delete(docs_to_delete)
# test_transformers.py - Transformers offline mode
hipporag = HippoRAG(
global_config=global_config,
save_dir=save_dir,
llm_model_name='gpt-4o-mini',
embedding_model_name='nvidia/NV-Embed-v2',
)
Source: tests_openai.py:test_structure, test_transformers.py:16-25
Package Dependencies
The configuration system depends on the following packages specified in setup.py:
| Package | Version | Purpose |
|---|---|---|
torch | 2.5.1 | PyTorch backend for models |
transformers | 4.45.2 | HuggingFace Transformers |
pydantic | 2.10.4 | Data validation and settings |
vllm | 0.6.6.post1 | LLM inference server |
openai | 1.91.1 | OpenAI API client |
litellm | 1.73.1 | Unified LLM interface |
gritlm | 1.0.2 | GritLM embedding model |
networkx | 3.4.2 | Graph operations |
python_igraph | 0.11.8 | Graph algorithms |
tiktoken | 0.7.0 | Tokenization |
tenacity | 8.5.0 | Retry logic |
Source: setup.py:14-27
Best Practices
``bash export OPENAI_API_KEY=<your_openai_api_key> export HF_HOME=<path_to_huggingface_home> ``
- Use environment variables for sensitive configuration like API keys:
``bash export CUDA_VISIBLE_DEVICES=0,1,2,3 ``
- Set GPU devices before initialization:
- Adjust batch sizes based on available GPU memory when using local models
- Configure damping factor carefully for retrieval - higher values (0.7-0.85) work better for complex multi-hop questions
- Set corpus_len correctly to enable proper progress tracking and memory management
Source: https://github.com/OSU-NLP-Group/HippoRAG / Human Manual
HippoRAG Core Class
Related topics: Knowledge Graph and Retrieval, Embedding Models
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Knowledge Graph and Retrieval, Embedding Models
HippoRAG Core Class
Overview
HippoRAG is a neurobiologically inspired graph-based Retrieval-Augmented Generation (RAG) framework designed to enable Large Language Models (LLMs) to identify and leverage connections within knowledge for improved retrieval and question answering. The project implements two primary RAG classes: HippoRAG (neurobiologically inspired with Personal Knowledge Graph) and StandardRAG (traditional DPR-based approach).
Sources: setup.py:8-9
Architecture Overview
graph TB
subgraph "Input Layer"
Docs[Documents/Passages]
Queries[User Queries]
end
subgraph "HippoRAG Core"
Index[Indexing Pipeline]
Retrieve[Retrieval Pipeline]
QA[Question Answering]
end
subgraph "Knowledge Graph Construction"
OpenIE[OpenIE Information Extraction]
Embed[Embedding Model]
GraphBuild[Graph Building]
end
subgraph "Backend Services"
LLM[LLM Inference]
EmbedModel[Embedding Service]
end
Docs --> Index
Index --> OpenIE
Index --> Embed
OpenIE --> GraphBuild
Embed --> GraphBuild
GraphBuild --> KG[Knowledge Graph]
Queries --> Retrieve
Retrieve --> KG
KG --> QA
QA --> LLM
Retrieve --> EmbedModelCore Classes
HippoRAG Class
The HippoRAG class is the main entry point for the neurobiologically inspired RAG system. It extends a base RAG implementation with Personal Knowledge Graph (PKG) capabilities.
Initialization Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
save_dir | str | Required | Directory to save all related information |
llm_model_name | str | Required | LLM model identifier (e.g., gpt-4o-mini) |
embedding_model_name | str | Required | Embedding model name (e.g., nvidia/NV-Embed-v2) |
global_config | BaseConfig | None | Full configuration object |
llm_base_url | str | None | Custom LLM API endpoint for OpenAI-compatible models |
embedding_base_url | str | None | Custom embedding API endpoint |
azure_endpoint | str | None | Azure OpenAI endpoint for LLM |
azure_embedding_endpoint | str | None | Azure OpenAI endpoint for embeddings |
Sources: main.py:19-28
Basic Usage Pattern
from hipporag import HippoRAG
hipporag = HippoRAG(
save_dir='outputs',
llm_model_name='gpt-4o-mini',
embedding_model_name='nvidia/NV-Embed-v2'
)
# Index documents
hipporag.index(docs=documents_list)
# Retrieve and answer queries
results = hipporag.rag_qa(
queries=query_list,
gold_docs=expected_documents,
gold_answers=expected_answers
)
StandardRAG Class
The StandardRAG class provides traditional Dense Passage Retrieval (DPR) based RAG without the Personal Knowledge Graph components. This is useful for baseline comparisons.
Sources: main_dpr.py:19
Configuration System
BaseConfig Parameters
The BaseConfig class (defined in src/hipporag/utils/config_utils.py) provides comprehensive configuration options:
OpenIE Configuration
| Parameter | Type | Default | Description |
|---|---|---|---|
openie_mode | str | Required | OpenIE mode: OpenAI, vllm-offline, or Transformers-offline |
information_extraction_model_name | str | None | Model for offline OpenIE (e.g., Qwen/Qwen2.5-7B-Instruct) |
Embedding Configuration
| Parameter | Type | Default | Description |
|---|---|---|---|
embedding_batch_size | int | 16 | Batch size for embedding model inference |
embedding_return_as_normalized | bool | True | Whether to normalize embeddings |
embedding_max_seq_len | int | 2048 | Maximum sequence length for embedding |
embedding_model_dtype | str | "auto" | Data type: float16, float32, bfloat16, or auto |
Graph Construction Configuration
| Parameter | Type | Default | Description |
|---|---|---|---|
synonymy_edge_topk | int | 2047 | K value for KNN retrieval in synonymy edge construction |
synonymy_edge_query_batch_size | int | 1000 | Batch size for query embeddings |
synonymy_edge_key_batch_size | int | 10000 | Batch size for key embeddings |
synonymy_edge_sim_threshold | float | 0.8 | Similarity threshold for synonymy edges |
is_directed_graph | bool | False | Whether the graph is directed |
Retrieval Configuration
| Parameter | Type | Default | Description |
|---|---|---|---|
retrieval_top_k | int | 200 | Number of documents to retrieve initially |
linking_top_k | int | 5 | Number of linked nodes at each retrieval step |
damping | float | 0.5 | Damping factor for Personalized PageRank |
QA Configuration
| Parameter | Type | Default | Description |
|---|---|---|---|
max_qa_steps | int | 1 | Maximum interleaved retrieval and reasoning steps |
qa_top_k | int | 5 | Top k documents fed to QA model |
Sources: src/hipporag/utils/config_utils.py:1-80
Core Methods
Indexing Pipeline
graph LR
A[Documents] --> B[Passage Embedding]
B --> C[OpenIE Extraction]
C --> D[Fact Node Creation]
D --> E[Similarity Edge Building]
E --> F[Knowledge Graph]Method Signature
def index(self, docs: List[str], **kwargs) -> None
The indexing process:
- Embeds passages using the configured embedding model
- Runs OpenIE to extract factual triples from each passage
- Constructs fact nodes and passage nodes in the knowledge graph
- Builds synonymy edges based on embedding similarity
- Persists the graph structure to
save_dir
RAG QA Pipeline
graph TD
Q[Query] --> EP[Embedding]
EP --> PPR[Personalized PageRank]
PPR --> LN[Linked Nodes]
LN --> LLM[LLM Reasoning]
LLM -->|Iteration| Check{More Steps?}
Check -->|Yes| EP
Check -->|No| Final[Final Answer]Method Signature
def rag_qa(
self,
queries: List[str],
gold_docs: Optional[List[List[str]]] = None,
gold_answers: Optional[List[List[str]]] = None,
**kwargs
) -> Dict
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
queries | List[str] | Yes | List of questions to answer |
gold_docs | List[List[str]] | No | Ground truth documents for evaluation |
gold_answers | List[List[str]] | No | Ground truth answers for evaluation |
Returns
A dictionary containing evaluation metrics and retrieved results.
Document Deletion
def delete(self, docs_to_delete: List[str]) -> None
Removes specified documents from the knowledge graph and updates persistence.
Supported Backend Models
LLM Backends
| Backend | Configuration | Example Model |
|---|---|---|
| OpenAI | llm_model_name | gpt-4o-mini |
| Azure OpenAI | azure_endpoint | Azure deployment URL |
| vLLM (Local) | llm_base_url + vLLM server | meta-llama/Llama-3.1-8B-Instruct |
| OpenAI-Compatible | llm_model_name + llm_base_url | Custom endpoint |
Sources: README.md:80-95
Embedding Models
| Model Type | Configuration | Notes |
|---|---|---|
| NV-Embed-v2 | embedding_model_name='nvidia/NV-Embed-v2' | Recommended |
| GritLM | embedding_model_name='GritLM' | Supported |
| Contriever | embedding_model_name='Contriever' | Supported |
| Azure Embeddings | azure_embedding_endpoint | Via Azure OpenAI |
| Custom OpenAI-Compatible | embedding_base_url | Any compatible endpoint |
OpenIE Modes
HippoRAG supports three OpenIE (Open Information Extraction) modes:
| Mode | Description | Use Case |
|---|---|---|
OpenAI | Uses OpenAI GPT models for extraction | Cloud-based, high quality |
vllm-offline | Uses locally deployed vLLM models | GPU-equipped servers |
Transformers-offline | Uses HuggingFace Transformers | CPU or limited GPU |
Sources: test_transformers.py:20-22
Workflow Example
from hipporag import HippoRAG
# Initialize
hipporag = HippoRAG(
save_dir='outputs',
llm_model_name='gpt-4o-mini',
embedding_model_name='nvidia/NV-Embed-v2'
)
# Prepare data
docs = [
"Oliver Badman is a politician.",
"George Rankin is a politician.",
"Cinderella attended the royal ball."
]
# Index
hipporag.index(docs=docs)
# Query
queries = ["What is George Rankin's occupation?"]
answers = [["Politician"]]
gold_docs = [["George Rankin is a politician."]]
# Retrieve and evaluate
results = hipporag.rag_qa(
queries=queries,
gold_docs=gold_docs,
gold_answers=answers
)
Graph Types
The framework supports configurable graph structures:
| Graph Type | Description |
|---|---|
facts_and_sim_passage_node_unidirectional | Facts with similarity-based passage connections (default) |
Graph edges include:
- Fact-to-Fact edges: Created from OpenIE extractions
- Synonymy edges: Based on embedding similarity above threshold
- Passage edges: Connect passages to their extracted facts
Dependencies
Key package dependencies managed in setup.py:
| Package | Version | Purpose |
|---|---|---|
torch | 2.5.1 | Deep learning framework |
transformers | 4.45.2 | Model architectures |
vllm | 0.6.6.post1 | LLM inference |
openai | 1.91.1 | OpenAI API client |
gritlm | 1.0.2 | GritLM embedding model |
networkx | 3.4.2 | Graph operations |
python_igraph | 0.11.8 | Graph algorithms |
pydantic | 2.10.4 | Configuration validation |
tiktoken | 0.7.0 | Tokenization |
Sources: setup.py:15-30
Error Handling
The framework uses tenacity for retry mechanisms with configurable backoff strategies when interacting with external APIs (OpenAI, Azure, vLLM).
Persistence
All indexed data is persisted to the save_dir directory with the following structure:
save_dir/
└── {llm_model_name}_{embedding_model_name}/
├── knowledge_graph.pkl # Serialized graph
├── passages.pkl # Passage embeddings
├── fact_nodes.pkl # Extracted facts
└── config.json # Configuration snapshotSources: [setup.py:8-9]()
Knowledge Graph and Retrieval
Related topics: Embedding Store and Management, LLM Integrations
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Embedding Store and Management, LLM Integrations
Knowledge Graph and Retrieval
Overview
HippoRAG implements a neurobiologically inspired retrieval system that combines knowledge graph construction with advanced retrieval algorithms. The system is designed to enable LLMs to identify and leverage connections within new knowledge for improved retrieval performance. Sources: setup.py:8
The Knowledge Graph and Retrieval module forms the core of HippoRAG's architecture, providing mechanisms to:
- Extract factual knowledge from text passages using Open Information Extraction (OpenIE)
- Construct heterogeneous graphs with multiple node and edge types
- Perform personalized PageRank (PPR) based retrieval over the constructed graphs
- Support incremental updates and document deletion operations
Sources: src/hipporag/utils/config_utils.py:48-72
Architecture
High-Level System Design
HippoRAG's retrieval system integrates several key components working in concert to provide accurate and efficient knowledge retrieval:
graph TD
A[Input Documents] --> B[OpenIE Processing]
B --> C[Knowledge Graph Construction]
C --> D[Embedding Generation]
D --> E[Synonymy Edge Building]
C --> F[Hybrid Graph]
G[Query Input] --> H[Query Embedding]
H --> I[Personalized PageRank]
I --> F
F --> J[Retrieval Results]
J --> K[Reranking]
K --> L[Final QA Output]
M[LLM Inference] --> LGraph Construction Pipeline
The graph construction process transforms raw text into a structured knowledge representation:
graph LR
A[Passages] --> B[OpenIE Extractor]
B --> C[Triplets/Entities]
C --> D[Fact Nodes]
E[Passages] --> F[Embedding Model]
F --> G[Passage Embeddings]
G --> H[Passage Nodes]
D --> I[Passage-Fact Edges]
H --> I
G --> J[Synonymy Edges]
J --> K[knn Retrieval]
K --> L[Similarity Threshold Filter]
L --> M[Synonymy Edge Network]Knowledge Graph Components
Node Types
| Node Type | Description | Attributes |
|---|---|---|
| Passage Nodes | Represent original text passages | idx, title, text, embedding |
| Fact Nodes | Extracted facts/triplets from OpenIE | subject, predicate, object, embedding |
Edge Types
| Edge Type | Source | Target | Purpose |
|---|---|---|---|
| Passage-to-Fact | Passage Node | Fact Node | Links passages to their extracted facts |
| Fact-to-Fact | Fact Node | Fact Node | Connects semantically related facts |
| Synonymy | Passage Node | Passage Node | Links passages with high semantic similarity |
| Bidirectional | Both | Both | Full edge in both directions |
Sources: src/hipporag/utils/config_utils.py:70-85
Graph Types Configuration
The system supports multiple graph configurations via the graph_type parameter:
| Graph Type | Description |
|---|---|
facts_and_sim_passage_node_unidirectional | Facts + similar passage nodes, unidirectional edges |
facts_and_sim_passage_node_bidirectional | Facts + similar passage nodes, bidirectional edges |
| Custom types | Extensible graph construction patterns |
Sources: main.py:18
Retrieval Process
Personalized PageRank (PPR) Algorithm
HippoRAG uses Personalized PageRank for graph-based retrieval, which allows queries to propagate through the knowledge graph to identify relevant nodes.
graph TD
A[Query] --> B[Query Embedding]
B --> C[Initial PPR Scores]
C --> D[Graph Propagation]
D --> E{Iteration}
E -->|Continue| F[Score Aggregation]
F --> D
E -->|Converge| G[Top-K Selection]
G --> H[Linked Nodes]
I[damping factor: 0.5] --> D
J[linking_top_k: 5] --> GRetrieval Configuration Parameters
| Parameter | Default | Description |
|---|---|---|
retrieval_top_k | 200 | Number of documents retrieved at each step |
linking_top_k | 5 | Number of linked nodes at each retrieval step |
damping | 0.5 | Damping factor for PPR algorithm |
qa_top_k | 5 | Top-k documents fed to QA model |
Sources: src/hipporag/utils/config_utils.py:60-72
Synonymy Edge Construction
Synonymy edges connect passages with high semantic similarity, enabling cross-document retrieval:
graph TD
A[All Passage Embeddings] --> B[KNN Retrieval]
B --> C[Top-K Candidates]
C --> D{Similarity > Threshold?}
D -->|Yes| E[Create Synonymy Edge]
D -->|No| F[Discard]
E --> G[Synonymy Edge Network]#### Synonymy Edge Parameters
| Parameter | Default | Description |
|---|---|---|
synonymy_edge_topk | 2047 | k for knn retrieval in building synonymy edges |
synonymy_edge_query_batch_size | 1000 | Batch size for query embeddings |
synonymy_edge_key_batch_size | 10000 | Batch size for key embeddings |
synonymy_edge_sim_threshold | 0.8 | Similarity threshold for candidate synonymy nodes |
Sources: src/hipporag/utils/config_utils.py:73-85
Embedding Integration
Embedding Model Configuration
| Parameter | Default | Description |
|---|---|---|
embedding_model_name | - | Name of the embedding model |
embedding_batch_size | 16 | Batch size for embedding calls |
embedding_return_as_normalized | True | Whether to normalize embeddings |
embedding_max_seq_len | 2048 | Maximum sequence length |
embedding_model_dtype | auto | Data type for local models (float16/float32/bfloat16/auto) |
Sources: src/hipporag/utils/config_utils.py:40-54
Supported Embedding Models
The system integrates with multiple embedding model providers:
- NV-Embed-v2: NVIDIA's embedding model
- GritLM: GritLM embedding model
- Contriever: Facebook's dense retriever
- OpenAI Compatible: Any OpenAI-compatible embedding endpoint
- Azure OpenAI: Azure-hosted embedding models
Reranking Module
After initial retrieval, HippoRAG applies reranking to improve result quality. The reranking module reorders retrieved candidates using additional scoring mechanisms.
graph LR
A[Retrieved Candidates] --> B[Reranker Model]
B --> C[Relevance Scores]
C --> D[Ranked Results]
D --> E[Top Results]Sources: src/hipporag/rerank.py
QA Integration
Multi-Step Retrieval and Reasoning
HippoRAG supports interleaved retrieval and reasoning with configurable steps:
| Parameter | Default | Description |
|---|---|---|
max_qa_steps | 1 | Maximum steps for interleaved retrieval and reasoning |
qa_top_k | 5 | Number of documents for QA model to process |
Sources: src/hipporag/utils/config_utils.py:68-72
QA Pipeline Flow
graph TD
A[Query] --> B[QA Step 1]
B --> C[Retrieval]
C --> D[Read Documents]
D --> E{More Steps Needed?}
E -->|Yes| F[Update Context]
F --> B
E -->|No| G[Final Answer]
H[gold_docs] --> I[Evaluation]
I --> J[Metrics]
J --> K[Recall, EM, F1]Data Formats
Corpus JSON Structure
[
{
"title": "PASSAGE TITLE",
"text": "PASSAGE TEXT",
"idx": 0
}
]
Query JSON Structure
[
{
"id": "question_id",
"question": "QUESTION TEXT",
"answer": ["ANSWER"],
"answerable": true,
"paragraphs": [
{
"title": "SUPPORTING TITLE",
"text": "SUPPORTING TEXT",
"is_supporting": true,
"idx": 0
}
]
}
]
Usage Examples
Basic Retrieval with HippoRAG
from hipporag import HippoRAG
hipporag = HippoRAG(
save_dir='outputs',
llm_model_name='gpt-4o-mini',
embedding_model_name='nvidia/NV-Embed-v2'
)
# Index documents
docs = [
"Oliver Badman is a politician.",
"George Rankin is a politician.",
"Erik Hort's birthplace is Montebello.",
"Montebello is a part of Rockland County."
]
hipporag.index(docs)
# Query with evaluation
queries = ["What is George Rankin's occupation?"]
gold_docs = [["George Rankin is a politician."]]
answers = [["Politician"]]
results = hipporag.rag_qa(
queries=queries,
gold_docs=gold_docs,
gold_answers=answers
)
Sources: README.md:Quick_Start, tests_openai.py:22-60
Incremental Updates
# Add new documents
new_docs = [
"Tom Hort's birthplace is Montebello.",
"Sam Hort's birthplace is Montebello."
]
hipporag.index(docs=new_docs)
# Delete documents
docs_to_delete = [
"Tom Hort's birthplace is Montebello.",
"Sam Hort's birthplace is Montebello."
]
hipporag.delete(docs_to_delete)
Sources: tests_openai.py:61-82
Evaluation Metrics
The retrieval system is evaluated using standard information retrieval metrics:
| Metric | Description |
|---|---|
| Recall@k | Fraction of relevant documents in top-k |
| EM | Exact Match accuracy |
| F1 | Harmonic mean of precision and recall |
Summary
The Knowledge Graph and Retrieval module in HippoRAG provides a sophisticated pipeline for:
- Knowledge Extraction: Using OpenIE to extract factual triplets from text
- Graph Construction: Building heterogeneous graphs with passage nodes, fact nodes, and multiple edge types
- Synonymy Discovery: Creating semantic links between similar passages via embedding similarity
- PPR-based Retrieval: Performing personalized PageRank for graph-aware document retrieval
- Reranking: Refining retrieval results for improved accuracy
- Incremental Updates: Supporting document additions and deletions
This architecture enables HippoRAG to perform complex associativity and multi-hop reasoning tasks that traditional vector similarity retrieval cannot accomplish effectively.
Sources: [src/hipporag/utils/config_utils.py:48-72]()
Embedding Store and Management
Related topics: LLM Integrations, Embedding Models
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: LLM Integrations, Embedding Models
Embedding Store and Management
Overview
The Embedding Store and Management system in HippoRAG provides a unified interface for encoding text passages into vector embeddings, managing these embeddings throughout the indexing and retrieval lifecycle, and supporting multiple embedding model backends including NVIDIA NV-Embed-v2, GritLM, and Contriever. The system is designed to handle batch processing of documents with configurable parameters for sequence length, data type precision, and normalization behavior.
HippoRAG's embedding management is tightly integrated with the knowledge graph construction process, where embeddings serve dual purposes: enabling semantic similarity search for passage linking and powering the retrieval phase through Personalized PageRank (PPR) algorithms. The embedding store abstracts away the underlying model implementation details, allowing the framework to switch between different embedding providers without changing the core indexing and retrieval logic.
Sources: src/hipporag/utils/config_utils.py:1-50
Architecture
High-Level Components
The embedding system consists of three primary layers that work together to provide embedding services throughout the HippoRAG pipeline.
The Model Layer contains implementations for specific embedding models, each inheriting from a common base class that enforces a consistent interface. Currently supported models include NV-Embed-v2, GritLM, and Contriever, with the architecture supporting easy extension to additional models. Each model implementation handles the specific requirements of its underlying transformer architecture, including tokenizer configuration, padding strategies, and model-specific inference optimizations.
The Utility Layer provides helper functions for common embedding operations such as batch processing, embedding normalization, and similarity computation. These utilities ensure consistent handling of embeddings across different contexts and help optimize memory usage during large-scale indexing operations.
The Configuration Layer defines the parameters that control embedding behavior, including batch sizes, sequence length limits, and model-specific settings. This layer connects the embedding system to HippoRAG's global configuration management, allowing users to customize embedding behavior without modifying code.
graph TD
A[Documents] --> B[Embedding Store]
B --> C[Model Layer<br/>NV-Embed-v2<br/>GritLM<br/>Contriever]
B --> D[Utility Layer<br/>Batch Processing<br/>Normalization]
C --> E[Vector Storage]
D --> E
E --> F[Graph Construction]
E --> G[Retrieval Phase]Sources: src/hipporag/embedding_store.py:1-30
Data Flow
During the indexing phase, documents are first processed by the embedding store to generate passage vectors. These vectors are stored alongside the passage metadata and serve as the foundation for graph construction. The embedding store processes passages in configurable batch sizes to balance memory usage and throughput, with the default batch size set to 16 documents per batch.
During the retrieval phase, incoming queries are encoded using the same embedding model to produce a query vector. This query vector is then used for similarity computation against the indexed passage vectors, enabling semantic matching between the query intent and stored knowledge. The retrieval system can perform k-nearest neighbor (kNN) searches over the embedding space to identify candidate passages for further processing.
graph LR
A[Indexing Flow] --> B[Input Documents]
B --> C[Batch Processing<br/>batch_size=16]
C --> D[Embedding Encoding]
D --> E[Normalized Vectors]
E --> F[Vector Storage]
G[Retrieval Flow] --> H[Query Text]
H --> I[Query Encoding]
I --> J[Similarity Search]
J --> K[kNN Retrieval<br/>top-k candidates]
K --> L[Ranked Passages]Sources: src/hipporag/utils/embed_utils.py:1-25
Configuration Parameters
The embedding system is controlled through several configuration parameters defined in the global configuration structure. These parameters allow fine-tuning of embedding behavior for different hardware configurations and use cases.
| Parameter | Type | Default | Description |
|---|---|---|---|
embedding_batch_size | int | 16 | Number of documents processed in each embedding batch |
embedding_return_as_normalized | bool | true | Whether to L2-normalize output embeddings |
embedding_max_seq_len | int | 2048 | Maximum sequence length in tokens for the embedding model |
embedding_model_dtype | Literal | "auto" | Data type for local embedding models: float16, float32, bfloat16, or auto |
embedding_model_name | str | varies | Identifier for the embedding model (e.g., "nvidia/NV-Embed-v2") |
embedding_base_url | str | None | Base URL for OpenAI-compatible embedding endpoints |
synonymy_edge_topk | int | 2047 | k value for kNN retrieval when building synonymy edges |
synonymy_edge_sim_threshold | float | 0.8 | Minimum similarity threshold for synonymy edge candidates |
Sources: src/hipporag/utils/config_utils.py:15-40
Embedding Model Interface
Base Class Contract
All embedding models must inherit from BaseEmbeddingModel, which defines the core interface that HippoRAG expects. The base class enforces implementation of the __call__ method that accepts text inputs and returns embeddings, ensuring polymorphism across different model implementations.
The base class also defines the EmbeddingConfig dataclass that encapsulates model-specific settings. This configuration includes the model name, batch size, maximum sequence length, and data type settings. The configuration object is passed to the embedding model during initialization and can be modified to adjust model behavior without recreating the model instance.
Supported Models
NV-Embed-v2 is the primary embedding model recommended for production use, developed by NVIDIA. It provides high-quality sentence embeddings optimized for retrieval tasks. The model is accessed through HuggingFace and supports automatic device placement based on available GPU resources.
GritLM provides an alternative embedding approach that combines retrieval and generation capabilities. It can serve both as an embedding model and as a decoder for generation tasks, offering flexibility in deployment configurations.
Contriever is an open-source bi-encoder model for dense retrieval, useful for scenarios requiring a completely open-source embedding solution without proprietary dependencies.
Sources: src/hipporag/embedding_model/__init__.py:1-20
Embedding Store API
Initialization
The embedding store is typically instantiated through the main HippoRAG class rather than directly. When creating a HippoRAG instance, the embedding model name and optional endpoint configuration are passed as parameters:
hipporag = HippoRAG(
save_dir="outputs",
llm_model_name="gpt-4o-mini",
embedding_model_name="nvidia/NV-Embed-v2"
)
For OpenAI-compatible embedding endpoints, the base URL can be specified:
hipporag = HippoRAG(
save_dir="outputs",
llm_model_name="gpt-4o-mini",
embedding_model_name="text-embedding-3-small",
embedding_base_url="https://api.openai.com/v1"
)
Sources: README.md:1-50
Encoding Operations
The embedding store provides batch encoding capabilities for processing multiple documents efficiently. The encoding operation returns normalized embeddings by default, which is required for proper similarity computation during retrieval. The normalization is L2 normalization, ensuring that all embedding vectors have unit length.
For Azure OpenAI deployments, specialized endpoint parameters are supported:
hipporag = HippoRAG(
save_dir="save_dir",
llm_model_name="gpt-4o-mini",
embedding_model_name="text-embedding-3-small",
azure_endpoint="https://[ENDPOINT].openai.azure.com/...",
azure_embedding_endpoint="https://[ENDPOINT].openai.azure.com/..."
)
Sources: demo_azure.py:1-30
Integration with Knowledge Graph
The embedding system plays a critical role in HippoRAG's knowledge graph construction phase. After passages are indexed and encoded, the embeddings are used for two key graph-related operations.
Synonymy Edge Construction uses embeddings to identify semantically similar passage pairs that should be connected in the knowledge graph. The system performs k-nearest neighbor searches over the passage embedding space, where the synonymy_edge_topk parameter controls how many candidates are considered for each passage. The synonymy_edge_sim_threshold parameter filters these candidates, with only pairs exceeding the similarity threshold being connected as synonymy edges.
Retrieval-Graph Linking during the PPR retrieval process uses passage embeddings to establish the connection between the query and the knowledge graph. The query embedding enables the system to identify the most relevant starting nodes in the graph for the random walk algorithm.
Sources: src/hipporag/utils/config_utils.py:30-45
Memory Management and Optimization
Batch Processing Strategy
The embedding store implements batch processing to optimize GPU memory utilization and throughput. The batch size is configurable via embedding_batch_size with a default of 16, meaning 16 documents are processed simultaneously during encoding. For systems with larger GPU memory, increasing this value can significantly improve indexing performance.
The system also supports separate batch sizes for the synonymy edge construction phase. The synonymy_edge_query_batch_size (default 1000) controls how many passage embeddings are queried at once during kNN search, while synonymy_edge_key_batch_size (default 10000) controls the key batch size for the search index.
Data Type Selection
The embedding_model_dtype parameter allows selection of the precision for local embedding models. The "auto" setting allows the system to select an appropriate default based on the hardware and model. Available options include float16 for memory-constrained environments, float32 for maximum precision, and bfloat16 which offers a good balance of range and memory efficiency on newer GPUs.
Sources: src/hipporag/utils/config_utils.py:25-35
Error Handling and Resilience
The embedding system is designed with error handling patterns compatible with HippoRAG's overall resilience strategy. Batch processing allows partial failures to be identified and retried without losing all progress. The configuration system supports specifying fallback models or endpoints for production deployments requiring high availability.
Tenacity is used for retry logic in the embedding utilities, ensuring transient network failures or temporary service unavailability do not cause complete pipeline failures. This is particularly important when using remote embedding endpoints that may experience temporary connectivity issues.
Sources: setup.py:1-30
Performance Considerations
When optimizing HippoRAG for production deployment, the embedding configuration should be tuned based on the available hardware and expected workload characteristics. The primary tuning parameters include batch size for indexing throughput, sequence length limits for handling long documents, and data type selection for memory-constrained environments.
For maximum retrieval quality, the default normalization behavior should be maintained as it ensures consistent similarity computation across the retrieval pipeline. Disabling normalization may lead to suboptimal retrieval results as the similarity metrics assume unit-normalized vectors.
Sources: src/hipporag/utils/config_utils.py:18-22
Related Components
The embedding system interacts closely with several other HippoRAG components. The Information Extraction module uses embeddings for processing extracted facts, the retrieval module depends on embeddings for kNN search and PPR initialization, and the evaluation module uses embeddings for computing retrieval metrics such as recall and MRR.
The embedding model implementations in src/hipporag/embedding_model/ follow a consistent interface defined in base.py, allowing the embedding store to work with any model that adheres to this contract.
Sources: [src/hipporag/utils/config_utils.py:1-50]()
LLM Integrations
Related topics: Embedding Models, Deployment Options
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Embedding Models, Deployment Options
LLM Integrations
HippoRAG provides a flexible, pluggable architecture for integrating various Large Language Model (LLM) providers. This modular design enables the framework to support multiple inference backends including OpenAI, vLLM for local deployment, and AWS Bedrock, allowing researchers and developers to choose the most appropriate LLM backend for their specific use case and infrastructure requirements.
Architecture Overview
The LLM integration system follows a strategy pattern where a base abstract class defines the interface contract, and concrete implementations handle provider-specific details. This design ensures that the core HippoRAG logic remains independent of any particular LLM vendor while maintaining the ability to leverage specialized features offered by different providers.
graph TD
A[HippoRAG Core] --> B[LLM Base Class]
B --> C[OpenAIGPT]
B --> D[VLLMOffline]
B --> E[BedrockLLM]
B --> F[Custom LLM Adapter]
C --> G[OpenAI API]
D --> H[Local vLLM Server]
E --> I[AWS Bedrock]The BaseLLM abstract class in src/hipporag/llm/base.py defines the common interface that all LLM adapters must implement, ensuring consistent behavior across different providers.
Supported LLM Providers
OpenAI Models
HippoRAG supports all OpenAI chat completion models through the OpenAIGPT class. This integration allows users to leverage the GPT family of models for both information extraction and question answering tasks.
Configuration Parameters:
| Parameter | Type | Default | Description |
|---|---|---|---|
model_name | string | required | OpenAI model identifier (e.g., gpt-4o-mini, gpt-4o) |
api_key | string | env OPENAI_API_KEY | OpenAI API authentication key |
base_url | string | https://api.openai.com/v1 | API endpoint base URL |
max_tokens | int | None | Maximum tokens in generated response |
temperature | float | 0.0 | Sampling temperature for generation |
Usage Example:
from hipporag import HippoRAG
hipporag = HippoRAG(
save_dir='outputs',
llm_model_name='gpt-4o-mini',
embedding_model_name='nvidia/NV-Embed-v2'
)
Sources: README.md:67-72
vLLM Local Deployment
For scenarios requiring local inference, HippoRAG supports vLLM-deployed models through the VLLMOffline class. This approach is particularly useful for privacy-sensitive applications, cost reduction at scale, or when working with custom fine-tuned models.
Server Setup:
export CUDA_VISIBLE_DEVICES=0,1
export VLLM_WORKER_MULTIPROC_METHOD=spawn
export HF_HOME=<path to Huggingface home directory>
vllm serve meta-llama/Llama-3.1-8B-Instruct \
--tensor-parallel-size 2 \
--max_model_len 4096 \
--gpu-memory-utilization 0.95 \
--port 6578
Sources: README.md:93-101
Configuration Parameters:
| Parameter | Type | Default | Description |
|---|---|---|---|
model_name | string | required | Model identifier for vLLM server |
base_url | string | required | vLLM server endpoint URL |
openie_mode | string | "online" | Mode for OpenIE processing (online or offline) |
max_tokens | int | None | Maximum tokens in generated response |
temperature | float | 0.0 | Sampling temperature for generation |
Offline Mode for OpenIE:
The vLLM integration supports an offline mode where OpenIE extraction runs separately from the main pipeline. This is useful for debugging or when OpenIE results can be cached and reused.
python main.py \
--dataset sample \
--llm_name meta-llama/Llama-3.3-70B-Instruct \
--openie_mode offline \
--skip_graph
Sources: README.md:130-135
AWS Bedrock
HippoRAG integrates with AWS Bedrock through the BedrockLLM class, enabling access to various foundation models hosted on AWS infrastructure. This integration is designed for enterprise deployments requiring scalable, managed LLM services.
Configuration Parameters:
| Parameter | Type | Default | Description |
|---|---|---|---|
model_name | string | required | Bedrock model identifier |
aws_region | string | "us-east-1" | AWS region for Bedrock endpoint |
max_tokens | int | None | Maximum tokens in generated response |
temperature | float | 0.0 | Sampling temperature for generation |
Azure OpenAI
For enterprise users with Azure OpenAI deployments, HippoRAG provides direct integration with Azure endpoints.
Configuration Example:
hipporag = HippoRAG(
save_dir=save_dir,
llm_model_name='gpt-4o-mini',
embedding_model_name='embedding-model-name',
azure_endpoint="https://[ENDPOINT NAME].openai.azure.com/openai/deployments/gpt-4o-mini/chat/completions?api-version=2025-01-01-preview",
azure_embedding_endpoint="https://[ENDPOINT NAME].openai.azure.com/openai/deployments/text-embedding-3-small/embeddings?api-version=2023-05-15"
)
Sources: demo_azure.py:16-21
Base LLM Interface
All LLM adapters inherit from the BaseLLM abstract class, which defines the core contract for LLM interactions.
classDiagram
class BaseLLM {
<<abstract>>
+generate(prompt: str) str
+batch_generate(prompts: List[str]) List[str]
+get_model_name() str
}
class OpenAIGPT {
+generate(prompt: str) str
+batch_generate(prompts: List[str]) List[str]
}
class VLLMOffline {
+generate(prompt: str) str
+batch_generate(prompts: List[str]) List[str]
}
class BedrockLLM {
+generate(prompt: str) str
+batch_generate(prompts: List[str]) List[str]
}
BaseLLM <|-- OpenAIGPT
BaseLLM <|-- VLLMOffline
BaseLLM <|-- BedrockLLMCore Methods:
| Method | Parameters | Return Type | Description |
|---|---|---|---|
generate | prompt: str | str | Generate a single response from a prompt |
batch_generate | prompts: List[str] | List[str] | Generate responses for multiple prompts in batch |
get_model_name | None | str | Return the configured model identifier |
OpenIE Integration
Open Information Extraction (OpenIE) is a critical component of HippoRAG's knowledge graph construction pipeline. The LLM integration system supports multiple OpenIE modes to accommodate different deployment scenarios.
graph LR
A[Documents] --> B{HippoRAG}
B --> C{OpenIE Mode}
C -->|online| D[Real-time OpenIE]
C -->|offline| E[Cached OpenIE Results]
D --> F[OpenIE with LLM]
E --> G[Load from JSON]
F --> H[Knowledge Graph]
G --> HOpenIE Implementation Classes:
| Class | Provider | Use Case |
|---|---|---|
OpenAI_GPT | OpenAI API | Cloud-based OpenIE extraction |
VLLM_Offline | Local vLLM | Private/onsite OpenIE extraction |
Sources: README.md:47-48
Configuration Schema
The LLM integration configuration is defined through the HippoRAGConfig class, which validates and manages all LLM-related settings.
Configuration Fields:
| Field | Type | Default | Description |
|---|---|---|---|
llm_name | string | required | LLM model identifier |
llm_base_url | string | None | Base URL for LLM API endpoint |
llm_max_tokens | int | None | Maximum tokens per generation |
llm_temperature | float | 0.0 | Sampling temperature |
openie_mode | string | "online" | OpenIE processing mode |
skip_graph | bool | False | Skip graph construction step |
Sources: main.py:18-26
Workflow Integration
The following diagram illustrates how LLM integrations fit into the HippoRAG indexing and retrieval pipeline:
graph TD
subgraph Indexing
A1[Input Documents] --> A2[Chunking]
A2 --> A3[Embedding Generation]
A3 --> A4[OpenIE with LLM]
A4 --> A5[Knowledge Graph Construction]
A5 --> A6[Graph Indexing]
end
subgraph Retrieval & QA
B1[User Query] --> B2[Query Embedding]
B2 --> B3[Graph Traversal]
B3 --> B4[LLM for Answer Synthesis]
B4 --> B5[Final Answer]
end
A4 -.->|Uses| LLM1[LLM Adapter]
B4 -.->|Uses| LLM1Environment Variables
Proper configuration of environment variables is essential for LLM integrations to function correctly.
| Variable | Required | Description |
|---|---|---|
OPENAI_API_KEY | For OpenAI | OpenAI API authentication key |
HF_HOME | For vLLM | Hugging Face cache directory |
CUDA_VISIBLE_DEVICES | For GPU | Comma-separated GPU device IDs |
AWS_ACCESS_KEY_ID | For Bedrock | AWS access credentials |
AWS_SECRET_ACCESS_KEY | For Bedrock | AWS secret credentials |
Sources: README.md:58-66
Testing LLM Integrations
HippoRAG provides dedicated test scripts to verify LLM integration functionality.
OpenAI Test
export OPENAI_API_KEY=<your-api-key>
conda activate hipporag
python tests_openai.py
Local vLLM Test
# Terminal 1: Start vLLM server
export CUDA_VISIBLE_DEVICES=0
vllm serve meta-llama/Llama-3.1-8B-Instruct --port 6578
# Terminal 2: Run test
CUDA_VISIBLE_DEVICES=1 python tests_local.py
Sources: README.md:137-148
Error Handling and Retries
The LLM integrations leverage the tenacity library for automatic retry behavior with exponential backoff. This ensures robust operation when dealing with network issues or rate limiting from LLM providers.
Configuration options for retry behavior:
| Parameter | Default | Description |
|---|---|---|
max_attempts | 3 | Maximum number of retry attempts |
wait_exponential_multiplier | 1000 | Initial wait time in milliseconds |
wait_exponential_max | 10000 | Maximum wait time in milliseconds |
Extending LLM Support
To add support for a new LLM provider, implement a new class that inherits from BaseLLM and implements the required abstract methods:
from hipporag.llm.base import BaseLLM
class CustomLLM(BaseLLM):
def __init__(self, model_name: str, **kwargs):
self.model_name = model_name
# Initialize provider-specific client
def generate(self, prompt: str) -> str:
# Implement generation logic
pass
def batch_generate(self, prompts: List[str]) -> List[str]:
# Implement batch generation
pass
def get_model_name(self) -> str:
return self.model_name
Performance Considerations
When selecting and configuring LLM integrations, consider the following factors:
- Latency: OpenAI APIs typically offer lower latency for small workloads, while vLLM provides better performance for high-throughput scenarios
- Cost: Local vLLM deployment eliminates API costs but requires GPU infrastructure
- Privacy: For sensitive data, local deployment via vLLM or Bedrock private endpoints is recommended
- Model Size: Larger models (e.g., Llama-3.3-70B) require more GPU memory but often provide better extraction quality
Sources: [README.md:67-72](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/README.md)
Embedding Models
Related topics: Embedding Store and Management, LLM Integrations
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Embedding Store and Management, LLM Integrations
Embedding Models
HippoRAG provides a flexible, modular embedding model system that supports multiple embedding backends including NVIDIA's NV-Embed-v2, GritLM, HuggingFace Transformers, and vLLM endpoints. This modular architecture enables the system to generate high-quality text embeddings for both passage encoding and query understanding in the retrieval pipeline.
Architecture Overview
The embedding model subsystem follows a base class pattern with specialized implementations. All embedding models inherit from BaseEmbeddingModel which defines the common interface and configuration schema.
graph TD
A[HippoRAG Core] --> B[Embedding Model Factory]
B --> C[BaseEmbeddingModel]
C --> D[NVEmbedV2]
C --> E[GritLM]
C --> F[TransformersEmbeddingModel]
C --> G[VLLMEmbeddingModel]The factory pattern in __init__.py dynamically instantiates the appropriate embedding model based on the model name prefix:
| Prefix | Model Class | Backend |
|---|---|---|
nvidia/NV-Embed-v2 | NVEmbedV2 | HuggingFace |
GritLM | GritLM | GritLM library |
Transformers/ | TransformersEmbeddingModel | SentenceTransformers |
VLLM/ | VLLMEmbeddingModel | vLLM endpoints |
Sources: src/hipporag/embedding_model/__init__.py
Base Configuration
The BaseEmbeddingModel and EmbeddingConfig classes define the configuration schema used across all embedding implementations. Configuration parameters include:
| Parameter | Default | Description |
|---|---|---|
embedding_batch_size | 16 | Batch size for encoding operations |
embedding_return_as_normalized | True | Whether to normalize output embeddings |
embedding_max_seq_len | 2048 | Maximum sequence length for tokenization |
embedding_model_dtype | "auto" | Data type: float16, float32, bfloat16, or auto |
Sources: src/hipporag/utils/config_utils.py:16-35
Available Embedding Models
NV-Embed-v2
The NVEmbedV2 class provides integration with NVIDIA's NV-Embed-v2 embedding model, a high-performance encoder optimized for retrieval tasks.
class NVEmbedV2(BaseEmbeddingModel):
def __init__(self, global_config: BaseConfig, embedding_model_name: str) -> None:
super().__init__(global_config=global_config)
# Model initialization with HuggingFace transformers
Sources: src/hipporag/embedding_model/NVEmbedV2.py
GritLM
The GritLM class wraps the GritLM library for generating embeddings with built-in instruction-following capabilities.
class GritLM(BaseEmbeddingModel):
def __init__(self, global_config: BaseConfig, embedding_model_name: str) -> None:
super().__init__(global_config=global_config)
# GritLM-specific initialization
Sources: src/hipporag/embedding_model/GritLM.py
Transformers (SentenceTransformers)
The TransformersEmbeddingModel class enables using any model from the HuggingFace ecosystem via the SentenceTransformers library. Select this implementation by using embedding_model_name that starts with "Transformers/".
class TransformersEmbeddingModel(BaseEmbeddingModel):
"""
To select this implementation you can initialise HippoRAG with:
embedding_model_name starts with "Transformers/"
"""
def __init__(self, global_config: BaseConfig, embedding_model_name: str) -> None:
super().__init__(global_config=global_config)
self.model_id = embedding_model_name[len("Transformers/"):]
self.batch_size = 64
self.model = SentenceTransformer(
self.model_id,
device="cuda" if torch.cuda.is_available() else "cpu"
)
Key characteristics:
- Automatically detects CUDA availability for GPU acceleration
- Uses batch size of 64 for efficient processing
- Extracts model ID by removing the
"Transformers/"prefix
Sources: src/hipporag/embedding_model/Transformers.py:1-40
VLLM (Endpoint-based)
The VLLMEmbeddingModel class provides integration with OpenAI-compatible vLLM embedding endpoints. Select this implementation by using embedding_model_name that starts with "VLLM/".
class VLLMEmbeddingModel(BaseEmbeddingModel):
"""
To select this implementation you can initialise HippoRAG with:
embedding_model_name starts with "VLLM/"
The embedding base url should contain the v1/embeddings.
"""
def __init__(self, global_config: BaseConfig, embedding_model_name: str) -> None:
super().__init__(global_config=global_config)
self.model_id = embedding_model_name[len("VLLM/"):]
self.batch_size = 32
self.url = global_config.embedding_base_url
The model communicates with the endpoint using the OpenAI embeddings API format:
payload = {
"model": self.model_id,
"input": input_text,
}
response = requests.post(self.base_url, headers=headers, json=payload)
Sources: src/hipporag/embedding_model/VLLM.py:1-50
Query Instructions
Embedding models support query instruction templates for improving retrieval relevance. The system uses instructions for mapping queries to facts and passages:
self.search_query_instr = set([
get_query_instruction('query_to_fact'),
get_query_instruction('query_to_passage')
])
Sources: src/hipporag/embedding_model/Transformers.py:23-27
Usage Patterns
Quick Start with OpenAI-style Models
hipporag = HippoRAG(
save_dir=save_dir,
llm_model_name='gpt-4o-mini',
llm_base_url='https://api.openai.com/v1',
embedding_model_name='nvidia/NV-Embed-v2',
embedding_base_url='https://api.openai.com/v1'
)
Using Custom Endpoints
hipporag = HippoRAG(
save_dir=save_dir,
llm_model_name='Your LLM Model name',
llm_base_url='Your LLM Model url',
embedding_model_name='Your Embedding model name',
embedding_base_url='Your Embedding model url'
)
Using vLLM Local Deployment
# Start vLLM server
vllm serve meta-llama/Llama-3.1-8B-Instruct --tensor-parallel-size 2
# Configure with VLLM prefix
hipporag = HippoRAG(
save_dir=save_dir,
llm_model_name='...',
embedding_model_name='VLLM/your-model-name',
embedding_base_url='http://localhost:8000/v1/embeddings'
)
Dependencies
The embedding model system depends on the following packages:
| Package | Version | Purpose |
|---|---|---|
transformers | 4.45.2 | Core model loading |
sentence-transformers | (via Transformers) | Sentence encoding |
gritlm | 1.0.2 | GritLM embeddings |
torch | 2.5.1 | GPU acceleration |
einops | (latest) | Tensor operations |
Sources: setup.py:19-32
Configuration Parameters Summary
| Parameter | Type | Default | Description |
|---|---|---|---|
embedding_batch_size | int | 16 | Batch size for embedding inference |
embedding_return_as_normalized | bool | True | L2 normalize embeddings |
embedding_max_seq_len | int | 2048 | Maximum token sequence length |
embedding_model_dtype | str | "auto" | Model precision (float16/float32/bfloat16/auto) |
Sources: [src/hipporag/embedding_model/__init__.py](src/hipporag/embedding_model/__init__.py)
Open Information Extraction (OpenIE)
Related topics: Knowledge Graph and Retrieval, LLM Integrations
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Knowledge Graph and Retrieval, LLM Integrations
Open Information Extraction (OpenIE)
Overview
Open Information Extraction (OpenIE) is a critical component in the HippoRAG pipeline that enables the extraction of structured knowledge triples from unstructured text. The system extracts entities, relations, and triples from passages to construct a knowledge graph that mimics hippocampal memory formation in biological systems.
In HippoRAG, OpenIE serves as the foundation for building the associative memory graph. Extracted triples form fact nodes in the knowledge graph, enabling parametric nearest neighbor (PPR) retrieval that connects related information across documents.
Sources: README.md
Architecture
The OpenIE system in HippoRAG supports multiple deployment modes and LLM backends:
graph TD
A[Unstructured Text] --> B[Information Extraction Module]
B --> C{openie_mode}
C -->|online| D[OpenAI GPT]
C -->|offline| E[vLLM Offline]
D --> F[Triple Extraction]
E --> F
F --> G[NER Processing]
G --> H[Knowledge Triples]
H --> I[Knowledge Graph Construction]Module Structure
| Module | File | Purpose |
|---|---|---|
| Base Interface | information_extraction/__init__.py | Exports model classes |
| OpenAI Integration | openie_openai_gpt.py | Online OpenIE via OpenAI API |
| vLLM Offline | openie_vllm_offline.py | Offline batch processing with vLLM |
| Triple Extraction Prompt | prompts/templates/triple_extraction.py | LLM prompt for triple extraction |
| NER Prompt | prompts/templates/ner.py | LLM prompt for named entity recognition |
Sources: README.md - Code Structure
Configuration
ConfigUtils Class Parameters
The InformationExtractionConfig dataclass provides the following configuration options:
| Parameter | Type | Default | Description |
|---|---|---|---|
information_extraction_model_name | Literal["openie_openai_gpt"] | "openie_openai_gpt" | Class name indicating which information extraction model to use |
openie_mode | Literal["offline", "online"] | "online" | Mode of the OpenIE model: online uses OpenAI API, offline uses vLLM batch processing |
skip_graph | bool | False | Whether to skip graph construction. Set to True when running vLLM offline indexing for the first time |
Sources: src/hipporag/utils/config_utils.py
Main Entry Point Configuration
In the main.py script, OpenIE parameters are passed via command-line arguments:
config = BaseConfig(
retrieval_top_k=200,
linking_top_k=5,
max_qa_steps=3,
qa_top_k=5,
graph_type="facts_and_sim_passage_node_unidirectional",
embedding_batch_size=8,
max_new_tokens=None,
corpus_len=len(corpus),
openie_mode=args.openie_mode # 'online' or 'offline'
)
Command-line arguments:
--openie_mode: Choose betweenonline(OpenAI API) oroffline(vLLM)--force_openie_from_scratch: IfFalse, reuse existing OpenIE results if available
Sources: main.py
Extraction Workflow
Triple Extraction Process
The triple extraction workflow follows these steps:
sequenceDiagram
participant Text as Raw Text Input
participant Triple as Triple Extraction Prompt
participant LLM as Language Model
participant NER as NER Prompt
participant Output as Knowledge Triples
Text->>Triple: Passage text
Triple->>LLM: Structured prompt
LLM->>Output: Subject-Predicate-Object triples
Output->>NER: Named Entity Recognition
NER->>LLM: Entity labels
LLM->>Output: Typed entitiesSupported Deployment Modes
| Mode | Backend | Use Case | API Key Required |
|---|---|---|---|
online | OpenAI GPT | Quick testing, small corpora | Yes (OPENAI_API_KEY) |
offline | vLLM | Large-scale indexing, cost efficiency | No (local deployment) |
Knowledge Graph Integration
OpenIE extracted triples are converted into graph structures:
graph LR
A[Passage Text] -->|OpenIE| B[Triple: Entity1 → Relation → Entity2]
B --> C[Fact Node]
C --> D[Knowledge Graph]
D --> E[Personalized PageRank]
E --> F[Associative Retrieval]The extracted triples serve dual purposes:
- Fact Nodes: Create direct connections between related entities
- Association Links: Enable multi-hop reasoning through the graph
This design mirrors the dentate gyrus pattern separation mechanism in the hippocampus, where similar memories are differentiated to reduce interference.
Sources: README.md - Methodology
Usage Examples
Online Mode (OpenAI)
from hipporag import HippoRAG
hipporag = HippoRAG(
save_dir='outputs',
llm_model_name='gpt-4o-mini',
embedding_model_name='nvidia/NV-Embed-v2'
)
# OpenIE runs automatically during indexing
hipporag.index(docs=["Passage containing facts to extract."])
Offline Mode (vLLM)
# 1. Start vLLM server
vllm serve meta-llama/Llama-3.3-70B-Instruct \
--tensor-parallel-size 2 \
--max_model_len 4096 \
--gpu-memory-utilization 0.95
# 2. Run indexing with offline OpenIE
python main.py --dataset sample --openie_mode offline
Sources: README.md - Quick Start
Dependencies
The OpenIE system requires the following core dependencies:
| Package | Version | Purpose |
|---|---|---|
torch | 2.5.1 | PyTorch backend |
transformers | 4.45.2 | Model architecture |
openai | 1.91.1 | Online OpenAI API |
vllm | 0.6.6.post1 | Offline inference |
litellm | 1.73.1 | Unified LLM interface |
tqdm | - | Progress bars |
Sources: setup.py
Extracted Data Format
OpenIE produces structured triples in the following format:
| Field | Type | Description |
|---|---|---|
subject | str | First entity |
predicate | str | Relation verb/phrase |
object | str | Second entity |
context | str | Source passage text |
These triples are then processed into graph nodes and edges for the knowledge graph construction phase.
Sources: [README.md](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/README.md)
Deployment Options
Related topics: Installation and Setup, LLM Integrations
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Continue reading this section for the full explanation and source context.
Related Pages
Related topics: Installation and Setup, LLM Integrations
Deployment Options
HippoRAG supports multiple deployment configurations to accommodate different infrastructure requirements and use cases. This page documents the available deployment options, configuration parameters, and setup procedures for running HippoRAG in various environments.
Overview
HippoRAG provides three primary deployment models:
| Deployment Type | LLM Backend | Embedding Backend | Typical Use Case |
|---|---|---|---|
| OpenAI API | OpenAI hosted models | OpenAI/NVIDIA hosted | Quickstart, development |
| vLLM (Local) | Self-hosted LLMs via vLLM | Local embedding models | Production, cost-sensitive |
| Azure OpenAI | Azure-hosted models | Azure-hosted embeddings | Enterprise compliance |
Sources: README.md
Environment Setup
Regardless of deployment type, certain environment variables must be configured:
export CUDA_VISIBLE_DEVICES=0,1,2,3
export HF_HOME=<path to Huggingface home directory>
For OpenAI and Azure deployments, additional API credentials are required:
export OPENAI_API_KEY=<your openai api key>
Sources: README.md:1
OpenAI API Deployment
The simplest deployment option uses OpenAI's hosted API endpoints for both LLM inference and embeddings.
Configuration Parameters
| Parameter | Description | Example Value |
|---|---|---|
--llm_base_url | OpenAI API endpoint | https://api.openai.com/v1 |
--llm_name | OpenAI model identifier | gpt-4o-mini |
--embedding_name | Embedding model name | nvidia/NV-Embed-v2 |
Running with OpenAI Models
dataset=sample
python main.py --dataset $dataset \
--llm_base_url https://api.openai.com/v1 \
--llm_name gpt-4o-mini \
--embedding_name nvidia/NV-Embed-v2
Sources: README.md:1
Programmatic Usage
from hipporag import HippoRAG
hipporag = HippoRAG(
save_dir='outputs',
llm_model_name='gpt-4o-mini',
embedding_model_name='nvidia/NV-Embed-v2'
)
Sources: README.md:1
Local vLLM Deployment
For production environments or cost-sensitive deployments, HippoRAG supports self-hosted LLMs using vLLM.
Architecture
graph TD
A[HippoRAG Main Process] --> B[vLLM Server]
A --> C[Local Embedding Model]
B --> D[GPU 0-1]
C --> D
E[Indexing Pipeline] --> A
F[QA Pipeline] --> AStarting vLLM Server
Launch the vLLM server with tensor parallelism for multi-GPU setups:
export CUDA_VISIBLE_DEVICES=0,1
export VLLM_WORKER_MULTIPROC_METHOD=spawn
export HF_HOME=<path to Huggingface home directory>
vllm serve meta-llama/Llama-3.3-70B-Instruct \
--tensor-parallel-size 2 \
--max_model_len 4096 \
--gpu-memory-utilization 0.95 \
--port 6578
Sources: README.md:1
Configuration Parameters
| Parameter | Description | Default |
|---|---|---|
--llm_base_url | vLLM server endpoint | http://localhost:6578/v1 |
--llm_name | Model name (must match deployed model) | meta-llama/Llama-3.1-8B-Instruct |
--embedding_name | Local embedding model identifier | nvidia/NV-Embed-v2 |
Running Main Process
With vLLM server running on GPUs 0-1, run the main process on separate GPUs:
export CUDA_VISIBLE_DEVICES=2,3
export HF_HOME=<path to Huggingface home directory>
python main.py --dataset $dataset \
--llm_base_url http://localhost:6578/v1 \
--llm_name meta-llama/Llama-3.3-70B-Instruct \
--embedding_name nvidia/NV-Embed-v2
Sources: README.md:1
Azure OpenAI Deployment
Enterprise deployments requiring Azure infrastructure can use Azure OpenAI endpoints.
Configuration Parameters
| Parameter | CLI Argument | Description |
|---|---|---|
azure_endpoint | --azure_endpoint | Azure OpenAI chat completions endpoint |
azure_embedding_endpoint | --azure_embedding_endpoint | Azure OpenAI embeddings endpoint |
Endpoint Format
azure_endpoint = (
"https://[ENDPOINT_NAME].openai.azure.com/"
"openai/deployments/gpt-4o-mini/chat/completions"
"?api-version=2025-01-01-preview"
)
azure_embedding_endpoint = (
"https://[ENDPOINT_NAME].openai.azure.com/"
"openai/deployments/text-embedding-3-small/embeddings"
"?api-version=2023-05-15"
)
Sources: demo_azure.py
Programmatic Usage
from hipporag import HippoRAG
hipporag = HippoRAG(
save_dir='outputs',
llm_model_name='gpt-4o-mini',
embedding_model_name='nvidia/NV-Embed-v2',
azure_endpoint="https://[ENDPOINT_NAME].openai.azure.com/openai/deployments/gpt-4o-mini/chat/completions?api-version=2025-01-01-preview",
azure_embedding_endpoint="https://[ENDPOINT_NAME].openai.azure.com/openai/deployments/text-embedding-3-small/embeddings?api-version=2023-05-15"
)
hipporag.index(docs=docs)
Sources: demo_azure.py
CLI Usage
python main_azure.py \
--dataset sample \
--azure_endpoint "https://[ENDPOINT].openai.azure.com/openai/deployments/gpt-4o-mini/chat/completions?api-version=2025-01-01-preview" \
--azure_embedding_endpoint "https://[ENDPOINT].openai.azure.com/openai/deployments/text-embedding-3-small/embeddings?api-version=2023-05-15" \
--save_dir outputs
Sources: main_azure.py
Indexing Options
OpenIE Modes
HippoRAG supports two Open Information Extraction (OpenIE) modes:
| Mode | Description | Resource Usage |
|---|---|---|
online | Uses OpenAI GPT for real-time extraction | API costs |
offline | Uses local vLLM batch processing | GPU compute |
python main.py --dataset $dataset --openie_mode offline
Sources: main.py:1
Force Rebuild Options
| Parameter | Description |
|---|---|
--force_index_from_scratch | Ignores existing storage and rebuilds from scratch |
--force_openie_from_scratch | Ignores cached OpenIE results and recomputes |
python main_azure.py \
--force_index_from_scratch true \
--force_openie_from_scratch true
Sources: main_azure.py
StandardRAG vs HippoRAG
The codebase provides two RAG implementations selectable via configuration:
# Standard HippoRAG (default)
hipporag = HippoRAG(global_config=config)
# Alternative DPR-style implementation
hipporag = StandardRAG(global_config=config)
Sources: main.py and main_dpr.py
Installation Requirements
All deployment options require the HippoRAG package and its dependencies:
conda create -n hipporag python=3.10
conda activate hipporag
pip install hipporag
Or install from source:
pip install -e .
Core dependencies include:
| Package | Version | Purpose |
|---|---|---|
torch | 2.5.1 | Deep learning framework |
transformers | 4.45.2 | Model loading |
vllm | 0.6.6.post1 | Local inference |
openai | 1.91.1 | API client |
litellm | 1.73.1 | Unified LLM interface |
gritlm | 1.0.2 | Embedding models |
networkx | 3.4.2 | Graph operations |
pydantic | 2.10.4 | Configuration validation |
Sources: setup.py
Testing Deployments
OpenAI Test
export OPENAI_API_KEY=<your openai api key>
conda activate hipporag
python tests_openai.py
Sources: README.md:1
Local vLLM Test
export CUDA_VISIBLE_DEVICES=0
export VLLM_WORKER_MULTIPROC_METHOD=spawn
export HF_HOME=<path to Huggingface home directory>
# Start vLLM server
vllm serve meta-llama/Llama-3.1-8B-Instruct \
--tensor-parallel-size 2 \
--max_model_len 4096 \
--gpu-memory-utilization 0.95 \
--port 6578
# Run tests
CUDA_VISIBLE=1 python tests_local.py
Sources: README.md:1
Azure Test
python tests_azure.py
Sources: tests_azure.py
Deployment Decision Matrix
| Criteria | OpenAI API | vLLM Local | Azure |
|---|---|---|---|
| Setup complexity | Low | High | Medium |
| Cost | Pay-per-use | GPU infrastructure | Azure subscription |
| Data privacy | Data leaves your environment | All data stays local | Configurable |
| Latency | Network dependent | Local, optimized | Network dependent |
| Model flexibility | Limited to API models | Any HuggingFace model | Limited to deployed models |
| Recommended for | Development, prototyping | Production, research | Enterprise compliance |
Sources: [README.md](https://github.com/OSU-NLP-Group/HippoRAG/blob/main/README.md)
Doramagic Pitfall Log
Source-linked risks stay visible on the manual page so the preview does not read like a recommendation.
First-time setup may fail or require extra isolation and rollback planning.
First-time setup may fail or require extra isolation and rollback planning.
The project may affect permissions, credentials, data exposure, or host boundaries.
First-time setup may fail or require extra isolation and rollback planning.
Doramagic Pitfall Log
Doramagic extracted 16 source-linked risk signals. Review them before installing or handing real data to the project.
1. Installation risk: add_fact_edges function adds the same edge twice?
- Severity: high
- Finding: Installation risk is backed by a source signal: add_fact_edges function adds the same edge twice?. Treat it as a review item until the current version is checked.
- User impact: First-time setup may fail or require extra isolation and rollback planning.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: Source-linked evidence: https://github.com/OSU-NLP-Group/HippoRAG/issues/174
2. Installation risk: pypi hipporag libraries
- Severity: high
- Finding: Installation risk is backed by a source signal: pypi hipporag libraries. Treat it as a review item until the current version is checked.
- User impact: First-time setup may fail or require extra isolation and rollback planning.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: Source-linked evidence: https://github.com/OSU-NLP-Group/HippoRAG/issues/168
3. Security or permission risk: Take the "musique" dataset as an example. The process of constructing an index based on individual paragraphs takes an…
- Severity: high
- Finding: Security or permission risk is backed by a source signal: Take the "musique" dataset as an example. The process of constructing an index based on individual paragraphs takes an…. Treat it as a review item until the current version is checked.
- User impact: The project may affect permissions, credentials, data exposure, or host boundaries.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: Source-linked evidence: https://github.com/OSU-NLP-Group/HippoRAG/issues/173
4. Installation risk: OpenAI version incompatibility in latest 2.0.0a4 version
- Severity: medium
- Finding: Installation risk is backed by a source signal: OpenAI version incompatibility in latest 2.0.0a4 version. Treat it as a review item until the current version is checked.
- User impact: First-time setup may fail or require extra isolation and rollback planning.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: Source-linked evidence: https://github.com/OSU-NLP-Group/HippoRAG/issues/140
5. Installation risk: Windows Compatibility Issues with vLLM dependency
- Severity: medium
- Finding: Installation risk is backed by a source signal: Windows Compatibility Issues with vLLM dependency. Treat it as a review item until the current version is checked.
- User impact: First-time setup may fail or require extra isolation and rollback planning.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: Source-linked evidence: https://github.com/OSU-NLP-Group/HippoRAG/issues/117
6. Configuration risk: How to use local embedding_model_
- Severity: medium
- Finding: Configuration risk is backed by a source signal: How to use local embedding_model_. Treat it as a review item until the current version is checked.
- User impact: Users may get misleading failures or incomplete behavior unless configuration is checked carefully.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: Source-linked evidence: https://github.com/OSU-NLP-Group/HippoRAG/issues/127
7. Capability assumption: README/documentation is current enough for a first validation pass.
- Severity: medium
- Finding: README/documentation is current enough for a first validation pass.
- User impact: The project should not be treated as fully validated until this signal is reviewed.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: capability.assumptions | github_repo:805115184 | https://github.com/OSU-NLP-Group/HippoRAG | README/documentation is current enough for a first validation pass.
8. Project risk: Inquiry Regarding OpenIE Extraction Results for HippoRAG 2
- Severity: medium
- Finding: Project risk is backed by a source signal: Inquiry Regarding OpenIE Extraction Results for HippoRAG 2. Treat it as a review item until the current version is checked.
- User impact: The project should not be treated as fully validated until this signal is reviewed.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: Source-linked evidence: https://github.com/OSU-NLP-Group/HippoRAG/issues/177
9. Maintenance risk: Maintainer activity is unknown
- Severity: medium
- Finding: Maintenance risk is backed by a source signal: Maintainer activity is unknown. Treat it as a review item until the current version is checked.
- User impact: Users cannot judge support quality until recent activity, releases, and issue response are checked.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: evidence.maintainer_signals | github_repo:805115184 | https://github.com/OSU-NLP-Group/HippoRAG | last_activity_observed missing
10. Security or permission risk: no_demo
- Severity: medium
- Finding: no_demo
- User impact: The project may affect permissions, credentials, data exposure, or host boundaries.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: downstream_validation.risk_items | github_repo:805115184 | https://github.com/OSU-NLP-Group/HippoRAG | no_demo; severity=medium
11. Security or permission risk: no_demo
- Severity: medium
- Finding: no_demo
- User impact: The project may affect permissions, credentials, data exposure, or host boundaries.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: risks.scoring_risks | github_repo:805115184 | https://github.com/OSU-NLP-Group/HippoRAG | no_demo; severity=medium
12. Security or permission risk: How to distinguish Hipporag1 from Hipporag2
- Severity: medium
- Finding: Security or permission risk is backed by a source signal: How to distinguish Hipporag1 from Hipporag2. Treat it as a review item until the current version is checked.
- User impact: The project may affect permissions, credentials, data exposure, or host boundaries.
- Recommended check: Open the linked source, confirm whether it still applies to the current version, and keep the first run isolated.
- Evidence: Source-linked evidence: https://github.com/OSU-NLP-Group/HippoRAG/issues/167
Source: Doramagic discovery, validation, and Project Pack records
Community Discussion Evidence
These external discussion links are review inputs, not standalone proof that the project is production-ready.
Count of project-level external discussion links exposed on this manual page.
Open the linked issues or discussions before treating the pack as ready for your environment.
Community Discussion Evidence
Doramagic exposes project-level community discussion separately from official documentation. Review these links before using HippoRAG with real data or production workflows.
- [[Discussion] Ablation: multi-component scoring layer over HippoRAG's KG?](https://github.com/OSU-NLP-Group/HippoRAG/issues/178) - github / github_issue
- Inquiry Regarding OpenIE Extraction Results for HippoRAG 2 - github / github_issue
- How to use local embedding_model_ - github / github_issue
- add_fact_edges function adds the same edge twice? - github / github_issue
- Quadratic runtime during indexing - github / github_issue
- Take the "musique" dataset as an example. The process of constructing an - github / github_issue
- Windows Compatibility Issues with vLLM dependency - github / github_issue
- OpenAI version incompatibility in latest 2.0.0a4 version - github / github_issue
- division by zero - github / github_issue
- Inquiry on Sample Selection for HippoRAG Experiments - github / github_issue
- How to distinguish Hipporag1 from Hipporag2 - github / github_issue
- pypi hipporag libraries - github / github_issue
Source: Project Pack community evidence and pitfall evidence