Doramagic Project Pack · Human Manual

milvus

The Quick Start Guide provides a streamlined path for new users to begin using Milvus, an open-source vector database optimized for AI applications. This guide covers essential operations ...

Milvus Introduction

Related topics: System Architecture, Quick Start Guide

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Core Capabilities

Continue reading this section for the full explanation and source context.

Section Key Components

Continue reading this section for the full explanation and source context.

Section Supported MQ Types

Continue reading this section for the full explanation and source context.

Related topics: System Architecture, Quick Start Guide

Milvus Introduction

Milvus is an open-source vector database designed for production-level AI applications. It enables efficient storage, indexing, and searching of dense vector embeddings at scale, supporting billions of vectors with millisecond query latency.

What is Milvus?

Milvus is a cloud-native, highly scalable vector database purpose-built for similarity search and AI-powered applications. The system handles vector embeddings—numerical representations of data such as images, text, audio, and video—and enables fast approximate nearest neighbor (ANN) queries across massive datasets.

Core Capabilities

CapabilityDescription
Vector Similarity SearchFind semantically similar vectors using distance metrics (IP, L2, Hamming, Jaccard)
Hybrid SearchCombine vector search with scalar filtering on metadata fields
Multi-tenancySupport thousands of collections with resource isolation
Time TravelQuery historical data states for audit and reproducibility
Horizontal ScalabilityScale data planes independently from coordination planes

Source: tools/README.md:1-8

System Architecture

Milvus follows a distributed architecture designed for horizontal scalability and high availability. The system separates concerns between data management and coordination services.

graph TD
    subgraph Client Layer
        Python[Python SDK]
        Go[Go SDK]
        Java[Java SDK]
        Node[Node.js SDK]
    end
    
    subgraph Coordination Layer
        RootCoord[Root Coordinator]
        Proxy[Proxy Service]
        IndexCoord[Index Coordinator]
        QueryCoord[Query Coordinator]
        DataCoord[Data Coordinator]
    end
    
    subgraph Data Layer
        QueryNode[Query Nodes]
        DataNode[Data Nodes]
        IndexNode[Index Nodes]
    end
    
    subgraph Storage Layer
        RocksMQ[RocksMQ / Pulsar / Kafka]
        MetaStore[(Metadata Store)]
        ObjectStorage[(Object Storage)]
    end
    
    Python & Go & Java & Node --> Proxy
    Proxy --> RootCoord
    Proxy --> QueryCoord
    Proxy --> DataCoord
    RootCoord --> MetaStore
    QueryCoord --> QueryNode
    DataCoord --> DataNode
    IndexCoord --> IndexNode
    QueryNode & DataNode --> RocksMQ
    QueryNode & DataNode --> ObjectStorage

Key Components

Proxy Service The entry point for all client requests. It validates, transforms, and routes requests to appropriate coordinator services.

Root Coordinator Manages collection and partition metadata, handles DDL operations, and coordinates timestamps for distributed transactions.

Query Coordinator Manages query node resources and load balancing. Tracks which segments are loaded into memory for fast querying.

Data Coordinator Manages data node resources, handles segment compaction, and coordinates data persistence to object storage.

Index Coordinator Manages index building tasks across index nodes. Supports multiple index types including IVF, HNSW, DiskANN, and ANNOY.

Source: internal/streamingcoord/server/service/assignment.go:1-25

Message Queue Integration

Milvus supports multiple message queue backends for storing write-ahead logs and enabling reliable message delivery.

Supported MQ Types

MQ TypeDescriptionUse Case
RocksMQBuilt-in RockDB-based message queueStandalone deployments
PulsarApache Pulsar for distributed messagingProduction clusters
KafkaApache Kafka for event streamingIntegration with Kafka ecosystems

Source: pkg/mq/mqimpl/rocksmq/server/rocksmq_impl.go:1-30

RocksMQ Implementation

RocksMQ provides an embedded message queue solution using RocksDB for persistent storage. It implements a consumer group model with the following characteristics:

graph LR
    Producer[Producer] --> Topic[Topic]
    Topic --> Page1[Page 1]
    Topic --> Page2[Page 2]
    Topic --> PageN[Page N]
    
    subgraph Consumer Groups
        CG1[Consumer Group A]
        CG2[Consumer Group B]
    end
    
    Page1 & Page2 & PageN --> CG1
    Page1 & Page2 & PageN --> CG2

Consumer Management

  • Each consumer group tracks its own consumption position
  • Message acknowledgment is tracked per consumer group
  • Retention policies clean up acknowledged messages based on size or time thresholds

Source: pkg/mq/mqimpl/rocksmq/server/rocksmq_retention.go:1-60

Retention Configuration

RocksMQ supports configurable retention policies:

ParameterDescriptionDefault
RetentionSizeInMBMinimum acknowledged size before cleanup-1 (disabled)
RetentionTimeInMinutesMinimum time before acknowledged messages are eligible-1 (disabled)

When both are set to -1, retention is disabled and messages are only removed when explicitly acknowledged.

Source: pkg/mq/mqimpl/rocksmq/server/rocksmq_impl.go:45-50

SDK Support

Milvus provides official SDKs for multiple programming languages, all maintained under the same version cadence as the core server.

Available SDKs

SDKLatest VersionStatus
Python2.6.14Production
Go2.6.5Production
Java2.6.20Production
Node.js2.6.14Production

Go SDK (client/v2)

The Go SDK provides a high-performance client library for Milvus:

import "github.com/milvus-io/milvus/client/v2/milvusclient"

ctx, cancel := context.WithCancel(context.Background())
defer cancel()

cli, err := milvusclient.New(ctx, &milvusclient.ClientConfig{
    Address: milvusAddr,
})

Prerequisites: Go 1.24.12 or higher

Key Features:

  • Struct-array support with vector sub-fields
  • Nullable vector columns (dense, binary, sparse, int8)
  • Array field partial updates (APPEND, REMOVE)
  • Custom gRPC DialOptions with preserved default settings

Source: client/README.md:1-35

Recent SDK Features (v2.6.x)

VersionFeature
v2.6.5Nullable vector columns, Array field helpers
v2.6.4Struct-array support, EmbeddingList/MAX_SIM search
v2.6.3TruncateCollection API, GetReplicateConfiguration API

Deployment Modes

Milvus supports three deployment configurations based on scale and requirements:

Standalone Mode

Single-node deployment with all components running in one process. Uses RocksMQ by default.

┌─────────────────────────────────────┐
│         Milvus Standalone           │
│  ┌─────────────────────────────┐   │
│  │  Proxy + All Coordinators   │   │
│  └─────────────────────────────┘   │
│  ┌─────────────────────────────┐   │
│  │  Query + Data + Index Nodes │   │
│  └─────────────────────────────┘   │
│  ┌─────────────────────────────┐   │
│  │         RocksMQ             │   │
│  └─────────────────────────────┘   │
└─────────────────────────────────────┘

Cluster Mode

Distributed deployment with separate coordinator services and worker nodes.

Distributed Mode

Multi-node cluster with separate storage backends (Pulsar/Kafka, S3/MinIO, etcd).

Source: tools/README.md:1-15

Logging Architecture

Milvus uses a context-aware logging library (mlog) built on zap for distributed system observability.

Design Principles

  1. Mandatory Context Passing - All logging operations require a context, ensuring request traceability
  2. Zero-Overhead Abstraction - Uses type aliases for performance comparable to direct zap usage
  3. Automatic Field Accumulation - Context fields automatically propagate through call chains
  4. Lazy Encoding - Deferred field encoding avoids overhead when log level is disabled

Key Log Fields

Field FunctionTypePurpose
FieldCollectionID(val)int64Collection identifier
FieldSegmentID(val)int64Segment identifier
FieldPChannel(val)stringPhysical channel name
FieldMessageID(val)ObjectMarshalerMessage identifier

Source: pkg/mlog/README.md:1-30

Streaming Coordination

The streaming coordination service manages physical channel assignments across streaming nodes.

PChannel States

stateDiagram-v2
    [*] --> UNASSIGNED
    UNASSIGNED --> ASSIGNING
    ASSIGNING --> ASSIGNED
    ASSIGNED --> ASSIGNING : Rebalancing
    ASSIGNING --> UNASSIGNED : Failure
    ASSIGNED --> [*]

State Definitions:

  • UNASSIGNED: Channel not assigned to any node
  • ASSIGNING: Transition state during assignment
  • ASSIGNED: Active on a streaming node

Assignment Discovery

The AssignmentService watches and broadcasts channel state changes:

func (s *assignmentServiceImpl) AssignmentDiscover(
    server streamingpb.StreamingCoordAssignmentService_AssignmentDiscoverServer
) error {
    s.listenerTotal.Inc()
    defer s.listenerTotal.Dec()
    
    balancer, err := balance.GetWithContext(server.Context())
    if err != nil {
        return err
    }
    // ... assignment streaming
}

Source: internal/streamingcoord/server/service/assignment.go:25-45

Channel History Tracking

Each physical channel maintains an assignment history for debugging and load balancing decisions:

func (c *PChannelMeta) GetHistories() []types.PChannelInfoAssigned {
    history := make([]types.PChannelInfoAssigned, 0, len(c.inner.Histories))
    for _, h := range c.inner.Histories {
        history = append(history, types.PChannelInfoAssigned{
            Channel: types.PChannelInfo{
                Name:       c.inner.GetChannel().GetName(),
                Term:       h.Term,
                AccessMode: types.AccessMode(h.AccessMode),
            },
            Node: types.NewStreamingNodeInfoFromProto(h.Node),
        })
    }
    return history
}

Source: internal/streamingcoord/server/balancer/channel/pchannel.go:1-25

Supported Index Types

Milvus supports multiple indexing algorithms optimized for different use cases:

Index TypeDescriptionBest For
FLATBrute-force searchSmall datasets, accuracy-critical
IVF_FLATInverted file indexBalanced speed/accuracy
IVF_SQ8Scalar quantized IVFMemory-constrained
HNSWHierarchical NSWFast queries, higher memory
DiskANNDisk-based ANNBillion-scale datasets
ANNOYApproximate nearest neighborsMemory-efficient

Recent Releases

Latest Stable Release (v2.6.17)

  • Release date: May 22, 2026
  • Python SDK: 2.6.14, Go SDK: 2.6.4, Java SDK: 2.6.20, Node.js SDK: 2.6.14

Security Release (v2.5.27)

  • Release date: February 27, 2026
  • Critical security patches (CVE fixes)

Feature Highlights by Version

VersionKey Features
v2.6.17Performance improvements, bug fixes
v2.6.15Gemini embedding model support
v2.6.14Enhanced index building
v2.6.13Streaming improvements
v2.6.12Query optimization

Community Considerations

Based on community feedback and engagement, users have expressed interest in:

  1. String Field Support - Extending field types beyond numerical values
  2. Flexible Schema Modification - Adding fields to existing non-empty collections
  3. Backup and Restore - Native backup mechanisms for production deployments
  4. String IDs - Primary key support beyond integer types
  5. ScaNN Index - Google's ScaNN algorithm for improved ANN performance

These feature requests reflect common production requirements that influence how developers design their AI applications around Milvus.

Next Steps

Source: https://github.com/milvus-io/milvus / Human Manual

Quick Start Guide

Related topics: Milvus Introduction, Go SDK (client/v2)

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Standalone Mode (Single Node)

Continue reading this section for the full explanation and source context.

Section Cluster Mode (Distributed)

Continue reading this section for the full explanation and source context.

Section SDK Installation

Continue reading this section for the full explanation and source context.

Related topics: Milvus Introduction, Go SDK (client/v2)

Quick Start Guide

Overview

The Quick Start Guide provides a streamlined path for new users to begin using Milvus, an open-source vector database optimized for AI applications. This guide covers essential operations including installation, connecting to Milvus, creating collections with schemas, inserting vectors, and performing similarity searches.

The guide is designed to help developers get from zero to their first vector search in under 15 minutes using the official SDKs.

Prerequisites

RequirementMinimum VersionRecommended
Python3.8+3.10+
JavaJDK 8+JDK 11+
Go1.18+1.21+
Node.js14+18+
Docker20.10+Latest
Milvus Server2.3+2.6.x

Installation

Standalone Mode (Single Node)

For local development and testing, Milvus standalone can be started using Docker Compose:

# Download configuration file
wget https://github.com/milvus-io/milvus/releases/download/v2.6.17/milvus-standalone-docker-compose.yml
mv milvus-standalone-docker-compose.yml docker-compose.yml

# Start Milvus
docker-compose up -d

The standalone instance will be available at localhost:19530.

Cluster Mode (Distributed)

For production deployments requiring high availability and scalability, deploy Milvus in cluster mode using Kubernetes or docker-compose with multiple nodes.

SDK Installation

Python SDK (pymilvus):

pip install pymilvus>=2.6.0

Go SDK (client/v2):

go get github.com/milvus-io/milvus-go-sdk/v2@latest

Core Concepts

Collections

A collection is the primary data organization unit in Milvus, containing structured data with vector fields and scalar fields. Collections can be created with predefined schemas or dynamically with flexible fields.

Schemas

Each collection requires a schema that defines:

  • Primary Key Field: Unique identifier for each entity (Int64 or VarChar)
  • Vector Field: Dense or sparse vector data for similarity search
  • Scalar Fields: Optional metadata fields (Int, Float, Boolean, String, etc.)

Index Types

Milvus supports multiple index types optimized for different search requirements:

Index TypeUse CaseMemory Efficiency
IVF_FLATBalanced accuracy/speedMedium
IVF_SQ8Reduced memoryHigh
HNSWFastest searchHigh
DISKANNBillion-scale datasetsVery High

Connection and Authentication

Basic Connection

from pymilvus import MilvusClient

# Connect to local Milvus
client = MilvusClient(uri="http://localhost:19530")

Authenticated Connection

from pymilvus import MilvusClient

# Connect with authentication
client = MilvusClient(
    uri="http://localhost:19530",
    token="user:password"
)

Connection with TLS

from pymilvus import MilvusClient

client = MilvusClient(
    uri="https://localhost:19530",
    secure=True,
    tls_ca_cert="path/to/ca.crt"
)

Collection Operations

Creating a Collection

from pymilvus import MilvusClient, DataType

client = MilvusClient()

# Define schema
schema = MilvusClient.create_schema(
    auto_id=False,
    enable_dynamic_field=True,
)

# Add primary key field
schema.add_field(field_name="id", datatype=DataType.INT64, is_primary=True)

# Add vector field
schema.add_field(field_name="vector", datatype=DataType.FLOAT_VECTOR, dim=128)

# Add scalar fields
schema.add_field(field_name="color", datatype=DataType.VARCHAR, max_length=100)

# Create collection with index parameters
index_params = client.prepare_index_params()
index_params.add_index(
    field_name="vector",
    index_type="AUTOINDEX",
    metric_type="COSINE"
)

client.create_collection(
    collection_name="quick_start",
    schema=schema,
    index_params=index_params
)

Inserting Data

import random

# Generate sample data
data = [
    {"id": i, "vector": [random.random() for _ in range(128)], "color": "red"}
    for i in range(1000)
]

# Insert data
result = client.insert(
    collection_name="quick_start",
    data=data
)

print(f"Inserted {result['insert_count']} entities")

Searching Data

# Prepare query vector
query_vector = [[random.random() for _ in range(128)]]

# Search for similar vectors
results = client.search(
    collection_name="quick_start",
    data=query_vector,
    limit=10,
    output_fields=["id", "color"]
)

# Process results
for result in results:
    print(f"ID: {result['id']}, Distance: {result['distance']}")

Data Flow Architecture

graph TD
    A[Client Application] -->|SDK API| B[Milvus Server]
    B --> C[Proxy Layer]
    C --> D[Query Node]
    C --> E[Data Node]
    D --> F[Object Storage]
    E --> F
    G[Index files] --> F
    H[Segment files] --> F

Go SDK Quick Start

package main

import (
    "context"
    "fmt"
    
    milvusclient "github.com/milvus-io/milvus-go-sdk/v2/client"
)

func main() {
    ctx := context.Background()
    
    // Connect to Milvus
    cli, err := milvusclient.NewGrpcClient(ctx, "localhost:19530")
    if err != nil {
        panic(err)
    }
    defer cli.Close()
    
    // Create collection option
    opt := milvusclient.NewCreateCollectionOption("my_collection", 128)
    
    // Create collection
    err = cli.CreateCollection(ctx, opt)
    if err != nil {
        panic(err)
    }
    
    fmt.Println("Collection created successfully")
}

Common Patterns

Batch Operations

For high-throughput scenarios, batch inserts in groups of 1000-5000 entities for optimal performance:

# Batch insert configuration
BATCH_SIZE = 1000

for batch_start in range(0, total_count, BATCH_SIZE):
    batch_end = min(batch_start + BATCH_SIZE, total_count)
    batch_data = generate_vectors(batch_start, batch_end)
    
    client.insert(
        collection_name="quick_start",
        data=batch_data
    )

Search with Filters

Apply metadata filters during similarity searches:

results = client.search(
    collection_name="quick_start",
    data=[query_vector],
    filter="color == 'red'",
    limit=10
)

Troubleshooting

IssueCauseSolution
Connection refusedMilvus not runningVerify docker container is running
Authentication failedInvalid credentialsCheck username/password configuration
Index not foundSearch before index creationWait for index build to complete
Out of memoryInsufficient memoryReduce batch size or increase system RAM

Next Steps

  • Schema Design: Review collection schema best practices for your use case
  • Index Tuning: Experiment with different index types for optimal performance
  • Partitioning: Use partitions to improve query performance on large datasets
  • Backup/Restore: Implement backup strategies for production deployments

See Also

Source: https://github.com/milvus-io/milvus / Human Manual

System Architecture

Related topics: Coordinator Services, Data Storage Layer, Data Ingestion and Flow

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Message Queue (RocksMQ)

Continue reading this section for the full explanation and source context.

Section Retention System

Continue reading this section for the full explanation and source context.

Section Streaming Coordination

Continue reading this section for the full explanation and source context.

Related topics: Coordinator Services, Data Storage Layer, Data Ingestion and Flow

System Architecture

Overview

Milvus is a cloud-native, highly scalable vector database designed for vector similarity search and AI applications. The system architecture follows a distributed design pattern that separates concerns between message queuing, storage coordination, data processing, and query execution.

The architecture is built around several core subsystems:

  • Message Queue Layer: RocksMQ, Pulsar, or Kafka for event streaming
  • Coordination Layer: Root Coordinator and Proxy components for cluster management
  • Storage Layer: Distributed storage with retention policies
  • Query Layer: Query nodes for vector search execution
  • Logging Infrastructure: Context-aware structured logging

Core Components

Message Queue (RocksMQ)

RocksMQ is Milvus's default message queue implementation built on RocksDB. It provides persistent message storage with consumer group management and retention capabilities.

#### Topic Management

Topics in RocksMQ are identified by unique keys stored in RocksDB. The topic identifier follows a naming convention where topic names cannot contain forward slashes (/).

// Source: pkg/mq/mqimpl/rocksmq/server/rocksmq_impl.go:1-30
topicIDKey := TopicIDTitle + topicName

Topic creation validation:

  • Checks for existing topics to prevent duplicates
  • Validates topic name does not contain reserved characters
  • Uses mutex locks to ensure thread-safe creation
  • Stores topic metadata with initialization markers

#### Consumer Group Management

Consumer groups are managed through the consumerList structure, which maintains a map of consumers by group name:

// Source: pkg/mq/mqimpl/rocksmq/server/rocksmq_impl.go:40-60
type consumerList struct {
    consumers map[string]*Consumer
    mu        sync.RWMutex
}

Key operations include:

OperationDescription
Add(consumer)Register a new consumer in the group
Remove(groupName)Unregister a consumer
Get(groupName)Retrieve consumer by group name
Notify(groupName)Signal consumer for message processing

#### Page-Based Message Storage

Messages are stored in fixed-size pages to enable efficient retrieval and cleanup:

// Source: pkg/mq/mqimpl/rocksmq/server/rocksmq_impl.go:80-120
func (rmq *rocksmq) updateTopicPageInfo(topicName string, msgIDs []UniqueID, msgSizes map[UniqueID]int64) error

The page management system:

  • Tracks page boundaries using message IDs
  • Records message timestamps for retention decisions
  • Uses a mutation buffer for atomic page updates
  • Supports configurable page size (default: 64MB)

Retention System

The retention system automatically cleans up old messages based on size and time policies:

// Source: pkg/mq/mqimpl/rocksmq/server/rocksmq_impl.go:35-38
func checkRetention() bool {
    params := paramtable.Get()
    return params.RocksmqCfg.RetentionSizeInMB.GetAsInt64() != -1 || 
           params.RocksmqCfg.RetentionTimeInMinutes.GetAsInt64() != -1
}

Retention policies are evaluated by:

  1. Scanning acknowledged message pages
  2. Checking message age against RetentionTimeInMinutes
  3. Checking cumulative size against RetentionSizeInMB
  4. Deleting expired pages using range deletes

#### Page Deletion Implementation

// Source: pkg/mq/mqimpl/rocksmq/server/rocksmq_retention.go:1-30
func DeleteMessages(db *gorocksdb.DB, topic string, startID, endID UniqueID) error {
    startKey := path.Join(topic, strconv.FormatInt(startID, 10))
    endKey := path.Join(topic, strconv.FormatInt(endID+1, 10))
    writeBatch := gorocksdb.NewWriteBatch()
    defer writeBatch.Destroy()
    writeBatch.DeleteRange([]byte(startKey), []byte(endKey))
    // ...
}

Streaming Coordination

The streaming coordination system manages persistent channels (PChannels) that distribute write operations across the cluster.

#### PChannel Meta Management

PChannels represent logical channels for data streaming, with state tracking and assignment history:

// Source: internal/streamingcoord/server/balancer/channel/pchannel.go:1-50
type PChannelMeta struct {
    inner *streamingpb.PChannelMeta
}

Key PChannel states:

StateDescription
PCHANNEL_META_STATE_ASSIGNINGChannel is being assigned to a node
PCHANNEL_META_STATE_ASSIGNEDChannel is actively assigned
Other statesReserved for future use

#### Channel Assignment History

The system maintains assignment history for debugging and load balancing:

// Source: internal/streamingcoord/server/balancer/channel/pchannel.go:60-80
func (c *PChannelMeta) GetChannelHistory() []types.PChannelInfoAssigned {
    history := make([]types.PChannelInfoAssigned, 0, len(c.inner.Histories))
    for _, h := range c.inner.Histories {
        history = append(history, types.PChannelInfoAssigned{
            Channel: types.PChannelInfo{
                Name:       c.inner.GetChannel().GetName(),
                Term:       h.Term,
                AccessMode: types.AccessMode(h.AccessMode),
            },
            Node: types.NewStreamingNodeInfoFromProto(h.Node),
        })
    }
    return history
}

#### State Query Methods

// Source: internal/streamingcoord/server/balancer/channel/pchannel.go:80-100
func (c *PChannelMeta) IsAssigned() bool {
    return c.inner.State == streamingpb.PChannelMetaState_PCHANNEL_META_STATE_ASSIGNED
}

func (c *PChannelMeta) IsAssignedOrAssigning() bool {
    return c.inner.State == streamingpb.PChannelMetaState_PCHANNEL_META_STATE_ASSIGNED || 
           c.inner.State == streamingpb.PChannelMetaState_PCHANNEL_META_STATE_ASSIGNING
}

Load Balancing Policy

The vChannel fair policy distributes channels across streaming nodes while maintaining balance:

// Source: internal/streamingcoord/server/balancer/policy/vchannelfair/expected_layout.go:1-30
type expectedLayoutForVChannelFairPolicy struct {
    Config                 policyConfig
    CurrentLayout          balancer.CurrentLayout
    AveragePChannelPerNode float64
    AverageVChannelPerNode float64
    Assignments            map[types.ChannelID]types.PChannelInfoAssigned
    Nodes                  nodes
}

#### Assignment Snapshot

// Source: internal/streamingcoord/server/balancer/policy/vchannelfair/expected_layout.go:50-70
type assignmentSnapshot struct {
    Assignments           map[types.ChannelID]types.PChannelInfoAssigned
    GlobalUnbalancedScore float64
}

func (s *assignmentSnapshot) Clone() assignmentSnapshot {
    assignments := make(map[types.ChannelID]types.PChannelInfoAssigned, len(s.Assignments))
    for channelID, assignment := range s.Assignments {
        assignments[channelID] = assignment
    }
    return assignmentSnapshot{
        Assignments:           assignments,
        GlobalUnbalancedScore: s.GlobalUnbalancedScore,
    }
}

Logging Infrastructure

Milvus uses a context-aware logging library (mlog) built on zap for structured logging:

// Source: pkg/mlog/README.md

#### Design Principles

PrincipleImplementation
Mandatory ContextAll logging operations require context parameter
Zero OverheadType aliases avoid wrapper overhead
Field AccumulationContext fields automatically propagate
Cross-ServicegRPC metadata for distributed tracing
Lazy EncodingWithLazy defers field encoding

#### Logger Performance Optimization

When logging, the system compares pre-encoded field counts between context and component loggers:

Component Logger: 2 fields (module, nodeID) - pre-encoded
ctx logger:       5 fields (traceID, spanID, ...) - pre-encoded

→ Select ctx logger as base, only need to encode component's 2 fields
→ Faster than using component logger and encoding 5 ctx fields

#### Field Management

// Add fields to context (fields accumulate)
ctx = mlog.WithFields(ctx, mlog.String("request_id", "abc123"))
ctx = mlog.WithFields(ctx, mlog.Int64("user_id", 42))

// Subsequent logs automatically include these fields
mlog.Info(ctx, "processing request")

Architecture Flow

Message Production and Consumption

graph TD
    A[Client Request] --> B[Proxy Layer]
    B --> C[Topic Selection]
    C --> D{RocksMQ Topic}
    D --> E[Page Buffer]
    E --> F[RocksDB Write]
    F --> G[Consumer Groups]
    G --> H[Message Retrieval]
    H --> I[Acknowledgment]
    I --> J[Retention Cleanup]

PChannel Assignment Flow

graph TD
    A[New PChannel Request] --> B[Check Current State]
    B --> C{Node Available?}
    C -->|Yes| D[Assign to Node]
    C -->|No| E[Queue Assignment]
    D --> F[Update Assignment History]
    F --> G[Notify Streaming Node]
    E --> B
    G --> H[Begin Streaming]

Retention Decision Flow

graph TD
    A[Retention Timer] --> B[Scan Acknowledged Pages]
    B --> C[Check Time Policy]
    C --> D{Time Expired?}
    D -->|Yes| E[Mark for Deletion]
    D -->|No| F[Check Size Policy]
    F --> G{Size Exceeded?}
    G -->|Yes| E
    G -->|No| H[Continue Monitoring]
    E --> I[Delete Page Range]
    I --> J[Update Metadata]

Development Tools

Milvus provides development utilities for consistent commit workflows:

# Commit with AI-generated messages
python3 tools/mgit.py

# Create PR with automatic issue linking
python3 tools/mgit.py --pr

# Complete workflow (commit + PR)
python3 tools/mgit.py --all

Commit Message Format

type: Brief description

Options:
- fix: Bug fixes
- feat: New features
- enhance: Improvements
- refactor: Code refactoring
- test: Test modifications
- docs: Documentation updates
- chore: Build/tool changes
  • Backup and Restore: Community feature request #9685 highlights the need for production-grade backup capabilities
  • String Field Support: Feature request #4430 discusses schema flexibility requirements
  • Schema Modification: Issue #20405 addresses collection schema evolution needs

Source: https://github.com/milvus-io/milvus / Human Manual

Coordinator Services

Related topics: System Architecture, Data Storage Layer

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Root Coordinator (RootCoord)

Continue reading this section for the full explanation and source context.

Section Data Coordinator (DataCoord)

Continue reading this section for the full explanation and source context.

Section Query Coordinator (QueryCoord)

Continue reading this section for the full explanation and source context.

Related topics: System Architecture, Data Storage Layer

Coordinator Services

Coordinator Services are centralized management components in Milvus that orchestrate distributed operations across the cluster. These services handle metadata management, data coordination, query distribution, and streaming coordination, acting as the brain of the Milvus distributed system.

Architecture Overview

Milvus employs a coordinator-based architecture where specialized coordinator services manage different aspects of the system:

graph TD
    subgraph "Coordinator Layer"
        RC[Root Coord]
        DC[Data Coord]
        QC[Query Coord]
        SC[Streaming Coord]
    end
    
    subgraph "Worker Nodes"
        DN[Data Node]
        QN[Query Node]
        PN[Proxy Node]
        SN[Streaming Node]
    end
    
    subgraph "Storage Layer"
        ETCD[etcd]
        MSG[Message Queue]
        OBJ[Object Storage]
    end
    
    RC --> ETCD
    DC --> ETCD
    QC --> ETCD
    SC --> MSG
    
    DC --> DN
    QC --> QN
    SC --> SN
    
    DN --> OBJ
    QN --> OBJ

Core Coordinator Components

Root Coordinator (RootCoord)

The Root Coordinator is the central authority for metadata management and cluster-level operations.

Key Responsibilities:

  • Collection and partition management
  • Schema enforcement and validation
  • Timestamp allocation for distributed transactions
  • Database and user management

Location: internal/rootcoord/root_coord.go

Data Coordinator (DataCoord)

The Data Coordinator manages data node coordination and segment lifecycle.

Key Responsibilities:

  • Segment assignment and load balancing across data nodes
  • Segment compaction coordination
  • Index building task distribution
  • Data node health monitoring

Location: internal/datacoord/server.go

Services Interface: internal/datacoord/services.go

Query Coordinator (QueryCoord)

The Query Coordinator orchestrates query operations across query nodes.

Key Responsibilities:

  • Query collection shard loading/unloading
  • Query load balancing
  • Search and query request routing
  • Query node cluster management

Location: internal/querycoordv2/server.go

Services Interface: internal/querycoordv2/services.go

Streaming Coordinator (StreamingCoord)

The Streaming Coordinator manages streaming channels and real-time data ingestion.

Location: internal/streamingcoord/server/

PChannel Metadata Management

Physical channels (PChannels) represent the fundamental unit of streaming coordination. The PChannel metadata structure tracks channel state and assignment history.

// Source: internal/streamingcoord/server/balancer/channel/pchannel.go

type PChannelMeta struct {
    inner *streamingpb.PChannelMeta
}

PChannel States:

StateDescription
PCHANNEL_META_STATE_UNASSIGNEDChannel not assigned to any node
PCHANNEL_META_STATE_ASSIGNINGChannel assignment in progress
PCHANNEL_META_STATE_ASSIGNEDChannel successfully assigned to a node

Key Methods:

MethodPurpose
IsAssigned()Returns true if channel is assigned to a server
IsAssignedOrAssigning()Returns true if channel is in active state
LastAssignTimestamp()Returns the last assigned timestamp
State()Returns the current channel state
CopyForWrite()Returns mutable copy for modification

History Tracking: PChannel metadata maintains assignment history for debugging and load balancing decisions:

func (c *PChannelMeta) GetHistories() []types.PChannelInfoAssigned {
    history := make([]types.PChannelInfoAssigned, 0, len(c.inner.Histories))
    for _, h := range c.inner.Histories {
        history = append(history, types.PChannelInfoAssigned{
            Channel: types.PChannelInfo{
                Name:       c.inner.GetChannel().GetName(),
                Term:       h.Term,
                AccessMode: types.AccessMode(h.AccessMode),
            },
            Node: types.NewStreamingNodeInfoFromProto(h.Node),
        })
    }
    return history
}

Broadcast Manager

The Broadcast Manager handles coordinated broadcasting of operations across the cluster, ensuring consistent state updates.

Location: internal/streamingcoord/server/broadcaster/broadcast_manager.go

Resource Key-Based Locking

Broadcast operations use resource keys to coordinate access to shared resources:

type broadcastTaskManager struct {
    resourceKeyLocker ResourceKeyLocker
    ackScheduler     *ackCallbackScheduler
}

API Methods:

MethodPurpose
WithResourceKeys(ctx, ...ResourceKey)Acquires resource keys for broadcast task
WithSecondaryClusterResourceKey(ctx)Acquires exclusive cluster-level key for secondary operations

Resource Key Acquisition Flow

sequenceDiagram
    participant Client
    participant BM as BroadcastManager
    participant Locker as ResourceKeyLocker
    participant IDAlloc as IDAllocator
    
    Client->>BM: WithResourceKeys(ctx, keys...)
    BM->>BM: startLockInstant = time.Now()
    BM->>BM: Append shared cluster resource keys
    BM->>Locker: Lock(resourceKeys...)
    Locker-->>BM: Return guards
    BM->>IDAlloc: Allocate(ctx)
    IDAlloc-->>BM: Return broadcastID
    BM->>BM: Check cluster role
    BM-->>Client: Return broadcasterWithRK

Lock Acquisition Process:

func (bm *broadcastTaskManager) WithResourceKeys(ctx context.Context, resourceKeys ...message.ResourceKey) (BroadcastAPI, error) {
    startLockInstant := time.Now()
    resourceKeys = bm.appendSharedClusterRK(resourceKeys...)
    guards := bm.resourceKeyLocker.Lock(resourceKeys...)
    
    id, err := resource.Resource().IDAllocator().Allocate(ctx)
    if err != nil {
        guards.Unlock()
        return nil, errors.Wrapf(err, "allocate new id failed")
    }
    
    if err := bm.checkClusterRole(ctx); err != nil {
        guards.Unlock()
        return nil, err
    }
    bm.metrics.ObserveAcquireLockDuration(startLockInstant, guards.ResourceKeys())
    
    return &broadcasterWithRK{
        broadcaster: bm,
        broadcastID: id,
        guards:      guards,
    }, nil
}

Secondary Cluster Operations

Force promote operations require secondary cluster verification:

func (bm *broadcastTaskManager) WithSecondaryClusterResourceKey(ctx context.Context) (BroadcastAPI, error) {
    id, err := resource.Resource().IDAllocator().Allocate(ctx)
    if err != nil {
        return nil, errors.Wrapf(err, "allocate new id failed")
    }
    
    startLockInstant := time.Now()
    // Acquire an exclusive cluster resource key to block
    // ...
}

Message Queue Integration

Coordinator services integrate with message queues for asynchronous communication and event-driven coordination.

RocksMQ Implementation

RocksMQ is a built-in message queue option for Milvus standalone deployments.

Location: pkg/mq/mqimpl/rocksmq/server/rocksmq_impl.go

Key Data Structures:

type consumerList struct {
    consumers map[string]*Consumer  // GroupName -> *Consumer
    mu        sync.RWMutex
}

Topic Management: Topics are identified by name with restrictions on character content:

func (rmq *rocksmq) CreateTopic(topicName string) error {
    // Check if topicName contains "/"
    if strings.Contains(topicName, "/") {
        return retry.Unrecoverable(fmt.Errorf("topic name = %s contains \"/\"", topicName))
    }
    // ...
}

Message Retention

RocksMQ implements retention policies to manage storage.

Location: pkg/mq/mqimpl/rocksmq/server/rocksmq_retention.go

Retention Conditions:

ConditionConfigurationDefault
Size-basedRocksmqCfg.RetentionSizeInMB-1 (disabled)
Time-basedRocksmqCfg.RetentionTimeInMinutes-1 (disabled)

Retention Check Function:

func checkRetention() bool {
    params := paramtable.Get()
    return params.RocksmqCfg.RetentionSizeInMB.GetAsInt64() != -1 || 
           params.RocksmqCfg.RetentionTimeInMinutes.GetAsInt64() != -1
}

Retention Process:

  1. Iterate through acked pages
  2. Check timestamp expiration per page
  3. Calculate total deleted size
  4. Clean expired data when conditions are met
func (ri *retentionInfo) cleanData(topic string, pageEndID int64) error {
    // ...
    ll, ok := topicMu.Load(topic)
    if !ok {
        return fmt.Errorf("topic name = %s not exist", topic)
    }
    lock, ok := ll.(*sync.Mutex)
    if !ok {
        return fmt.Errorf("get mutex failed, topic name = %s", topic)
    }
    lock.Lock()
    defer lock.Unlock()
    
    err := DeleteMessages(ri.db, topic, 0, pageEndID)
    // ...
}

Context-Aware Logging

Coordinator services use the mlog package for context-aware logging.

Location: pkg/mlog/README.md

Design Principles

  1. Mandatory Context Passing - All logging requires a context for request traceability
  2. Zero-Overhead Abstraction - Type aliases avoid wrapper overhead
  3. Automatic Field Accumulation - Context fields propagate through call chains
  4. Lazy Encoding - Deferred field encoding when log level is disabled

Logging API

MethodPurpose
mlog.Info(ctx, msg, fields...)Log info-level message
mlog.Debug(ctx, msg, fields...)Log debug-level message
mlog.Error(ctx, msg, fields...)Log error-level message
mlog.WithFields(ctx, fields...)Add fields to context

Field Accumulation Example

// Add fields to context (fields accumulate)
ctx = mlog.WithFields(ctx, mlog.String("request_id", "abc123"))
ctx = mlog.WithFields(ctx, mlog.Int64("user_id", 42))

// Subsequent logs automatically include these fields
mlog.Info(ctx, "processing request")
// Output: {"msg":"processing request", "request_id":"abc123", "user_id":42, ...}

Coordinator Communication Patterns

graph LR
    subgraph "Client Requests"
        S[Search/Query]
        I[Insert/Delete]
        M[Management]
    end
    
    subgraph "Coordinators"
        PN[Proxy Node]
        RC[Root Coord]
        DC[Data Coord]
        QC[Query Coord]
    end
    
    subgraph "Workers"
        DN[Data Nodes]
        QN[Query Nodes]
    end
    
    S --> PN
    I --> PN
    M --> PN
    
    PN --> RC
    PN --> DC
    PN --> QC
    
    RC -.-> DC
    RC -.-> QC
    DC --> DN
    QC --> QN

Configuration and Health Monitoring

Consumer Group Management

Coordinator services track consumer groups for each topic:

func (rmq *rocksmq) ExistConsumerGroup(topicName, groupName string) (bool, *Consumer, error) {
    key := constructCurrentID(topicName, groupName)
    _, ok := rmq.consumersID.Load(key)
    if ok {
        if val, ok := rmq.consumers.Load(topicName); ok {
            c := val.(*consumerList).Get(groupName)
            return c != nil, c, nil
        }
    }
    return false, nil, nil
}

Topic Cleanup

Topics are cleaned up atomically with multi-key removal:

func (rmq *rocksmq) DestroyTopic(topicName string) error {
    // Remove page size info
    pageMsgSizeKey := constructKey(PageMsgSizeTitle, topicName)
    err = rmq.kv.RemoveWithPrefix(context.TODO(), pageMsgSizeKey)
    
    // Remove page ts info
    pageMsgTsKey := constructKey(PageTsTitle, topicName)
    err = rmq.kv.RemoveWithPrefix(context.TODO(), pageMsgTsKey)
    
    // Remove acked ts info
    ackedTsKey := constructKey(AckedTsTitle, topicName)
    err = rmq.kv.RemoveWithPrefix(context.TODO(), ackedTsKey)
    
    // Batch remove, atomic operation
    err = rmq.kv.MultiRemove(context.TODO(), removedKeys)
}

Best Practices

Topic Naming

  • Avoid special characters like / in topic names
  • Use descriptive naming conventions for easier debugging

Resource Locking

  • Always release locks in defer statements
  • Check cluster role before critical operations
  • Handle lock acquisition failures gracefully

Retention Configuration

  • Enable retention to prevent unbounded storage growth
  • Balance retention time/size based on workload patterns
  • Monitor RocksMQ disk usage in production deployments

Source: https://github.com/milvus-io/milvus / Human Manual

Query Execution Engine

Related topics: Vector Index Types, System Architecture

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Query Plan Parser (Go)

Continue reading this section for the full explanation and source context.

Section ExecPlanNodeVisitor (C++)

Continue reading this section for the full explanation and source context.

Section Segment Interface

Continue reading this section for the full explanation and source context.

Related topics: Vector Index Types, System Architecture

Query Execution Engine

Overview

The Query Execution Engine is the core component responsible for processing search and query requests in Milvus. It translates user queries into executable operations against stored data segments, leveraging vector similarity algorithms and efficient filtering mechanisms to retrieve relevant results.

The execution engine operates at the query node level, coordinating between segment management, index structures, and expression evaluation to deliver high-performance vector search capabilities.

Architecture Overview

graph TD
    A[User Query Request] --> B[Query Plan Parser]
    B --> C[Execution Plan Node Visitor]
    C --> D[Segment Executor]
    D --> E[Index-Based Search]
    D --> F[Vector Comparison]
    D --> G[Expression Filter]
    E --> H[Results Merge & Rank]
    F --> H
    G --> H
    H --> I[Final Results]

Key Components

Query Plan Parser (Go)

Located in internal/parser/planparserv2/plan_parser_v2.go, the plan parser translates user query requests into an executable query plan. It handles:

  • Expression Parsing: Converts filter expressions into abstract syntax trees
  • Vector Query Construction: Builds search request structures with query vectors
  • Plan Validation: Ensures query parameters are valid before execution
// Plan parser entry point structure
type PlanParserV2 struct {
    // Parses query expressions and vector search parameters
    // into executable plan nodes
}

ExecPlanNodeVisitor (C++)

The ExecPlanNodeVisitor.cpp implements the visitor pattern for traversing query plan nodes:

MethodPurpose
VisitPlanNodeEntry point for plan traversal
VisitQueryProcesses query-specific nodes
VisitSearchHandles vector similarity search
VisitFilterEvaluates filter expressions
VisitRetrieveFetches complete record data

Source: internal/core/src/query/ExecPlanNodeVisitor.cpp

Segment Interface

The SegmentInterface.h defines the contract for segment execution:

// Core segment interface for query execution
class SegmentInterface {
    virtual SearchResult Search(const QueryPlan& plan) = 0;
    virtual RetrieveResult Retrieve(const QueryPlan& plan) = 0;
    virtual int64_t GetRowCount() const = 0;
};

Source: internal/core/src/segcore/SegmentInterface.h

Expression System

Located in internal/core/src/exec/expression/Expr.h, the expression system provides:

  • Vectorized Evaluation: Processes batches of data efficiently
  • Type Safety: Supports Int, Float, String, and Array field types
  • Optimized Predicates: Short-circuit evaluation for filter conditions

Source: internal/core/src/exec/expression/Expr.h

Execution Flow

Search Request Pipeline

sequenceDiagram
    participant Client
    participant Proxy
    participant QueryNode
    participant Segment
    participant Index

    Client->>Proxy: Search Request
    Proxy->>QueryNode: Forward Query Plan
    QueryNode->>Segment: Initialize Search
    Segment->>Index: Query Index Structure
    Index-->>Segment: Candidate IDs
    Segment->>Segment: Apply Filter Expressions
    Segment->>Index: Score Calculation
    Index-->>Segment: Scored Results
    Segment-->>QueryNode: Top-K Results
    QueryNode-->>Proxy: Merged Results
    Proxy-->>Client: Final Results

Query Node Search Implementation

The search.go file implements the core search execution logic:

func (s *SearchTask) Execute() error {
    // 1. Load segments for the collection
    segments, err := s.segmentManager.GetSegments()
    
    // 2. Execute search on each segment
    results := make([]*SearchResult, 0)
    for _, seg := range segments {
        result, err := seg.Search(s.plan)
        results = append(results, result)
    }
    
    // 3. Merge results from all segments
    return s.mergeResults(results)
}

Source: internal/querynodev2/segments/search.go

Expression Evaluation

Supported Expression Types

Expression TypeOperatorExample
Comparison=, !=, >, <, >=, <=age >= 25
LogicalAND, OR, NOTage >= 25 AND gender = "M"
InINstatus IN ["active", "pending"]
RangeBETWEENscore BETWEEN 0.5 AND 0.9
String MatchLIKEname LIKE "John%"

Expression Evaluation Order

graph LR
    A[Raw Expression] --> B[Parse to AST]
    B --> C[Optimize Order]
    C --> D[Execute Filters]
    D --> E[Short-Circuit Evaluation]
    E --> F[Bitmap Generation]
    F --> G[Vector Search Filtering]

The expression evaluator generates a bitmap of matching entity IDs, which is then used to filter vector search results before returning to the user.

Segment Execution Strategies

When an index exists for the vector field:

  1. Query index structure for candidate IDs
  2. Apply pre-filter expressions using index metadata
  3. Compute similarity scores on candidates only
  4. Return top-K results

Raw Scan

When no index exists:

  1. Scan all records in the segment
  2. Apply filter expressions during scan
  3. Compute similarity scores for all filtered entities
  4. Return top-K results

Configuration Parameters

ParameterDefaultDescription
queryNode.enableMaterializedViewtrueEnable materialized view execution
queryNode.chunkCacheEnabledtrueEnable chunk-level caching
queryNode.nprobe16Number of probe clusters for IVF indexes
queryNode.growingSizeLimit-1Max results from growing segments

Performance Considerations

Optimization Strategies

  1. Early Filtering: Apply filter expressions before vector search to reduce candidates
  2. Batch Processing: Process multiple queries concurrently using goroutines
  3. Index Selection: Automatically select optimal index type based on data characteristics
  4. Cache Utilization: Leverage chunk cache for frequently accessed segments

Monitoring Metrics

MetricDescription
query_execution_time_msTime taken for query execution
segment_search_countNumber of segments searched
index_query_countNumber of index queries performed
expression_eval_countNumber of filter evaluations

Data Flow Integration

graph TD
    A[Data Ingestion] --> B[Segment Builder]
    B --> C[Growing Segment]
    C --> D[Sealed Segment]
    D --> E[Query Node]
    E --> F[Query Execution Engine]
    F --> G[Search Results]
    
    H[Index Builder] --> D
    I[Retention Manager] --> D

Component Dependencies

  • Segment Manager: Manages lifecycle of queryable segments
  • Index Executor: Handles index-specific query operations
  • Result Merger: Combines results from multiple segments
  • Cache Manager: Manages in-memory segment data

Common Use Cases

Vector Similarity Search with Filters

Search Request:
- Collection: "products"
- Vector Field: "embedding"
- Query Vector: [0.1, 0.2, ...]
- Filter: "category == 'electronics' AND price < 1000"
- Limit: 100
- Offset: 0

The execution engine supports hybrid searches combining:

  • Vector similarity (ANN search)
  • Scalar filtering (expression evaluation)
  • Full-text matching (BM25)
  • Array field operations

Troubleshooting

Slow Query Performance

  1. Check if appropriate index is built on vector field
  2. Verify filter expressions are selective
  3. Monitor segment sizes and memory usage
  4. Review cache hit rates

Memory Issues

  • Adjust queryNode.chunkCacheSize parameter
  • Consider segment size limits for growing segments
  • Monitor memory usage per query node

References

Source: https://github.com/milvus-io/milvus / Human Manual

Vector Index Types

Related topics: Schema Design and Field Types, Query Execution Engine

Section Related Pages

Continue reading this section for the full explanation and source context.

Section 1. HNSW (Hierarchical Navigable Small World)

Continue reading this section for the full explanation and source context.

Section 2. IVF (Inverted File Index)

Continue reading this section for the full explanation and source context.

Section 3. SCANN (Scalable Nearest Neighbors)

Continue reading this section for the full explanation and source context.

Related topics: Schema Design and Field Types, Query Execution Engine

Vector Index Types

Milvus is a cloud-native vector database designed for similarity search and AI applications. At its core, Milvus stores and indexes high-dimensional vector embeddings to enable fast approximate nearest neighbor (ANN) searches across billions of vectors.

Overview

Vector indexes are data structures that organize vector data to accelerate similarity queries. Without indexes, searching requires comparing a query vector against every stored vector—a linear scan operation with O(n) complexity. Vector indexes reduce this to logarithmic or sub-linear complexity, making billion-scale similarity search practical.

Milvus supports multiple vector index types, each optimized for different use cases involving trade-offs between:

  • Search speed — how quickly results are returned
  • Memory footprint — RAM or disk usage
  • Build time — index construction duration
  • Accuracy — precision of ANN results vs. exact kNN

Index Type Architecture

graph TD
    A[Vector Data] --> B{Index Type Selection}
    B --> C[In-Memory Indexes]
    B --> D[Disk-Based Indexes]
    
    C --> C1[HNSW]
    C --> C2[IVF Family]
    C --> C3[SCANN]
    C --> C4[Sparse Index]
    
    D --> D1[DiskANN]
    
    C1 --> E[Search Query]
    C2 --> E
    C3 --> E
    C4 --> E
    D1 --> E
    E --> F[Top-k Results]

Supported Index Types

1. HNSW (Hierarchical Navigable Small World)

HNSW is a graph-based indexing algorithm that constructs a multi-layer navigation structure. It offers excellent search performance with minimal tuning.

ParameterTypeDefaultDescription
Mint16Maximum connections per node
efConstructionint200Search width during build
efint100Search width during query

Use cases: General-purpose ANN search with high accuracy requirements.

Source: client/index/hnsw.go

2. IVF (Inverted File Index)

IVF partitions vectors into clusters using k-means and maintains an inverted index mapping clusters to vectors.

ParameterTypeDefaultDescription
nlistint128Number of cluster centers
nprobeint16Number of clusters to search

Variants:

  • IVF_FLAT — stores exact vectors in clusters
  • IVF_SQ8 — scalar quantization applied
  • IVF_PQ — product quantization applied

Source: client/index/ivf.go

3. SCANN (Scalable Nearest Neighbors)

SCANN is Google's high-performance ANN algorithm that combines inverted indexing with asymmetric hashing.

ParameterTypeDefaultDescription
祠堂int-Number of subquantizers
reorder_topkint-Top-k results to reorder

Use cases: High-throughput scenarios requiring high accuracy.

Source: client/index/scann.go

Community note: Feature request #2771 highlights community interest in ScaNN support, noting its potential for significantly better performance compared to other algorithms.

4. DiskANN

DiskANN is a disk-based index that enables searching billion-scale datasets with limited RAM. It uses a hybrid in-memory/disk architecture.

ParameterTypeDefaultDescription
max_degreeint64Maximum out-degree in graph
search_list_sizeint100Search candidate list size
pq_code_budget_gbfloat-PQ code memory budget
cache_dataset_on_diskboolfalseCache raw vectors on disk

Use cases: Large-scale datasets exceeding available RAM.

Source: client/index/disk_ann.go, internal/core/src/index/VectorDiskIndex.cpp

5. Sparse Index

Sparse index is designed for sparse vectors, commonly used with embedding models that produce high-dimensional sparse representations.

ParameterTypeDefaultDescription
mutable_ratiofloat1.1Ratio for index mutability

Use cases: Sparse embeddings, BM25-style retrieval.

Source: client/index/sparse.go

Vector Types and Index Compatibility

Milvus supports multiple vector types, and each index type has specific compatibility:

Vector TypeHNSWIVFSCANNDiskANNSparse
Float16
BFloat16
Float32
Binary
Sparse Float

Index Build Process

graph LR
    A[Collection Data] --> B[Create Index Request]
    B --> C{Index Type Valid?}
    C -->|No| D[Return Error]
    C -->|Yes| E[Select Index Builder]
    E --> F[Parameter Validation]
    F --> G[Build Index Structure]
    G --> H[Serialize to Storage]
    H --> I[Index Ready for Search]

Index Parameter Validation

Milvus includes a parameter validation layer that checks index parameters before building:

Source: internal/util/indexparamcheck/index_type.go

The validation system ensures:

  • Parameters are within acceptable ranges
  • Required parameters are present
  • Parameter types match expectations

Search Workflow with Indexes

graph TD
    A[Search Request] --> B[Load Index into Memory]
    B --> C[Execute ANN Search]
    C --> D[Rank Results by Distance]
    D --> E[Apply Metric Type]
    E --> F[Return Top-k Results]
    
    B -.->|DiskANN| G[Memory-Mapped Access]
    C -.->|HNSW| H[Graph Traversal]
    C -.->|IVF| I[Cluster Pruning]

Metric Types

Vector similarity is measured using one of these metrics:

Metric TypeDescriptionCompatible Indexes
L2 (Euclidean)Euclidean distance between vectorsAll
IP (Inner Product)Dot product of vectorsAll
COSINECosine similarityAll

Schema Definition

Vectors are defined in collection schemas with the following attributes:

fields = [
    FieldSchema(name="id", dtype=DataType.INT64, is_primary=True),
    FieldSchema(name="embedding", dtype=DataType.FLOAT_VECTOR, 
                dim=128, index_params={"index_type": "HNSW", "M": 16, "efConstruction": 200})
]

Common Issues and Best Practices

Index Not Built

  • Symptom: Slow search performance
  • Solution: Ensure index is built before loading collection

Memory Constraints

  • Symptom: Out of memory errors
  • Solution: Consider DiskANN or reduce index parameters

Accuracy Tuning

  • Symptom: Low recall rates
  • Solution: Increase HNSW ef or IVF nprobe parameters

Community Resources

For additional support:

Source: https://github.com/milvus-io/milvus / Human Manual

Schema Design and Field Types

Related topics: Vector Index Types, Data Storage Layer

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Core Schema Components

Continue reading this section for the full explanation and source context.

Section Field Type Hierarchy

Continue reading this section for the full explanation and source context.

Section Field Schema Structure

Continue reading this section for the full explanation and source context.

Related topics: Vector Index Types, Data Storage Layer

Schema Design and Field Types

Overview

Schema Design and Field Types form the foundation of how Milvus organizes and stores vector data. A schema defines the structure of a collection, specifying the fields that can be used to describe entities, including both metadata attributes and vector embeddings.

In Milvus, every collection requires a schema that specifies:

  • Primary Key Field: Uniquely identifies each entity in the collection
  • Vector Field(s): Stores the embedding vectors used for similarity search
  • Metadata Fields: Optional fields to store descriptive attributes

Source: client/entity/schema.go:1-50

Schema Architecture

Core Schema Components

The schema system consists of several interconnected components that work together to provide a robust data modeling layer:

graph TD
    A[Collection Schema] --> B[Field Schemas]
    B --> C[Primary Key Field]
    B --> D[Vector Fields]
    B --> E[Metadata Fields]
    C --> F[Int64 or String]
    D --> G[FloatVector, BinaryVector, SparseVector]
    E --> H[String, Int, Float, Bool, Array]

Field Type Hierarchy

Milvus supports a comprehensive set of field types organized into distinct categories:

CategoryData TypesDescription
IntegerInt8, Int16, Int32, Int64Signed integer values
Floating PointFloat, DoubleDecimal numbers
BooleanBoolTrue/False values
StringString, VarCharText data with optional length constraints
VectorFloatVector, BinaryVector, SparseVector, Int8VectorEmbedding representations
ArrayArrayCollection of elements of the same type

Source: client/entity/field.go:1-100

Field Definitions

Field Schema Structure

Each field in a Milvus schema is defined by the FieldSchema structure:

type FieldSchema struct {
    Name        string      // Field name identifier
    DataType    DataType    // Data type enumeration
    TypeParams  map[string]string  // Type-specific parameters
    ElementType DataType    // For array fields, the element data type
    IsPrimaryKey bool       // Marks the primary key field
    IsPartitionKey bool     // Enables partition by key
    IsDynamic bool         // Allows dynamic field population
    Description string     // Human-readable description
    AutoID      bool       // Auto-generates IDs for this field
}

Source: client/entity/field.go:50-80

Field Type Parameters

Field behavior is customized through type-specific parameters stored as key-value pairs:

ParameterApplicable TypesDescriptionExample
dimFloatVector, BinaryVector, Int8VectorVector dimension"128", "768"
max_lengthVarChar, StringMaximum string length"256"
max_capacityArrayMaximum array elements"100"
element_typeArrayElement data type"Int64", "Float"

Source: internal/core/src/common/FieldMeta.h:1-100

Primary Key Fields

Supported Primary Key Types

The primary key field uniquely identifies each entity in a collection. Milvus supports two primary key data types:

Data TypeUse CaseID Generation
Int64Sequential numeric IDsAuto-generated or user-specified
VarCharString-based identifiersUser-specified

Note: Issue #1924 in the community highlights that string ID support was a highly requested feature, now addressed through VarChar primary keys.

Source: internal/rootcoord/create_collection_task.go:100-150

Auto-ID Configuration

Primary key fields can be configured for automatic ID generation:

// Enable auto-generated IDs
FieldSchema{
    Name: "id",
    DataType: Int64,
    IsPrimaryKey: true,
    AutoID: true,
}

// User-specified IDs
FieldSchema{
    Name: "id",
    DataType: Int64,
    IsPrimaryKey: true,
    AutoID: false,
}

Source: client/entity/field.go:80-120

Vector Fields

Vector fields store embedding representations used for similarity search operations.

Dense Vector Types

#### FloatVector

The most common vector type for high-dimensional embeddings from models like BERT, CLIP, or custom neural networks.

FieldSchema{
    Name: "embedding",
    DataType: FloatVector,
    TypeParams: map[string]string{
        "dim": "768",  // Embedding dimension
    },
}

#### BinaryVector

Compact representation for binary embeddings, useful for models like CLIP's binary variant.

FieldSchema{
    Name: "binary_embedding",
    DataType: BinaryVector,
    TypeParams: map[string]string{
        "dim": "256",  // Binary vector dimension (bits)
    },
}

#### Int8Vector

Quantized vector type for int8 embeddings, providing memory efficiency.

FieldSchema{
    Name: "int8_embedding",
    DataType: Int8Vector,
    TypeParams: map[string]string{
        "dim": "128",
    },
}

Source: internal/core/src/common/FieldMeta.h:100-200

Sparse Vector Support

Sparse vectors represent embeddings where most dimensions are zero, common in models like BM25 or SPLADE.

FieldSchema{
    Name: "sparse_embedding",
    DataType: SparseVector,
}

The recent v2.6.5 release added nullable vector columns across all vector types including sparse vectors.

Source: client/v2.6.5 Release Notes

Metadata Fields

Scalar Data Types

#### Integer Types

TypeRangeStorage
Int8-128 to 1271 byte
Int16-32,768 to 32,7672 bytes
Int32-2,147,483,648 to 2,147,483,6474 bytes
Int64-9,223,372,036,854,775,808 to 9,223,372,036,854,775,8078 bytes

#### Floating Point Types

TypePrecisionStorage
FloatSingle precision (7 digits)4 bytes
DoubleDouble precision (15 digits)8 bytes

#### String Types

TypeDescriptionUse Case
StringVariable-length stringGeneral text storage
VarCharVariable-length with max length constraintEfficient storage with length limits
FieldSchema{
    Name: "category",
    DataType: VarChar,
    TypeParams: map[string]string{
        "max_length": "256",
    },
}

Source: client/entity/field.go:100-150

Array Fields

Array fields allow storing collections of elements of the same data type, enabling more complex data modeling.

Supported Array Element Types

  • Int8, Int16, Int32, Int64
  • Float, Double
  • Bool
  • VarChar

Array Field Definition

FieldSchema{
    Name: "tags",
    DataType: Array,
    ElementType: VarChar,
    TypeParams: map[string]string{
        "max_capacity": "100",    // Maximum number of elements
        "max_length": "128",       // Max length of each element
    },
}

Recent releases (v2.6.3+) added Array field partial update helpers for ARRAY_APPEND and ARRAY_REMOVE operations in upsert requests.

Source: client/v2.6.3 Release Notes

Schema Validation

Collection Creation Validation

When creating a collection, the schema undergoes rigorous validation:

graph TD
    A[CreateCollection Request] --> B[Schema Validation]
    B --> C{Valid Schema?}
    C -->|No| D[Return Error]
    C -->|Yes| E[Check Field Constraints]
    E --> F{Primary Key Exists?}
    F -->|No| G[Return Error: Primary Key Required]
    F -->|Yes| H{Vector Field Valid?}
    H -->|No| I[Return Error: Invalid Vector]
    H -->|Yes| J[Validate Field Types]
    J --> K[Schema Accepted]

Validation Rules

RuleDescriptionError Code
Primary Key RequiredCollection must have exactly one primary keyErrNoPrimaryKey
Vector DimensionVector fields must have valid dimension > 0ErrInvalidDimension
Primary Key TypeOnly Int64 or VarChar allowed as primary keyErrInvalidPrimaryKeyType
AutoID CompatibilityAutoID only valid for Int64 primary keysErrInvalidAutoID
Field Name UniquenessNo duplicate field names allowedErrDuplicateFieldName

Source: internal/rootcoord/create_collection_task.go:150-300

Schema Modification

Alter Collection Operations

Milvus allows modifying collection schemas under specific conditions:

OperationEmpty CollectionNon-Empty Collection
Add Field✅ Supported✅ Supported (nullable fields only)
Drop Field✅ Supported❌ Not Supported
Modify Field✅ Supported❌ Not Supported

Note: Issue #20405 in the community requests the ability to modify schemas on non-empty collections, which is partially addressed through nullable field additions.

Source: internal/rootcoord/alter_collection_task.go:1-100

Adding Fields to Existing Collections

// Add a nullable field to existing collection
AlterCollectionRequest{
    CollectionName: "my_collection",
    Type: AddField,
    FieldName: "new_metadata",
    FieldType: VarChar,
    TypeParams: map[string]string{
        "max_length": "512",
    },
    Nullable: true,  // Required for non-empty collections
}

The v2.6.5 release added validation to ensure vector fields added to existing collections are nullable before sending AddCollectionField requests.

Source: client/v2.6.5 Release Notes

Dynamic Fields

Dynamic fields enable flexible schema designs where entities can have attributes not defined in the schema. These fields are stored as key-value pairs in a special $meta or $dynamic field.

// Collection with dynamic fields enabled
CollectionSchema{
    Fields: [...],
    EnableDynamicField: true,
}

// Insert with dynamic attributes
InsertRequest{
    CollectionName: "products",
    Fields: [...],
    FieldsData: [...],
    // Dynamic fields
    DynamicFields: map[string]interface{}{
        "brand": "ExampleCorp",
        "rating": 4.5,
        "tags": []string{"electronics", "sale"},
    },
}

Best Practices

Schema Design Guidelines

  1. Choose Appropriate Primary Key Type
  • Use Int64 for sequential or auto-generated IDs
  • Use VarChar for string-based identifiers with known prefix patterns
  1. Vector Dimension Planning
  • Match dimension to your embedding model output
  • Document the embedding source for reproducibility
  • Consider memory footprint: FloatVector uses 4 bytes × dimension per vector
  1. String Field Sizing
  • Set max_length conservatively to reduce storage overhead
  • For VarChar primary keys, consider prefix patterns for efficient queries
  1. Partition Key Strategy
  • Use high-cardinality fields (e.g., user_id, timestamp) as partition keys
  • Avoid low-cardinality fields that create imbalanced partitions

Memory Considerations

Field TypeBytes per Value
Int648
Float4
Double8
Bool1
VarChar(n)n + overhead
FloatVector(dim)4 × dim
BinaryVector(dim)dim / 8

API Reference

Python SDK

from pymilvus import CollectionSchema, FieldSchema, DataType

schema = CollectionSchema(
    fields=[
        FieldSchema(name="id", dtype=DataType.INT64, is_primary=True),
        FieldSchema(name="embedding", dtype=DataType.FLOAT_VECTOR, dim=768),
        FieldSchema(name="category", dtype=DataType.VARCHAR, max_length=256),
    ],
    description="Example collection schema"
)

Go SDK

import "github.com/milvus-io/milvus/client/v2/entity"

schema := &entity.Schema{
    Fields: []*entity.Field{
        {
            Name:       "id",
            DataType:   entity.FieldTypeInt64,
            PrimaryKey: true,
        },
        {
            Name:     "embedding",
            DataType: entity.FieldTypeFloatVector,
            TypeParams: map[string]string{
                "dim": "768",
            },
        },
    },
}

Summary

Schema Design and Field Types in Milvus provide a flexible yet structured approach to organizing vector data:

  • Flexible Field Types: Support for scalar, vector, and array types
  • Primary Key Options: Int64 or VarChar with optional auto-generation
  • Vector Diversity: Dense, sparse, binary, and quantized vectors
  • Schema Evolution: Add nullable fields to existing collections
  • Dynamic Fields: Handle unstructured attributes without schema changes

Understanding these concepts is essential for building efficient vector search applications with Milvus.

Source: https://github.com/milvus-io/milvus / Human Manual

Data Storage Layer

Related topics: Data Ingestion and Flow, Coordinator Services

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Key Components

Continue reading this section for the full explanation and source context.

Related topics: Data Ingestion and Flow, Coordinator Services

Data Storage Layer

Overview

The Data Storage Layer in Milvus is a multi-tiered system responsible for persisting vector data, metadata, and operational logs. It encompasses the message queue layer (RocksMQ), blob storage integration (MinIO/S3), binlog management, and coordination services that ensure data durability and consistency across the cluster.

Milvus employs a cloud-native storage architecture designed for high scalability and horizontal partitioning of data. The storage layer is decoupled from compute resources, allowing independent scaling of storage and query capabilities.

Key Components

ComponentPurposeStorage Type
RocksMQMessage queue and log persistenceLocal RocksDB
ChunkManagerBlob storage abstractionMinIO/S3
DataNodeData写入 persistent storageChunkManager
DataCoordSegment allocation and metadataetcd
BinlogOperational change logsChunkManager

Source: internal/datanode/README.md

Source: https://github.com/milvus-io/milvus / Human Manual

Data Ingestion and Flow

Related topics: Data Storage Layer, Coordinator Services

Section Related Pages

Continue reading this section for the full explanation and source context.

Section RocksMQ (Default)

Continue reading this section for the full explanation and source context.

Section Consumer Groups

Continue reading this section for the full explanation and source context.

Section Producer Message Flow

Continue reading this section for the full explanation and source context.

Related topics: Data Storage Layer, Coordinator Services

Data Ingestion and Flow

Overview

Data ingestion in Milvus refers to the process of receiving, buffering, persisting, and making vector data searchable. Milvus employs a multi-tiered architecture that separates write and read paths, enabling high-throughput insertions while maintaining query performance. The system decouples data ingestion from search operations through an asynchronous flow that leverages message queues and write buffers.

The data ingestion pipeline handles the journey from user-inserted vectors to searchable segments in object storage, incorporating automatic flushing, compaction, and indexing mechanisms.

Architecture Overview

graph TD
    subgraph Client["Client Layer"]
        SDK[SDK Insert/Upsert]
    end
    
    subgraph Proxy["Proxy Layer"]
        TaskInsert[Insert Task]
        TaskUpsert[Upsert Task]
        ChannelsMgr[Channels Manager]
    end
    
    subgraph MessageQueue["Message Queue"]
        RocksMQ[RocksMQ]
        Pulsar[Pulsar]
        Kafka[Kafka]
    end
    
    subgraph DataNode["Data Node"]
        WriteBuffer[Write Buffer]
        Flush[Flush Service]
        Compactor[Compactor]
    end
    
    subgraph Storage["Storage Layer"]
        Segment[Segment Files]
        Index[Index Files]
    end
    
    SDK --> TaskInsert
    SDK --> TaskUpsert
    TaskInsert --> ChannelsMgr
    TaskUpsert --> ChannelsMgr
    ChannelsMgr --> RocksMQ
    ChannelsMgr --> Pulsar
    ChannelsMgr --> Kafka
    RocksMQ --> WriteBuffer
    Pulsar --> WriteBuffer
    Kafka --> WriteBuffer
    WriteBuffer --> Flush
    Flush --> Segment
    Compactor --> Segment
    Segment --> Index

Message Queue Integration

Milvus supports multiple message queue implementations for data ingestion buffering. The message queue layer abstracts the underlying implementation, allowing users to choose between RocksMQ, Pulsar, or Kafka based on their deployment requirements.

RocksMQ (Default)

RocksMQ is the default embedded message queue implementation in Milvus, suitable for standalone deployments. It stores messages in RocksDB and provides reliable message delivery with acknowledgment tracking.

#### Topic Management

Topics in RocksMQ represent logical channels for message streams. Each topic maintains its own message pages and consumer tracking information.

// CreateTopic initializes a new topic in RocksMQ
func (rmq *rocksmq) CreateTopic(topicName string) error {
    // Check if topicName contains "/"
    if strings.Contains(topicName, "/") {
        return retry.Unrecoverable(fmt.Errorf("topic name = %s contains \"/\"", topicName))
    }
    
    // topicIDKey is the only identifier of a topic
    topicIDKey := TopicIDTitle + topicName
    val, err := rmq.kv.Load(context.TODO(), topicIDKey)
    if val != "" {
        log.Warn("rocksmq topic already exists ", zap.String("topic", topicName))
        return nil
    }
    // ...
}

Source: pkg/mq/mqimpl/rocksmq/server/rocksmq_impl.go:280-295

#### Topic Name Restrictions

Topic names must not contain the "/" character, as this separator is used internally for key construction. Attempting to create a topic with "/" results in an unrecoverable error.

Source: pkg/mq/mqimpl/rocksmq/server/rocksmq_impl.go:284

Consumer Groups

Consumer groups manage subscriptions to topics, tracking consumption progress for each consumer instance.

type consumerList struct {
    consumers map[string]*Consumer // GroupName -> *Consumer
    mu        sync.RWMutex
}

func (l *consumerList) Add(consumer *Consumer) {
    l.mu.Lock()
    defer l.mu.Unlock()
    
    if _, ok := l.consumers[consumer.GroupName]; ok {
        return
    }
    l.consumers[consumer.GroupName] = consumer
}

Source: pkg/mq/mqimpl/rocksmq/server/rocksmq_impl.go:35-49

Producer Message Flow

Messages are produced to topics in batches, with each message assigned a unique ID upon successful storage.

func (rmq *rocksmq) updatePageMsgSize(topicName string, msgIDs []UniqueID, msgSizes map[UniqueID]int64) error {
    // ...
    mutateBuffer := make(map[string]string)
    for _, id := range msgIDs {
        msgSize := msgSizes[id]
        if curMsgSize+msgSize > params.RocksmqCfg.PageSize.GetAsInt64() {
            // Current page is full
            newPageSize := curMsgSize + msgSize
            pageEndID := id
            // Update page message size for current page. key is page end ID
            pageMsgSizeKey := fixedPageSizeKey + "/" + strconv.FormatInt(pageEndID, 10)
            mutateBuffer[pageMsgSizeKey] = strconv.FormatInt(newPageSize, 10)
            // ...
        }
    }
    // ...
}

Source: pkg/mq/mqimpl/rocksmq/server/rocksmq_impl.go:75-92

Write Buffer Management

Write buffers provide an in-memory staging area for incoming data before flushing to persistent storage. This component reduces I/O overhead by batching writes and allows for memory-efficient data handling.

graph LR
    subgraph InMemory["Write Buffer"]
        BufferA[Buffer A]
        BufferB[Buffer B]
        BufferN[Buffer N]
    end
    
    subgraph Flush["Flush Trigger"]
        Trigger[Size/Time Trigger]
    end
    
    subgraph Persistent["Persistent Storage"]
        SegA[Segment A]
        SegB[Segment B]
    end
    
    Trigger -->|Flush| BufferA
    Trigger -->|Flush| BufferB
    BufferA --> SegA
    BufferB --> SegB

Buffer Lifecycle

StateDescription
ActiveBuffer accepting new data writes
FullBuffer reached size threshold, awaiting flush
FlushingData being written to persistent storage
FlushedData committed, buffer can be cleared

PChannel (Persistent Channel) Metadata

PChannels manage the assignment and state tracking of streaming channels across the Milvus cluster.

type PChannelMeta struct {
    inner *streamingpb.PChannelMeta
}

func (c *PChannelMeta) IsAssigned() bool {
    return c.inner.State == streamingpb.PChannelMetaState_PCHANNEL_META_STATE_ASSIGNED
}

func (c *PChannelMeta) IsAssignedOrAssigning() bool {
    return c.inner.State == streamingpb.PChannelMetaState_PCHANNEL_META_STATE_ASSIGNED || 
           c.inner.State == streamingpb.PChannelMetaState_PCHANNEL_META_STATE_ASSIGNING
}

Source: internal/streamingcoord/server/balancer/channel/pchannel.go:45-55

Channel Assignment States

StateDescription
PCHANNEL_META_STATE_UNSPECIFIEDInitial state
PCHANNEL_META_STATE_ASSIGNINGChannel being assigned to a node
PCHANNEL_META_STATE_ASSIGNEDChannel successfully assigned
PCHANNEL_META_STATE_DROPPINGChannel being removed
PCHANNEL_META_STATE_DROPPEDChannel removed

Data Flow: Insert Operation

The following sequence describes how an insert request flows through the system:

sequenceDiagram
    participant Client
    participant Proxy
    participant ChannelMgr
    participant MQ
    participant DataNode
    participant Storage
    
    Client->>Proxy: Insert request
    Proxy->>Proxy: Validate schema
    Proxy->>ChannelMgr: Get channel for collection
    ChannelMgr->>Proxy: Channel assignment
    Proxy->>MQ: Produce messages
    MQ-->>Proxy: Message IDs
    Proxy-->>Client: Insert result with IDs
    Note over MQ,DataNode: Asynchronous processing
    DataNode->>MQ: Consume messages
    DataNode->>DataNode: Buffer in memory
    DataNode->>Storage: Flush to segment
    DataNode->>DataNode: Build index

Retention and Cleanup

RocksMQ implements retention policies to manage disk space by removing acknowledged messages that have exceeded their retention period.

Retention Configuration

ParameterDescriptionDefault
RetentionSizeInMBMaximum retained message size-1 (disabled)
RetentionTimeInMinutesMaximum retention duration-1 (disabled)

Source: pkg/mq/mqimpl/rocksmq/server/rocksmq_impl.go:28-30

Page-Based Cleanup

Retention operates on a page-based mechanism where messages are grouped into pages. Each page tracks its end ID, timestamp, and acknowledgment status.

func (ri *rocksmqRetention) expiredClean(topic string) error {
    // ...
    pageIter := rocksdbkv.NewRocksIteratorWithUpperBound(ri.kv.DB, 
        typeutil.AddOne(pageMsgPrefix), pageReadOpts)
    defer pageIter.Close()
    
    for ; pageIter.Valid(); pageIter.Next() {
        pKey := pageIter.Key()
        pageID, err := parsePageID(string(pKey.Data()))
        // ...
        
        ackedTsKey := fixedAckedTsKey + "/" + strconv.FormatInt(pageID, 10)
        ackedTsVal, err := ri.kv.Load(context.TODO(), ackedTsKey)
        
        if msgTimeExpiredCheck(ackedTs) {
            pageEndID = pageID
            deletedAckedSize += size
            pageCleaned++
        }
    }
    // ...
}

Source: pkg/mq/mqimpl/rocksmq/server/rocksmq_retention.go:78-105

Message Deletion by Range

Messages are deleted using RocksDB's range delete functionality for efficient batch removal:

func DeleteMessages(db *gorocksdb.DB, topic string, startID, endID UniqueID) error {
    startKey := path.Join(topic, strconv.FormatInt(startID, 10))
    endKey := path.Join(topic, strconv.FormatInt(endID+1, 10))
    
    writeBatch := gorocksdb.NewWriteBatch()
    defer writeBatch.Destroy()
    writeBatch.DeleteRange([]byte(startKey), []byte(endKey))
    
    opts := gorocksdb.NewDefaultWriteOptions()
    defer opts.Destroy()
    return db.Write(opts, writeBatch)
}

Source: pkg/mq/mqimpl/rocksmq/server/rocksmq_retention.go:45-56

Channels Manager

The Channels Manager in the Proxy layer routes insert and upsert operations to the appropriate data node channels based on collection and partition information.

func (mgr *channelsMgr) getChannels(collectionID UniqueID) ([]pChan, error) {
    // Routes requests to appropriate proxy channels
    // based on collection metadata
}

Source: internal/proxy/channels_mgr.go:1-50

Channel Assignment Strategy

StrategyDescription
ShardedMessages distributed round-robin across channels
UniformEqual distribution based on hash
LookupAll messages to single channel

Compaction

Compaction merges small segments into larger ones, improving query performance and reclaiming storage space.

graph TD
    subgraph Before["Before Compaction"]
        Seg1[Segment 1 - 1K rows]
        Seg2[Segment 2 - 1K rows]
        Seg3[Segment 3 - 1K rows]
        Seg4[Segment 4 - 1K rows]
    end
    
    subgraph After["After Compaction"]
        Merged[Segment M - 4K rows]
    end
    
    Seg1 --> Merged
    Seg2 --> Merged
    Seg3 --> Merged
    Seg4 --> Merged

Compactor Component

The compactor evaluates segments based on:

  • Segment size thresholds
  • Number of deleted records
  • Segment age
  • Index type and state

Source: internal/datanode/compactor/compactor.go

Error Handling and Recovery

Topic Destruction

When destroying a topic, the system performs a multi-step cleanup process:

func (rmq *rocksmq) DestroyTopic(topicName string) error {
    // Acquire topic lock
    ll, ok := topicMu.Load(topicName)
    lock, ok := ll.(*sync.Mutex)
    lock.Lock()
    defer lock.Unlock()
    
    // Remove consumer associations
    rmq.consumers.Delete(topicName)
    rmq.topicName2LatestMsgID.Delete(topicName)
    
    // Clean topic data
    fixTopicName := topicName + "/"
    err := rmq.kv.RemoveWithPrefix(context.TODO(), fixTopicName)
    
    // Clean page size and timestamp info
    pageMsgSizeKey := constructKey(PageMsgSizeTitle, topicName)
    rmq.kv.RemoveWithPrefix(context.TODO(), pageMsgSizeKey)
    
    // Clean acknowledgment tracking
    ackedTsKey := constructKey(AckedTsTitle, topicName)
    rmq.kv.RemoveWithPrefix(context.TODO(), ackedTsKey)
    
    // Remove topic metadata
    rmq.kv.MultiRemove(context.TODO(), removedKeys)
    
    // Clean retention info
    topicMu.Delete(topicName)
    rmq.retentionInfo.topicRetetionTime.GetAndRemove(topicName)
}

Source: pkg/mq/mqimpl/rocksmq/server/rocksmq_impl.go:340-380

Feature Requests and Limitations

The following community issues relate to data ingestion and flow:

IssueDescriptionStatus
#9685Backup and restore functionalityFeature request
#20405Modify collection schema after creationFeature request
#28583Milvus standalone crash with RocksMQBug report

String ID Support

Support for string-type primary keys has been a frequently requested feature (#4430, #1924), which would affect the data ingestion pipeline's ID handling mechanisms.

Configuration Reference

RocksMQ Configuration Parameters

ParameterTypeDefaultDescription
rocksmq.compressionTypesstring"0,0,0,0,0"Compression types per level
rocksmq.retentionSizeInMBint64-1Retention size limit (-1 disabled)
rocksmq.retentionTimeInMinutesint64-1Retention time limit (-1 disabled)
rocksmq.pageSizeint6410MBPage size for message batching

Message Queue Selection

MQ TypeUse CaseDeployment
RocksMQDevelopment, StandaloneEmbedded
PulsarProduction, Multi-tenantExternal
KafkaHigh-throughput, StreamingExternal

Best Practices

  1. Batch Insert Operations: Group multiple vectors into single insert requests to reduce network overhead and improve throughput.
  1. Monitor Retention Settings: In production environments, configure retention parameters to prevent unbounded disk usage growth.
  1. Channel Planning: For high-throughput workloads, distribute collections across multiple channels to parallelize data ingestion.
  1. Consumer Group Management: Properly manage consumer groups to ensure reliable message consumption and acknowledgment.
  1. Index Building Strategy: Plan index types and build triggers based on query patterns to balance insert performance with search latency.

Logging and Debugging

Milvus uses structured logging through the mlog package for data ingestion operations:

log.Info("Expired check by retention time",
    zap.String("topic", topic),
    zap.Int64("pageEndID", pageEndID),
    zap.Int64("deletedAckedSize", deletedAckedSize),
    zap.Int64("lastAck", lastAck),
    zap.Int64("pageCleaned", pageCleaned),
    zap.Int64("time taken", time.Since(start).Milliseconds()))

Source: pkg/mq/mqimpl/rocksmq/server/rocksmq_retention.go:120-127

Key Log Patterns

PatternComponentDescription
Rocksmq create topicRocksMQTopic initialization
Rocksmq destroy topicRocksMQTopic cleanup completion
Expired check by retentionRetentionPage cleanup operations
PChannel state changeStreamingCoordChannel assignment updates

Source: https://github.com/milvus-io/milvus / Human Manual

Go SDK (client/v2)

Related topics: Quick Start Guide, Schema Design and Field Types

Section Related Pages

Continue reading this section for the full explanation and source context.

Section Package Structure

Continue reading this section for the full explanation and source context.

Section Creating a Client

Continue reading this section for the full explanation and source context.

Section ClientConfig Parameters

Continue reading this section for the full explanation and source context.

Related topics: Quick Start Guide, Schema Design and Field Types

Go SDK (client/v2)

The Go SDK (client/v2) is the official Go client library for Milvus, providing a high-level interface for interacting with Milvus vector database. It enables Go applications to perform vector operations including insertion, querying, similarity search, and collection management.

Overview

The Go SDK is designed for Go 1.24.12 or higher and provides idiomatic Go patterns for working with Milvus. The SDK wraps the Milvus gRPC API with a clean, type-safe interface that handles connection management, request serialization, and response parsing.

Source: client/README.md

Architecture

graph TD
    A[Application Code] --> B[milvusclient Package]
    B --> C[Entity Layer<br/>schema, collection definitions]
    B --> D[gRPC Client<br/>Connection, Keepalive]
    D --> E[Milvus Server]
    
    F[Collection Operations] --> B
    G[Write Operations] --> B
    H[Read Operations] --> B
    I[Index Operations] --> B
    J[RBAC Operations] --> B

Package Structure

PackagePurpose
client/milvusclientMain client interface with New(), collection, write, read, index, and RBAC operations
client/entityData models including schema definitions, field types, and collection schemas

Source: client/milvusclient/client.go

Client Configuration

Creating a Client

import "github.com/milvus-io/milvus/client/v2/milvusclient"

ctx, cancel := context.WithCancel(context.Background())
defer cancel()

cli, err := milvusclient.New(ctx, &milvusclient.ClientConfig{
    Address: "YOUR_MILVUS_ENDPOINT",
})
if err != nil {
    // handle error
}
defer cli.Close(ctx)

ClientConfig Parameters

ParameterTypeDescription
AddressstringMilvus server address (host:port)
DialOptions[]grpc.DialOptionCustom gRPC dial options

Important: Custom DialOptions preserve the SDK's default settings for keepalive, backoff, and max receive message size. Previous versions could inadvertently drop these defaults. Source: client/v2.6.4 release notes

Collection Management

Creating a Collection

Collections are created using the CreateCollection method with a schema definition:

err = cli.CreateCollection(ctx, &milvusclient.CreateCollectionOption{
    CollectionName: "my_collection",
    Schema: collectionSchema,
})

Collection Schema

Schemas define the structure of vector data stored in Milvus:

collectionSchema := &entity.Schema{
    CollectionName: "my_collection",
    Description:    "Example collection",
    AutoID:         false,
    Fields: []entity.Field{
        {
            Name:       "id",
            DataType:   entity.FieldTypeInt64,
            PrimaryKey: true,
        },
        {
            Name:     "vector",
            DataType: entity.FieldTypeFloatVector,
            TypeParams: map[string]string{
                entity.TypeParamDim: "128",
            },
        },
    },
}

Schema Field Types

Field TypeDescription
FieldTypeInt6464-bit integer primary key
FieldTypeFloatVectorFloat vector for dense embeddings
FieldTypeBinaryVectorBinary vector
FieldTypeSparseVectorSparse vector for BM25-style search
FieldTypeJSONJSON document field
FieldTypeArrayArray field with element type

Source: client/entity/schema.go

TruncateCollection

Removes all data from a collection without dropping and recreating it:

err = cli.TruncateCollection(ctx, NewTruncateCollectionOption("my_collection"))

This is useful for quickly clearing collection data while maintaining the schema and index definitions. Source: client/v2.6.3 release notes

Write Operations

Insert Data

Insert vectors with optional metadata:

ids, err := cli.Insert(ctx, &milvusclient.InsertOption{
    CollectionName: "my_collection",
    PartitionName:  "optional_partition",
    FieldsData: []entity.FieldData{
        entity.NewInt64Field("id", []int64{1, 2, 3}),
        entity.NewFloatVectorField("vector", [][]float32{
            {0.1, 0.2, ...},
            {0.3, 0.4, ...},
        }),
    },
})

Upsert Data

Upsert combines insert and update operations:

err = cli.Upsert(ctx, &milvusclient.UpsertOption{
    CollectionName: "my_collection",
    FieldsData: []entity.FieldData{...},
})

Array Field Partial Updates

The SDK supports partial updates for array fields with ARRAY_APPEND and ARRAY_REMOVE operations:

// Append elements to array field
err = cli.Upsert(ctx, &milvusclient.UpsertOption{
    CollectionName: "my_collection",
    FieldsData:    ...,
    // Uses ARRAY_APPEND for array fields
})

Source: client/milvusclient/write.go, client/v2.6.5 release notes

Read Operations

Perform similarity search with various vector types:

results, err := cli.Search(ctx, &milvusclient.SearchOption{
    CollectionName: "my_collection",
    AnnsField:     "vector",
    SearchParams: map[string]any{
        "metric_type": "L2",
        "params":      map[string]any{"nprobe": 10},
    },
    Vector: [][]float32{
        {0.1, 0.2, 0.3, ...},
    },
    Limit:     10,
    Offset:    0,
    OutputFields: []string{"id"},
})

Vector Sub-field Columns

The SDK supports struct-array vector sub-field columns for advanced embedding scenarios:

// Search with embedding list
results, err := cli.Search(ctx, &milvusclient.SearchOption{
    CollectionName: "my_collection",
    AnnsField:     "embeddings", // Array field with vector sub-fields
    // ... additional search parameters
})

Source: client/milvusclient/read.go, client/v2.6.4 release notes

Maximum similarity search is supported for compatible index types:

results, err := cli.Search(ctx, &milvusclient.SearchOption{
    CollectionName: "my_collection",
    AnnsField:     "sparse_vector",
    SearchParams: map[string]any{
        "metric_type": "MAX_SIM",
    },
    Vector: sparseVector,
    Limit:  10,
})

Query with Filters

Query entities with boolean expressions:

results, err := cli.Query(ctx, &milvusclient.QueryOption{
    CollectionName: "my_collection",
    Expr:          "id > 100",
    OutputFields:  []string{"id", "vector"},
    Limit:         100,
})

Nullable Vector Columns

The SDK supports nullable vector columns across all vector types:

Vector TypeNullable Support
Dense (FloatVector)Yes
Binary (BinaryVector)Yes
Sparse (SparseVector)Yes
Int8 (Int8Vector)Yes

Source: client/milvusclient/read.go, client/v2.6.5 release notes

Index Management

Create Index

Build indexes for efficient search:

err = cli.CreateIndex(ctx, &milvusclient.CreateIndexOption{
    CollectionName: "my_collection",
    FieldName:      "vector",
    IndexName:      "vector_idx",
    IndexParams: map[string]string{
        "index_type":  "IVF_FLAT",
        "metric_type": "L2",
        "params":      "{\"nlist\":128}",
    },
})

Index Types

Index TypeDescription
FLATBrute-force search
IVF_FLATInverted file index
IVF_SQ8IVF with scalar quantization
HNSWHierarchical navigable small world
DISKANNDisk-based ANN

Source: client/milvusclient/index.go

Role-Based Access Control (RBAC)

User Management

// Create user
err = cli.CreateUser(ctx, &milvusclient.CreateUserOption{
    Username: "user1",
    Password: "password123",
})

// Delete user
err = cli.DropUser(ctx, &milvusclient.DropUserOption{
    Username: "user1",
})

Role Management

// Create role
err = cli.CreateRole(ctx, &milvusclient.CreateRoleOption{
    RoleName: "viewer",
})

// Grant privilege
err = cli.Grant(ctx, &milvusclient.GrantOption{
    Role:       "viewer",
    Object:     entity.ObjectTypeCollection,
    ObjectName: "my_collection",
    Privilege:  entity.PrivilegeSearch,
})

Get Replication Configuration

Retrieve the current replication settings:

config, err := cli.GetReplicateConfiguration(ctx, &milvusclient.GetReplicateConfigurationOption{})

Source: client/milvusclient/rbac.go, client/v2.6.3 release notes

SDK Version Compatibility

Milvus VersionGo SDK Version
2.6.172.6.4
2.6.152.6.3
2.6.142.6.1
2.6.132.6.1
2.6.122.6.1

The Go SDK maintains semantic versioning aligned with the Milvus server releases. Source: Milvus release notes

Installation

go get -u github.com/milvus-io/milvus/client/v2

Source: client/README.md

Best Practices

  1. Connection Management: Always defer client.Close(ctx) to ensure proper cleanup
  2. Context Handling: Use appropriate context timeouts for long-running operations
  3. Error Handling: Check all SDK method returns for errors before proceeding
  4. Batch Operations: Use batch insert for better throughput when loading large datasets
  5. Index Building: Create indexes after data insertion for optimal performance

Source: https://github.com/milvus-io/milvus / Human Manual

Doramagic Pitfall Log

Source-linked risks stay visible on the manual page so the preview does not read like a recommendation.

medium Installation risk requires verification

May increase setup, validation, or first-run risk for the user.

medium Capability evidence risk requires verification

May increase setup, validation, or first-run risk for the user.

medium Runtime risk requires verification

May increase setup, validation, or first-run risk for the user.

medium Maintenance risk requires verification

May increase setup, validation, or first-run risk for the user.

Doramagic Pitfall Log

Found 8 structured pitfall item(s), including 0 high/blocking item(s). Top priority: Installation risk - Installation risk requires verification.

1. Installation risk: Installation risk requires verification

  • Severity: medium
  • Finding: Project evidence flags a installation risk. Review the linked source before relying on this workflow.
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: community_evidence:github | cevd_f0bbd70e35ef4c65a374c802b7614b46 | https://github.com/milvus-io/milvus/issues/28583

2. Capability evidence risk: Capability evidence risk requires verification

  • Severity: medium
  • Finding: README/documentation is current enough for a first validation pass.
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: capability.assumptions | github_repo:208728772 | https://github.com/milvus-io/milvus

3. Runtime risk: Runtime risk requires verification

  • Severity: medium
  • Finding: Project evidence flags a runtime risk. Review the linked source before relying on this workflow.
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: packet_text.keyword_scan | github_repo:208728772 | https://github.com/milvus-io/milvus

4. Maintenance risk: Maintenance risk requires verification

  • Severity: medium
  • Finding: Project evidence flags a maintenance risk. Review the linked source before relying on this workflow.
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: evidence.maintainer_signals | github_repo:208728772 | https://github.com/milvus-io/milvus

5. Security or permission risk: Security or permission risk requires verification

  • Severity: medium
  • Finding: no_demo
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: downstream_validation.risk_items | github_repo:208728772 | https://github.com/milvus-io/milvus

6. Security or permission risk: Security or permission risk requires verification

  • Severity: medium
  • Finding: no_demo
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: risks.scoring_risks | github_repo:208728772 | https://github.com/milvus-io/milvus

7. Maintenance risk: Maintenance risk requires verification

  • Severity: low
  • Finding: issue_or_pr_quality=unknown。
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: evidence.maintainer_signals | github_repo:208728772 | https://github.com/milvus-io/milvus

8. Maintenance risk: Maintenance risk requires verification

  • Severity: low
  • Finding: release_recency=unknown。
  • User impact: May increase setup, validation, or first-run risk for the user.
  • Recommended check: Reproduce the official install and quickstart path in an isolated environment.
  • Evidence: evidence.maintainer_signals | github_repo:208728772 | https://github.com/milvus-io/milvus

Source: Doramagic discovery, validation, and Project Pack records

Community Discussion Evidence

These external discussion links are review inputs, not standalone proof that the project is production-ready.

Sources 7

Count of project-level external discussion links exposed on this manual page.

Use Review before install

Open the linked issues or discussions before treating the pack as ready for your environment.

Community Discussion Evidence

Doramagic exposes project-level community discussion separately from official documentation. Review these links before using milvus with real data or production workflows.

Source: Project Pack community evidence and pitfall evidence