Official Resources
Key Features
- Embedding Storage & Dimensions: Dense float32, int8, int1, and quantized vectors stored as regular document fields. Max 8192 dimensions as of June 2025.
- Dual Index Types: ANN via tunable HNSW (efConstruction, efSearch, M) and ENN (exact) for ≤10k vectors or highly-filtered subsets.
- Advanced Query API: $vectorSearch aggregation stage (MongoDB ≥6.0.11 for ANN, ≥7.0.10 for ENN) with pre-filter using standard MQL.
- Vector Quantization: Scalar (int8) cuts RAM ~75%; Binary (int1) up to ~97% with automatic rescoring for binary vectors.
- Search Nodes & Scalability: Search Nodes (GA) isolate vector workload, improve latency up to 60%. Horizontal scaling via sharding.
- Security & Encryption: Field-Level Encryption & Queryable Encryption compatible with vector indexes for regulated environments.
- Multi-Provider Support: Any provider ≤8192 dims (OpenAI, Cohere, Bedrock, Voyage, Jina, Nomic, etc.) supported.
Code Examples
Vector Search Index Creation
json
{
"fields": [
{
"type": "vector",
"path": "embedding",
"numDimensions": 1536,
"similarity": "cosine"
},
{
"type": "filter",
"path": "tenantId"
}
]
}
ANN with Pre-Filter Query
javascript
db.movies.aggregate([
{
$vectorSearch: {
index: "vector_index",
path: "embedding",
queryVector: [0.12, -0.34, ...], // 1536-dim
numCandidates: 150,
limit: 10,
filter: { tenantId: { $eq: "acme" } }
}
}
])
Exact Nearest Neighbor (ENN)
javascript
{
$vectorSearch: {
index: "vector_index",
path: "embedding",
queryVector: [...],
exact: true, // ENN mode
limit: 50
}
}
Spring Data MongoDB 4.5.0
java
VectorIndex index = new VectorIndex("idx")
.addVector("e", v -> v.dimensions(1536).similarity(COSINE))
.addFilter("tenantId");
mongoTemplate.searchIndexOps(Movie.class).createIndex(index);
Use Cases
- RAG systems - Hybrid retrieval ($vectorSearch + $search + filters)
- Semantic recommendations - Movies, e-commerce products with vector similarity
- Multi-tenant SaaS - ENN ensures exact retrieval per tenant after metadata filters
- Real-time AI agents - Kafka → Atlas → Bedrock/LangGraph pipelines
- Regulated search - Encrypted vectors + redaction filters for compliance
Pros & Cons
Advantages
- Single datastore - Operational & vector data without sync tax
- ANN + ENN + hybrid - All in one query language
- Search Nodes - Workload isolation & cost control
- Quantization + encryption - Secure, cost-efficient scale
- Unified platform - Store, process, and search all data types
Disadvantages
- Not in Community Edition - Vector features planned late 2025
- No native distributed HNSW - Single cluster per region clustering
- ENN latency scaling - Grows linearly, careful sizing needed
- Maturing ecosystem - Tooling vs. dedicated vector DBs still developing
Future Outlook & Integrations
- Community Edition Support [Late 2025]: Vector support for Community Edition targeted for late 2025
- Voyage AI Integration [H2 2025]: Next-gen embeddings & retrieval metrics integration
- Agentic RAG Blueprints [H2 2025]: LangGraph & CrewAI blueprints for agentic RAG workflows
- Quantization Improvements [Ongoing]: Continuous improvements with auto-tuning and 4-bit quantization
- Multi-Region Active-Active [Under Exploration]: Multi-region active-active vectors under exploration