stable-worldmodel-a-high-performance-platform-for-reproducible-world-model-research
Ayush Chaurasia
Quentin Lhoest
Lucas Maes
Quentin Le Lidec
reproducible-data-curation-in-the-multimodal-lakehouse
Prashanth Rao
newsletter-may-2026
ChanChan Mao
newsletter-april-2026
ChanChan Mao
how-lancedb-accelerates-vector-search-at-10-billion-scale
Yang Cen
opensearch-vs-lancedb-for-vector-search-query-cost-and-infrastructure
Justin Miller
volcano-engine-autonomous-driving-data-lake-solution
Kejian Ju
unifying-the-av-ml-stack-lancedb
Ayush Chaurasia
lance-json-support-why-you-might-not-really-need-variant
Jack Ye
building-a-storage-format-for-the-next-era-of-biology
Pavan Ramkumar
newsletter-march-2026
ChanChan Mao
smart-parsing-meets-sharp-retrieval-combining-liteparse-and-lancedb
Clelia Astra Bertelli
Prashanth Rao
lance-format-v2-2-benchmarks-half-the-storage-none-of-the-slowdown
Xuanwo
make-your-sql-workflows-multimodal-with-lancedb-x-duckdb
Prashanth Rao
agentic-coding-as-community-stewardship
Xuanwo
what-we-mean-by-multimodal
Prashanth Rao
ai-native-development-local-continue-lancedb
Ty Dunn
lance-file-format-2-2-taming-complex-data
Xuanwo
lance-blob-v2
Xuanwo
Jack Ye
openclaw-lancedb-memory-layer
Xuanwo
Prashanth Rao
openclaw-lancedb-seed2
LanceDB
openclaw-memory-from-zero-to-lancedb-pro
Prashanth Rao
upload-lance-datasets-to-hf-hub
Prashanth Rao
zero-shot-image-classification-with-vector-search
Vipul Maheshwari
werides-data-platform-transformation-how-lancedb-fuels-model-development-velocity
Qian Zhu
Fei Chen
training-a-variational-autoencoder-from-scratch-with-the-lance-file-format
LanceDB
track-ai-trends-crewai-agents-rag
LanceDB
tokens-per-second-is-not-all-you-need
Mingran Wang
Tan Li
the-future-of-open-source-table-formats-iceberg-and-lance
Jack Ye
the-case-for-random-access-i-o
LanceDB
series-a-funding
Chang She
semanticdotart
Ayush Chaurasia
second-dinners-secret-weapon-lancedb-powered-rag-for-faster-smarter-game-development
Qian Zhu
search-within-an-image-331b54e4285e
Kaushal Choudhary
scalable-computer-vision-with-lancedb-voxel51-d8b65066d5f6
LanceDB
rethinking-table-file-paths-lance-multi-base-layout
Jack Ye
rag-isnt-one-size-fits-all
Leonard Marcq
python-package-to-convert-image-datasets-to-lance-type
Vipul Maheshwari
one-million-iops
Weston Pace
november-feature-roundup
Will Jones
newsletter-september-2025
Jasmine Wang
newsletter-october-2025
Jasmine Wang
newsletter-november-2025
ChanChan Mao
newsletter-june-2025
David Myriel
newsletter-july-2025
Jasmine Wang
newsletter-january-2026
ChanChan Mao
newsletter-february-2026
ChanChan Mao
newsletter-december-2025
ChanChan Mao
newsletter-august-2025
Jasmine Wang
my-summer-internship-experience-at-lancedb-2
Raunak Sinha
my-simd-is-faster-than-yours-fb2989bf25e7
LanceDB
multimodal-myntra-fashion-search-engine-using-lancedb
LanceDB
multimodal-lakehouse
David Myriel
multi-document-agentic-rag-a-walkthrough
Vipul Maheshwari
modified-rag-parent-document-bigger-chunk-retriever-62b3d1e79bc6
Mahesh Deshwal
memgpt-os-inspired-llms-that-manage-their-own-memory-793d6eed417e
Ayush Chaurasia
late-interaction-efficient-multi-modal-retrievers-need-more-than-just-a-vector-index
Ayush Chaurasia
lancedb-x-continue
LanceDB
lance-x-huggingface-a-new-era-of-sharing-multimodal-data
Prashanth Rao
Quentin Lhoest
Xuanwo
Ayush Chaurasia
lance-x-duckdb-sql-retrieval-on-the-multimodal-lakehouse-format
Xuanwo
lance-windows-windows-lance
Chang She
lance-v2
Weston Pace
lance-namespace-lancedb-and-ray
Jack Ye
lance-file-2-1-stable
Weston Pace
lance-file-2-1-smaller-and-simpler
Weston Pace
lance-data-viewer
Gordon Murray
lance-community-governance
Jack Ye
introducing-lance-namespace-spark-integration
Jack Ye
implementing-corrective-rag-in-the-easiest-way-2
LanceDB
hybrid-search-rag-for-real-life-production-grade-applications-e1e727b3965a
Mahesh Deshwal
hybrid-search-combining-bm25-and-semantic-search-for-better-results-with-lan-1358038fe7e6
LanceDB
hybrid-search-and-custom-reranking-with-lancedb-4c10a6a3447e
LanceDB
how-to-reduce-hallucinations-from-llm-powered-agents-using-long-term-memory-72f262c3cc1f
Tevin Wang
guide-to-use-contextual-retrieval-and-prompt-caching-with-lancedb
LanceDB
grpo-understanding-and-fine-tuning-the-next-gen-reasoning-model-2
Mahesh Deshwal
graphrag-hierarchical-approach-to-retrieval-augmented-generation
Akash Desai
gpu-accelerated-indexing-in-lancedb-27558fa7eee5
LanceDB
geo-support
Jack Ye
geneva-twelvelabs
David Myriel
geneva-feature-engineering
Jonathan Hsieh
from-bi-to-ai-lance-and-iceberg
Jack Ye
Prashanth Rao
fluss-integration
Wayne Wang
file-readers-in-depth-parallelism-without-row-groups
Weston Pace
feature-rabitq-quantization
David Myriel
Yang Cen
feature-full-text-search
David Myriel
enhance-rag-integrate-contextual-compression-and-filtering-for-precision-a29d4a810301
Kaushal Choudhary
effortlessly-loading-and-processing-images-with-lance-a-code-walkthrough
LanceDB
designing-a-table-format-for-ml-workloads
Weston Pace
custom-dataset-for-llm-training-using-lance
LanceDB
creating-a-fintech-agent
Vipul Maheshwari
convert-any-image-dataset-to-lance
LanceDB
columnar-file-readers-in-depth-structural-encoding
Weston Pace
columnar-file-readers-in-depth-repetition-definition-levels
Weston Pace
columnar-file-readers-in-depth-compression-transparency
Weston Pace
columnar-file-readers-in-depth-column-shredding
Weston Pace
columnar-file-readers-in-depth-backpressure
Weston Pace
columnar-file-readers-in-depth-apis-and-fusion
Weston Pace
chunking-techniques-with-langchain-and-llamaindex
Prashant Kumar
chunking-analysis-which-is-the-right-chunking-approach-for-your-language
Shresth Shukla
chat-with-csv-excel-using-lancedb
LanceDB
case-study-netflix
David Myriel
case-study-dosu
Qian Zhu
Michael Ludden
case-study-cognee
David Myriel
Vasilije Markovic
case-study-coderabbit
Qian Zhu
building-rag-on-codebases-part-2
Sankalp Shubham
building-rag-on-codebases-part-1
Sankalp Shubham
branching-and-shallow-clone
Jack Ye
better-rag-with-active-retrieval-augmented-generation-flare-3b66646e2a9f
LanceDB
benchmarking-random-access-in-lance
Chang She
benchmarking-lancedb-92b01032874a-2
LanceDB
benchmarking-cohere-reranker-with-lancedb
LanceDB
anythingllms-competitive-edge-lancedb-for-seamless-rag-and-agent-workflows
Ayush Chaurasia
announcing-lance-sdk
Weston Pace
agentic-rag-using-langgraph-building-a-simple-customer-support-autonomous-agent
LanceDB
advanced-rag-precise-zero-shot-dense-retrieval-with-hyde-0946c54dfdcb
LanceDB
accelerate-vector-search-applications-using-openvino-lancedb
LanceDB
a-primer-on-text-chunking-and-its-types-a420efc96a13
Prashant Kumar
a-practical-guide-to-training-custom-rerankers
Ayush Chaurasia
a-practical-guide-to-fine-tuning-embedding-models
Ayush Chaurasia
keep-your-data-fresh-with-cocoindex-and-lancedb
Prashanth Rao
Linghua Jin

LanceDB's RaBitQ Quantization for Blazing Fast Vector Search

September 17, 2025
Engineering

Most embedding models give you float32 vectors. That is precise, but wasteful. You pay in storage and query time. Scalar quantization can shrink the size, but not enough. Binary quantization can shrink it further, but at the cost of quality.

Until now LanceDB has relied on IVF_PQ (Inverted File with Product Quantization) as its default compression and search strategy. IVF_PQ works well in many cases, but comes with trade-offs: building PQ codebooks is expensive, re-training is required when data distribution shifts, and distance estimation can be slow. Learn more about index types in Indexing.

Today we are adding RaBitQ quantization to LanceDB. You now have a choice. IVF_PQ remains the default, but RaBitQ is available when you need higher compression, faster indexing, and better recall in high dimensions. See how this fits into vector search and hybrid search.

A 1024-dimensional float32 vector needs 4 KB. With RaBitQ and corrective factors the same vector takes about 136 bytes.
IVF_PQ typically reduces this by 8–16x.

Why this matters for you

High-dimensional vectors behave in ways that seem strange if you are used to 2D or 3D. In high dimensions almost every coordinate of a unit vector is close to zero. This is the concentration of measure phenomenon often discussed in retrieval systems.

RaBitQ takes advantage of this. It stores only the binary "sign pattern" of each normalized vector and relies on the fact that the error introduced is bounded. The RaBitQ paper proves the estimation error is O(1/√D) (decreases proportionally to the inverse square root of dimensions), which is asymptotically optimal. In practice, the more dimensions your embeddings have, the more RaBitQ gives you for free.

💡 RaBitQ is most effective with embeddings of 512, 768, or 1024 dimensions. IVF_PQ often struggles more as dimensionality grows, making RaBitQ a strong complement.

What happens under the hood

When you index embeddings in LanceDB with RaBitQ your vectors are shifted around a centroid and normalized. They are snapped to the nearest vertex of a randomly rotated hypercube on the unit sphere. Each quantized vector is stored as 1 bit per dimension. Index creation is managed the same way as other vector indexes.

Two small corrective factors are also stored:

  • the distance from the vector to the centroid
  • the dot product between the vector and its quantized form

At query time your input—whether text, image, audio, or video—is transformed in the same way. Comparisons become fast binary dot products, and the corrective factors restore fidelity.

Formally, RaBitQ maps a vector v to a binary vector (vqb) and a hypercube vertex (vh) on the unit sphere:

$$ v_h[i] = \begin{cases} \frac{1}{\sqrt{D}} & \text{if } v_{qb}[i] = 1 \\ -\frac{1}{\sqrt{D}} & \text{if } v_{qb}[i] = 0 \end{cases} $$

where D is the number of dimensions. Each component is either a positive or negative value scaled by the square root of dimensions.

To remove bias, RaBitQ samples a random orthogonal matrix P and sets the final quantized vector to:

$$ v_q = P \times v_h $$

During indexing, for each data vector (v), we compute the residual around a centroid c, normalize it, map it to bits via the inverse rotation P-1, and store two corrective scalars alongside the bits:

  • Centroid distance: 𝑑𝑐 =||⁢𝑣 −𝑐⁢||, which is the Euclidean distance from vector to centroid
  • Inner product: 𝑑𝑠 =⟨𝑣𝑞,𝑣𝑛⟩, which is the dot product between quantized and normalized vectors

At search time, the query is processed with the same rotation to enable extremely fast binary dot products plus lightweight corrections for accurate distance estimates.

RabitQ 2D mapping
Figure 1a: 2D illustration of mapping a vector to the nearest unit‑sphere hypercube vertex (RabitQ overview)
figure1b
Figure 1b: Geometry of query and data vectors in RaBitQ

This figure shows the geometric relationship between a data vector o, its quantized form ¯𝑜, and a query vector q. The auxiliary vector e1 lies in the same plane. RaBitQ exploits the fact that in high-dimensional spaces, the projection of ¯𝑜 onto 𝑒1 is highly concentrated around zero (red region on the right). This allows us to treat it as negligible, enabling accurate distance estimation with minimal computation.

Difference with IVF_PQ: PQ requires training a codebook and splitting vectors into subvectors. RaBitQ avoids that entirely. Indexing is faster, maintenance is simpler, and results are more stable when you update your dataset.

Search in practice

With IVF_PQ you first partition vectors into inverted lists, then run PQ-based comparisons inside each list. With RaBitQ the comparison step is even cheaper: a binary dot product plus small corrections. Both approaches benefit from reranking with full-precision vectors.

Both approaches use a candidate set and re-ranking to ensure quality. But because RaBitQ's approximations are tight, you often need fewer candidates to hit the same recall.

For you this means recall above 95 percent with lower latency, even on large multimodal datasets on our test hardware.

💡 Always re-rank candidates with full precision vectors. IVF_PQ and RaBitQ both rely on approximation for candidate generation.

Research context: extended-RaBitQ

The RaBitQ paper also introduces extended-RaBitQ, a generalization to multi-bit quantization. Instead of limiting each coordinate to 1 bit, extended-RaBitQ allows for 2–4 bits per dimension. This reduces error further and can outperform product quantization under the same compression budget. For background on token-based search that pairs well with these methods, see Full‑Text Search.

⚠️ extended-RaBitQ is not yet available in LanceDB. We mention it here only as research context. It shows the trajectory of this work, and the same principles that make RaBitQ efficient at 1 bit can be extended to higher precision when needed.

Benchmarks

benchmarks

We tested IVF_PQ and RaBitQ side by side on two public datasets:

The tests were run on a consumer-grade PC:

  • CPU: Intel 12400F (6 cores, 12 threads)
  • Disk: 1TB SSD
  • No GPU

DBpedia (text embeddings, 768d)

Metric IVF_PQ RaBitQ
recall@10 ~92% 96%+
throughput ~350 QPS 495 QPS
index build time ~85s ~75s
DBpedia Benchmark
Figure 2: DBpedia benchmark results (recall@10, throughput, build time)

GIST1M (image embeddings, 960d)

Metric IVF_PQ RaBitQ
recall@10 ~90% 94%
throughput ~420 QPS 540–765 QPS
index build time ~130s ~21s
GIST1M Benchmark
Figure 3: GIST1M benchmark results (recall@10, throughput, build time)
💡 On CPUs, RaBitQ already outperforms IVF_PQ on our test hardware. On GPUs the gap should widen further because RaBitQ operations are easier to parallelize.

IVF_PQ vs RaBitQ

When you use LanceDB today, the default quantization method is IVF_PQ. It works well for many workloads and gives you solid compression and recall. With RaBitQ now available, you have another option. The two approaches make very different trade-offs, especially when your vectors are high-dimensional or multimodal.

Feature IVF_PQ (default) RaBitQ (new)
Compression ratio 8–16x typical Up to 32x
Index build time Slower, requires codebook training Faster, no training needed
Recall in high dimensions Good but drops with >512d Stays high, error shrinks with more dimensions
Query speed Good, but PQ distance estimation is slower Faster, binary dot products dominate
Update handling Requires re-training if distribution shifts No retraining needed, robust to updates
Best for General workloads, balanced compression Large multimodal datasets, high-dim embeddings

What this means in practice is that you don't have to pick one forever. IVF_PQ remains a great default for balanced workloads. But if you are working with very large datasets, embeddings above 512 dimensions, or multimodal collections where precision matters, RaBitQ can give you higher recall, faster queries, and far smaller storage footprints. Because RaBitQ does not depend on heavy codebook training, you can also update your data more easily without re-training the index.

What you can do today in LanceDB

With IVF_PQ you already had scalable compression and fast ANN search. Now with RaBitQ you get an additional tool:

  • Use IVF_PQ for general-purpose workloads where codebooks make sense
  • Use RaBitQ when you want maximum compression, faster indexing, or higher recall in high dimensions
  • Mix RaBitQ-S with other modalities for adaptive precision control
  • Keep more embeddings online for RAG , multimodal search, and analytics

Proven and ready

RaBitQ is not theoretical. Elastic integrated it into their BBQ quantization feature. Tensorchord built a cost-efficient vector search service with it. Both show strong improvements in recall and speed compared to classical product quantization.

Now RaBitQ is available in LanceDB alongside IVF_PQ. You can choose the method that best fits your workload, and even switch as your dataset grows. If you're exploring LanceDB for the first time, check out our quickstart, the WikiSearch demo, and how this ties into the Multimodal Lakehouse.

outro
David Myriel
Writer, Software Engineer
Yang Cen
Software Engineer @ LanceDB

Stable-Worldmodel: A High Performance Platform for Reproducible World Model Research

Ayush Chaurasia
Quentin Lhoest
Lucas Maes
Quentin Le Lidec
June 2, 2026
stable-worldmodel-a-high-performance-platform-for-reproducible-world-model-research

🌍 Lance-Backed World Model Platform, 🦆 Multimodal SQL with Lance DuckDB Extension, 💰 LanceDB vs OpenSearch Cost Breakdown

ChanChan Mao
May 28, 2026
newsletter-may-2026

Reproducible Data Curation In The Multimodal Lakehouse

Prashanth Rao
May 29, 2026
reproducible-data-curation-in-the-multimodal-lakehouse