Vector search on object storage: Performance at scale without the infrastructure tax

Most vector databases require three systems: raw data in a lake, metadata in a warehouse, embeddings in a vector index. Three copies of state. Three things to keep in sync.

LanceDB stores everything in one table on object storage. Compute nodes are stateless. No RAM constraints at scale.

Thank you Name Surname
Your submission has been received successfully.
We’ll get back to you as soon as possible.
In the meantime, please check your email — we’ve sent you a confirmation.
Back to Homepage
Tomorrow's AI is being built on LanceDB today
No items found.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
No items found.

Why teams switch

Compute-storage separation

Data lives on object storage at $0.02/GB/month. Compute scales with query load, not data size. 10 TB of data, one small query node during off-peak. No paying for idle capacity.

One table. Actual data

Embeddings, metadata, and raw blobs in the same table. Not links to S3. Blobs. Vector search, full-text search, and SQL filtering compose into a single query. No round trips.

Write a new column without rewriting the table

Adding a column doesn't rewrite existing data. Zero copy. New embedding model? Access control column? Test two models side by side? Column-level operations, not table-level rewrites.

IVF-based indexing

Inserts go to the appropriate partition without touching others. Deletes handled by lightweight bitmaps. MVCC for concurrent reads and writes. Index rebuilds happen asynchronously.

Comparison

Traditional Vector Database
LanceDB
Cost
RAM-bound. Replication for availability means 2-3x storage. Thousands/month at 100M vectors.
Object storage at $0.02/GB/month. Only hot index in memory. Stateless compute.
Scale
What has to stay hot? Index structures, caches, metadata, allocator overhead all in RAM.
Large index persisted to disk. Lance format optimized for random access. Small hot set in memory.
Search
Embeddings and metadata in vector DB. Raw docs/images in S3. Second call for originals.
Vector, full-text, and SQL in one query against one table. Raw blobs stored inline.
Data model
Three systems: lake for raw data, warehouse for metadata, vector DB for embeddings.
One table. Embedding is a column. Metadata in other columns. Raw binary in another column.
Schema changes
New column means rewriting every row. Full rewrite tax every time your app evolves.
Column-level operations. Existing columns untouched. Zero copy.
Best for
Sub-millisecond p99 at any cost. Small static datasets. Zero infrastructure decisions.
Best cost-to-performance at scale. Teams that want to understand and control their infrastructure.

The Power of the Lance Format

Vector Search
  • Fast scans and random access from the same table — no tradeoff
  • Zero-copy access for high throughput without serialization overhead
Multi-Modal
  • Raw data, embeddings, and metadata in one table — not pointers to blob storage
  • No separate metadata store to keep in sync

Enterprise-Grade Requirements

Security

Granular RBAC, SSO integration, and VPC deployment options.

Governance

Data versioning and time-travel capabilities for auditability.

Support

Dedicated technical account management and guaranteed SLAs.

Talk to Engineering

Or try LanceDB OSS — same code, scales to Cloud.

Schedule a Technical Consultation