OpenSearch vs LanceDB for Vector Search: Performance and Cost

Justin Miller

•

May 1, 2026

•

Engineering Case Study

Table of Contents

This is a title

This is a subtitle

Choosing a vector database usually comes down to a tradeoff between a full search service and an in-process library. OpenSearch and LanceDB sit on opposite ends of that spectrum: one runs as a distributed cluster, the other as an embedded file format you query directly from your application. This post benchmarks both on the same workload (287,360 COCO 2017 images embedded with SigLIP 2) measuring ingestion throughput, query latency, storage layout, and AWS cost.

Setup

Both systems index the same data: 287,360 images from the COCO 2017 dataset, embedded with Google's SigLIP 2 (SoViT-400M, 384px) into 1152-dimensional vectors, L2-normalized. The embeddings parquet is 46.6 GB with most of that is inline JPEG image bytes alongside the vectors and metadata.

Metric/Component	Value
Dataset	COCO 2017 (all splits)
Images	287,360
Embedding model	`google/siglip2-so400m-patch14-384`
Vector dimensions	1152
Normalization	L2 (unit vectors)
Total data size	~46 GB (vectors + metadata + images)

OpenSearch stores file paths pointing at images in S3. LanceDB stores the JPEG bytes inline alongside the vectors and metadata. Inline storage means a query returns everything needed to render results in one round trip (no second fetch to object storage, no signed URLs, no IAM policy to maintain). The tradeoff is a larger index and slower ingestion, since every byte of every image is written into the columnar store. Whether that's worth it depends on how often you're reading versus writing, and whether your images already live somewhere the application can reach cheaply.

Ingestion Performance

OpenSearch: 1,724s

OpenSearch indexes documents through its bulk API over HTTP. Each document contains the vector, metadata fields, and a string reference to the image file on disk:

actions.append({
    "_index": INDEX_NAME,
    "_source": {
        "embedding": row["vector"],
        "image_id": row["image_id"],
        "file_name": row["file_name"],
        "image_path": f"data/coco_images/{row['file_name']}",
        "coco_url": row["coco_url"],
        "caption": row["caption"],
    },
})

The entire parquet is loaded into memory, serialized to JSON, and sent over the network to the OpenSearch container. At 287K documents with 1152-dim vectors, this takes nearly 29 minutes.

Loaded 287360 rows from parquet (vector dim=1152)
Created index 'coco-clip-embeddings' (dim=1152)
Indexing into OpenSearch...
Indexed 287360 documents in 1724.4s

LanceDB: 263s (8.8x faster)

LanceDB writes directly to Lance columnar files on local disk. The script streams the parquet in 4,096-row batches to keep memory bounded:

for batch in pf.iter_batches(batch_size=BATCH_SIZE):
    batch_df = batch.to_pandas()
    if table is None:
        table = db.create_table(TABLE_NAME, data=batch_df)
    else:
        table.add(batch_df)

Insertion takes 195 seconds. The IVF_HNSW_SQ index build adds another 68 seconds, for a total of 264 seconds. That's 6.5x faster than OpenSearch for the data load alone, and still faster even including index creation.

Inserting 287,360 records into LanceDB in batches of 4,096...
Loading: 100%|██████████| 287360/287360 [03:14<00:00, 1475.05row/s]
Inserted 287,360 records in 194.8s

--- Creating IVF_HNSW_SQ index ---
  rows=287,360  dim=1152  metric=cosine
  num_partitions=1  m=32  ef_construction=300
  Index created in 67.7s

Total time: 263.5s

Why the Difference?

OpenSearch ingestion goes through multiple layers: HTTP serialization, REST API parsing, Lucene segment writes, and JVM garbage collection. LanceDB writes directly to columnar files on disk with no network hop, no JVM, and no serialization overhead.

Metric	OpenSearch	LanceDB
Ingestion time	1,724s	195s
Index creation	Automatic (during ingest)	68s (separate step)
Total	1,724s	263s
Throughput	167 docs/s	1,475 docs/s
Speedup	-	6.5x

Query Results

Both systems return the same top result for a query using the first image embedding (a man on a moped):‍

OpenSearch:

Rank	Score	Image ID	Caption
1	1.0000	391895	A man with a red helmet on a small moped on a di...
2	0.9064	252839	cattle grazing on grass along the side of a road...
3	0.9033	253446
4	0.8949	490582	A man and a woman on a motorcycle in helmets.
5	0.8941	550859

LanceDB:

Rank	Distance	Image ID	Caption	Image Data
1	0.0000	391895	A man with a red helmet on a small moped on a di...	224,456 bytes inline
2	0.4941	580784		179,640 bytes inline
3	0.4995	579451		142,544 bytes inline
4	0.5030	169633	there is a man riding a bike and waving	120,032 bytes inline
5	0.5132	191824		216,890 bytes inline

OpenSearch reports cosine similarity scores (higher is better), LanceDB reports cosine distance (lower is better). Both retrieved the exact match at rank 1. The remaining results differ because OpenSearch uses HNSW with default Lucene parameters while LanceDB uses IVF_HNSW_SQ with scalar quantization, different index structures produce different approximate nearest neighbors beyond the exact match.

Look at the rightmost column: LanceDB returns image bytes directly in the search results. OpenSearch returns a file path string. To actually display an OpenSearch result, you need a second system (S3, a CDN, or a local filesystem mount).

Storage Architecture

This is where the two systems diverge most, and it drives the cost and complexity differences below.

OpenSearch: Vectors + References

OpenSearch Container (Docker/JVM)
├── HNSW index in JVM heap (~2.7 GB)
├── Lucene segments on EBS (~1.8 GB)
└── image_path: "data/coco_images/000000391895.jpg" <-- just a string

S3 / CDN / Filesystem (separate)
└── 287,360 JPEG files (~55 GB)

OpenSearch stores vectors in JVM heap for kNN search. The HNSW graph must fit entirely in memory. Images live somewhere else entirely. Your application needs to resolve the path, fetch the file, and serve it. That's additional infrastructure to deploy, secure, and pay for.

LanceDB: Everything Inline

Lance files on disk/S3
└── coco_clip_embeddings.lance/
    ├── vectors (1152-dim float32)
    ├── metadata (image_id, caption, etc.)
    └── image_bytes (raw JPEG)        <-- stored inline, ~46 GB total

LanceDB stores vectors, metadata, and image bytes together in columnar Lance files. A search query returns everything, including the image, in a single read.

results = table.search(query_vec).limit(10).to_pandas()
img = Image.open(io.BytesIO(results.iloc[0]["image_bytes"]))

The Lance format is columnar with data stored in fragments, so reading vectors for search doesn't touch the image bytes column. Only when you access image_bytes does it read those pages. Memory-mapping lets the OS handle caching. LanceDB doesn't load everything into RAM.

AWS Cost Comparison

Using the cost estimator from the project at three scales:

287K Documents (This Benchmark)

Component	OpenSearch	LanceDB
Instance	r6g.large.search (16 GB, 2 vCPU)	c6g.medium (2 GB, 1 vCPU)
Compute	$0.167/hr	$0.034/hr
Storage	EBS $0.0002/hr + S3 images $0.002/hr	S3 $0.002/hr
Total	$0.17/hr ($125/mo)	$0.04/hr ($26/mo)
Ratio	-	4.7x cheaper

OpenSearch needs a memory-optimized instance because the HNSW graph lives in JVM heap. LanceDB memory-maps from disk, so a 2 GB compute-optimized instance is sufficient.

1M Documents

Component	OpenSearch	LanceDB
Instance	r6g.xlarge.search (32 GB, 4 vCPU)	c6g.medium (2 GB, 1 vCPU)
Total	$0.34/hr ($248/mo)	$0.04/hr ($30/mo)
Ratio	-	8.4x cheaper

At 1M vectors, OpenSearch needs to double its instance size. LanceDB stays on the same instance. The working set (memory-mapped pages actually accessed during queries) is still well under 1 GB.

10M Documents

Component	OpenSearch	LanceDB
Instance	r6g.8xlarge.search (256 GB, 32 vCPU)	c6g.xlarge (8 GB, 4 vCPU)
Total	$2.74/hr ($1,976/mo)	$0.20/hr ($143/mo)
Ratio	-	13.8x cheaper

At 10M vectors with 1152 dimensions, OpenSearch needs 94 GB of JVM heap for the HNSW graph. That requires an r6g.8xlarge, a 256 GB machine at $2.67/hr just for compute. LanceDB's working set is ~2 GB, served by a $0.14/hr instance.

Why the Gap Widens

OpenSearch cost scales with RAM because vectors must fit in JVM heap. Memory-optimized instances are expensive. LanceDB cost scales with storage (S3 at $0.023/GB/month) because it memory-maps columnar files and only loads the pages needed per query. Storage is cheap. As document counts grow, OpenSearch jumps to larger (and disproportionately expensive) instance tiers, while LanceDB's compute stays roughly flat.

Cost scaling (approximate):
  OpenSearch:  O(num_docs × dims × instance_price_per_GB_RAM)
  LanceDB:     O(num_docs × dims × s3_price_per_GB) + fixed_small_compute

Index Configuration

OpenSearch

OpenSearch uses HNSW with Lucene's defaults. The kNN index is configured at index creation:

"settings": {
    "index": {
        "knn": True,
        "knn.algo_param.ef_search": 100,
    }
},
"mappings": {
    "properties": {
        "embedding": {
            "type": "knn_vector",
            "dimension": dim,
            "method": {
                "name": "hnsw",
                "space_type": "cosinesimil",
                "engine": "lucene",
            },
        },
    }
}

LanceDB

The IVF_HNSW_SQ index parameters are derived from table statistics:

# Single HNSW graph for tables under 1M rows
num_partitions = 1 if num_rows < 1_000_000 else int(math.sqrt(num_rows))

# More graph connectivity for larger tables
m = 32 if num_rows > 100_000 else 20
ef_construction = 400 if num_rows > 500_000 else 300

table.create_index(
    metric="cosine",
    vector_column_name="vector",
    index_type="IVF_HNSW_SQ",
    num_partitions=num_partitions,
    m=m,
    ef_construction=ef_construction,
)

Scalar quantization (SQ) compresses each float32 to 8 bits during search, reducing memory bandwidth with minimal recall loss. The index builds in 68 seconds for 287K vectors.

Migration Path

The project includes a live migration script that reads from an OpenSearch index via scroll API and writes to LanceDB, pulling image bytes inline:

# Scroll through OpenSearch documents
for doc in scroll_opensearch(client, INDEX_NAME):
    image_path = IMAGES_DIR / doc["file_name"]
    image_bytes = image_path.read_bytes() if image_path.exists() else b""

    records.append({
        "image_id": doc["image_id"],
        "vector": doc["embedding"],
        "image_bytes": image_bytes,  # inline the image
        # ... metadata fields
    })

You can migrate incrementally without regenerating embeddings. The vectors come from OpenSearch, the images from disk, and everything lands in a single LanceDB table.

Operational Complexity

Concern	OpenSearch	LanceDB
Runtime	JVM in Docker container	Embedded Python library
Dependencies	Docker/Podman, JVM tuning, REST API	`pip install lancedb`
Startup	`docker compose up`, wait for health check	`db = lancedb.connect("path")`
Scaling	Add nodes, rebalance shards	Add storage (S3/disk)
Image serving	Separate S3/CDN infrastructure	Included in query results
Backup	Snapshot API to S3	Copy files
Monitoring	`_cat/indices`, `_cluster/health`, JMX	`table.count_rows()`, `ls -la`
Can scale to zero	No, domain runs 24/7	Yes, just files on S3

When to Use Which

Choose OpenSearch when:

You already run an Elasticsearch/OpenSearch cluster for full-text search and want to add vector search alongside it
You need multi-tenancy with fine-grained access control
Your team has existing operational expertise with the Elastic ecosystem
You need real-time index updates with immediate consistency

Choose LanceDB when:

Vector search is the primary use case, not an add-on
You want to store images (or other binary data) inline with vectors
Cost matters especially at scale where the RAM vs. storage gap widens
You want to eliminate external storage infrastructure
Your workload is bursty and benefits from scale-to-zero

Summary

Ingestion: LanceDB is 6.5x faster at bulk loading (1,475 rows/s vs 167 rows/s), primarily because it writes directly to disk without HTTP/JVM overhead
Cost: OpenSearch is 4.7x more expensive at 287K docs and 13.8x more expensive at 10M docs, driven by JVM heap requirements forcing memory-optimized instances
Storage model: LanceDB's inline image storage eliminates the need for a separate S3/CDN layer, reducing both cost and architectural complexity
Memory: OpenSearch loads the entire HNSW graph into JVM heap; LanceDB memory-maps columnar files and reads only the pages needed per query
Scaling: The cost gap widens with scale because OpenSearch scales with expensive RAM while LanceDB scales with cheap storage
Migration: You can migrate from OpenSearch to LanceDB incrementally without regenerating embeddings

The numbers above reflect a specific workload (image embeddings with large inline payloads). Pure vector-only workloads without image storage would narrow the gap. But for applications where data self-containment matters (search results that include the actual content, not just references to it) then LanceDB's embedded approach is compelling.

All code, benchmarks, and the cost estimator are available at opensearch-lancedb-migration.

The dataset is available on Hugging Face here: jrmiller/coco-2017-siglip2-embeddings