Choosing a vector database usually comes down to a tradeoff between a full search service and an in-process library. OpenSearch and LanceDB sit on opposite ends of that spectrum: one runs as a distributed cluster, the other as an embedded file format you query directly from your application. This post benchmarks both on the same workload (287,360 COCO 2017 images embedded with SigLIP 2) measuring ingestion throughput, query latency, storage layout, and AWS cost.
Setup
Both systems index the same data: 287,360 images from the COCO 2017 dataset, embedded with Google's SigLIP 2 (SoViT-400M, 384px) into 1152-dimensional vectors, L2-normalized. The embeddings parquet is 46.6 GB with most of that is inline JPEG image bytes alongside the vectors and metadata.
OpenSearch stores file paths pointing at images in S3. LanceDB stores the JPEG bytes inline alongside the vectors and metadata. Inline storage means a query returns everything needed to render results in one round trip (no second fetch to object storage, no signed URLs, no IAM policy to maintain). The tradeoff is a larger index and slower ingestion, since every byte of every image is written into the columnar store. Whether that's worth it depends on how often you're reading versus writing, and whether your images already live somewhere the application can reach cheaply.
Ingestion Performance
OpenSearch: 1,724s
OpenSearch indexes documents through its bulk API over HTTP. Each document contains the vector, metadata fields, and a string reference to the image file on disk:
actions.append({
"_index": INDEX_NAME,
"_source": {
"embedding": row["vector"],
"image_id": row["image_id"],
"file_name": row["file_name"],
"image_path": f"data/coco_images/{row['file_name']}",
"coco_url": row["coco_url"],
"caption": row["caption"],
},
})The entire parquet is loaded into memory, serialized to JSON, and sent over the network to the OpenSearch container. At 287K documents with 1152-dim vectors, this takes nearly 29 minutes.
Loaded 287360 rows from parquet (vector dim=1152)
Created index 'coco-clip-embeddings' (dim=1152)
Indexing into OpenSearch...
Indexed 287360 documents in 1724.4sLanceDB: 263s (8.8x faster)
LanceDB writes directly to Lance columnar files on local disk. The script streams the parquet in 4,096-row batches to keep memory bounded:
for batch in pf.iter_batches(batch_size=BATCH_SIZE):
batch_df = batch.to_pandas()
if table is None:
table = db.create_table(TABLE_NAME, data=batch_df)
else:
table.add(batch_df)Insertion takes 195 seconds. The IVF_HNSW_SQ index build adds another 68 seconds, for a total of 264 seconds. That's 6.5x faster than OpenSearch for the data load alone, and still faster even including index creation.
Inserting 287,360 records into LanceDB in batches of 4,096...
Loading: 100%|██████████| 287360/287360 [03:14<00:00, 1475.05row/s]
Inserted 287,360 records in 194.8s
--- Creating IVF_HNSW_SQ index ---
rows=287,360 dim=1152 metric=cosine
num_partitions=1 m=32 ef_construction=300
Index created in 67.7s
Total time: 263.5sWhy the Difference?
OpenSearch ingestion goes through multiple layers: HTTP serialization, REST API parsing, Lucene segment writes, and JVM garbage collection. LanceDB writes directly to columnar files on disk with no network hop, no JVM, and no serialization overhead.
Query Results
Both systems return the same top result for a query using the first image embedding (a man on a moped):
OpenSearch:
LanceDB:
OpenSearch reports cosine similarity scores (higher is better), LanceDB reports cosine distance (lower is better). Both retrieved the exact match at rank 1. The remaining results differ because OpenSearch uses HNSW with default Lucene parameters while LanceDB uses IVF_HNSW_SQ with scalar quantization, different index structures produce different approximate nearest neighbors beyond the exact match.
Look at the rightmost column: LanceDB returns image bytes directly in the search results. OpenSearch returns a file path string. To actually display an OpenSearch result, you need a second system (S3, a CDN, or a local filesystem mount).
Storage Architecture
This is where the two systems diverge most, and it drives the cost and complexity differences below.
OpenSearch: Vectors + References
OpenSearch Container (Docker/JVM)
├── HNSW index in JVM heap (~2.7 GB)
├── Lucene segments on EBS (~1.8 GB)
└── image_path: "data/coco_images/000000391895.jpg" <-- just a string
S3 / CDN / Filesystem (separate)
└── 287,360 JPEG files (~55 GB)
OpenSearch stores vectors in JVM heap for kNN search. The HNSW graph must fit entirely in memory. Images live somewhere else entirely. Your application needs to resolve the path, fetch the file, and serve it. That's additional infrastructure to deploy, secure, and pay for.
LanceDB: Everything Inline
Lance files on disk/S3
└── coco_clip_embeddings.lance/
├── vectors (1152-dim float32)
├── metadata (image_id, caption, etc.)
└── image_bytes (raw JPEG) <-- stored inline, ~46 GB total
LanceDB stores vectors, metadata, and image bytes together in columnar Lance files. A search query returns everything, including the image, in a single read.
results = table.search(query_vec).limit(10).to_pandas()
img = Image.open(io.BytesIO(results.iloc[0]["image_bytes"]))The Lance format is columnar with data stored in fragments, so reading vectors for search doesn't touch the image bytes column. Only when you access image_bytes does it read those pages. Memory-mapping lets the OS handle caching. LanceDB doesn't load everything into RAM.
AWS Cost Comparison
Using the cost estimator from the project at three scales:
287K Documents (This Benchmark)
OpenSearch needs a memory-optimized instance because the HNSW graph lives in JVM heap. LanceDB memory-maps from disk, so a 2 GB compute-optimized instance is sufficient.
1M Documents
At 1M vectors, OpenSearch needs to double its instance size. LanceDB stays on the same instance. The working set (memory-mapped pages actually accessed during queries) is still well under 1 GB.
10M Documents
At 10M vectors with 1152 dimensions, OpenSearch needs 94 GB of JVM heap for the HNSW graph. That requires an r6g.8xlarge, a 256 GB machine at $2.67/hr just for compute. LanceDB's working set is ~2 GB, served by a $0.14/hr instance.
Why the Gap Widens
OpenSearch cost scales with RAM because vectors must fit in JVM heap. Memory-optimized instances are expensive. LanceDB cost scales with storage (S3 at $0.023/GB/month) because it memory-maps columnar files and only loads the pages needed per query. Storage is cheap. As document counts grow, OpenSearch jumps to larger (and disproportionately expensive) instance tiers, while LanceDB's compute stays roughly flat.
Cost scaling (approximate):
OpenSearch: O(num_docs × dims × instance_price_per_GB_RAM)
LanceDB: O(num_docs × dims × s3_price_per_GB) + fixed_small_compute
Index Configuration
OpenSearch
OpenSearch uses HNSW with Lucene's defaults. The kNN index is configured at index creation:
"settings": {
"index": {
"knn": True,
"knn.algo_param.ef_search": 100,
}
},
"mappings": {
"properties": {
"embedding": {
"type": "knn_vector",
"dimension": dim,
"method": {
"name": "hnsw",
"space_type": "cosinesimil",
"engine": "lucene",
},
},
}
}
LanceDB
The IVF_HNSW_SQ index parameters are derived from table statistics:
# Single HNSW graph for tables under 1M rows
num_partitions = 1 if num_rows < 1_000_000 else int(math.sqrt(num_rows))
# More graph connectivity for larger tables
m = 32 if num_rows > 100_000 else 20
ef_construction = 400 if num_rows > 500_000 else 300
table.create_index(
metric="cosine",
vector_column_name="vector",
index_type="IVF_HNSW_SQ",
num_partitions=num_partitions,
m=m,
ef_construction=ef_construction,
)
Scalar quantization (SQ) compresses each float32 to 8 bits during search, reducing memory bandwidth with minimal recall loss. The index builds in 68 seconds for 287K vectors.
Migration Path
The project includes a live migration script that reads from an OpenSearch index via scroll API and writes to LanceDB, pulling image bytes inline:
# Scroll through OpenSearch documents
for doc in scroll_opensearch(client, INDEX_NAME):
image_path = IMAGES_DIR / doc["file_name"]
image_bytes = image_path.read_bytes() if image_path.exists() else b""
records.append({
"image_id": doc["image_id"],
"vector": doc["embedding"],
"image_bytes": image_bytes, # inline the image
# ... metadata fields
})
You can migrate incrementally without regenerating embeddings. The vectors come from OpenSearch, the images from disk, and everything lands in a single LanceDB table.
Operational Complexity
When to Use Which
Choose OpenSearch when:
- You already run an Elasticsearch/OpenSearch cluster for full-text search and want to add vector search alongside it
- You need multi-tenancy with fine-grained access control
- Your team has existing operational expertise with the Elastic ecosystem
- You need real-time index updates with immediate consistency
Choose LanceDB when:
- Vector search is the primary use case, not an add-on
- You want to store images (or other binary data) inline with vectors
- Cost matters especially at scale where the RAM vs. storage gap widens
- You want to eliminate external storage infrastructure
- Your workload is bursty and benefits from scale-to-zero
Summary
- Ingestion: LanceDB is 6.5x faster at bulk loading (1,475 rows/s vs 167 rows/s), primarily because it writes directly to disk without HTTP/JVM overhead
- Cost: OpenSearch is 4.7x more expensive at 287K docs and 13.8x more expensive at 10M docs, driven by JVM heap requirements forcing memory-optimized instances
- Storage model: LanceDB's inline image storage eliminates the need for a separate S3/CDN layer, reducing both cost and architectural complexity
- Memory: OpenSearch loads the entire HNSW graph into JVM heap; LanceDB memory-maps columnar files and reads only the pages needed per query
- Scaling: The cost gap widens with scale because OpenSearch scales with expensive RAM while LanceDB scales with cheap storage
- Migration: You can migrate from OpenSearch to LanceDB incrementally without regenerating embeddings
The numbers above reflect a specific workload (image embeddings with large inline payloads). Pure vector-only workloads without image storage would narrow the gap. But for applications where data self-containment matters (search results that include the actual content, not just references to it) then LanceDB's embedded approach is compelling.
All code, benchmarks, and the cost estimator are available at opensearch-lancedb-migration.
The dataset is available on Hugging Face here: jrmiller/coco-2017-siglip2-embeddings



