đź“„ Lance Blob V2: Making Multimodal Data a First-Class Citizen in the Lakehouse

Lance Blob V2 introduces four storage semantics—inline, packed, dedicated, and external—allowing Lance to automatically optimize how data is stored based on size and access pattern. This removes the need to choose between efficient small reads and avoiding large rewrite costs.
By separating storage layout decisions at the format layer, datasets can efficiently handle everything from KB-scale thumbnails to multi-GB videos. The result is fewer unnecessary rewrites, better locality for small data, and scalable access patterns for large blobs.
🤗 A Guide to Uploading Lance Datasets on the Hugging Face Hub

You can now upload a Lance dataset—including data, indexes, and versions—directly to Hugging Face and query it via hf:// without downloading. Vector search, full-text search, SQL, and nested filtering are all supported out of the box.
Updates are incremental: new columns like embeddings or labels can be added without rewriting existing data. This makes it practical to evolve datasets over time while preserving existing blobs and indexes.
🦞 Why LanceDB Is the Most Natural Memory Layer for OpenClaw

OpenClaw agents persist memory across sessions, and LanceDB is emerging as the default storage layer for that memory. It runs embedded (no service required) and stores embeddings, metadata, and indexes together in a single table.
This enables unified querying—vector, full-text, and structured filtering—over agent memory. Combined with append-friendly storage, it matches how agents accumulate and retrieve knowledge over time.
đź“– Also Published This Month
- Lance File Format 2.2: Taming Complex Data
- Memory for OpenClaw: From Zero to LanceDB ProÂ
- OpenClaw + LanceDB + Seed 2.0: Turn Visual Ideas into Reality, Fast!
đź“… Upcoming Events

Data Engineering Open Forum – April 16 in SF
Jack Ye and Pablo Delgado, ML Engineer at Netflix, will present on multimodal feature engineering at scale with Netflix, covering how LanceDB supports large-scale storage, retrieval, and dataset workflows.
🗓️ Session details: https://www.dataengineeringopenforum.com/?session=powering-netflixs-multimodal-feature-engineering-at-scale#agenda
đź”— Register link: https://luma.com/deof2026?utm_source=li-speaker

TokioConf 2026 – April 20 in Portland
Weston Pace & Lu Qiu will share a deep dive into optimizing a Rust-native search database, focusing on I/O scheduling, async profiling, and achieving storage-level performance.
🗓️ Conference schedule: https://www.tokioconf.com/schedule
🏗️ LanceDB Enterprise Updates
🌟 Open Source Releases
đź«¶ Community Contributions
Thank you to contributors from Netflix, Uber, Bytedance, Huawei, Baidu, and Linkedin for improvements across storage, indexing, query execution, distributed processing, and ecosystem integrations in LanceDB, Lance, lance-spark, and other products
Notable contributions this month:
- @beinan — Enabled vector-first ANN integration in lance-graph, bringing hybrid graph + vector reranking into Cypher workflows
- @Mesut-Doner — Introduced type-safe expression APIs in Rust, improving composability and safety of query construction
- @pratik0316 — Added type-safe expression builder API in Python, aligning query ergonomics across SDKs
- @nyl3532016 — Extended vector search capabilities with prefiltering support across Spark and core query paths
- @burlacio — Expanded cloud storage support with Azure ADLS Gen2 (abfss://) integration across the ecosystem
- @XuQianJin-Stars — Added atomic multi-table transaction support, enabling more reliable multi-dataset workflows
- @yingjianwu98 — Improved storage efficiency with encoding and compression enhancements for complex data layouts
- @HemantSudarshan — Added Levenshtein-based schema suggestions, improving developer experience in query debugging
- @LuciferYang — Improved Spark execution reliability and performance with fixes across scan planning and Arrow integration
- @mrncstt — Enabled structured updates via dict→SQL conversion, improving usability of update workflows
A heartfelt thank you to our community contributors of Lance and LanceDB this past month:
@VedantMadane • @pratik0316 • @lennylxx • @majiayu000 • @myandpr • @marca116 • @dask-58 • @Mesut-Doner • @mrncstt • @omair445 • @veeceey • @Abhisheklearn12 • @ChinmayGowda71 • @Zelys-DFKH • @BillionClaw • @octo-patch • @sinianlouye • @ddupg • @yingjianwu98 • @xloya • @zhangyue19921010 • @HemantSudarshan • @nyl3532016 • @fangbo • @ndpvt-web • @XuQianJin-Stars • @burlacio • @wojiaodoubao • @fenfeng9 • @dardourimohamed • @cheungxi • @cijiugechu • @erandagan • @acking-you • @Gallardot • @wombatu-kun • @FarmerChillax • @shepmaster • @majin1102 • @ztorchan • @yanghua • @touch-of-grey • @bryanck • @fecet • @apoc • @rahil-c • @AndreaBozzo • @durch • @LuciferYang • @dik654 • @chyyran • @beinan • @ChunxuTang • @aheev • @leiyuou • @jja725 • @jiaoew1991 • @a-sane • @ivscheianu • @jtuglu1 • @mikewhb
🤝 Lance Community Sync Recap
This month’s community syncs focused on the Lance 3.0 and upcoming 4.0 releases, including adoption of the 2.2 file format and ongoing improvements to indexing and query performance. Ecosystem momentum continues to build with Lance as a core DuckDB extension, a new PrestoDB connector, and early discussion of distributed vector indexing with significant build speed improvements.
The next Lance Community Sync will take place on Thursday, April 9.



