Blog category:

Engineering

Engineering

Lance File 2.1 is Now Stable

Weston Pace / October 3, 2025

The 2.1 file version is now stable, learn what that means for you and what's coming next.

Engineering

Introducing Lance Data Viewer: A Simple Way to Explore Lance Tables

Gordon Murray / September 24, 2025

A lightweight open source web UI for exploring Lance datasets, viewing schemas, and browsing table data with vector visualization support.

Engineering

LanceDB's RaBitQ Quantization for Blazing Fast Vector Search

David Myriel, Yang Cen / September 17, 2025

Introducing RaBitQ quantization in LanceDB for higher compression, faster indexing, and better recall on high‑dimensional embeddings.

Engineering

Building Semantic Video Recommendations with TwelveLabs and LanceDB

David Myriel / September 16, 2025

Build semantic video recommendations using TwelveLabs embeddings, LanceDB storage, and Geneva pipelines with Ray.

Engineering

Setup Real-Time Multimodal AI Analytics with Apache Fluss (incubating) and Lance

Wayne Wang / September 8, 2025

Learn how to build real-time multimodal AI analytics by integrating Apache Fluss streaming storage with Lance's AI-optimized lakehouse. This guide demonstrates streaming multimodal data processing for RAG systems and ML workflows.

Engineering

Productionalize AI Workloads with Lance Namespace, LanceDB, and Ray

Jack Ye / September 4, 2025

Learn how to productionalize AI workloads with Lance Namespace's enterprise stack integration and the scalability of LanceDB and Ray for end-to-end ML pipelines.

Engineering

LanceDB's Geneva: Scalable Feature Engineering

Jonathan Hsieh / August 21, 2025

Learn how to build scalable feature engineering pipelines with Geneva and LanceDB. This demo transforms image data into rich features including captions, embeddings, and metadata using distributed Ray clusters.

Engineering

LanceDB WikiSearch: Native Full-Text Search on 41M Wikipedia Docs

Ayush Chaurasia / August 11, 2025

No more Tantivy! We stress-tested native full-text search in our latest massive-scale search demo. Let's break down how it works and what we did to scale it.

Engineering

Manage Lance Tables in Any Catalog using Lance Namespace and Spark

Jack Ye / August 8, 2025

Access and manage your Lance tables in Hive, Glue, Unity Catalog, or any catalog service using Lance Namespace with the latest Lance Spark connector.

Engineering

Columnar File Readers in Depth: Structural Encoding

Weston Pace / August 7, 2025

Deep dive into LanceDB's dual structural encoding approach - mini-block for small data types and full-zip for large multimodal data. Learn how this optimizes compression and random access performance compared to Parquet.

Engineering

S3 Vectors vs LanceDB: Cost, Latency, and the Hidden Trade-offs

David Myriel / August 1, 2025

Is it worth the hype? Comparing Amazon S3 Vectors and LanceDB for RAG and agentic systems.

Engineering

What is the LanceDB Multimodal Lakehouse?

David Myriel / June 23, 2025

Introducing the Multimodal Lakehouse - a unified platform for managing AI data from raw files to production-ready features, now part of LanceDB Enterprise.

Engineering

Columnar File Readers in Depth: Repetition & Definition Levels

Weston Pace / June 2, 2025

Explore columnar file readers in depth: repetition & definition levels with practical insights and expert guidance from the LanceDB team.

Engineering

Columnar File Readers in Depth: Column Shredding

Weston Pace / May 15, 2025

Explore columnar file readers in depth: column shredding with practical insights and expert guidance from the LanceDB team.

Engineering

Columnar File Readers in Depth: Compression Transparency

Weston Pace / April 29, 2025

Explore columnar file readers in depth: compression transparency with practical insights and expert guidance from the LanceDB team.

Engineering

A Practical Guide to Training Custom Rerankers

Ayush Chaurasia / April 10, 2025

Explore a practical guide to training custom rerankers with practical insights and expert guidance from the LanceDB team.

Engineering

The Future of Open Source Table Formats: Apache Iceberg and Lance

Jack Ye / April 8, 2025

Explore the future of open source table formats: apache iceberg and lance with practical insights and expert guidance from the LanceDB team.

Engineering

Lance File 2.1: Smaller and Simpler

Weston Pace / March 27, 2025

Explore lance file 2.1: smaller and simpler with practical insights and expert guidance from the LanceDB team.

Engineering

RAG with GRPO Fine-Tuned Reasoning Model

Mahesh Deshwal / March 24, 2025

Explore rag with grpo fine-tuned reasoning model with practical insights and expert guidance from the LanceDB team.

Engineering

Creating a FinTech AI Agent From Scratch

Vipul Maheshwari / February 27, 2025

Explore fintech ai agent from scratch with practical insights and expert guidance from the LanceDB team.

Engineering

Chunking Analysis: Which is the right chunking approach for your language?

Shresth Shukla / January 27, 2025

Explore chunking analysis: which is the right chunking approach for your language? with practical insights and expert guidance from the LanceDB team.

Engineering

Agentic RAG Using LangGraph: Build an Autonomous Customer Support Agent

LanceDB / January 26, 2025

Build an autonomous customer support agent using LangGraph and LanceDB that automatically fetches, classifies, drafts, and responds to emails with RAG-powered policy retrieval.

Engineering

Python Package to convert image datasets to lance type

Vipul Maheshwari / December 9, 2024

Explore python package to convert image datasets to lance type with practical insights and expert guidance from the LanceDB team.

Engineering

Late Interaction & Efficient Multi-modal Retrievers Need More Than a Vector Index

Ayush Chaurasia / September 18, 2024

Explore late interaction & efficient multi-modal retrievers need more than a vector index with practical insights and expert guidance from the LanceDB team.

Engineering

The Case for Random Access I/O

LanceDB / August 20, 2024

One of the reasons we started the Lance file format and have been investigating new encodings is because we wanted a format with better support for random access.

Engineering

My Summer Internship Experience at LanceDB

Raunak Sinha / August 15, 2024

I'm Raunak, a master's student at the University of Illinois, Urbana-Champaign. This summer, I had the opportunity to intern as a Software Engineer at LanceDB, an early-stage startup based in San Francisco.

Engineering

Columnar File Readers in Depth: APIs and Fusion

Weston Pace / June 18, 2024

The API used to read files has evolved over time, from simple full table reads to batch reads and eventually to iterative record batch readers. Lance takes this a step further to return a stream of read tasks.

Engineering

Developers, Ditch the Black Box: Welcome to Continue

LanceDB / May 23, 2024

Remember flipping through coding manuals? Those quickly became relics with the rise of Google and Stack Overflow, a one-stop shop for developer queries.

Engineering

Columnar File Readers in Depth: Parallelism without Row Groups

Weston Pace / May 14, 2024

Explore columnar file readers in depth: column shredding with practical insights and expert guidance from the LanceDB team.

Engineering

Benchmarking Cohere Rerankers with LanceDB

LanceDB / May 7, 2024

Improve retrieval quality by reranking LanceDB results with Cohere and ColBERT. You’ll plug rerankers into vector, FTS, and hybrid search and compare accuracy on real datasets.

Engineering

Lance v2: A New Columnar Container Format

Weston Pace / April 13, 2024

Explore lance v2: a new columnar container format with practical insights and expert guidance from the LanceDB team.

Engineering

Effortlessly Loading and Processing Images with Lance: a Code Walkthrough

LanceDB / March 29, 2024

Working with large image datasets in machine learning can be challenging, often requiring significant computational resources and efficient data-handling techniques.

Engineering

A Practical Guide to Fine-Tuning Embedding Models

Ayush Chaurasia / March 25, 2024

Explore a practical guide to fine-tuning embedding models with practical insights and expert guidance from the LanceDB team.

Engineering

Columnar File Readers in Depth: Backpressure

Weston Pace / March 25, 2024

Streaming data applications can be tricky. When you can read data faster than you can process the data then bad things tend to happen. The various solutions to this problem are largely classified as backpressure.

Engineering

Designing a Table Format for ML Workloads

Weston Pace / March 25, 2024

Explore designing a table format for ML workloads with practical insights and expert guidance from the LanceDB team.

Engineering

GraphRAG: Hierarchical Approach to Retrieval-Augmented Generation

Akash Desai / March 25, 2024

Explore GraphRAG: hierarchical approach to retrieval-augmented-generation with practical insights and expert guidance from the LanceDB team.

Engineering

Track AI Trends: CrewAI Agents & RAG

LanceDB / March 25, 2024

This article will teach us how to make an AI Trends Searcher using CrewAI Agents and their Tasks. But before diving into that, let's first understand what CrewAI is and how we can use it for these applications.

Engineering

Multimodal Myntra Fashion Search Engine Using LanceDB

LanceDB / March 20, 2024

Build a multimodal fashion search engine with LanceDB and CLIP embeddings. Follow a step‑by‑step workflow to register embeddings, create the table, query by text or image, and ship a Streamlit UI.

Engineering

Custom Datasets for Efficient LLM Training Using Lance

LanceDB / March 8, 2024

See about custom datasets for efficient llm training using lance. Get practical steps, examples, and best practices you can use now.

Engineering

Implementing Corrective RAG in the Easiest Way

LanceDB / March 4, 2024

Even though text-generation models are good at generating content, they sometimes need to improve in returning facts. This happens because of the way they are trained.

Engineering

Hybrid Search and Custom Reranking with LanceDB

LanceDB / February 19, 2024

Combine keyword and vector search for higher‑quality results with LanceDB. This post shows how to run hybrid search and compare rerankers (linear combination, Cohere, ColBERT) with code and benchmarks.

Engineering

Hybrid Search: RAG for Real-Life Production-Grade Applications

Mahesh Deshwal / February 18, 2024

Get about hybrid search: rag for real-life production-grade applications. Get practical steps, examples, and best practices you can use now.

Engineering

Efficient RAG with Compression and Filtering

Kaushal Choudhary / January 9, 2024

Discover about efficient rag with compression and filtering. Get practical steps, examples, and best practices you can use now.

Engineering

Inverted File Product Quantization (IVF_PQ): Accelerate Vector Search by Creating Indices

LanceDB / December 17, 2023

Compress vectors with PQ and accelerate retrieval with IVF_PQ in LanceDB. The tutorial explains the concepts, memory savings, and a minimal implementation with search tuning knobs.

Engineering

Modified RAG: Parent Document & Bigger Chunk Retriever

Mahesh Deshwal / December 15, 2023

Get about modified rag: parent document & bigger chunk retriever. Get practical steps, examples, and best practices you can use now.

Engineering

Search Within an Image with Segment Anything

Kaushal Choudhary / December 12, 2023

Get about search within an image with segment anything. Get practical steps, examples, and best practices you can use now.

Engineering

MemGPT: OS Inspired LLMs That Manage Their Own Memory

Ayush Chaurasia / December 11, 2023

Explore about memgpt: os inspired llms that manage their own memory. Get practical steps, examples, and best practices you can use now.

Engineering

Hybrid Search: Combining BM25 and Semantic Search for Better Results with Langchain

LanceDB / December 9, 2023

Have you ever thought about how search engines find exactly what you're looking for? They usually use a mix of looking for specific words and understanding the meaning behind them.

Engineering

Accelerate Vector Search Applications Using OpenVINO & LanceDB

LanceDB / December 6, 2023

In this article, We use CLIP from OpenAI for Text-to-Image and Image-to-Image searching and we’ll also do a comparative analysis of the Pytorch model, FP16 OpenVINO format, and INT8 OpenVINO format in terms of speedup.

Engineering

Advanced RAG: Precise Zero-Shot Dense Retrieval with HyDE

LanceDB / November 27, 2023

In the world of search engines, the quest to find the most relevant information is a constant challenge. Researchers are always on the lookout for innovative ways to improve the effectiveness of search results.

Engineering

Better RAG with Active Retrieval Augmented Generation FLARE

LanceDB / November 17, 2023

by Akash A. Get practical steps and examples from 'Better RAG with Active Retrieval Augmented Generation FLARE'.

Engineering

GPU-Accelerated Indexing in LanceDB

LanceDB / November 2, 2023

Speed up vector index training in LanceDB with CUDA or Apple Silicon (MPS). See how GPU‑accelerated IVF/PQ training compares to CPU and how to enable it in code.

Engineering

Reduce Hallucinations from LLM-Powered Agents Using Long-Term Memory

LanceDB / July 19, 2023

Understand about reduce hallucinations from llm-powered agents using long-term memory. Get practical steps, examples, and best practices you can use now.

Engineering

Scalable Computer Vision with LanceDB & Voxel51

LanceDB / July 13, 2023

Explore about scalable computer vision with lancedb & voxel51. Get practical steps, examples, and best practices you can use now.

Engineering

Lance, Windows. Windows, Lance

Chang She / May 31, 2023

It was Spring of 2012. After being an avid user for 2+ years, I finally decided to join Wes Mckinney and work on pandas full time.

Engineering

My SIMD Is Faster than Yours

LanceDB / April 24, 2023

An untold story about how we make LanceDB vector search fast. Get practical steps and examples from 'My SIMD is faster than Yours'.

Engineering

Benchmarking Random Access in Lance

LanceDB / March 14, 2023

In this short blog post we’ll take you through some simple benchmarks to show the random access performance of Lance format. Get practical steps and examples from 'Benchmarking random access in Lance'.