Vector Search with LanceDB A vector search finds the approximate or exact nearest neighbors to a given query vector. In a recommendation system or search engine, you can find similar records to the one you searched. In LLM and other AI applications, each data point can be represented by embeddings generated from existing models, following which the search returns the most relevant features. Distance metrics Distance metrics are a measure of the similarity between a pair of vectors. Currently, LanceDB supports the following metrics: Metric Description l2 Euclidean / l2 distance cosine Cosine Similarity dot Dot Production hamming Hamming Distance Note The hamming metric is only available for binary vectors. Exhaustive search (kNN) If you do not create a vector index, LanceDB exhaustively scans the entire vector space and computes the distance to every vector in order to find the exact nearest neighbors. This is effectively a kNN search. PythonTypeScript Sync APIAsync API uri = "data/sample-lancedb" db = lancedb.connect(uri) data = [ {"vector": row, "item": f"item {i}"} for i, row in enumerate(np.random.random((10_000, 1536)).astype("float32")) ] tbl = db.create_table("vector_search", data=data) tbl.search(np.random.random((1536))).limit(10).to_list() uri = "data/sample-lancedb" async_db = await lancedb.connect_async(uri) data = [ {"vector": row, "item": f"item {i}"} for i, row in enumerate(np.random.random((10_000, 1536)).astype("float32")) ] async_tbl = await async_db.create_table("vector_search_async", data=data) (await (await async_tbl.search(np.random.random((1536)))).limit(10).to_list()) @lancedb/lancedbvectordb (deprecated) import * as lancedb from "@lancedb/lancedb"; const db = await lancedb.connect(databaseDir); const tbl = await db.openTable("my_vectors"); const results1 = await tbl.search(Array(128).fill(1.2)).limit(10).toArray(); import * as lancedb from "vectordb"; const db = await lancedb.connect("data/sample-lancedb"); const tbl = await db.openTable("my_vectors"); const results_1 = await tbl.search(Array(1536).fill(1.2)).limit(10).execute(); By default, l2 will be used as metric type. You can specify the metric type as cosine or dot if required. PythonTypeScript Sync APIAsync API tbl.search(np.random.random((1536))).distance_type("cosine").limit(10).to_list() ( await (await async_tbl.search(np.random.random((1536)))) .distance_type("cosine") .limit(10) .to_list() ) @lancedb/lancedbvectordb (deprecated) const results2 = await ( tbl.search(Array(128).fill(1.2)) as lancedb.VectorQuery ) .distanceType("cosine") .limit(10) .toArray(); const results_2 = await tbl .search(Array(1536).fill(1.2)) .metricType(lancedb.MetricType.Cosine) .limit(10) .execute(); Approximate nearest neighbor (ANN) search To perform scalable vector retrieval with acceptable latencies, it's common to build a vector index. While the exhaustive search is guaranteed to always return 100% recall, the approximate nature of an ANN search means that using an index often involves a trade-off between recall and latency. See the IVF_PQ index for a deeper description of how IVF_PQ indexes work in LanceDB. Binary vector LanceDB supports binary vectors as a data type, and has the ability to search binary vectors with hamming distance. The binary vectors are stored as uint8 arrays (every 8 bits are stored as a byte): Note The dim of the binary vector must be a multiple of 8. A vector of dim 128 will be stored as a uint8 array of size 16. Python Sync APIAsync APITypeScript import lancedb import numpy as np import pyarrow as pa import pytest db = lancedb.connect("data/binary_lancedb") schema = pa.schema( [ pa.field("id", pa.int64()), # for dim=256, lance stores every 8 bits in a byte # so the vector field should be a list of 256 / 8 = 32 bytes pa.field("vector", pa.list_(pa.uint8(), 32)), ] ) tbl = db.create_table("my_binary_vectors", schema=schema) data = [] for i in range(1024): vector = np.random.randint(0, 2, size=256) # pack the binary vector into bytes to save space packed_vector = np.packbits(vector) data.append( { "id": i, "vector": packed_vector, } ) tbl.add(data) query = np.random.randint(0, 2, size=256) packed_query = np.packbits(query) tbl.search(packed_query).distance_type("hamming").to_arrow() import lancedb import numpy as np import pyarrow as pa import pytest db = await lancedb.connect_async("data/binary_lancedb") schema = pa.schema( [ pa.field("id", pa.int64()), # for dim=256, lance stores every 8 bits in a byte # so the vector field should be a list of 256 / 8 = 32 bytes pa.field("vector", pa.list_(pa.uint8(), 32)), ] ) tbl = await db.create_table("my_binary_vectors", schema=schema) data = [] for i in range(1024): vector = np.random.randint(0, 2, size=256) # pack the binary vector into bytes to save space packed_vector = np.packbits(vector) data.append( { "id": i, "vector": packed_vector, } ) await tbl.add(data) query = np.random.randint(0, 2, size=256) packed_query = np.packbits(query) await (await tbl.search(packed_query)).distance_type("hamming").to_arrow() import * as lancedb from "@lancedb/lancedb"; import { Field, FixedSizeList, Int32, Schema, Uint8 } from "apache-arrow"; const schema = new Schema([ new Field("id", new Int32(), true), new Field("vec", new FixedSizeList(32, new Field("item", new Uint8()))), ]); const data = lancedb.makeArrowTable( Array(1_000) .fill(0) .map((_, i) => ({ // the 256 bits would be store in 32 bytes, // if your data is already in this format, you can skip the packBits step id: i, vec: lancedb.packBits(Array(256).fill(i % 2)), })), { schema: schema }, ); const tbl = await db.createTable("binary_table", data); await tbl.createIndex("vec", { config: lancedb.Index.ivfFlat({ numPartitions: 10, distanceType: "hamming", }), }); const query = Array(32) .fill(1) .map(() => Math.floor(Math.random() * 255)); const results = await tbl.query().nearestTo(query).limit(10).toArrow(); // --8<-- [end:search_binary_data expect(results.numRows).toBe(10); } }); }); Search with distance range You can also search for vectors within a specific distance range from the query vector. This is useful when you want to find vectors that are not just the nearest neighbors, but also those that are within a certain distance. This can be done by using the distance_range method. PythonTypeScript Sync APIAsync API import lancedb import numpy as np db = lancedb.connect("data/distance_range_demo") data = [ { "id": i, "vector": np.random.random(256), } for i in range(1024) ] tbl = db.create_table("my_table", data=data) query = np.random.random(256) # Search for the vectors within the range of [0.1, 0.5) tbl.search(query).distance_range(0.1, 0.5).to_arrow() # Search for the vectors with the distance less than 0.5 tbl.search(query).distance_range(upper_bound=0.5).to_arrow() # Search for the vectors with the distance greater or equal to 0.1 tbl.search(query).distance_range(lower_bound=0.1).to_arrow() import lancedb import numpy as np db = await lancedb.connect_async("data/distance_range_demo") data = [ { "id": i, "vector": np.random.random(256), } for i in range(1024) ] tbl = await db.create_table("my_table", data=data) query = np.random.random(256) # Search for the vectors within the range of [0.1, 0.5) await (await tbl.search(query)).distance_range(0.1, 0.5).to_arrow() # Search for the vectors with the distance less than 0.5 await (await tbl.search(query)).distance_range(upper_bound=0.5).to_arrow() # Search for the vectors with the distance greater or equal to 0.1 await (await tbl.search(query)).distance_range(lower_bound=0.1).to_arrow() @lancedb/lancedb import * as lancedb from "@lancedb/lancedb"; const results3 = await ( tbl.search(Array(128).fill(1.2)) as lancedb.VectorQuery ) .distanceType("cosine") .distanceRange(0.1, 0.2) .limit(10) .toArray(); Output search results LanceDB returns vector search results via different formats commonly used in python. Let's create a LanceDB table with a nested schema: Python Sync APIAsync API from datetime import datetime import lancedb from lancedb.pydantic import Vector, LanceModel from lancedb.query import BoostQuery, MatchQuery import numpy as np import pyarrow as pa from pydantic import BaseModel class Metadata(BaseModel): source: str timestamp: datetime class Document(BaseModel): content: str meta: Metadata class LanceSchema(LanceModel): id: str vector: Vector(1536) payload: Document # Let's add 100 sample rows to our dataset data = [ LanceSchema( id=f"id{i}", vector=np.random.randn(1536), payload=Document( content=f"document{i}", meta=Metadata(source=f"source{i % 10}", timestamp=datetime.now()), ), ) for i in range(100) ] # Synchronous client tbl = db.create_table("documents", data=data) from datetime import datetime import lancedb from lancedb.pydantic import Vector, LanceModel from lancedb.query import BoostQuery, MatchQuery import numpy as np import pyarrow as pa from pydantic import BaseModel class Metadata(BaseModel): source: str timestamp: datetime class Document(BaseModel): content: str meta: Metadata class LanceSchema(LanceModel): id: str vector: Vector(1536) payload: Document # Let's add 100 sample rows to our dataset data = [ LanceSchema( id=f"id{i}", vector=np.random.randn(1536), payload=Document( content=f"document{i}", meta=Metadata(source=f"source{i % 10}", timestamp=datetime.now()), ), ) for i in range(100) ] async_tbl = await async_db.create_table("documents_async", data=data) As a PyArrow table Using to_arrow() we can get the results back as a pyarrow Table. This result table has the same columns as the LanceDB table, with the addition of an _distance column for vector search or a score column for full text search. Sync APIAsync API tbl.search(np.random.randn(1536)).to_arrow() await (await async_tbl.search(np.random.randn(1536))).to_arrow() As a Pandas DataFrame You can also get the results as a pandas dataframe. Sync APIAsync API tbl.search(np.random.randn(1536)).to_pandas() await (await async_tbl.search(np.random.randn(1536))).to_pandas() While other formats like Arrow/Pydantic/Python dicts have a natural way to handle nested schemas, pandas can only store nested data as a python dict column, which makes it difficult to support nested references. So for convenience, you can also tell LanceDB to flatten a nested schema when creating the pandas dataframe. Sync API tbl.search(np.random.randn(1536)).to_pandas(flatten=True) If your table has a deeply nested struct, you can control how many levels of nesting to flatten by passing in a positive integer. Sync API tbl.search(np.random.randn(1536)).to_pandas(flatten=1) Note flatten is not yet supported with our asynchronous client. As a list of Python dicts You can of course return results as a list of python dicts. Sync APIAsync API tbl.search(np.random.randn(1536)).to_list() await (await async_tbl.search(np.random.randn(1536))).to_list() As a list of Pydantic models We can add data using Pydantic models, and we can certainly retrieve results as Pydantic models Sync API tbl.search(np.random.randn(1536)).to_pydantic(LanceSchema) Note to_pydantic() is not yet supported with our asynchronous client. Note that in this case the extra _distance field is discarded since it's not part of the LanceSchema. Vector search with metadata prefiltering PythonTypeScript import lancedb from datasets import load_dataset # Connect to LanceDB db = lancedb.connect( uri="db://your-project-slug", api_key="your-api-key", region="us-east-1" ) # Load query vector from dataset query_dataset = load_dataset("sunhaozhepy/ag_news_sbert_keywords_embeddings", split="test[5000:5001]") print(f"Query keywords: {query_dataset[0]['keywords']}") query_embed = query_dataset["keywords_embeddings"][0] # Open table and perform search table_name = "lancedb-cloud-quickstart" table = db.open_table(table_name) # Vector search with filters (pre-filtering is the default) search_results = ( table.search(query_embed) .where("label > 2") .select(["text", "keywords", "label"]) .limit(5) .to_pandas() ) print("Search results (with pre-filtering):") print(search_results) import * as lancedb from "@lancedb/lancedb"; // Connect to LanceDB const db = await lancedb.connect({ uri: "db://your-project-slug", apiKey: "your-api-key", region: "us-east-1" }); // Generate a sample 768-dimension embedding vector (typical for BERT-based models) // In real applications, you would get this from an embedding model const dimensions = 768; const queryEmbed = Array.from({ length: dimensions }, () => Math.random() * 2 - 1); // Open table and perform search const tableName = "lancedb-cloud-quickstart"; const table = await db.openTable(tableName); // Vector search with filters (pre-filtering is the default) const vectorResults = await table.search(queryEmbed) .where("label > 2") .select(["text", "keywords", "label"]) .limit(5) .toArray(); console.log("Search results (with pre-filtering):"); console.log(vectorResults); Vector search with metadata postfiltering By default, pre-filtering is performed to filter prior to vector search. This can be useful to narrow down the search space of a very large dataset to reduce query latency. Post-filtering is also an option that performs the filter on the results returned by the vector search. You can use post-filtering as follows: PythonTypeScript results_post_filtered = ( table.search(query_embed) .where("label > 1", prefilter=False) .select(["text", "keywords", "label"]) .limit(5) .to_pandas() ) print("Vector search results with post-filter:") print(results_post_filtered) const vectorResultsWithPostFilter = await (table.search(queryEmbed) as VectorQuery) .where("label > 2") .postfilter() .select(["text", "keywords", "label"]) .limit(5) .toArray(); console.log("Vector search results with post-filter:"); console.log(vectorResultsWithPostFilter); Batch query LanceDB can process multiple similarity search requests simultaneously in a single operation, rather than handling each query individually. PythonTypeScript # Load a batch of query embeddings query_dataset = load_dataset( "sunhaozhepy/ag_news_sbert_keywords_embeddings", split="test[5000:5005]" ) query_embeds = query_dataset["keywords_embeddings"] batch_results = table.search(query_embeds).limit(5).to_pandas() print(batch_results) // Batch query console.log("Performing batch vector search..."); const batchSize = 5; const queryVectors = Array.from( { length: batchSize }, () => Array.from( { length: dimensions }, () => Math.random() * 2 - 1, ), ); let batchQuery = table.search(queryVectors[0]) as VectorQuery; for (let i = 1; i < batchSize; i++) { batchQuery = batchQuery.addQueryVector(queryVectors[i]); } const batchResults = await batchQuery .select(["text", "keywords", "label"]) .limit(5) .toArray(); console.log("Batch vector search results:"); console.log(batchResults); Batch Query Results When processing batch queries, the results include a query_index field to explicitly associate each result set with its corresponding query in the input batch. Other search options Fast search While vector indexing occurs asynchronously, newly added vectors are immediately searchable through a fallback brute-force search mechanism. This ensures zero latency between data insertion and searchability, though it may temporarily increase query response times. To optimize for speed over completeness, enable the fast_search flag in your query to skip searching unindexed data. PythonTypeScript # sync API table.search(embedding, fast_search=True).limit(5).to_pandas() # async API await table.query().nearest_to(embedding).fast_search().limit(5).to_pandas() await table .query() .nearestTo(embedding) .fastSearch() .limit(5) .toArray(); Bypass Vector Index The bypass vector index feature prioritizes search accuracy over query speed by performing an exhaustive search across all vectors. Instead of using the approximate nearest neighbor (ANN) index, it compares the query vector against every vector in the table directly. While this approach increases query latency, especially with large datasets, it provides exact, ground-truth results. This is particularly useful when: - Evaluating ANN index quality - Calculating recall metrics to tune nprobes parameter - Verifying search accuracy for critical applications - Benchmarking approximate vs exact search results PythonTypeScript # sync API table.search(embedding).bypass_vector_index().limit(5).to_pandas() # async API await table.query().nearest_to(embedding).bypass_vector_index().limit(5).to_pandas() await table .query() .nearestTo(embedding) .bypassVectorIndex() .limit(5) .toArray();