Vector Search Tutorials

Feature Description
Vector Search Learn the fundamentals of vector search, including how to perform similarity searches, use different distance metrics, and optimize performance.
Hybrid Search Combine keyword-based search with vector search to improve retrieval accuracy and relevance.
Full-Text Search Perform full-text search on your text data, and combine it with vector search for a powerful hybrid search experience.
Reranking Refine your search results using reranking models to improve the relevance of the top-k results.
Multi-vector Search Use multiple vector embeddings per document to perform more nuanced and accurate searches.

Examples

This section provides handpicked examples of applications built with LanceDB, showcasing its versatility and power.

Example Description
Hybrid Search & Reranking on BEIR
Open In Colab
View on GitHub
This example demonstrates how to use LanceDB’s built-in hybrid search feature, which combines the strengths of both semantic and full-text search. By using the BEIR dataset, it shows how to achieve more relevant results by searching for both the meaning of a query and the specific keywords it contains.
Semantic Search Across Videos
Open In Colab
View on GitHub
Learn how to build a video search application using V-JEPA (Video Joint Embedding Predictive Architecture) and LanceDB. This example shows how to generate vector embeddings for videos and then use LanceDB to perform similarity searches, allowing you to find videos that are visually similar to a given query.
Semantic Result Merging
Open In Colab
View on GitHub
Explore the concept of vector arithmetic with LanceDB. This notebook demonstrates how you can manipulate vector embeddings to capture more complex relationships in your data. For instance, you can modify a search query by adding or subtracting vector representations of different concepts, enabling more nuanced and powerful semantic search.
Reddit Concept Summarizer
Open In Colab
View on GitHub
This project showcases a complete pipeline for acquiring text data from Reddit, transforming it into meaningful vector representations using embeddings, and then storing and managing those vectors in LanceDB. It demonstrates how to build applications on top of this data, such as summarization and powerful semantic search.
NER-Powered Vector Search
Open In Colab
View on GitHub
This example demonstrates how to use Named Entity Recognition (NER) to power vector search. By extracting entities (like people, places, and organizations) from text and creating vector embeddings of them, you can significantly improve the accuracy of your search results.
Multivector Search with XTR
Open In Colab
View on GitHub
This notebook dives into LanceDB’s advanced multivector search capabilities, enhanced by the XTR (ConteXtualized Token Retriever) technique. It shows how to represent complex data with multiple vectors for more nuanced meaning and how XTR speeds up retrieval by prioritizing the most important tokens.