Vector embedding search using TransformersJS Embed and query data from LanceDB using TransformersJS This example shows how to use the transformers.js library to perform vector embedding search using LanceDB's Javascript API. Setting up First, install the dependencies: npm install vectordb npm i @xenova/transformers We will also be using the all-MiniLM-L6-v2 model to make it compatible with Transformers.js Within our index.js file we will import the necessary libraries and define our model and database: const lancedb = require('vectordb') const { pipeline } = await import('@xenova/transformers') const pipe = await pipeline('feature-extraction', 'Xenova/all-MiniLM-L6-v2'); Creating the embedding function Next, we will create a function that will take in a string and return the vector embedding of that string. We will use the pipe function we defined earlier to get the vector embedding of the string. // Define the function. `sourceColumn` is required for LanceDB to know // which column to use as input. const embed_fun = {} embed_fun.sourceColumn = 'text' embed_fun.embed = async function (batch) { let result = [] // Given a batch of strings, we will use the `pipe` function to get // the vector embedding of each string. for (let text of batch) { // 'mean' pooling and normalizing allows the embeddings to share the // same length. const res = await pipe(text, { pooling: 'mean', normalize: true }) result.push(Array.from(res['data'])) } return (result) } Creating the database Now, we will create the LanceDB database and add the embedding function we defined earlier. // Link a folder and create a table with data const db = await lancedb.connect('data/sample-lancedb') // You can also import any other data, but make sure that you have a column // for the embedding function to use. const data = [ { id: 1, text: 'Cherry', type: 'fruit' }, { id: 2, text: 'Carrot', type: 'vegetable' }, { id: 3, text: 'Potato', type: 'vegetable' }, { id: 4, text: 'Apple', type: 'fruit' }, { id: 5, text: 'Banana', type: 'fruit' } ] // Create the table with the embedding function const table = await db.createTable('food_table', data, "create", embed_fun) Performing the search Now, we can perform the search using the search function. LanceDB automatically uses the embedding function we defined earlier to get the vector embedding of the query string. // Query the table const results = await table .search("a sweet fruit to eat") .metricType("cosine") .limit(2) .execute() console.log(results.map(r => r.text)) [ 'Banana', 'Cherry' ] Output of results: [ { vector: Float32Array(384) [ -0.057455405592918396, 0.03617725893855095, -0.0367760956287384, ... 381 more items ], id: 5, text: 'Banana', type: 'fruit', _distance: 0.4919965863227844 }, { vector: Float32Array(384) [ 0.0009714411571621895, 0.008223623037338257, 0.009571489877998829, ... 381 more items ], id: 1, text: 'Cherry', type: 'fruit', _distance: 0.5540297031402588 } ] Wrapping it up In this example, we showed how to use the transformers.js library to perform vector embedding search using LanceDB's Javascript API. You can find the full code for this example on Github!