In LanceDB OSS, users can set the read_consistency_interval parameter on connections to achieve different levels of read consistency. This parameter determines how frequently the database synchronizes with the underlying storage system to check for updates made by other processes. If another process updates a table, the database will not see the changes until the next synchronization.
There are three possible settings for read_consistency_interval:
To set strong consistency, set the interval to 0:
from datetime import timedelta
uri = "data/sample-lancedb"
db = lancedb.connect(uri, read_consistency_interval=timedelta(0))
tbl = db.open_table("test_table")const db = await lancedb.connect({ uri: "./.lancedb", readConsistencyInterval: 0 });
const tbl = await db.openTable("my_table");For eventual consistency, use a custom interval:
from datetime import timedelta
uri = "data/sample-lancedb"
db = lancedb.connect(uri, read_consistency_interval=timedelta(seconds=5))
tbl = db.open_table("test_table")const db = await lancedb.connect({ uri: "./.lancedb", readConsistencyInterval: 5 });
const tbl = await db.openTable("my_table");By default, a Table will never check for updates from other writers. To manually check for updates you can use checkout_latest:
tbl = db.open_table("test_table")
# (Other writes happen to my_table from another process)
# Check for updates
tbl.checkout_latest()In LanceDB Python, you can use the on_bad_vectors parameter to choose how
invalid vector values are handled. Invalid vectors are vectors that are not valid
because:
By default, LanceDB will raise an error if it encounters a bad vector. You can also choose one of the following options:
drop: Ignore rows with bad vectorsfill: Replace bad values (NaNs) or missing values (too few dimensions) with
the fill value specified in the fill_value parameter. An input like
[1.0, NaN, 3.0] will be replaced with [1.0, 0.0, 3.0] if fill_value=0.0.null: Replace bad vectors with null (only works if the column is nullable).
A bad vector [1.0, NaN, 3.0] will be replaced with null if the column is
nullable. If the vector column is non-nullable, then bad vectors will cause an
error