Getting Started with LanceDB

This is a minimal tutorial for Python users on LanceDB Cloud. To run the notebook, open in Colab .

  1. Sign up for LanceDB Cloud
  2. Follow our tutorial video to create a LanceDB Cloud API key

1. Install LanceDB

LanceDB requires Python 3.8+ and can be installed via pip. The pandas package is optional but recommended for data manipulation.

python
pip install lancedb pandas

2. Import Libraries

Import the libraries. lancedb provides the core vector database functionality, while pandas helps with data handling.

python
import lancedb
import pandas as pd

3. Connect to LanceDB Cloud

LanceDB Cloud provides managed infrastructure, security, and automatic backups. The connection uri determines where your data is stored.

python
db = lancedb.connect(
    uri="db://your-project-slug",
    api_key="your-api-key",
    region="us-east-1"
)

4. Add Data

Create a pandas DataFrame with your data. Each row must contain a vector field (list of floats) and can include additional metadata.

python
data = pd.DataFrame([
    {"id": "1", "vector": [0.9, 0.4, 0.8], "text": "knight"},    
    {"id": "2", "vector": [0.8, 0.5, 0.3], "text": "ranger"},  
    {"id": "3", "vector": [0.5, 0.9, 0.6], "text": "cleric"},    
    {"id": "4", "vector": [0.3, 0.8, 0.7], "text": "rogue"},     
    {"id": "5", "vector": [0.2, 1.0, 0.5], "text": "thief"},     
])

5. Create a Table

Create a table in the database. The table takes on the schema of your ingested data.

python
table = db.create_table("adventurers", data)

Now, go to LanceDB Cloud and verify that your remote table has been created:

Quickstart Table

Perform a vector similarity search. The query vector should have the same dimensionality as your data vectors. The search returns the most similar vectors based on euclidean distance.

Our query is “warrior”, represented by a vector [0.8, 0.3, 0.8]. Let’s find the most similar adventurer:

python
query_vector = [0.8, 0.3, 0.8] # warrior 
results = table.search(query_vector).limit(3).to_pandas()
print(results)

7. Results

The results show the most similar vectors to your query, sorted by similarity score (distance). Lower distance means higher similarity.

python
| id | vector          | text    | distance  |
|----|-----------------|---------|-----------|
| 1  | [0.9, 0.4, 0.8] | knight  | 0.02      |
| 2  | [0.8, 0.5, 0.3] | ranger  | 0.29      |
| 3  | [0.5, 0.9, 0.6] | cleric  | 0.49      |

Looks like the knight is the most similar to the warrior.

This is, of course, a simplified scenario - but the engine offered by LanceDB allows you to conduct complex calculations at high volumes and extreme speeds.

In real world scenarios, embeddings capture meaning and vector search gives you access to powerful ways of finding semantic relevance or contextual relations.

💡 What's Next?
Check out the next tutorial on Common Database Operations .