Semantic Search with RavenDB, Python, and FastAPI

Name: RavenDB
Author: RavenDB

Paweł Lachowski

Technical Writer

Published on July 22, 2025

Tags:Python FastAPI AI Demo Use Case

Introduction

Semantic search is a more intuitive way for your users to find the content they need. They do not need to look for specific keywords but search by meaning. If you want to enhance your users' experience, it will simplify finding the content your users are looking for.

To achieve this, your data needs to be vectorized into AI model embeddings, as they are a digital representation of the meaning of your data within a specific model.

To reduce data logistics and the amount of code to be written and maintained, RavenDB offers automatic embedding generation. If you don't need another custom solution for that and want to ship faster, this is the way. Automatic embedding generation with RavenDB handles your data logistics out of the box, simplifying app development.

In this article, we will create a sample FastAPI application to show you how vector search works. We will implement both manual and automatic embedding generation.

Setup

Application

Using FastAPI, we can quickly build a web AI search endpoint to demonstrate how semantic search works. We'll use a built-in OpenAPI interface to picture that.

See, we query for "Cheese" and we get all kinds of cheese products from our database:

Under the hood, the application translates our query term "Cheese" to an embedding (vector) on the fly and compares other vectors within the database, finding the closest "meanings". Let's show how to build that.

Without the query implementation, our application looks like this:

from fastapi import FastAPI
from pydantic import BaseModel
from ravendb import DocumentStore

app = FastAPI()

# RavenDB setup
document_store = DocumentStore(
    urls=["http://127.0.0.1:8080"],
    database="Northwind"
)
document_store.initialize()

# Northwind Product schema
class Product(BaseModel):
    name: str
    supplier: str
    category: str
    quantity_per_unit: str
    price_per_unit: float
    units_in_stock: int
    units_on_order: int
    discontinued: bool
    reorder_level: int


@app.get("/products")
async def search_products(query: str):
    ...

if __name__ == "__main__":
    import uvicorn
    uvicorn.run("app:app", host="127.0.0.1", port=8000, reload=True)

To make it work, we need a few Python packages:

pip install fastapi
pip install uvicorn
pip install ravendb

fastapi: framework library
uvicorn: ASGI web server
ravendb: Python SDK for RavenDB

This fragment sets up RavenDB. It connects to a database running on our local machine with the Northwind sample data set.

# RavenDB setup
document_store = DocumentStore(
    urls=["http://127.0.0.1:8080"],
    database="Northwind"
)
document_store.initialize()

Then, we create the Product class to represent the JSON document schema for documents in the Products collection of the "Northwind" database. This allows us to work with Product documents as objects.

# Northwind Product schema
class Product(BaseModel):
    name: str
    supplier: str
    category: str
    quantity_per_unit: str
    price_per_unit: float
    units_in_stock: int
    units_on_order: int
    discontinued: bool
    reorder_level: int

Next, we define endpoints to query products. We'll add the detailed query logic once we have embeddings set up.

@app.get("/products")
async def search_products(query: str):
    ...

The last step is to use Uvicorn to launch this app. It handles your web requests, allowing them to reach our application. Uvicorn serves as a simple bridge between the network and your application.

if __name__ == "__main__":
    import uvicorn
    uvicorn.run("app:app", host="127.0.0.1", port=8000, reload=True)

Vector Search with manual embeddings generation

We can generate embeddings manually, but it requires more effort on our part. We need to take care of data logistics and structure, and we may still be missing some more advanced functionalities, for example, caching or chunking. Let's show how to generate embeddings using the popular OpenAI embedding model "text-embedding-3-small".

import openai

def get_embedding(text: str) -> list[float]:
    response = openai.embeddings.create(input=text, model="text-embedding-3-small")
    return response.data[0].embedding

# Example usage
embedding = get_embedding("Example product name")

We need to install and use the openai Python package. This allows us to interact with AI models from OpenAI. Next, we generate embeddings with get_embedding for the entire Products collection. Then we put the vector in the vector_embedding field. We will query all Products, add the field vector_embedding, and then call save_changes.

with document_store.open_session() as session:
    products = list(session.query(object_type=Product))
    for product in products:
        product_name_embedding = get_embedding(product.name)
        product.vector_embedding = product_name_embedding
    session.save_changes()

Note: The Northwind database is relatively small, allowing us to query and update the documents directly. However, for larger datasets, you'd need to explore a more effective strategy. If that's your case, see how to use automatic embeddings generation later in this article.

Products now have an embedding field, allowing us to run RavenDB Vector Search. But to compare vectors, we also need an embedding vector for the query search term. We use the same function to generate embeddings for incoming search terms on the fly using the same method (get_embedding).

query_embedding = get_embedding("search term")

with document_store.open_session() as session:
    results = list(
        session.query(object_type=Product)
        .vector_search("vector_embedding", query_embedding, minimum_similarity=0.75)
        .take(10)
    )

    return [
            {
                "name": p.name,
                "price_per_unit": p.price_per_unit,
                "units_in_stock": p.units_in_stock,
                "discontinued": p.discontinued,
            }
            for p in results
        ]

This gives us results not based on an exact match but on semantic meaning.

This way we need to:

Vectorize our entire Products dataset with OpenAI manually
Put vector embeddings in the separate document fields
Vectorize the query terms manually on the fly
Compare the search term embedding vector with other vectors from the Products collection using RavenDB Vector Search.

… and we still lack features like caching repetitive search terms and text chunking.

Aspect	Manual embeddings	Automatic embeddings
Embedding generation	Your code calls OpenAI directly	RavenDB calls the model via a configured task
Initial data vectorization	You loop through documents and save vectors yourself	RavenDB handles the backfill automatically
Query-time vectorization	Your code calls `get_embedding()` on each request	`vector_search_text_using_task` handles it internally
Caching repeated queries	Not included; you build it yourself	Built in
Chunking support	Not included; you build it yourself	Built in
Switching AI models	Requires code changes and re-vectorization	Change the connection string in Studio
Application code size	~50 lines including OpenAI setup	~30 lines, no AI client code
Best for	Full control over embedding logic	Faster delivery, less maintenance overhead

Vector search with automatic embeddings

Now, let's try automatic embeddings generation in RavenDB. The code will be the same as starting one, but reduced by manual communication with OpenAI. We just need to add the embeddings generation in RavenDB Studio.

Adding automatic embeddings generation starts in the AI Hub. We will automate the communication with the external AI model. To do so, we need to define which model we want to use.

AI Connection String

Create a new AI connection string in RavenDB Studio:

Define your custom name and identifier and pick the service you want to use; we chose OpenAI. Then, in the new fields, select the endpoint & model, and paste your API key. Other fields are optional and not currently relevant to us.

You can test the connection to ensure everything works properly.

AI Task

We can connect to the OpenAI model, but we need to create a task that generates embeddings. Go back to the AI Hub and choose 'AI Tasks'. Create a new embeddings generation task and fill in its name and identifier. Select our new connection string. We select the 'Products' collection and type 'Name' for a path just beneath the collection. Just save it, and it's ready.

The task configuration is minimal: just a name, a connection string, the target collection, and the field path. RavenDB handles everything else: scheduling, API calls, and storing the resulting vectors.

Application code

All the required parts for your own embeddings generation and adding the field can be removed, making the code smaller.

from fastapi import FastAPI
from pydantic import BaseModel
from ravendb import DocumentStore

# RavenDB setup
document_store = DocumentStore(
    urls=["http://127.0.0.1:8080"],
    database="Northwind"
)
document_store.initialize()

# Northwind Product schema
class Product(BaseModel):
    name: str
    supplier: str
    category: str
    quantity_per_unit: str
    price_per_unit: float
    units_in_stock: int
    units_on_order: int
    discontinued: bool
    reorder_level: int

@app.get("/products")
async def search_products(query: str):
    with document_store.open_session() as session:
        results = list(session.query(object_type=Product).vector_search_text_using_task("Name", query, "my-ai-task", minimum_similarity=0.7))

        return [
            {
                "name": p.name,
                "price_per_unit": p.price_per_unit,
                "units_in_stock": p.units_in_stock,
                "discontinued": p.discontinued,
            }
            for p in results
        ]

if __name__ == "__main__":
    import uvicorn
    uvicorn.run("app:app", host="127.0.0.1", port=8000, reload=True)

Compared to the manual approach, the application code no longer needs to manage an OpenAI client, generate embeddings per document, or store them in a separate field. The vector_search_text_using_task call replaces the full manual pipeline: RavenDB resolves the task, vectorizes the query on the fly, and returns ranked results.

This way we get:

Out-of-the-box embeddings for your collection
Out-of-the-box embeddings for query terms
Caching without the need to write it
Ability to change the model without the need to rewire the connection

Works perfectly:

In the studio, RavenDB created a separate collection for embeddings and cached terms:

Everything's working on its own, and our semantic search can be delivered much quicker.

Summary

In this article, you built a semantic search endpoint using RavenDB, Python, and FastAPI, twice.

Manual approach: You vectorized the Products collection yourself using openai.embeddings.create, stored the vectors in a document field, and queried them with vector_search. This gives you full control but requires managing API calls, data updates, and lacks built-in caching.
Automatic approach: You configured an embeddings generation task in RavenDB Studio and called vector_search_text_using_task in a single line. RavenDB handles vectorization, storage, caching, and query-time embedding, with no OpenAI client code needed in the application.

For most projects, the automatic approach is the better starting point: less code, easier to maintain, and caching included out of the box.

Built something with RavenDB? Share it with the community on Discord.

Introduction​

Setup​

Application​

Vector Search with manual embeddings generation​

Vector search with automatic embeddings​

AI Connection String​

AI Task​

Application code​

Summary​