Skip to main content

Semantic Search with RavenDB, Python, and FastAPI

Paweł Lachowski
Paweł Lachowski
Technical Writer
Published on July 22, 2025
Semantic Search with RavenDB, Python, and FastAPI

Introduction

Semantic search is a more intuitive way for your users to find the content they need. They do not need to look for specific keywords but search by meaning. If you want to enhance your users' experience, it will simplify finding the content your users are looking for.

To achieve this, your data needs to be vectorized into AI model embeddings, as they are a digital representation of the meaning of your data within a specific model.

To reduce data logistics and the amount of code to be written and maintained, RavenDB offers automatic embedding generation. If you don't need another custom solution for that and want to ship faster, this is the way. Automatic embedding generation with RavenDB handles your data logistics out of the box, simplifying app development.

In this article, we will create a sample FastAPI application to show you how vector search works. We will implement both manual and automatic embedding generation.

Setup

Application

Using FastAPI, we can quickly build a web AI search endpoint to demonstrate how semantic search works. We'll use a built-in OpenAPI interface to picture that.

See, we query for "Cheese" and we get all kinds of cheese products from our database:

FastAPI Swagger UI with a 'Cheese' query entered into the search endpoint
FastAPI Swagger UI response showing cheese products returned by RavenDB vector search

Under the hood, the application translates our query term "Cheese" to an embedding (vector) on the fly and compares other vectors within the database, finding the closest "meanings". Let's show how to build that.

Without the query implementation, our application looks like this:

from fastapi import FastAPI
from pydantic import BaseModel
from ravendb import DocumentStore

app = FastAPI()

# RavenDB setup
document_store = DocumentStore(
urls=["http://127.0.0.1:8080"],
database="Northwind"
)
document_store.initialize()

# Northwind Product schema
class Product(BaseModel):
name: str
supplier: str
category: str
quantity_per_unit: str
price_per_unit: float
units_in_stock: int
units_on_order: int
discontinued: bool
reorder_level: int


@app.get("/products")
async def search_products(query: str):
...

if __name__ == "__main__":
import uvicorn
uvicorn.run("app:app", host="127.0.0.1", port=8000, reload=True)

To make it work, we need a few Python packages:

pip install fastapi
pip install uvicorn
pip install ravendb
  • fastapi: framework library
  • uvicorn: ASGI web server
  • ravendb: Python SDK for RavenDB

This fragment sets up RavenDB. It connects to a database running on our local machine with the Northwind sample data set.

# RavenDB setup
document_store = DocumentStore(
urls=["http://127.0.0.1:8080"],
database="Northwind"
)
document_store.initialize()

Then, we create the Product class to represent the JSON document schema for documents in the Products collection of the "Northwind" database. This allows us to work with Product documents as objects.

# Northwind Product schema
class Product(BaseModel):
name: str
supplier: str
category: str
quantity_per_unit: str
price_per_unit: float
units_in_stock: int
units_on_order: int
discontinued: bool
reorder_level: int

Next, we define endpoints to query products. We'll add the detailed query logic once we have embeddings set up.

@app.get("/products")
async def search_products(query: str):
...

The last step is to use Uvicorn to launch this app. It handles your web requests, allowing them to reach our application. Uvicorn serves as a simple bridge between the network and your application.

if __name__ == "__main__":
import uvicorn
uvicorn.run("app:app", host="127.0.0.1", port=8000, reload=True)

Vector Search with manual embeddings generation

We can generate embeddings manually, but it requires more effort on our part. We need to take care of data logistics and structure, and we may still be missing some more advanced functionalities, for example, caching or chunking. Let's show how to generate embeddings using the popular OpenAI embedding model "text-embedding-3-small".

import openai

def get_embedding(text: str) -> list[float]:
response = openai.embeddings.create(input=text, model="text-embedding-3-small")
return response.data[0].embedding

# Example usage
embedding = get_embedding("Example product name")

We need to install and use the openai Python package. This allows us to interact with AI models from OpenAI. Next, we generate embeddings with get_embedding for the entire Products collection. Then we put the vector in the vector_embedding field. We will query all Products, add the field vector_embedding, and then call save_changes.

with document_store.open_session() as session:
products = list(session.query(object_type=Product))
for product in products:
product_name_embedding = get_embedding(product.name)
product.vector_embedding = product_name_embedding
session.save_changes()

Note: The Northwind database is relatively small, allowing us to query and update the documents directly. However, for larger datasets, you'd need to explore a more effective strategy. If that's your case, see how to use automatic embeddings generation later in this article.

Products now have an embedding field, allowing us to run RavenDB Vector Search. But to compare vectors, we also need an embedding vector for the query search term. We use the same function to generate embeddings for incoming search terms on the fly using the same method (get_embedding).

query_embedding = get_embedding("search term")

with document_store.open_session() as session:
results = list(
session.query(object_type=Product)
.vector_search("vector_embedding", query_embedding, minimum_similarity=0.75)
.take(10)
)

return [
{
"name": p.name,
"price_per_unit": p.price_per_unit,
"units_in_stock": p.units_in_stock,
"discontinued": p.discontinued,
}
for p in results
]

This gives us results not based on an exact match but on semantic meaning.

This way we need to:

  1. Vectorize our entire Products dataset with OpenAI manually
  2. Put vector embeddings in the separate document fields
  3. Vectorize the query terms manually on the fly
  4. Compare the search term embedding vector with other vectors from the Products collection using RavenDB Vector Search.

… and we still lack features like caching repetitive search terms and text chunking.

AspectManual embeddingsAutomatic embeddings
Embedding generationYour code calls OpenAI directlyRavenDB calls the model via a configured task
Initial data vectorizationYou loop through documents and save vectors yourselfRavenDB handles the backfill automatically
Query-time vectorizationYour code calls get_embedding() on each requestvector_search_text_using_task handles it internally
Caching repeated queriesNot included; you build it yourselfBuilt in
Chunking supportNot included; you build it yourselfBuilt in
Switching AI modelsRequires code changes and re-vectorizationChange the connection string in Studio
Application code size~50 lines including OpenAI setup~30 lines, no AI client code
Best forFull control over embedding logicFaster delivery, less maintenance overhead

Vector search with automatic embeddings

Now, let's try automatic embeddings generation in RavenDB. The code will be the same as starting one, but reduced by manual communication with OpenAI. We just need to add the embeddings generation in RavenDB Studio.

Adding automatic embeddings generation starts in the AI Hub. We will automate the communication with the external AI model. To do so, we need to define which model we want to use.

AI Connection String

Create a new AI connection string in RavenDB Studio:

RavenDB Studio form for creating a new AI connection string, with the OpenAI provider selected

Define your custom name and identifier and pick the service you want to use; we chose OpenAI. Then, in the new fields, select the endpoint & model, and paste your API key. Other fields are optional and not currently relevant to us.

You can test the connection to ensure everything works properly.

AI Task

We can connect to the OpenAI model, but we need to create a task that generates embeddings. Go back to the AI Hub and choose 'AI Tasks'. Create a new embeddings generation task and fill in its name and identifier. Select our new connection string. We select the 'Products' collection and type 'Name' for a path just beneath the collection. Just save it, and it's ready.

RavenDB Studio form for a new embeddings generation task, targeting the Products collection Name field

The task configuration is minimal: just a name, a connection string, the target collection, and the field path. RavenDB handles everything else: scheduling, API calls, and storing the resulting vectors.

Application code

All the required parts for your own embeddings generation and adding the field can be removed, making the code smaller.

from fastapi import FastAPI
from pydantic import BaseModel
from ravendb import DocumentStore

# RavenDB setup
document_store = DocumentStore(
urls=["http://127.0.0.1:8080"],
database="Northwind"
)
document_store.initialize()

# Northwind Product schema
class Product(BaseModel):
name: str
supplier: str
category: str
quantity_per_unit: str
price_per_unit: float
units_in_stock: int
units_on_order: int
discontinued: bool
reorder_level: int

@app.get("/products")
async def search_products(query: str):
with document_store.open_session() as session:
results = list(session.query(object_type=Product).vector_search_text_using_task("Name", query, "my-ai-task", minimum_similarity=0.7))

return [
{
"name": p.name,
"price_per_unit": p.price_per_unit,
"units_in_stock": p.units_in_stock,
"discontinued": p.discontinued,
}
for p in results
]

if __name__ == "__main__":
import uvicorn
uvicorn.run("app:app", host="127.0.0.1", port=8000, reload=True)

Compared to the manual approach, the application code no longer needs to manage an OpenAI client, generate embeddings per document, or store them in a separate field. The vector_search_text_using_task call replaces the full manual pipeline: RavenDB resolves the task, vectorizes the query on the fly, and returns ranked results.

This way we get:

  1. Out-of-the-box embeddings for your collection
  2. Out-of-the-box embeddings for query terms
  3. Caching without the need to write it
  4. Ability to change the model without the need to rewire the connection

Works perfectly:

FastAPI Swagger UI showing a semantic search query returning relevant products using automatic RavenDB embeddings
FastAPI Swagger UI response body with product results returned by automatic vector search

In the studio, RavenDB created a separate collection for embeddings and cached terms:

RavenDB Studio collections view showing an auto-generated embeddings collection and cached query terms

Everything's working on its own, and our semantic search can be delivered much quicker.

Summary

In this article, you built a semantic search endpoint using RavenDB, Python, and FastAPI, twice.

  • Manual approach: You vectorized the Products collection yourself using openai.embeddings.create, stored the vectors in a document field, and queried them with vector_search. This gives you full control but requires managing API calls, data updates, and lacks built-in caching.
  • Automatic approach: You configured an embeddings generation task in RavenDB Studio and called vector_search_text_using_task in a single line. RavenDB handles vectorization, storage, caching, and query-time embedding, with no OpenAI client code needed in the application.

For most projects, the automatic approach is the better starting point: less code, easier to maintain, and caching included out of the box.

Built something with RavenDB? Share it with the community on Discord.

In this article