Connection String to Ollama

This article explains how to define a connection string to Ollama,
enabling RavenDB to use Ollama models for Embeddings generation tasks, Gen AI tasks, and AI agents.
In this article:

Define the connection string - from the Studio

Configuring a text embedding model

connection string to ollama

Name
Enter a name for this connection string.
Identifier (optional)
Learn more about the identifier in the connection string identifier section.
Model Type
Select "Text Embeddings".
Connector
Select Ollama from the dropdown menu.
URI
Enter the Ollama API URI.
Model
Specify the Ollama text embedding model to use.
Max concurrent query batches: (optional)
- When making vector search queries, the content of the search terms must also be converted to embeddings to compare them against the stored vectors.
  Requests to generate such query embeddings via the AI provider are sent in batches.
- This parameter defines the maximum number of these batches that can be processed concurrently.
  You can set a default value using the Ai.Embeddings.MaxConcurrentBatches configuration key.
Click Test Connection to confirm the connection string is set up correctly.
Click Save to store the connection string or Cancel to discard changes.

Configuring a chat model

When configuring a chat model, the UI displays the same base fields as those used for text embedding models,
including the connection string Name, optional Identifier, URI, and Model name.
In addition, two fields are specific to chat models: Temperature and Thinking mode.

connection string to ollama

Model Type
Select "Chat".
Model
Enter the name of the Ollama model to use for chat completions.
Thinking mode (optional)
The thinking mode setting controls whether the model outputs its internal reasoning steps before returning the final answer.
- When setting to Enabled:
  the model outputs a series of intermediate reasoning steps (chain of thought) before the final answer.
  This may improve output quality for complex tasks, but increases response time and token usage.
- When setting to Disabled:
  the model returns only the final answer, without exposing intermediate steps.
  This is typically faster and more cost-effective (uses fewer tokens),
  but may reduce quality on complex reasoning tasks.
- When setting to Default:
  The model’s built-in default will be used. This value may vary depending on the selected model.
  Set this parameter based on the trade-off between task complexity and speed/cost requirements.
Temperature (optional)
The temperature setting controls the randomness and creativity of the model’s output.
Valid values typically range from 0.0 to 2.0:
- Higher values (e.g., 1.0 or above) produce more diverse and creative responses.
- Lower values (e.g., 0.2) result in more focused, consistent, and deterministic output.
- If not explicitly set, Ollama defaults to a temperature of 0.8.
  See Ollama's parameters reference.

Define the connection string - from the Client API

Connection_string_for_text_embedding_model
Connection_string_for_chat_model

using (var store = new DocumentStore())
{
    // Define the connection string to Ollama
    var connectionString = new AiConnectionString
    {
        // Connection string Name & Identifier
        Name = "ConnectionStringToOllama", 
        Identifier = "identifier-to-the-connection-string", // optional
        
        // Model type
        ModelType = AiModelType.TextEmbeddings,
        
        // Ollama connection settings
        OllamaSettings = new OllamaSettings
        {
            Uri = "http://localhost:11434",
            
            // Name of text embedding model to use
            Model = "mxbai-embed-large",
            
            // Optionally, override the default maximum number of query embedding batches
            // that can be processed concurrently 
            EmbeddingsMaxConcurrentBatches = 10
        }
    };
    
    // Deploy the connection string to the server
    var putConnectionStringOp = new PutConnectionStringOperation<AiConnectionString>(connectionString);
    var putConnectionStringResult = store.Maintenance.Send(putConnectionStringOp);
}

using (var store = new DocumentStore())
{
    // Define the connection string to Ollama
    var connectionString = new AiConnectionString
    {
        // Connection string Name & Identifier
        Name = "ConnectionStringToOllama", 
        Identifier = "identifier-to-the-connection-string", // optional
        
        // Model type
        ModelType = AiModelType.Chat,
        
        // Ollama connection settings
        OllamaSettings = new OllamaSettings
        {
            Uri = "http://localhost:11434",
            
            // Name of chat model to use
            Model = "llama3:8b-instruct",
            
            // Optionally, set the model's temperature
            Temperature = 0.4,
            
            // Optionally, set the model's thinking behavior
            Think = true
        }
    };
    
    // Deploy the connection string to the server
    var putConnectionStringOp = new PutConnectionStringOperation<AiConnectionString>(connectionString);    
    var putConnectionStringResult = store.Maintenance.Send(putConnectionStringOp);
}

Syntax

public class AiConnectionString
{
    public string Name { get; set; }
    public string Identifier { get; set; }
    public AiModelType ModelType { get; set; }
    public OllamaSettings OllamaSettings { get; set; }
}

public class OllamaSettings : AbstractAiSettings
{
    // The base URI of your Ollama server
    // For a local setup, use: "http://localhost:11434"
    public string Uri { get; set; }
    
    // The name of the model to use 
    public string Model { get; set; }
    
    // Relevant only for chat models:
    // Control whether the model outputs its internal reasoning steps before returning the final answer.
    // 'true'  - the model outputs intermediate reasoning steps (chain of thought) before the final answer.
    // 'false' - the model returns only the final answer, without exposing intermediate steps.
    // 'null'  - the model’s default behavior is used.
    public bool? Think { get; set; }
    
    // Relevant only for chat models:
    // Controls the randomness and creativity of the model’s output.
    // Higher values (e.g., 1.0 or above) produce more diverse and creative responses.
    // Lower values (e.g., 0.2) result in more focused and deterministic output.
    // If set to 'null', the temperature is not sent and the model's default will be used.
    public double? Temperature { get; set; }
}

public class AbstractAiSettings
{
    public int? EmbeddingsMaxConcurrentBatches { get; set; }
}

Define the connection string - from the Studio​

Configuring a text embedding model​

Configuring a chat model​

Define the connection string - from the Client API​

Syntax​

Define the connection string - from the Studio

Configuring a text embedding model

Configuring a chat model

Define the connection string - from the Client API

Syntax