Skip to main content

Filter with Lucene Syntax

Basic usage

  • WhereLucene(string fieldName, string whereClause) takes 2 parameters:

    • Field name:
      The field name tells Lucene which field to search when the clause does not specify a field explicitly.
      In .WhereLucene("Name", "bistro"), the term bistro is searched in the Name field.

    • Where clause:
      The where clause is a string written in Lucene query syntax.
      RavenDB parses this string on the server with Lucene's query parser.

  • When making a dynamic query on a collection:
    RavenDB creates or reuses an auto-index with a full-text search field for the requested field.
    This field uses the default search analyzer. By default, this is RavenStandardAnalyzer, which lowercases and tokenizes the text, so matching is case-insensitive (e.g. bistro matches "Bistro du Centre").

  • When querying a static index:
    The Lucene query clause is parsed using the analyzer configured for the target index field.
    Configure the analyzer in the index definition, for example with Analyze(...).
    If a Search field has no analyzer configured, RavenDB uses the default search analyzer.


The following query returns all companies whose Name field contains the term bistro:

List<Company> companies = session.Advanced
.DocumentQuery<Company>()
.WhereLucene("Name", "bistro")
.ToList();

If no suitable index already exists for this dynamic query, RavenDB creates an auto-index for Companies with Name indexed as a full-text search field, for example, Auto/Companies/BySearch(Name).

Common Lucene clauses

The whereClause argument specifies the filtering predicate in Lucene query syntax.
Common clause types include:

Clause typeExample clauseMeaning
Single termbistroMatch documents whose field contains the term bistro.
Wildcardsbist*Match terms that start with bist.
Single-character wildcardb?stroMatch one character at the ? position.
Boolean operators(bistro OR cafe) AND NOT grillCombine terms with AND, OR, NOT, and parentheses.
Phrase"sales representative"Match the exact phrase, with terms in order.
Proximity"fluent french"~5Match both terms within 5 words of each other.
Fuzzybistor~Match terms similar to bistor, such as minor typos.
Term boostbistro^2 OR cafeMatch both terms, but rank bistro matches higher when ordering by score.
Term range[apple TO banana]Match terms in the inclusive term range.
Use {...} for exclusive bounds.

Wildcards

Use * to match any number of characters and ? to match exactly one character:

// Matches "bistro", "bistros", "bistrot", ...
List<Company> companies = session.Advanced
.DocumentQuery<Company>()
.WhereLucene("Name", "bist*")
.ToList();

// Matches terms such as "bistro" and "bystro"
List<Company> companiesWithSingleCharacterWildcard = session.Advanced
.DocumentQuery<Company>()
.WhereLucene("Name", "b?stro")
.ToList();

Leading wildcards

Leading wildcards (e.g. *stro) are allowed as well.
Note that a leading wildcard forces the server to scan all terms of the field, which can be slow on large indexes.


Boolean operators

Use AND, OR, NOT, and parentheses to combine terms inside the Lucene clause.
Lucene operator keywords are case-sensitive, so they must be written in UPPERCASE.

The following query matches companies whose Name field contains bistro OR cafe,
but excludes names that contain grill:

List<Company> companies = session.Advanced
.DocumentQuery<Company>()
.WhereLucene("Name", "(bistro OR cafe) AND NOT grill")
.ToList();

Surround multiple terms with double quotes to match a phrase.
In C# string literals, escape those quote characters with \".

Add ~<number> after the closing quote to perform a proximity search:
the terms must appear near each other within the specified proximity distance.

// Exact phrase:
List<Employee> employees = session.Advanced
.DocumentQuery<Employee>()
.WhereLucene("Notes", "\"sales representative\"")
.ToList();

// Proximity - "sales" and "french" within a distance of 10 words:
List<Employee> employeesProx = session.Advanced
.DocumentQuery<Employee>()
.WhereLucene("Notes", "\"sales french\"~10")
.ToList();

Append ~ to a single term to match indexed terms that are similar to it, for example, to tolerate minor typos.

You can add a minimum similarity value after ~, such as bistor~0.7.
The value is between 0 and 1 (default: 0.5); higher values require closer similarity:

// Matches "bistro" although the searched term is misspelled
List<Company> companies = session.Advanced
.DocumentQuery<Company>()
.WhereLucene("Name", "bistor~")
.ToList();

Boosting terms

Append ^<number> to a term to increase its relevance score.

Boosting does not change which documents match the clause.
It only affects ranking when results are ordered by score:

// When ordered by score, 
// companies matching "bistro" rank higher than those matching "cafe"
List<Company> companies = session.Advanced
.DocumentQuery<Company>()
.WhereLucene("Name", "bistro^2 OR cafe")
.OrderByScore()
.ToList();

Range clauses

Use square brackets for an inclusive range and curly braces for an exclusive range.
Lucene range clauses compare indexed terms lexicographically:

// Match company names from "bistro" through "cafe", inclusive
List<Company> companies = session.Advanced
.DocumentQuery<Company>()
.WhereLucene("Name", "[bistro TO cafe]")
.ToList();

Numeric and date ranges

For range queries over numeric or date fields, prefer strongly typed methods:
use comparison operators such as WhereGreaterThan / WhereLessThan, or WhereBetween.

A Lucene range clause compares string terms lexicographically,
so numeric and date ranges can produce unexpected results.

Filter on multiple fields

  • In WhereLucene(fieldName, whereClause), the fieldName argument is Lucene's default field:
    terms in the clause that are not prefixed with another field name are checked against this field.

  • Inside the Lucene whereClause, prefix a term with <FieldName>: to search another indexed field.

  • Use this pattern when querying a static index that defines the fields referenced in the Lucene clause.
    RavenDB creates or chooses an auto-index from the fieldName argument;
    it does not infer additional fields from inside the raw Lucene clause.


Assume the following static index exists:

public class Companies_ByNameAndPhone : AbstractIndexCreationTask<Company, Companies_ByNameAndPhone.IndexEntry>
{
public class IndexEntry
{
public string Name { get; set; }
public string Phone { get; set; }
}

public Companies_ByNameAndPhone()
{
Map = companies => from company in companies
select new IndexEntry
{
Name = company.Name,
Phone = company.Phone
};

Index(x => x.Name, FieldIndexing.Search);
Index(x => x.Phone, FieldIndexing.Search);
}
}

The following query targets both the Name and Phone fields:

  • bistro is not prefixed, so it is checked against the default field, Name.
  • Phone:981* is prefixed with Phone:, so it is checked against the Phone field.

The query matches companies whose Name contains bistro
and whose Phone field contains a term that starts with 981.

List<Company> companies = session.Advanced
.DocumentQuery<Company>("Companies/ByNameAndPhone")
.WhereLucene("Name", "bistro AND Phone:981*")
.ToList();

Case-sensitive matching

To keep query terms unchanged, use exact: true in the client API, or wrap lucene() with exact(...) in RQL.
RavenDB then parses the Lucene clause with Lucene's KeywordAnalyzer.

For case-sensitive matching, the indexed field must also preserve casing.
The example below uses a static index that defines Name as an exact field.

public class Companies_ByNameExact : AbstractIndexCreationTask<Company, Companies_ByNameExact.IndexEntry>
{
public class IndexEntry
{
public string Name { get; set; }
}

public Companies_ByNameExact()
{
Map = companies => from company in companies
select new IndexEntry
{
Name = company.Name
};

Index(x => x.Name, FieldIndexing.Exact);
}
}

Query the index with:

// Matches "Bistro...", but not "bistro..." or "BISTRO..."
List<Company> companies = session.Advanced
.DocumentQuery<Company>("Companies/ByNameExact")
.WhereLucene("Name", "Bistro*", exact: true)
.ToList();

Exact analyzer

FieldIndexing.Exact indexes the field with the default exact analyzer.
The default is KeywordAnalyzer.

WhereLucene(..., exact: true) and exact(lucene(...)) use the KeywordAnalyzer at query time.
If you configure a different exact analyzer for indexing, make sure the query terms can still match the indexed terms.

Combine with other filtering methods

You can combine a Lucene clause with other DocumentQuery filters.
Use AndAlso(), OrElse(), and the Not modifier to compose filters outside the Lucene clause:

The following query matches companies whose Name contains bistro or cafe, and whose country is France.

List<Company> companies = session.Advanced
.DocumentQuery<Company>()
.WhereLucene("Name", "bistro OR cafe")
.AndAlso()
.WhereEquals("Address.Country", "France")
.ToList();

Escaping special characters

The following characters are part of Lucene query syntax.
To match them as literal characters, escape them with a backslash (\):

+ - && || ! ( ) { } [ ] ^ " ~ * ? : \

For example, the following clause searches for the literal text (171) in the Phone field:

// The C# string "\\(" becomes \( in the Lucene clause
List<Company> companies = session.Advanced
.DocumentQuery<Company>()
.WhereLucene("Phone", "\\(171\\)")
.ToList();

Escape user input

The whereClause is parsed as Lucene query syntax on the server.

If you build the clause from user input, escape or validate the user-provided value before embedding it.
Otherwise, characters such as *, ?, :, or operators such as OR can change the meaning of the query.

For example, a user value that contains OR Name:* could turn a literal search value into a broader Lucene query.

Restrictions

  • Lucene search engine only
    WhereLucene / lucene() is supported only by indexes that use the Lucene search engine.
    Running the query against an index that uses Corax throws an exception.

    The search engine in use is determined by the Indexing.Auto.SearchEngineType and Indexing.Static.SearchEngineType configuration keys, or per index (static indexes).


  • DocumentQuery only
    In the client API, WhereLucene is available in DocumentQuery / AsyncDocumentQuery and has no LINQ (session.Query) equivalent. In raw RQL, use lucene().

  • The clause must be a string or null
    The whereClause argument, or the value passed to lucene() in RQL, must be a string or null.
    Passing any other value type throws an exception.
    A null clause is treated as a null-value query for the specified field.

  • The clause must be valid Lucene syntax
    The clause is parsed by Lucene's query parser when the query executes.
    Invalid Lucene syntax fails the query with a parse error.

  • Use static indexes for multi-field Lucene clauses
    In a dynamic query, RavenDB creates or chooses an auto-index based on the field passed as the first argument.
    Field prefixes inside the raw Lucene clause, such as Phone:981*, are not used to infer additional auto-index fields.
    Use a static index when the Lucene clause references fields other than the first-argument field.

    from Companies
    where lucene(Name, 'bistro AND Phone:981*')

    In this dynamic query, Name is the field RavenDB can infer for the auto-index.
    Phone appears only inside the Lucene clause string, so define a static index that maps both Name and Phone.

Syntax

IDocumentQuery<T> WhereLucene(string fieldName, string whereClause);
IDocumentQuery<T> WhereLucene(string fieldName, string whereClause, bool exact);
ParameterTypeDescription
fieldNamestringName of the field the clause is checked against (the Lucene default field). Terms inside the clause can target other fields using the <FieldName>: prefix.
whereClausestringA predicate written in Lucene query syntax.
exactbooltrue - parse the clause with KeywordAnalyzer for exact matching.
Default: false.

In this article