Skip to main content

Full-Text Search with Index

Indexing single field for FTS

The index:

public class Employees_ByNotes :
AbstractIndexCreationTask<Employee, Employees_ByNotes.IndexEntry>
{
// The IndexEntry class defines the index-fields
public class IndexEntry
{
public string EmployeeNotes { get; set; }
}

public Employees_ByNotes()
{
// The 'Map' function defines the content of the index-fields
Map = employees => from employee in employees
select new IndexEntry()
{
EmployeeNotes = employee.Notes[0]
};

// Configure the index-field for FTS:
// Set 'FieldIndexing.Search' on index-field 'EmployeeNotes'
Index(x => x.EmployeeNotes, FieldIndexing.Search);

// Optionally: Set your choice of analyzer for the index-field.
// Here the text from index-field 'EmployeeNotes' will be tokenized by 'WhitespaceAnalyzer'.
Analyze(x => x.EmployeeNotes, "WhitespaceAnalyzer");

// Note:
// If no analyzer is set then the default 'RavenStandardAnalyzer' is used.
}
}
  • Use Search to make a full-text search when querying the index.

  • Refer to Full-Text search with dynamic queries for all available Search options,
    such as using wildcards, searching for multiple terms, etc.

List<Employee> employees = session
// Query the index
.Query<Employees_ByNotes.IndexEntry, Employees_ByNotes>()
// Call 'Search':
// pass the index field that was configured for FTS and the term to search for.
.Search(x => x.EmployeeNotes, "French")
.OfType<Employee>()
.ToList();

// * Results will contain all Employee documents that have 'French' in their 'Notes' field.
//
// * Search is case-sensitive since field was indexed using the 'WhitespaceAnalyzer'
// which preserves casing.

Indexing multiple fields for FTS

The index:

public class Employees_ByEmployeeData : 
AbstractIndexCreationTask<Employee, Employees_ByEmployeeData.IndexEntry>
{
public class IndexEntry
{
public object[] EmployeeData { get; set; }
}

public Employees_ByEmployeeData()
{
Map = employees => from employee in employees
select new IndexEntry()
{
EmployeeData = new object[]
{
// Multiple document-fields can be indexed
// into the single index-field 'EmployeeData'
employee.FirstName,
employee.LastName,
employee.Title,
employee.Notes
}
};

// Configure the index-field for FTS:
// Set 'FieldIndexing.Search' on index-field 'EmployeeData'
Index(x => x.EmployeeData, FieldIndexing.Search);

// Note:
// Since no analyzer is set, the default 'RavenStandardAnalyzer' is used.
}
}

Sample query:

List<Employee> employees = session
// Query the static-index
.Query<Employees_ByEmployeeData.IndexEntry, Employees_ByEmployeeData>()
// A logical OR is applied between the following two Search calls:
.Search(x => x.EmployeeData, "Manager")
// A logical AND is applied between the following two terms:
.Search(x => x.EmployeeData, "French Spanish", @operator: SearchOperator.And)
.OfType<Employee>()
.ToList();

// * Results will contain all Employee documents that have:
// ('Manager' in any of the 4 document-fields that were indexed)
// OR
// ('French' AND 'Spanish' in any of the 4 document-fields that were indexed)
//
// * Search is case-insensitive since the default analyzer is used

Indexing all fields for FTS (using AsJson)

  • To search across ALL fields in a document without defining each one explicitly, use the AsJson method in the Map function to extract all property values and index them in a single searchable field.

  • This approach makes the index robust to changes in the document schema.
    By calling .Select(x => x.Value) on the result of AsJson(...), the index automatically includes values from ALL existing and newly added properties and there is no need to update the index when the document structure changes.

  • This indexing method is supported only when using Lucene as the indexing engine.

The index:

public class Products_ByAllValues : 
AbstractIndexCreationTask<Product, Products_ByAllValues.IndexEntry>
{
public class IndexEntry
{
// This index field will contain all values from all properties in the document
public string AllValues { get; set; }

// Note:
// RavenDB seamlessly supports multi-value indexing on this field.
// Even though the 'AllValues' index-field is declared as a 'string',
// it can accept a collection of values, as defined in the Map function.
// The engine treats the field as if it contains multiple strings
// and indexes each one individually.
}

public Products_ByAllValues()
{
Map = products => from product in products
select new
{
// Use the 'AsJson' method to convert the document into a JSON-like structure
// and call 'Select' to extract only the values of each property
AllValues = AsJson(product).Select(x => x.Value)
};

// Configure the index-field for FTS:
// Set 'FieldIndexing.Search' on index-field 'AllValues'
Index(x => x.AllValues, FieldIndexing.Search);

// Note:
// Since no analyzer is set, the default 'RavenStandardAnalyzer' is used.

// Set the search engine type to Lucene:
SearchEngineType = Raven.Client.Documents.Indexes.SearchEngineType.Lucene;
}
}

Sample query:

List<Product> products = session
.Query<Products_ByAllValues.IndexEntry,
Products_ByAllValues>()
.Search(x => x.AllValues, "tofu")
.OfType<Product>()
.ToList();

// * Results will contain all Product documents that have 'tofu'
// in ANY of their fields.
//
// * Search is case-insensitive since the default analyzer is used.

Boosting search results

  • In order to prioritize results, you can provide a boost value to the searched terms.
    This can be applied by either of the following:

    • Add a boost value to the relevant index-field inside the index definition.
      Refer to article indexes - boosting.

    • Add a boost value to the queried terms at query time.
      Refer to article Boost search results.

Searching with wildcards

When using RavenStandardAnalyzer orStandardAnalyzer or NGramAnalyzer:

Usually, the same analyzer used to tokenize field content at indexing time is also used to process the terms provided in the full-text search query before they are sent to the search engine to retrieve matching documents.

However, in the following cases:

the queried terms in the Search method are processed with the LowerCaseKeywordAnalyzer
before being sent to the search engine.

This analyzer does Not remove the *, so the terms are sent with *, as provided in the search terms.
For example:

public class Employees_ByNotes_usingDefaultAnalyzer :
AbstractIndexCreationTask<Employee, Employees_ByNotes_usingDefaultAnalyzer.IndexEntry>
{
public class IndexEntry
{
public string EmployeeNotes { get; set; }
}

public Employees_ByNotes_usingDefaultAnalyzer()
{
Map = employees => from employee in employees
select new IndexEntry()
{
EmployeeNotes = employee.Notes[0]
};

// Configure the index-field for FTS:
Index(x => x.EmployeeNotes, FieldIndexing.Search);

// Since no analyzer is explicitly set
// then the default 'RavenStandardAnalyzer' will be used at indexing time.

// However, when making a search query with wildcards,
// the 'LowerCaseKeywordAnalyzer' will be used to process the search terms
// prior to sending them to the search engine.
}
}
When using a custom analyzer:
  • When setting a custom analyzer in your index to tokenize field content, then when querying the index, the search terms in the query will be processed according to the custom analyzer's logic.

  • The * will remain in the terms if the custom analyzer allows it. It is the user’s responsibility to ensure that wildcards are not removed by the custom analyzer if they should be included in the query.

  • Note:
    An exception to the above is when the wildcard is used as a suffix in the search term (e.g. Fren*).
    In this case the wildcard will be included in the query regardless of the analyzer's logic.

For example:

public class Employees_ByNotes_usingCustomAnalyzer :
AbstractIndexCreationTask<Employee, Employees_ByNotes_usingCustomAnalyzer.IndexEntry>
{
public class IndexEntry
{
public string EmployeeNotes { get; set; }
}

public Employees_ByNotes_usingCustomAnalyzer()
{
Map = employees => from employee in employees
select new IndexEntry()
{
EmployeeNotes = employee.Notes[0]
};

// Configure the index-field for FTS:
Index(x => x.EmployeeNotes, FieldIndexing.Search);

// Set a custom analyzer for the index-field:
Analyze(x => x.EmployeeNotes, "CustomAnalyzers.RemoveWildcardsAnalyzer");
}
}
When using the Exact analyzer:

When using the default Exact analyzer in your index (which is KeywordAnalyzer),
then when querying the index, the wildcards in your search terms remain untouched.
The terms are sent to the search engine exactly as produced by the analyzer.

For example:

public class Employees_ByFirstName_usingExactAnalyzer :
AbstractIndexCreationTask<Employee, Employees_ByFirstName_usingExactAnalyzer.IndexEntry>
{
public class IndexEntry
{
public string FirstName { get; set; }
}

public Employees_ByFirstName_usingExactAnalyzer()
{
Map = employees => from employee in employees
select new IndexEntry()
{
FirstName = employee.FirstName
};

// Set the Exact analyzer for the index-field:
// (The field will not be tokenized)
Indexes.Add(x => x.FirstName, FieldIndexing.Exact);
}
}