Sharding: Indexing
-
Indexes in a sharded database are defined and deployed the same way as in a non-sharded database,
using the same syntax and the same client API. -
Most indexing features available in a non-sharded database are also available in a sharded database.
Unsupported features are listed below. -
In this article:
Indexing in a sharded database
-
The same index definition is deployed across the database to all shards.
However, each shard indexes only its own local data - there is no cross-shard indexing process.
Each shard executes the index definition independently on the documents it stores locally. -
As a result, each shard maintains its own local index entries for the data stored on that shard.
There is no indexing stage that reads documents from multiple shards and builds a single shared index. -
Querying a sharded index is coordinated by the orchestrator, which combines results from all shards.
The orchestrator is a RavenDB server that mediates all communication between the client and the database shards.
Learn more in Client-server communication.
Map-Reduce indexes in a sharded database
Map-reduce indexes in a sharded database work in two stages:
- At indexing time:
During indexing, each shard maps and reduces only the documents it stores locally,
just as a non-sharded database reduces its local data. - At query time:
When a query uses a map-reduce index, the orchestrator distributes the query to the shards,
gathers the partial reduce results returned from each shard, and reduces them to produce the final query result.
The data retrieved from the shards depends on the query shape.
See order by and limit in a Map-Reduce query for details.
Learn more about querying map-reduce indexes in a sharded database in Sharding: querying.
Unsupported indexing features
Unsupported or not-yet-implemented indexing features include:
-
Custom sorters:
Custom sorters are not supported in a sharded database. -
Rolling index deployment:
Rolling index deployment is not supported in a sharded database. -
Outputting Map-Reduce results to a collection:
Outputting map-reduce index results to an artificial documents collection is not supported in a sharded database. -
Loading a document from another shard:
Loading a document during indexing is possible only if the document resides on the same shard where the index is running. If the requested document is stored on a different shard,LoadDocumentwill returnnull.For example, consider the following index, which attempts to load a related Category document.
To ensure that all documents are properly indexed - including those whose related document resides on another shard - handle this null case explicitly in your index definition, as shown below:public class Products_ByCategoryName :AbstractIndexCreationTask<Product, Products_ByCategoryName.IndexEntry>{public class IndexEntry{public string CategoryName { get; set; }}public Products_ByCategoryName(){Map = products =>from product in products// In a sharded database, LoadDocument returns null// if the related document resides on a different shard.let category = LoadDocument<Category>(product.Category)select new IndexEntry{// Handle the null case explicitly:CategoryName = category != null ? category.Name : null};}}Why the explicit null check matters:
Without the explicit null check (e.g., assigning
category.Namedirectly toCategoryName),
RavenDB treats the resulting null as an implicit null and omits the field entirely from the index entry.
Products whose category resides on another shard would then be missing theCategoryNamefield in the index,
making them invisible to queries that filter on this field (includingwhere CategoryName == null).Using
category != null ? category.Name : nullstores an explicit null in the index entry,
keeping those products queryable.Storing documents in the same shard:
You can make sure related documents are stored in the same bucket, and therefore on the same shard,
by using the$syntax. Learn more in Anchoring documents to a bucket.