Index Attachments
-
Indexing attachments allows you to query for documents based on their attachments' details and content.
-
Static indexes:
Both attachments' details and content can be indexed within a static-index definition. -
Auto-indexes:
Auto-indexing attachments via dynamic queries is not available at this time. -
In this page:
Index attachments details
The index:
-
To index attachments' details, call
AttachmentsFor()
within the index definition. -
AttachmentsFor()
provides access to the name, size, hash, and content-type of each attachment a document has. These details can then be used when defining the index-fields. Once the index is deployed, you can query the index to find Employee documents based on these attachment properties. -
To index attachments' content, see the examples below.
- LINQ_index
- JS_index
public class Employees_ByAttachmentDetails :
AbstractIndexCreationTask<Employee, Employees_ByAttachmentDetails.IndexEntry>
{
public class IndexEntry
{
public string EmployeeName { get; set; }
public string[] AttachmentNames { get; set; }
public string[] AttachmentContentTypes { get; set; }
public long[] AttachmentSizes { get; set; }
}
public Employees_ByAttachmentDetails()
{
Map = employees => from employee in employees
// Call 'AttachmentsFor' to get attachments details
let attachments = AttachmentsFor(employee)
select new IndexEntry()
{
// Can index info from document properties:
EmployeeName = employee.FirstName + " " + employee.LastName,
// Index DETAILS of attachments:
AttachmentNames = attachments.Select(x => x.Name).ToArray(),
AttachmentContentTypes = attachments.Select(x => x.ContentType).ToArray(),
AttachmentSizes = attachments.Select(x => x.Size).ToArray()
};
}
}
public class Employees_ByAttachmentDetails_JS : AbstractJavaScriptIndexCreationTask
{
public Employees_ByAttachmentDetails_JS()
{
Maps = new HashSet<string>
{
@"map('Employees', function (employee) {
var attachments = attachmentsFor(employee);
return {
EmployeeName: employee.FirstName + ' ' + employee.LastName,
AttachmentNames: attachments.map(function(attachment) { return attachment.Name; }),
AttachmentContentTypes: attachments.map(function(attachment) { return attachment.ContentType; }),
AttachmentSizes: attachments.map(function(attachment) { return attachment.Size; })
};
})"
};
}
}
Query the Index:
You can now query for Employee documents based on their attachments details.
- Query
- Query_async
- DocumentQuery
- RQL
List<Employee> employees = session
// Query the index for matching employees
.Query<Employees_ByAttachmentDetails.IndexEntry, Employees_ByAttachmentDetails>()
// Filter employee results by their attachments details
.Where(x => x.AttachmentNames.Contains("photo.jpg"))
.Where(x => x.AttachmentSizes.Any(size => size > 20_000))
// Return matching Employee docs
.OfType<Employee>()
.ToList();
// Results:
// ========
// Running this query on the Northwind sample data,
// results will include 'employees/4-A' and 'employees/5-A'.
// These 2 documents contain an attachment by name 'photo.jpg' with a matching size.
List<Employee> employees = await asyncSession
.Query<Employees_ByAttachmentDetails.IndexEntry, Employees_ByAttachmentDetails>()
.Where(x => x.AttachmentNames.Contains("photo.jpg"))
.Where(x => x.AttachmentSizes.Any(size => size > 20_000))
.OfType<Employee>()
.ToListAsync();
List<Employee> employees = session.Advanced
.DocumentQuery<Employees_ByAttachmentDetails.IndexEntry, Employees_ByAttachmentDetails>()
.WhereEquals("AttachmentNames", "photo.jpg")
.WhereGreaterThan("AttachmentSizes", 20_000)
.OfType<Employee>()
.ToList();
from index "Employees/ByAttachmentDetails"
where AttachmentNames == "photo.jpg" and AttachmentSizes > 20000
Index details & content - by attachment name
Sample data:
-
Each Employee document in RavenDB's sample data already includes a photo.jpg attachment.
-
For all following examples, let's store a textual attachment (file notes.txt) on 3 documents in the 'Employees' collection.
// Create some sample attachments:
for (var i = 1; i <= 3; i++)
{
var id = $"employees/{i}-A";
// Load an employee document:
var employee = session.Load<Employee>($"employees/{i}-A");
if (employee?.Notes == null || employee.Notes.Count == 0)
continue;
// Store the employee's notes as an attachment on the document:
byte[] bytes = System.Text.Encoding.UTF8.GetBytes(employee.Notes[0]);
using (var stream = new MemoryStream(bytes))
{
session.Advanced.Attachments.Store(
$"employees/{i}-A",
"notes.txt", stream,
"text/plain");
session.SaveChanges();
}
}
The index:
-
To index the details & content for a specific attachment, call
LoadAttachment()
within the index definition. -
In addition to accessing the attachment details,
LoadAttachment()
provides access to the attachment's content, which can be used when defining the index-fields.
- LINQ_index
- JS_index
public class Employees_ByAttachment:
AbstractIndexCreationTask<Employee, Employees_ByAttachment.IndexEntry>
{
public class IndexEntry
{
public string AttachmentName { get; set; }
public string AttachmentContentType { get; set; }
public long AttachmentSize { get; set; }
public string AttachmentContent { get; set; }
}
public Employees_ByAttachment()
{
Map = employees =>
from employee in employees
// Call 'LoadAttachment' to get attachment's details and content
// pass the attachment name, e.g. "notes.txt"
let attachment = LoadAttachment(employee, "notes.txt")
select new IndexEntry()
{
// Index DETAILS of attachment:
AttachmentName = attachment.Name,
AttachmentContentType = attachment.ContentType,
AttachmentSize = attachment.Size,
// Index CONTENT of attachment:
// Call 'GetContentAsString' to access content
AttachmentContent = attachment.GetContentAsString()
};
// It can be useful configure Full-Text search on the attachment content index-field
Index(x => x.AttachmentContent, FieldIndexing.Search);
// Documents with an attachment named 'notes.txt' will be indexed,
// allowing you to query them by either the attachment's details or its content.
}
}
public class Employees_ByAttachment_JS : AbstractJavaScriptIndexCreationTask
{
public Employees_ByAttachment_JS()
{
Maps = new HashSet<string>
{
@"map('Employees', function (employee) {
var attachment = loadAttachment(employee, 'notes.txt');
return {
AttachmentName: attachment.Name,
AttachmentContentType: attachment.ContentType,
AttachmentSize: attachment.Size,
AttachmentContent: attachment.getContentAsString()
};
})"
};
Fields = new Dictionary<string, IndexFieldOptions>
{
{
"AttachmentContent", new IndexFieldOptions
{
Indexing = FieldIndexing.Search
}
}
};
}
}
Query the Index:
You can now query for Employee documents based on their attachment details and/or its content.
- Query
- Query_async
- DocumentQuery
- RQL
List<Employee> employees = session
// Query the index for matching employees
.Query<Employees_ByAttachment.IndexEntry, Employees_ByAttachment>()
// Can make a full-text search
// Looking for employees with an attachment content that contains 'Colorado' OR 'Dallas'
.Search(x => x.AttachmentContent, "Colorado Dallas")
.OfType<Employee>()
.ToList();
// Results:
// ========
// Results will include 'employees/1-A' and 'employees/2-A'.
// Only these 2 documents have an attachment by name 'notes.txt'
// that contains either 'Colorado' or 'Dallas'.
List<Employee> employees = await asyncSession
// Query the index for matching employees
.Query<Employees_ByAttachment.IndexEntry, Employees_ByAttachment>()
// Can make a full-text search
// Looking for employees with an attachment content that contains 'Colorado' OR 'Dallas'
.Search(x => x.AttachmentContent, "Colorado Dallas")
.OfType<Employee>()
.ToListAsync();
List<Employee> employees = session.Advanced
.DocumentQuery<Employees_ByAttachment.IndexEntry, Employees_ByAttachment>()
.Search(x => x.AttachmentContent, "Colorado Dallas")
.OfType<Employee>()
.ToList();
from index "Employees/ByAttachment"
where search(AttachmentContent, "Colorado Dallas")
Index details & content - all attachments
The index:
-
Use
LoadAttachments()
to be able to index the details & content of ALL attachments. -
Note how the index example below is employing the Fanout index pattern.
- LINQ_index
- JS_index
public class Employees_ByAllAttachments :
AbstractIndexCreationTask<Employee, Employees_ByAllAttachments.IndexEntry>
{
public class IndexEntry
{
public string AttachmentName { get; set; }
public string AttachmentContentType { get; set; }
public long AttachmentSize { get; set; }
public string AttachmentContent { get; set; }
}
public Employees_ByAllAttachments()
{
Map = employees =>
// Call 'LoadAttachments' to get details and content for ALL attachments
from employee in employees
from attachment in LoadAttachments(employee)
// This will be a Fanout index -
// the index will generate an index-entry for each attachment per document
select new IndexEntry
{
// Index DETAILS of attachment:
AttachmentName = attachment.Name,
AttachmentContentType = attachment.ContentType,
AttachmentSize = attachment.Size,
// Index CONTENT of attachment:
// Call 'getContentAsString' to access content
AttachmentContent = attachment.GetContentAsString()
};
// It can be useful configure Full-Text search on the attachment content index-field
Index(x => x.AttachmentContent, FieldIndexing.Search);
}
}
public class Employees_ByAllAttachments_JS : AbstractJavaScriptIndexCreationTask
{
public Employees_ByAllAttachments_JS()
{
Maps = new HashSet<string>
{
@"map('Employees', function (employee) {
const allAttachments = loadAttachments(employee);
return allAttachments.map(function (attachment) {
return {
attachmentName: attachment.Name,
attachmentContentType: attachment.ContentType,
attachmentSize: attachment.Size,
attachmentContent: attachment.getContentAsString()
};
});
})"
};
Fields = new Dictionary<string, IndexFieldOptions>
{
{
"attachmentContent", new IndexFieldOptions
{
Indexing = FieldIndexing.Search
}
}
};
}
}
Query the Index:
- Query
- Query_async
- DocumentQuery
- RQL
// Query the index for matching employees
List<Employee> employees = session
.Query<Employees_ByAllAttachments.IndexEntry, Employees_ByAllAttachments>()
// Filter employee results by their attachments details and content:
// Using 'SearchOptions.Or' combines the full-text search on 'AttachmentContent'
// with the following 'Where' condition using OR logic.
.Search(x => x.AttachmentContent, "Colorado Dallas", options: SearchOptions.Or)
.Where(x => x.AttachmentSize > 20_000)
.OfType<Employee>()
.ToList();
// Results:
// ========
// Results will include:
// 'employees/1-A' and 'employees/2-A' that match the content criteria
// 'employees/4-A' and 'employees/5-A' that match the size criteria
List<Employee> employees = await asyncSession
.Query<Employees_ByAttachment.IndexEntry, Employees_ByAttachment>()
.Search(x => x.AttachmentContent, "Colorado Dallas", options: SearchOptions.Or)
.Where(x => x.AttachmentSize > 20_000)
.OfType<Employee>()
.ToListAsync();
List<Employee> employees = session
.Advanced
.DocumentQuery<Employees_ByAllAttachments.IndexEntry, Employees_ByAllAttachments>()
.Search(x => x.AttachmentContent, "Colorado Dallas")
.OrElse()
.WhereGreaterThan(x => x.AttachmentSize, 20_000)
.OfType<Employee>()
.ToList();
from index "Employees/ByAllAttachments"
where search(AttachmentContent, "Colorado Dallas") or AttachmentSize > 20000
Leveraging indexed attachments
-
Access to the indexed attachment content opens the door to many different applications,
including many that can be integrated directly into RavenDB. -
This blog post demonstrates how image recognition can be applied to indexed attachments using the additional sources feature. The resulting index allows filtering and querying based on image content.
Syntax
AttachmentsFor
// Returns a list of attachment details for the specified document.
IEnumerable<AttachmentName> AttachmentsFor(object document);
Parameter | Type | Description |
---|---|---|
document | object | The document object whose attachments details you want to load. |
// AttachmentsFor returns a list containing the following attachment details object:
public class AttachmentName
{
public string Name;
public string Hash;
public string ContentType;
public long Size;
}
LoadAttachment
public IAttachmentObject LoadAttachment(object doc, string name);
public IEnumerable<IAttachmentObject> LoadAttachments(object doc);
Parameter | Type | Description |
---|---|---|
document | object | The document whose attachment you want to load. |
attachmentName | string | The name of the attachment to load. |
public interface IAttachmentObject
{
public string Name { get; }
public string Hash { get; }
public string ContentType { get; }
public long Size { get; }
public string GetContentAsString();
public string GetContentAsString(Encoding encoding);
public Stream GetContentAsStream();
}
LoadAttachments
// Returns a list of all attachments for the specified document.
public IEnumerable<IAttachmentObject> LoadAttachments(object doc);