Skip to main content

Index Attachments

Index attachments details

The index:

  • To index attachments' details, call AttachmentsFor() within the index definition.

  • AttachmentsFor() provides access to the name, size, hash, and content-type of each attachment a document has. These details can then be used when defining the index-fields. Once the index is deployed, you can query the index to find Employee documents based on these attachment properties.

  • To index attachments' content, see the examples below.

public class Employees_ByAttachmentDetails : 
AbstractIndexCreationTask<Employee, Employees_ByAttachmentDetails.IndexEntry>
{
public class IndexEntry
{
public string EmployeeName { get; set; }

public string[] AttachmentNames { get; set; }
public string[] AttachmentContentTypes { get; set; }
public long[] AttachmentSizes { get; set; }
}

public Employees_ByAttachmentDetails()
{
Map = employees => from employee in employees

// Call 'AttachmentsFor' to get attachments details
let attachments = AttachmentsFor(employee)

select new IndexEntry()
{
// Can index info from document properties:
EmployeeName = employee.FirstName + " " + employee.LastName,

// Index DETAILS of attachments:
AttachmentNames = attachments.Select(x => x.Name).ToArray(),
AttachmentContentTypes = attachments.Select(x => x.ContentType).ToArray(),
AttachmentSizes = attachments.Select(x => x.Size).ToArray()
};
}
}

Query the Index:

You can now query for Employee documents based on their attachments details.

List<Employee> employees = session
// Query the index for matching employees
.Query<Employees_ByAttachmentDetails.IndexEntry, Employees_ByAttachmentDetails>()
// Filter employee results by their attachments details
.Where(x => x.AttachmentNames.Contains("photo.jpg"))
.Where(x => x.AttachmentSizes.Any(size => size > 20_000))
// Return matching Employee docs
.OfType<Employee>()
.ToList();

// Results:
// ========
// Running this query on the Northwind sample data,
// results will include 'employees/4-A' and 'employees/5-A'.
// These 2 documents contain an attachment by name 'photo.jpg' with a matching size.

Index details & content - by attachment name

Sample data:

  • Each Employee document in RavenDB's sample data already includes a photo.jpg attachment.

  • For all following examples, let's store a textual attachment (file notes.txt) on 3 documents in the 'Employees' collection.

// Create some sample attachments:
for (var i = 1; i <= 3; i++)
{
var id = $"employees/{i}-A";

// Load an employee document:
var employee = session.Load<Employee>($"employees/{i}-A");
if (employee?.Notes == null || employee.Notes.Count == 0)
continue;

// Store the employee's notes as an attachment on the document:
byte[] bytes = System.Text.Encoding.UTF8.GetBytes(employee.Notes[0]);
using (var stream = new MemoryStream(bytes))
{
session.Advanced.Attachments.Store(
$"employees/{i}-A",
"notes.txt", stream,
"text/plain");

session.SaveChanges();
}
}

The index:

  • To index the details & content for a specific attachment, call LoadAttachment() within the index definition.

  • In addition to accessing the attachment details, LoadAttachment() provides access to the attachment's content, which can be used when defining the index-fields.

public class Employees_ByAttachment: 
AbstractIndexCreationTask<Employee, Employees_ByAttachment.IndexEntry>
{
public class IndexEntry
{
public string AttachmentName { get; set; }
public string AttachmentContentType { get; set; }
public long AttachmentSize { get; set; }

public string AttachmentContent { get; set; }
}

public Employees_ByAttachment()
{
Map = employees =>
from employee in employees

// Call 'LoadAttachment' to get attachment's details and content
// pass the attachment name, e.g. "notes.txt"
let attachment = LoadAttachment(employee, "notes.txt")

select new IndexEntry()
{
// Index DETAILS of attachment:
AttachmentName = attachment.Name,
AttachmentContentType = attachment.ContentType,
AttachmentSize = attachment.Size,

// Index CONTENT of attachment:
// Call 'GetContentAsString' to access content
AttachmentContent = attachment.GetContentAsString()
};

// It can be useful configure Full-Text search on the attachment content index-field
Index(x => x.AttachmentContent, FieldIndexing.Search);

// Documents with an attachment named 'notes.txt' will be indexed,
// allowing you to query them by either the attachment's details or its content.
}
}

Query the Index:

You can now query for Employee documents based on their attachment details and/or its content.

List<Employee> employees = session
// Query the index for matching employees
.Query<Employees_ByAttachment.IndexEntry, Employees_ByAttachment>()
// Can make a full-text search
// Looking for employees with an attachment content that contains 'Colorado' OR 'Dallas'
.Search(x => x.AttachmentContent, "Colorado Dallas")
.OfType<Employee>()
.ToList();

// Results:
// ========
// Results will include 'employees/1-A' and 'employees/2-A'.
// Only these 2 documents have an attachment by name 'notes.txt'
// that contains either 'Colorado' or 'Dallas'.

Index details & content - all attachments

The index:

  • Use LoadAttachments() to be able to index the details & content of ALL attachments.

  • Note how the index example below is employing the Fanout index pattern.

public class Employees_ByAllAttachments : 
AbstractIndexCreationTask<Employee, Employees_ByAllAttachments.IndexEntry>
{
public class IndexEntry
{
public string AttachmentName { get; set; }
public string AttachmentContentType { get; set; }
public long AttachmentSize { get; set; }
public string AttachmentContent { get; set; }
}

public Employees_ByAllAttachments()
{
Map = employees =>

// Call 'LoadAttachments' to get details and content for ALL attachments
from employee in employees
from attachment in LoadAttachments(employee)

// This will be a Fanout index -
// the index will generate an index-entry for each attachment per document

select new IndexEntry
{
// Index DETAILS of attachment:
AttachmentName = attachment.Name,
AttachmentContentType = attachment.ContentType,
AttachmentSize = attachment.Size,

// Index CONTENT of attachment:
// Call 'getContentAsString' to access content
AttachmentContent = attachment.GetContentAsString()
};

// It can be useful configure Full-Text search on the attachment content index-field
Index(x => x.AttachmentContent, FieldIndexing.Search);
}
}

Query the Index:

// Query the index for matching employees
List<Employee> employees = session
.Query<Employees_ByAllAttachments.IndexEntry, Employees_ByAllAttachments>()
// Filter employee results by their attachments details and content:
// Using 'SearchOptions.Or' combines the full-text search on 'AttachmentContent'
// with the following 'Where' condition using OR logic.
.Search(x => x.AttachmentContent, "Colorado Dallas", options: SearchOptions.Or)
.Where(x => x.AttachmentSize > 20_000)
.OfType<Employee>()
.ToList();

// Results:
// ========
// Results will include:
// 'employees/1-A' and 'employees/2-A' that match the content criteria
// 'employees/4-A' and 'employees/5-A' that match the size criteria

Leveraging indexed attachments

  • Access to the indexed attachment content opens the door to many different applications,
    including many that can be integrated directly into RavenDB.

  • This blog post demonstrates how image recognition can be applied to indexed attachments using the additional sources feature. The resulting index allows filtering and querying based on image content.

Syntax

AttachmentsFor

// Returns a list of attachment details for the specified document.
IEnumerable<AttachmentName> AttachmentsFor(object document);
ParameterTypeDescription
documentobjectThe document object whose attachments details you want to load.
// AttachmentsFor returns a list containing the following attachment details object:
public class AttachmentName
{
public string Name;
public string Hash;
public string ContentType;
public long Size;
}

LoadAttachment

public IAttachmentObject LoadAttachment(object doc, string name);
public IEnumerable<IAttachmentObject> LoadAttachments(object doc);
ParameterTypeDescription
documentobjectThe document whose attachment you want to load.
attachmentNamestringThe name of the attachment to load.
public interface IAttachmentObject
{
public string Name { get; }
public string Hash { get; }
public string ContentType { get; }
public long Size { get; }

public string GetContentAsString();
public string GetContentAsString(Encoding encoding);
public Stream GetContentAsStream();
}

LoadAttachments

// Returns a list of all attachments for the specified document.
public IEnumerable<IAttachmentObject> LoadAttachments(object doc);