Skip to main content

Monitoring: Prometheus

Prometheus

Prometheus is commonly used in conjunction with services that expose numeric data via HTTP endpoints.
A Prometheus server can be used (either manually or by an application using a dedicated library) to scrape the data off an endpoint, monitor and store it, and put it to use in additional ways like raising alerts related to it or forwarding it to analysis services.

RavenDB Prometheus Endpoint

The path to the Prometheus endpoint of a RavenDB instance is: /admin/monitoring/v1/prometheus
To inspect the endpoint's output using a browser, add the endpoint path to the RavenDB server's URL.
E.g. http://live-test.ravendb.net/admin/monitoring/v1/prometheus

RavenDB: Prometheus Endpoint Output

  • As prometheus handles only numeric values, the endpoint outputs all values as numbers, providing legends that explain what the numbers mean.
    Metrics values are also explained in the table below.

Omit or Include Selected Metrics

To omit metrics related to selected topics from the data sent to the endpoint, set one or more of the flags listed below, in the URL line. Each flag can be set to true to omit the topic from the results, or to false to include it.
skipServerMetrics
skipDatabasesMetrics
skipIndexesMetrics
skipCollectionsMetrics

E.g., to skip indexing metrics use -
http://localhost:8080/admin/monitoring/v1/prometheus?skipIndexesMetrics=true And to skip both indexing and server metrics use -
http://localhost:8080/admin/monitoring/v1/prometheus?skipIndexesMetrics=true&skipServerMetrics=true

Metrics Provided by the Prometheus Endpoint

Here is the list of metrics made available by the /admin/monitoring/v1/prometheus endpoint.

MetricsDescription
archived_data_processing_behaviorArchived data processing behavior + 0/1/2
0 => ExcludeArchived
1 => IncludeArchived
2 => ArchivedOnly
backup_current_number_of_running_backupsNumber of currently running backups
backup_max_number_of_concurrent_backupsMaximum number of concurrent backups
certificate_server_certificate_expiration_left_secondsServer certificate expiration left in seconds
cluster_current_termCluster term
cluster_indexCluster index
cluster_node_stateCurrent node state + 0/1/2
0 => Passive
1 => Candidate
2 => Follower
3 => LeaderElect
4 => Leader
collection_documents_countNumber of documents in collection
collection_documents_size_bytesSize of documents
collection_revisions_size_bytesSize of revisions
collection_tombstones_size_bytesSize of tombstones
collection_total_size_bytesTotal size of collection
cpu_assigned_processor_countNumber of assigned processors on the machine
cpu_machine_io_waitIO wait in %
cpu_machine_usageMachine CPU usage in %
cpu_process_usageProcess CPU usage in %
cpu_processor_countNumber of processors on the machine
cpu_thread_pool_available_completion_port_threadsNumber of available completion port threads in the thread pool
cpu_thread_pool_available_worker_threadsNumber of available worker threads in the thread pool
database_alerts_countNumber of alerts
database_attachments_countNumber of attachments
database_documents_countNumber of documents
database_indexes_auto_countNumber of auto indexes
database_indexes_countNumber of indexes
database_indexes_errored_countNumber of error indexes
database_indexes_disabled_countNumber of disabled indexes
database_indexes_errors_countNumber of indexing errors
database_indexes_idle_countNumber of idle indexes
database_indexes_stale_countNumber of stale indexes
database_indexes_static_countNumber of static indexes
database_performance_hints_countNumber of performance hints
database_rehabs_countNumber of rehabs
database_replication_factorDatabase replication factor
database_revisions_countNumber of revision documents
database_statistics_doc_puts_per_secondNumber of document puts per second (one minute rate)
database_statistics_map_index_indexes_per_secondNumber of indexed documents per second for map indexes (one minute rate)
database_statistics_map_reduce_index_mapped_per_secondNumber of maps per second for map-reduce indexes (one minute rate)
database_statistics_map_reduce_index_reduced_per_secondNumber of reduces per second for map-reduce indexes (one minute rate)
database_statistics_request_average_duration_secondsAverage request time in seconds
database_statistics_requests_countNumber of requests from database start
database_statistics_requests_per_secondNumber of requests per second (one minute rate)
database_storage_documents_allocated_data_file_bytesDocuments storage allocated size
database_storage_documents_used_data_file_bytesDocuments storage used size
database_storage_indexes_allocated_data_file_bytesIndex storage allocated size
database_storage_indexes_used_data_file_bytesIndex storage used size
database_storage_io_read_operationsDisk IO Read operations
database_storage_io_write_operationsDisk IO Write operations
database_storage_queue_lengthDisk Queue length
database_storage_read_throughput_bytesDisk Read Throughput
database_storage_total_allocated_storage_file_bytesTotal storage size
database_storage_total_free_space_bytesRemaining storage disk space
database_storage_write_throughput_bytesDisk Write Throughput
database_time_since_last_backup_secondsLast backup
database_unique_attachments_countNumber of unique attachments
database_uptime_secondsDatabase up-time
databases_loaded_countNumber of loaded databases
databases_total_countNumber of all databases
index_entries_countNumber of entries in the index
index_errorsNumber of index errors
index_is_invalidIndicates if index is invalid
index_lock_modeIndex lock mode + 0/1/2
0 => Unlock
1 => LockedIgnore
2 => LockedError
index_mapped_per_secondNumber of maps per second (one minute rate)
index_priorityIndex priority + 0/1/2
0 => Low
1 => Normal
2 => High
index_reduced_per_secondNumber of reduces per second (one minute rate)
index_stateIndex state + 0/1/2/3
0 => Normal
1 => Disabled
2 => Idle
3 => Error
index_statusIndex status + 0/1/2/3
0 => Running
1 => Paused
2 => Disabled
3 => Pending (for rolling indexes)
index_time_since_last_indexing_secondsTime since last indexing
index_time_since_last_query_secondsTime since last query
index_typeIndex type + 0/1/2/3/4/5/6/7
0 => None
1 => AutoMap
2 => AutoMapReduce
3 => Map
4 => MapReduce
5 => Faulty
6 => JavaScriptMap
7 => JavaScriptMapReduce
license_expiration_left_secondsServer license expiration left
license_max_coresServer license max CPU cores
license_typeServer license type + -1/0/1/2/3/4/5/6
-1 => Invalid
0 => None
1 => Community
2 => Reserved
3 => Professional
4 => Enterprise
5 => Developer
6 => Essential
license_utilized_cpu_coresServer license utilized CPU cores
memory_allocated_bytesServer allocated memory
memory_installed_bytesInstalled memory
memory_low_memory_severityServer low memory flag value + 0/1/2
0 => None
1 => Low
2 => ExtremelyLow
memory_physical_bytesPhysical memory
memory_total_dirty_bytesDirty memory that is used by the scratch buffers
memory_total_swap_size_bytesServer total swap size
memory_total_swap_usage_bytesServer total swap usage
memory_working_set_swap_usage_bytesServer working set swap usage
network_concurrent_requests_countNumber of concurrent requests
network_last_authorized_non_cluster_admin_request_time_in_secServer last authorized non cluster admin request time
network_last_request_time_in_secondsServer last request time
network_requests_per_secondNumber of requests per second (one minute rate)
network_tcp_active_connectionsNumber of active TCP connections
network_total_requestsTotal number of requests since server startup
server_disk_remaining_storage_space_percentageRemaining server storage disk space in %
server_disk_system_store_total_data_file_size_bytesServer storage total size
server_disk_system_store_used_data_file_size_bytesServer storage used size
server_infoServer Info
server_process_idServer process ID
server_storage_io_read_operationsDisk IO Read operations
server_storage_io_write_operationsDisk IO Write operations
server_storage_queue_lengthDisk Queue length
server_storage_read_throughput_bytesDisk Read Throughput
server_storage_total_free_space_bytesRemaining storage disk space
server_storage_write_throughput_bytesDisk Write Throughput
server_uptime_secondsServer up-time

Using the RavenDB Endpoint by a Prometheus Server

To direct a Prometheus server to the Prometheus endpoint of a RavenDB instance add an entry to the Prometheus yml configuration file.

  • Prometheus.yml:
- job_name: "local-raven-instance"
# Monitor a local, non-secure RavenDB server
metrics_path: "/admin/monitoring/v1/prometheus"
static_configs:
- targets:
- "localhost:8080"
- job_name: "cloud-raven-instance"
# Monitor 3 RavenDB nodes on the cloud
scheme: https
tls_config:
cert_file: 'a.client.certificate.crt'
key_file: 'a.client.certificate.key'
metrics_path: "/admin/monitoring/v1/prometheus"
static_configs:
- targets:
- "a.cloudtest.ravendb.org"
- "b.cloudtest.ravendb.org"
- "c.cloudtest.ravendb.org"
  • When you run the Prometheus server, direct it to the configuration file you want it to use.
    ./prometheus --config.file=prometheus.yml

  • When the Prometheus server is running, you'll be able to provide it with various RavenDB metrics.

    Prometheus: Select Metrics

    Prometheus: Metrics Graph

    • Use the search bar to search for relevant metrics.
      Typing raven will display a list of metrics provided by the endpoint.
  • Metrics can also be found in RavenDB's enpoint output, using the browser.
    In the following screenshot, for example, we can see that the priority of one of the indexes was updated to 2 (high).

    RavenDB Endpoint Output: Index Priority

    Providing the Prometheus server with the metrics' name will display the event graphically.

    Prometheus Server: Index Priority

Fetching Additional RavenDB Information

To retrieve additional information about RavenDB, including its version, HTTP and TCP URLs, and other details, pass Prometheus: ravendb_server_info

Additional Information