Send Metrics Implementing a Custom Producer

If you have arrived to this sections it's because you have decided that you can't use the monit agent nor the provided kubernetes helmchart to forward your metrics to us.

As part of the MONIT infrastructure we offer two endpoints where you can forward your metrics to, one based in HTTP+JSON and one in OTLP (HTTP+GRPC). Please refer to the first step section to know what you should do before using any of the integrations.

Open Telemetry Metrics

There's a big effort ongoing in the Monitoring team to provide support for OTLP based documents, including metrics. This will be considered our main point of integration for the future and will be used to provide support for the MONIT provided forwarders.

Storages

Any metric that you send to the OTLP endpoint will be stored in our Mimir backend, so of course you will need to request a tenant in case you don't have one already.

Send data

Sending data using OTLP can be achieved with any tool that supports this kind of output, in our case we have more experience using Fluentbit and so we will be able to provide extra support when configuring it if required.

Otherwise, here you will find the connection details needed for you to start sending your data to the "monit-otlp" endpoint. We keep the endpoint URIs by default, so no need to specify them.

Endpoint: monit-otlp.cern.ch
Port: 4319 (HTTP), 4316 (GRPC)
User: Your tenant name
Password: Your tenant password

Access data

Once your data has been sent, accessing it is done through Grafana and configuring one of the MONIT managed Prometheus datasources.

JSON Metrics

The MONIT infrastructure has historically offered this endpoint for integration, and although our recommendation is to use the OTLP one when possible, we will keep maintaining this for a while.

Your data must be represented as a valid JSON object with the following fields (as strings):

(mandatory) producer: (your tenant) used to name your data set, only one value allowed
(mandatory) type: used to classify your data set, you can define multiple values but please try to keep it limited
(optional) timestamp: used to indicate the event submission time
(optional) _id: if you wish to set your own ID (needed to override documents in OpenSearch), we assign one random ID by default
(optional) host: used to add extra information about the node submitting your data

Storages

By default, metrics that you send to the JSON endpoint will be written into MONIT OpenSearch short term cluster, other options can be requested:

HDFS: If you would like your metrics to be stored for archival (13 months by default)
InfluxDB: Since we are moving most of the MONIT use cases out of InfluxDB, we don't write data there by default, you can still request and follow the section on how to send data to InfluxDB if needed.

Send data HTTP

Metrics can be sent to the HTTP endpoint listening in https://monit-metrics.cern.ch:10014/\<producer\>, this is a secured endpoint and so you will need to use your tenant credentials to communicate with it.

Please pay attention to the following:

You provide all the mandatory fields
All timestamps must be in UTC milliseconds or seconds, without any subdecimal part
Use double quotes and not single quote (not valid in JSON)
Send multiple documents in the same batch grouping them in a JSON array (max size ~1MB).
Make sure your document fits in one line as we don't support multi-line JSON.
Anything considered metadata for the infrastructure will be promoted to the metadata field in the produced JSON, and the rest will be put inside data.
Only UTF-8 charset is accepted and it must be explicitly specified into the Content-Type entity of the HTTP requests

Writing to InfluxDB

Please note that we are gradually moving out of InfluxDB, if you still want to keep writing data there please continue reading the section below.

If you wish to write your data to InfluxDB you should specify which entities should be considered as tags and which as fields. To do this, in every message define two arrays: idb_tags and idb_fields. Without them all of the entities are treated as fields. Also with InfluxDB type entity is mandatory and will be used to create measurement.

Remarks:

General rule of thumb: tags are used for filtering data, and fields are values seen in the plots.
Specifying only idb_tags will indicate to use all remaining entities as fields.
Specifying idb_fields is used for either treating the same entity as field and tag or to exclude some entities to be sent to InfluxDB (when both idb_tags and idb_fields are set, remaining entities are not used)
You should avoid writing any sort of IDs (as a tag) to InfluxDB. Reason - cardinality
Strings like descriptions, logs, etc. should not be written to InfluxDB.
If you send some nested structure you may access entities with dot notation (e.g. struct.field)
Fields mentioned at the beginning of the document (producer, type, etc.) are not written to InfluxDB by default, but they might be used in idb_tags and idb_fields.

Example:

{
  "producer": "myproducer",
  "type": "mytype",
  "field1": 42,
  "mytag1": "CERN_PROD",
  "both": "available"
  "idb_tags": ["mytag1", "both"],
  "idb_fields": ["field1", "both"]
}

Warning

Even though we present this JSON in multiple lines, please make sure yours fits in one line, as we don't support multi-line JSON.

Example

Here's a simple Python example on how to send data to the JSON endpoint:

import requests
import json
from requests.auth import HTTPBasicAuth

def send(document):
    return requests.post('https://monit-metrics:10014/<tenant>', auth=HTTPBasicAuth(<tenant>, password), data=json.dumps(document), headers={ "Content-Type": "application/json"})

def send_and_check(document, should_fail=False):
    response = send(document)
    assert( (response.status_code in [200]) != should_fail), 'With document: {0}. Status code: {1}. Message: {2}'.format(document, response.status_code, response.text)

basic_document = [{
   "producer": "<tenant>",
   "type_prefix": "raw",
   "type": "dbmetric",
   "hostname": "hostnameA",
   "timestamp": 1483696735836,
   "data": { # Your metric fields
     "foo": "bar"
   },
   "field2": "value" # Another metric
}]

send_and_check(basic_document)

Send data AMQ

Your data must be represented as a valid JSON object with the following fields (as strings):

(mandatory) producer: used to name your data set, only one value allowed
(mandatory) type: used to classify your data set, you can define multiple values
(optional) type_prefix: used to categorise your metrics, possible values are raw|agg|enr
(optional) timestamp: used to indicate the event submission time
(optional) host: used to add extra information about the node submitting your data

Once you have received from us the destination details, user, password, topic, host and port, you will need to start sending the messages to the Active MQ topic. The expected input data for this example should look like this, where your actual metric is nested inside the body and the rest are the expected monitoring fields:

{
  ...
  "body": {
    "metadata": {
      "producer": <producer>,
      "type": <type>,
      "timestamp": <timestamp>
    },
    # Your metrics here
  }
}

For more information on supported mechanism to send your data to the Messaging service, please check the Messaging documentation.

Access data

Depending where you have requested to send your data, the access method might vary, here you have the different options:

OpenSearch: You can of course use directly MONIT OpenSearch to access your metrics, but the recommendation is to do it from Grafana configuring a new datasource
InfluxDB: Configure a new Grafana datasource for InfluxDB
Hadoop: Current recommendation to access data in HDFS is through the SWAN service
Please note your data will be under the path /project/monitoring/archive/<producer>/raw/<type>/YYYY/MM/DD