Send Logs Implementing a Custom Producer

If you have arrived to this sections it's because you have decided that you can't use the monit agent nor the provided kubernetes helmchart to forward your logs to us.

As part of the MONIT infrastructure we offer two endpoints where you can forward your logs to, one based in HTTP+JSON and one in OTLP (HTTP+GRPC). Please refer to the first step section to know what you should do before using any of the integrations.

Open Telemetry Logs

There's a big effort ongoing in the monitoring team to provide support for OTLP based documents, including logs. This will be considered our main point of integration for the future and will be used to provide support for the MONIT provided forwarders.

Storages

Any logs that you send to the OTLP endpoint will be stored in our Timberprivate OpenSearch, so of course you will need to request a tenant in case you don't have one already.

Send data

Sending data using OTLP can be achieved with any tool that supports this kind of output, in our case we have more experience using Fluentbit and so we will be able to provide extra support when configuring it if required.

Otherwise, here you will find the connection details needed for you to start sending your data to the "monit-otlp" endpoint. We keep the endpoint uris by default, so no need to specify them.

Endpoint: monit-otlp.cern.ch
Port: 4319 (HTTP), 4316 (GRPC)
User: Your tenant name
Password: Your tenant password

Access data

Once your data has been sent, accessing it can be done through Grafana and configuring one of the MONIT managed Prometheus datasources or directly in OpenSearch Timberprivate under your own tenant.

JSON Logs

The MONIT infrastructure has historically offered this endpoint for integration, and although our recommendation is to use the OTLP one when possible, we will keep maintaining this for a while.

Your data must be represented as a valid JSON object with the following fields (as strings):

(mandatory) producer: (your tenant) used to name your data set, only one value allowed
(mandatory) type: used to classify your data set, you can define multiple values but please try to keep it limited
(optional) timestamp: used to indicate the event submission time
(optional) _id: if you wish to set your own ID (needed to override documents in OpenSearch), we assign one random ID by default
(optional) host: used to add extra information about the node submitting your data
(optional) raw: in case logs are written to Elasticsearch, this field gets indexed

Storages

By default, logs that you send to the JSON endpoint will be written into MONIT Timberprivate OpenSearch cluster, you can also request for the logs to be written into your own cluster, please keep reading to understand how this works.

You can also request to have the logs stored in HDFS for archival.

Writing to a custom cluster

As part of the offered services from MONIT we can redirect the logs forwarded by us to a custom Opensearch cluster, there's a small preparatory work you will need to do beforehand. Once you have done those, please communicate to us the cluster endpoint and credentials that we should be using.

Create a user with write permissions

MONIT flows will always write in a index prefixed with monit_, so we need write access to this. Current recommendation from the Opensearch team is to grant the following ones using a security role:

cluster permissions: cluster_composite_ops_ro and cluster_monitor
index permissions: indices_all for monit_*

Create mapping template

In order to make sure that the basic fields get the proper formatting, this is the minimum template that you will need to create (for other fields recommendations check with the Opensearch team):

Index patterns: monit_*
Priority: 0
Index mapping:

{
  "dynamic_templates": [
    {
      "strings": {
        "match_mapping_type": "string",
        "mapping": {
          "ignore_above": 256,
          "type": "keyword"
        }
      }
    }
  ],
  "date_detection": true,
  "numeric_detection": true,
  "properties": {
    "data": {
      "type": "object",
      "properties": {
        "raw": {
          "type": "text"
        }
      }
    },
    "metadata": {
      "type": "object",
      "properties": {
        "timestamp": {
          "type": "date"
        }
      }
    }
  }
}

Create index pattern

Once data has been written to your cluster you should be able to create an index pattern, please make sure to select metadata.timestamp as the time field.

Send data

Logs can be sent to the HTTP endpoint listening in https://monit-logs.cern.ch:10013/\<producer\>, this is a secured endpoint and so you will need to use your tenant credentials to communicate with it.

Please pay attention to the following:

You provide all the mandatory fields
All timestamps must be in UTC milliseconds or seconds, without any subdecimal part
Use double quotes and not single quote (not valid in JSON)
Send multiple documents in the same batch grouping them in a JSON array.
Make sure your document fits in one line as we don't support multi-line JSON.
Anything considered metadata for the infrastructure will be promoted to the metadata field in the produced JSON, and the rest will be put inside data.
Only UTF-8 charset is accepted and it must be explicitly specified into the Content-Type entity of the HTTP requests

Example

Here there's a very basic Python example on how to send data to the JSON endpoint:

import requests
import json
from requests.auth import HTTPBasicAuth

def send(document):
    return requests.post('https://monit-metrics:10014/<tenant>', auth=HTTPBasicAuth(<tenant>, password), data=json.dumps(document), headers={ "Content-Type": "application/json"})

def send_and_check(document, should_fail=False):
    response = send(document)
    assert( (response.status_code in [200]) != should_fail), 'With document: {0}. Status code: {1}. Message: {2}'.format(document, response.status_code, response.text)

basic_document = [{
   "producer": "<tenant>",
   "type_prefix": "raw",
   "type": "apache_access",
   "hostname": "hostname.cern.ch",
   "timestamp": 1483696735836,
   "data": { # Your metric fields
     "foo": "bar"
   },
   "raw": "myfulllogmessagethatiwanttosearchinopensearch"
}]

send_and_check(basic_document)

Access data

Depending where you have requested to send your data, the access method might vary, here you have the different options:

OpenSearch: You can of course use directly MONIT Timberprivate to access your logs, but you can also do it from Grafana configuring a new datasource
Hadoop: Current recommendation to access data in HDFS is through the SWAN service
Please note your data will be under the path /project/monitoring/archive/<producer>/logs/<type>/YYYY/MM/DD