Skip to content

Service Availability Metrics

The Service Availability (SA) metric is mandatory for all CERN IT Services and represents the general status of a service. The Availability Dashboard provides the historical view of all services' availability.

Service managers can send their service availability data via JSON/HTTP, as described in the custom integrations section. Service Availability metrics have to comply with a specific format, described below, and have to be reported at least once per hour for each service. If no metric is received for more than one hour, the service will be flagged as 'unknown'.

Availability format:

  • (mandatory) producer: "myproducer"
  • (mandatory) type: "availability"
  • (mandatory) serviceid: your service id as registered in the SNOW service catalogue
  • (mandatory) service_status: the current service status [available|degraded|unavailable]
  • (optional) timestamp: the event submission time
  • (optional) availabilityinfo: extra information about the service status
  • (optional) availabilitydesc: detailed description of the service
  • (optional) contact: contact email information for the sender of the availability
  • (optional) webpage: url pointing to the service website

In case you are not sure which is the serviceid of your service, you can refer to this link for the mapping between SE and serviceid.

In summary the document will look something like this:

{
  "producer": "myproducer",
  "type": "availability",
  "serviceid": "arealserviceid",
  "service_status": "available",
  "availabilitydesc": "Indicates availability of this service based on X, Y and Z",
  "availabilityinfo": "100 out of 100 machines happily running",
}

Warning

Even though we present this JSON in multiple lines please make sure yours fits in one line, as we don't support multi-line JSON.

Warning

Please note that additional numerical metrics (previously known as 'numerical values') are not part anymore of the availability document but they have to be sent as a separated JSON document as generic JSON/HTTP metrics, as by the custom integrations section.

Notifications

If you wish to receive notifications when your service leaves the "available" status there're two possible ways to achieve it.

  • Rely on SNOW to handle the notification: Please check this KB. SNOW will send an email to the registered "Groups receiving notifications" when creating your service. In case your service is already registered a ticket to SNOW admins will be needed to change the "Groups receiving notifications".

  • Configure an alarm in Grafana, this will give you better control on the notification flow at the cost of some Grafana work, please check the docs. If you need to have the datasource configured in your organisation contact us through SNOW.