Integrate Alertmanager with Grafana to send alerts!

Alerting Overview

Alerting with Prometheus is further separated into two parts. The alerting rules of Prometheus servers send alerts to the Alertmanager. Then, the alert manager manages these alerts, including silencing, aggregation, inhibition, and sends out the notifications through methods like email, on-call notifying systems, and chat platforms.

The primary steps involved in setting up notifications and alerts are given below.

  • Setup, configure the Alertmanager
  • Configure Prometheus to communicate with the Alertmanager
  • Write alerting rules in Prometheus

Source:-(https://www.aliptic.net/wp-content/uploads/2018/10/grafana-prometheus.png)

Alertmanager

Alertmanager is meant to handle the alerts that are sent by the client applications like the Prometheus server. The Alertmanager takes care of grouping, deduplicating, and routing such alerts to the exact receiver integration, including an email, OpsGenie, or PaperDuty. The core concepts implemented by the Alertmanager are given below.

Grouping

Grouping categorizes similar forms of alerts into a single notification. Especially, this is useful at the time of larger outages at the time when more systems failed at once, and simultaneously hundreds to thousands of alerts may be firing.

For example, the Fifties or Hundreds of service instances are running on our cluster when a network partition takes place. And, about half of the service instances can no longer reach the database. The alerting rules of Prometheus are configured to send an alert for every service instance if it cannot communicate with database. Resultantly, hundreds of such alerts are sent to the Alertmanager.

Being a user, one wants to get one single page even though still being able to notice which service instances are exactly affected. Therefore the user can configure the Alertmanager to group alerts based on the respective clusters and alert name so that it sends only a single compact notification.

Timing for grouped notifications, grouping the alerts, and the receivers of such notifications are configured with a routing tree in the configuration file created.

Inhibition

This is a concept of suppressing the notifications for certain alerts if other certain alerts are firing already.

For example, when an alert is firing that informs an entire cluster is not reachable. The Alertmanager may be configured to mute all the other alerts concerning the same cluster if that specific alert is firing. This prevents the notifications for firing hundreds or thousands of alerts, which falls unrelated to the actual issue.

Inhibitions can be configured via Alertmanager’s configuration file.

Silences

These are such a straightforward way to mute particular alerts for a specified time period. Silencers are configured based on the matchers, which are like the routing tree. The total incoming alerts are cross-checked to see whether they match to all the regular expressions or equality matchers of active silence. If they do so, none of the notifications will be sent out for that alert.

Silencers are configured in the web interface of Alertmanager.

Client behavior

Since Alertmanager has special requirements for the behavior of its client, those are relevant only for the advanced use cases, whereas the Prometheus is not used to send alerts.

High Availability

Alertmanager allows configuration to create a cluster for high availability. This can be configured by using “cluster-*” flags.

It’s essential not to load balance traffic between Prometheus and its Alerts Managers. But, instead, point Prometheus to all the Alertmanagers list.

Sending Alerts to Clients via Slack

Let us take an example of sending alerts to the client by creating a notification template on it. Let us look at it in brief.

Note: Prometheus automatically manages the sending of alerts generated by its configured alerting rules. It is highly recommended that Prometheus’ alerting rules be configured on the basis of time-series data rather than the implementation of a direct client.

The Alertmanager has two APIs as v1, v2, both for alert listening. The v1 scheme is written in the code snippet below. At the same time, the v2 scheme is specified as an OpenAPI specification, which is found in the Alertmanager repository. Clients are expected to re-send the alerts on a continual basis as long as they are still active (in order from 30 seconds to 3 minutes, in general). Clients can push the alerts list to Alertmanager through a POST request.

Each alert label is used to identify the identical instances of an alert and to perform deduplication. Always, the annotations are set to those received most recently and for the one which is not identifying an alert.

Both the timestamps, startsAt, and endsAt are the optional ones. The Alertmanager assigns the current time if startsAt is omitted. Whereas, endsAt is the only set if the alert’s end time is known. Or else, it will be set to a configurable timeout period from the time since the alert was received last.

The generatorURL is one of the unique back-link fields that identify the causing entity of this client alert.

The simple coding structure of the v1 scheme for the version, 0.21 mentioned above are represented in the below snippet.

[  {    “labels”: {      “alertname”: “<requiredAlertName>”,      “<labelname>”: “<labelvalue>”,      …    },    “annotations”: {      “<labelname>”: “<labelvalue>”,    },    “startsAt”: “<rfc3339>”,    “endsAt”: “<rfc3339>”,    “generatorURL”: “<generator_url>”  },  …]

Integration

Alertmanager can be integrated with tools like Grafana, Jenkins, and lots more. Let us have a look at how to integrate Alertmanager with Grafana.

Grafana Integration

Iris can be integrated easily into an existing Grafana implementation.

Iris Configuration

Enable the builtin available Grafana webhook in the Iris configuration. It can be represented in the below snippet.

webhooks:  – grafana

Now create an application using the UI. Let us use the name ‘grafana’ in this example. Once we have completed creating the application, we’ll be able to retrieve the application’s key. And here, we will use “abc” as the application key.

Grafana Configuration

Grafana can be helpful for us to configure Iris as a notification channel, using the application, the application’s key, and the target plan as parameters in the webhook URL.

The configuration of the webhook URL is given below.

Name: iris-team1Type: webhookUrl: http://iris:16649/v0/webhooks/grafana?application=grafana&key=abc&plan=team1Http Method: POST

This simple configuration adds the notification channel to our alert in the Grafana.

Conclusion

As explained by the Alertmanager with Prometheus, it can also be integrated with other tools like Grafana, Jenkins, IRIS API, and lots more.

At ScriptBees, we’ll take care of these integrations and deliver you the right build based on your requirements. Our technical expert team will also suggest the right choice on selecting the specific integrations and configuration files as well.

Sharing is caring!

Share

Leave A Comment