序
本系列主要介绍prometheus+cadvisor+alertmanager打造docker监控,主要监控指定docker容器是否挂掉
本节主要熟悉prometheus+Alertmanager的部署和基本使用
一、说明
prometheus本身并没有集成告警功能,需要配合Alertmanager使用
二、下载并安装Alertmanager
进入下载页,操作系统选择darwin
往下拉,看到alertmanager
三、配置Prometheus,使其可以与Alertmanager通信
# Alertmanager配置
alerting:
alertmanagers:
- static_configs:
- targets: ["localhost:9093"] # 设定alertmanager和prometheus交互的接口,即alertmanager监听的ip地址和端口
四、添加prometheus.rules.yml,配置Prometheus规则,实例down掉触发alert
groups:
- name: Instances
rules:
- alert: InstanceDown
expr: up == 0
for: 5s
labels:
severity: page
# Prometheus templates apply here in the annotation and label fields of the alert.
annotations:
description: '{{ $labels.instance }} of job {{ $labels.job }} has been down for more than 5 s.'
summary: 'Instance {{ $labels.instance }} down'
为prometheus指定规则文件
rule_files:
- 'prometheus.rules.yml'
五、编辑alertmanager.yml,配置webhook_config,即告警触发的接口调用
global:
resolve_timeout: 5m
route:
group_by: ['alertname']
group_wait: 10s
group_interval: 10s
repeat_interval: 1h
receiver: 'web.hook'
receivers:
- name: 'web.hook'
webhook_configs:
- url: 'http://localhost:5200/auth/instanceDown'
- url: 'http://localhost:5200/auth/instanceDown' 是你的告警会触发的接口调用
启动Alertmanager
./alertmanager --config.file=alertmanager.yml
启动Prometheus
./prometheus --config.file=prometheus.yml
将上一篇起的任意一个进程关闭,比如 http://localhost:8080
到http://localhost:9090/alerts 查看告警
欢迎继续阅读: