本篇文章不讲述如何使用prometheus监控容器,详细来说一下监控容器后续使用钉钉报警的讲解
前言:
之前一直使用的是prometheus-webhook-dingtalk0.3.0版本来进行告警的
效果图如下:
作者关注了很久项目今天迎来了重大更新Click me迎来了prometheus-webhook-dingtalk1.2.2版本,整体有很大的革新,所以按奈不住躁动的心,前来尝试一番:
效果图如下:
我使用的是prometheus-operator来部署的监控容器,这里不做过多赘述。
部署完成后效果图如下:
可以看到已经都部署完成了,但是现在还没有关联钉钉告警,接下来我们来部署钉钉
- 部署钉钉
- 这里挂载了一个名为dingding的cm
vim dingtalk.yaml
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: dingtalk-hook
namespace: monitoring
spec:
template:
metadata:
labels:
app: dingtalk-hook
spec:
containers:
- name: dingtalk-hook
image: timonwong/prometheus-webhook-dingtalk:v1.2.2
args:
- '--web.listen-address=0.0.0.0:8060'
- '--log.level=info'
- '--config.file=/etc/prometheus-webhook-dingtalk/config.yml'
imagePullPolicy: IfNotPresent
ports:
- name: http
containerPort: 8060
resources:
requests:
cpu: 100m
memory: 32Mi
limits:
cpu: 200m
memory: 64Mi
volumeMounts:
- mountPath: /etc/prometheus-webhook-dingtalk/
name: config-yml
volumes:
- configMap:
defaultMode: 420
name: dingding
name: config-yml
---
apiVersion: v1
kind: Service
metadata:
name: dingtalk-hook
namespace: monitoring
spec:
ports:
- port: 8060
protocol: TCP
targetPort: 8060
name: http
selector:
app: dingtalk-hook
type: ClusterIP
- 创建dingding cm
vim prometheus-webhook-dingtalk-config.yaml
apiVersion: v1
data:
config.yml: >-
targets:
webhook2:
url: https://oapi.dingtalk.com/robot/send?access_token=10bda98979ae2155b6822b699cde1841d4fbd8514c0441bbbb4485caddf3a388x
webhook_legacy:
url: https://oapi.dingtalk.com/robot/send?access_token=10bda98979ae2155b6822b699cde1841d4fbd8514c0441bbbb4485caddf3a388x
message:
title: '{{ template "default.title" . }}'
text: '{{ template "default.content" . }}'
webhook_mention_all:
url: https://oapi.dingtalk.com/robot/send?access_token=10bda98979ae2155b6822b699cde1841d4fbd8514c0441bbbb4485caddf3a388x
mention:
all: true
webhook_mention_users:
url: https://oapi.dingtalk.com/robot/send?access_token=10bda98979ae2155b6822b699cde1841d4fbd8514c0441bbbb4485caddf3a388x
mention:
mobiles: ['15xxxxxxxx0']
kind: ConfigMap
apiVersion: v1
metadata:
name: dingding
namespace: monitoring
- 上面的两个文件编辑完成后进行创建
kubectl create -f dingtalk.yaml
kubectl create -f prometheus-webhook-dingtalk-config.yaml
kubectl get pod -n monitoring
#你会看到一个dingtalk-hook-xxxxx的pod状态已经是running
- 钉钉也创建完成了配置文件也创建完成了,但是现在还没有和alertmanager进行关联,接下来把alertmanager和钉钉进行关联
# 配置alertmanager
vim alertmanager.yaml
global:
resolve_timeout: 1m
route:
group_by: ['job', 'severity']
group_wait: 30s
group_interval: 1m
repeat_interval: 1m
receiver: 'webhook'
routes:
- receiver: webhook
group_wait: 30s
match:
severity: none
- receiver: webhook
group_wait: 30s
match_re:
severity: none|warning|critical
receivers:
- name: 'webhook'
webhook_configs:
- url: 'http://dingtalk-hook:8060/dingtalk/webhook2/send'
send_resolved: true
- 先删除prometheus-operator自带的告警,再创建自定义的告警
kubectl delete secret alertmanager-main -n monitoring
kubectl create secret generic alertmanager-main --from-file=alertmanager.yaml -n monitoring
删除默认自带的告警
- 不能直接改cm需要改crd
kubectl get crd -n monitoring
kubectl get prometheusrules -n monitoring
kubectl edit prometheusrules prometheus-k8s-rules -n monitoring
注意事项:
- 每个告警规则的annotation下面都要增加一个description
- description是这个的关键字,否则显示不正常