prometheus搭建教程

主要概念

prometheus 主要功能是负责数据的手机存储，手机的来源是各种exporter。比如mysql 有mysql exporter ，服务器性能指标的exporter 等等。

因此为了能够监控到某些东西，如主机的CPU 使用率，我们需要使用到 Exporter。Prometheus 周期性的从 Exporter 暴露的HTTP 服务地址（通常是/metrics）拉取监控样本数据。

prometheus安装部署

访问官网

https://prometheus.io/download/

选择linux版本下载

https://github.com/prometheus/prometheus/releases/download/v2.42.0/prometheus-2.42.0.linux-amd64.tar.gz

上传到服务器

解压

修改配置文件

prometheus.yml文件

my global config
global: # global是一些常规的全局配置，这里只列出了两个参数：
  scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute. #每15s采集一次数据
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.  #每15s做一次告警检测
  # scrape_timeout is set to the global default (10s).

# Alertmanager configuration 
alerting:
  alertmanagers:
    - static_configs:
        - targets:
          # - alertmanager:9093


# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files: # rule_files指定加载的告警规则文件，告警规则放到下面来介绍
  # - "first_rules.yml"
  # - "second_rules.yml"

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: "prometheus" # 
这是prometheus本机的一个监控节点，可以继续扩展加入其它需要被监控的节点，例如：

    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.
    #可以看到targets可以并列写入多个节点，用逗号隔开，机器名+端口号，端口号主要是exporters的端口，在这里9100其实是node_exporter的默认端口。配置完成后，prometheus就可以通过配置文件识别监控的节点，持续开始采集数据，prometheus基础配置也就搭建好了。

    static_configs:
      - targets: ["localhost:9090"]  # 启动的端口

启动命令

nohup ./prometheus --config.file=prometheus.yml > ./prometheus.log 2>&1 &

exporter 安装部署

下载地址

https://github.com/prometheus/node_exporter/releases/download/v1.5.0/node_exporter-1.5.0.linux-amd64.tar.gz

nohup ./node_exporter > node_exporter.log 2>&1 &

ps -ef |grep node_exporter

检查是否正常启动

打开网页 http://10.50.51.30:9100/metrics

配置node_exporter的自启动

vi /usr/lib/systemd/system/node_exporter.service

[Unit]
Description=node_export
Documentation=https://github.com/prometheus/node_exporter
After=network.target
[Service]
Type=simple
ExecStart= /usr/local/node_exporter-1.4.0/node_exporter
Restart=on-failure
[Install]
WantedBy=multi-user.target

grafana安装部署

grafana官方下载地址：https://grafana.com/grafana/download?pg=get&plcmt=selfmanaged-box1-cta1

wget https://dl.grafana.com/enterprise/release/grafana-enterprise-9.3.6.linux-amd64.tar.gz
tar -zxvf grafana-enterprise-9.3.6.linux-amd64.tar.gz

参考教程

https://grafana.com/grafana/download?pg=get&plcmt=selfmanaged-box1-cta1

启动命令

nohup ./grafana-server &

需要配置邮箱信息

在grafana目录下创建目录config，在里面创建文件grafana.ini

#################################### SMTP / Emailing ##########################
# 配置邮件服务器
[smtp]
enabled = true
# 发件服务器
host = smtp.qq.com:465
# smtp账号
user = 2469278741@qq.com
# smtp 授权码
password = 123456
# 发信邮箱
from_address = 2469278741@qq.com
# 发信人
from_name = zhiweiliao

需要配置数据源文件 conf/ provisioning /datasource.yml

# config file version
apiVersion: 1

deleteDatasources:  #如果之前存在name为Prometheus，orgId为1的数据源先删除
- name: Prometheus
  orgId: 1

datasources:  #配置Prometheus的数据源
- name: Prometheus
  type: prometheus
  access: proxy
  orgId: 1
  url: http://prometheus:9090  #在相同的docker compose下，可以直接用prometheus服务名直接访问
  basicAuth: false
  isDefault: true
  version: 1
  editable: true

打开页面

http://10.50.51.30:3000/

跳过用户名密码访问

添加prometheus数据源

点击右侧小齿轮图标 ==》add data source

选择prometheus 填入url http://localhost:9090 点击save&test成功

测试查询

点击explorer 小图标

选择顶部 explore 右边的下拉框里的 prometheus

metric里选择 go_gc_duration_seconds

label filters instance

localhost:9090 点击左上角的runQuery 就有图表数据出来了

pushgateway安装部署

有些指标是能通过拉取来实现的，但是有些数据是事件触发的，或者我们想推送到prometheus怎么办这个时候就需要pushgateway了。

下载地址：https://github.com/prometheus/pushgateway/releases/download/v1.5.1/pushgateway-1.5.1.linux-amd64.tar.gz

启动命令 nohup ./pushgateway &

查看端口

netstat -apn | grep 9091

查看pushgateway页面

打开pushgateway的web页面，http://10.50.51.30:9091，发现Metrics栏没有任何数据。因为此时还没有客户端推送数据给pushgateway。

修改 prometheus server 配置文件，定义一个job

在prometheus server的prometheus.yml文件中定义个job，然后tagets指向pushgateway所在的ip和9091端口：

[root@ip-10-50-51-30 prometheus-2.42.0.linux-amd64]# ps -ef|grep prometheus
root     25157 22948  0 Feb20 pts/0    00:02:01 ./prometheus --config.file=prometheus.yml
root     25310 22948  2 14:35 pts/0    00:00:00 grep --color=auto prometheus
[root@ip-10-50-51-30 prometheus-2.42.0.linux-amd64]# kill -9 25157
[root@ip-10-50-51-30 prometheus-2.42.0.linux-amd64]# nohup ./prometheus --config.file=promeths.yml > ./prometheus.log 2>&1 & 
ps -ef|grep prometheus

编辑采集脚本采集主机数据，然后推送给pushgateway

vim pushgateway.sh

#!/bin/bash

instance_name=`hostname -f | cut -d'.' -f1` 截取主机名



vim pushgateway.sh                          #编写pushgateway脚本采集数据
#!/bin/bash
instance_name=`hostname -f | cut -d'.' -f1` #截取主机名
if [ $instance_name == "localhost" ];then
   echo "Must FQDN hostname"                #要求主机名不能是localhost，不要主机名区别不了
   exit 1
fi
label="count_netstat_wait_connections"                              #定义一个key
count_netstat_wait_connections=`netstat -an| grep -i wait| wc -l`   #定义values

#推送数据给pushgateway
echo "$label $count_netstat_wait_connections" | curl --data-binary @- http://10.50.51.30:9091/metrics/job/${instance_name}

echo " $label$ count_netstat_wait_connections" | curl --data-binary @- http://10.50.51.30:9091/metrics/job/${instance_name}

然后给脚本授权执行

再打开prometheus或者grafana进行查看就可以了

http://10.50.51.30:9090/graph?

count_netstat_wait_connections

换成用java 进行推送

java 推送pushgateway

方案1 推送到gateway

推送的话数据量太大了，其实是http请求，每次都以http请求进行发送物理机的还好，

用户的数据量太大了。目前来看用户的数据是grpc实时上报的，udp是定时上报的。

方案2 写到redis中

方案3 写到本地日志中

方案4 prometheus拉取的时候取消费kafka

取了大量的时候返回

边缘节点有prometheus 为什么么要集中到中心节点总结领导sb

和同时讨论 prometheus 是适合存储监控指标，不适合记录每一个记录，他时候定时的记录监控目标的瞬间状态，但是你要让他存储完整的记录，他有translog 吗，他的存储是通过拉取的方式就不适合当做数据库取用！！！！！！！！

---------------------------------------------------完结撒花-------------------------------------------

docker部署

待续

参考

官网： https://prometheus.io/download/

csdn： https://blog.csdn.net/weixin_44352521/article/details/127947313

         https://blog.csdn.net/MssGuo/article/details/127599745

java推送 https://blog.csdn.net/qq_21389711/article/details/125183313

prometheus +pushgatewa +grafana的安装部署与测试