Prometheus监控网络设备结合Grafana展示(三)

上篇主要记录搭建过程,此篇主要记录一些配置文件的修改。

修改snmp_exporter配置文件

需要修改的是/opt/snmp_exporter/snmp.yml,snmp.yml文件中有一些常用OID,修改这个配置文件总共有两种方法,一种是使用GO语言自己编译,一种是直接修改自带的snmp.yml。

使用GO语言编译

  • 由于有一些软件Linux官方源没有,所以需要安装一下epel源。
1
yum -y install epel-release
  • 安装编译需要的软件包
1
yum -y install git gcc gcc-c++ make net-snmp net-snmp-utils net-snmp-libs net-snmp-devel go p7zip p7zip-plugins
  • 克隆snmp_exporter源码到本地
1
git clone https://github.com/prometheus/snmp_exporter.git
  • build generator
1
2
3
go get github.com/prometheus/snmp_exporter/generator
cd snmp_exporter/genrator
go build
  • 此步骤需要注意的是,是否需要默认编译还是自定义编译,默认编译需要执行以下命令:
1
2
3
make mibs
export MIBDIRS=mibs
./generator generate
  • 执行完后在当前目录会出现snmp.yml文件。如果自定义编译需要去下载一些mib库,我这里以Juniper为例。首先需要将下载的mib文件上传到snmp_exporter/generator/目录下。
1
2
unzip juniper-mibs-15.1R7.8.zip # 解压缩zip包
mv juniper-mibs-15.1R7.8 juniper # 重命名文件夹
  • 修改配置文件generator.yml,这里建议备份下源文件。
1
2
3
4
5
6
7
8
9
10
11
mv generator.yml generator.yml.bak
vim generator.yml # 新建一个配置文件,并添加下面内容

+ modules:
+ juniper: #此名称为自定义
+ walk:
+ - 1.3.6.1.4.1.2636.3.39
+ - 1.3.6.1.4.1.2636.3.13
+ version: 2 # snmp版本
+ auth:
+ community: public # 团体字
  • 然后重复上面命令
1
2
export MIBDIRS=juniper:/usr/share/snmp/mibs
./generator generate
  • 生成的snmp.yml,移动到/opt/snmp_exporter/目录下并重启服务。
1
2
mv snmp.yml /opt/snmp_exporter/
systemctl restart snmp_exporter

直接修改默认的snmp.yml

  • 在配置文件中添加想监控的一些数据的OID。vim /opt/snmp_exporter/snmp.yml。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
if_mib:
+ auth:
+ community: public
walk:
- 1.3.6.1.2.1.2
- 1.3.6.1.2.1.31.1.1
+ - 1.3.6.1.4.1.2636.3.39.1.12
+ - 1.3.6.1.4.1.2636.3.1
get:
- 1.3.6.1.2.1.1.3.0
metrics:
+ - name: jnxAvailableSession
+ oid: 1.3.6.1.4.1.2636.3.39.1.12.1.1.1.7
+ type: gauge
+ help: Juniper_SRX_MonitoringMaxFlowSession
+ - name: FlowSessionsCount
+ oid: 1.3.6.1.4.1.2636.3.39.1.12.1.1.1.6.0
+ type: gauge
+ help: Current number of Flow sessions
+ - name: jnxJsSPUMonitoringMemoryUsage
+ oid: 1.3.6.1.4.1.2636.3.39.1.12.1.1.1.5.0
+ type: gauge
+ help: Current memory usage of SPU(CPU) in percentage.
+ - name: jnxJsSPUMonitoringCPUUsage
+ oid: 1.3.6.1.4.1.2636.3.1.13.1.8.9.1.0.0
+ type: gauge
+ help: Current Services Processing Unit's -SPU (CPU) Utilization in percentage.
+ - name: jnxOperatingTemp
+ oid: 1.3.6.1.4.1.2636.3.1.13.1.7.9.1.0
+ type: gauge
+ help: The temperature in Celsius (degrees C) of this subject. Zero if unavailable or inapplicable
+ - name: jnxOperatingMemoryRE1
+ oid: 1.3.6.1.4.1.2636.3.1.13.1.15.9.2.0.0
+ type: gauge
+ help: The installed memory size in Megabytes of this subject. Zero if unavailable or inapplicable.
+ - name: jnxOperatingMemoryRE0
+ oid: 1.3.6.1.4.1.2636.3.1.13.1.15.9.1.0.0
+ type: gauge
+ help: The installed memory size in Megabytes of this subject. Zero if unavailable or inapplicable.
- name: sysUpTime
oid: 1.3.6.1.2.1.1.3
type: gauge
help: The time (in hundredths of a second) since the network management portion
of the system was last re-initialized. - 1.3.6.1.2.1.1.3

  • 编辑完之后,一样需要重启服务生效。
  • 访问snmp_exporter服务的web页面,监测配置是否生效。

  • target为监控对象的IP地址

  • 至此snmp_exporter服务配置基本已经结束。

修改Prometheus配置文件

  • 配置文件比较简单,就是添加节点信息,vim /etc/opt/prometheus/prometheus.yml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
# my global config
global:
scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
# scrape_timeout is set to the global default (10s).

# Alertmanager configuration
alerting:
alertmanagers:
- static_configs:
- targets:
# - alertmanager:9093

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
# - "first_rules.yml"
# - "second_rules.yml"

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
# The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
- job_name: "prometheus"

# metrics_path defaults to '/metrics'
# scheme defaults to 'http'.

static_configs:
- targets: ["localhost:9090"]
+ - job_name: 'test_juniper'
+ static_configs:
+ - targets: ['192.168.254.99']
+ metrics_path: /snmp
+ params:
+ module: [if_mib]
+ relabel_configs:
+ - source_labels: [__address__]
+ target_label: __param_target
+ - source_labels: [__param_target]
+ target_label: instance
+ - target_label: __address__
+ replacement: 192.168.254.112:9116

  • job_name:作业名称
  • scrape_interval:间隔
  • targets:监控的目标机器
  • 检查配置文件是否通过
1
./prometheus --config.file=prometheus.yml
  • 重启prometheus服务
1
systemctl restart prometheus
  • 查看web页面监控是否有监控数据。prometheus监听接口为9090。

  • 检查是否有图形化数据。

后续