概述
最近遇到一个多租户网络的坑,导致3.11的cluster-console页面拿不到监控数据,cluster-console属于openshift-console项目,prometheus监控属于openshift-monitoring项目,默认情况下openshift-console是能够访问openshift-monitoring的。
需求
客户有个需求,使用prometheus监控ipaas-dev下面的fuse应用,但是prometheus一直没拿到数据,我就怀疑是多租户网络的问题,就使用下面的命令开启这两个项目的互访权限:
oc adm pod-network join-projects --to=ipaas-dev openshift-monitoring
但是没啥用,prometheus还是没拿到数据,然后继续解决prometheus监控fuse的问题,多租户网络刚刚的配置也没有回退。后来解决了prometheus监控问题,后面会写篇文章如何监控fuse。
问题来了
第二天客户说cluster-console页面拿不到监控数据,就是下面这种效果:
有点一脸懵逼,开启浏览器页面网络调试,发现这些数据太慢了,都是红色,加载不出来,然后去cluster-console pod看看日志,发现大量提示无法连接至prometheus svc的地址,在cluster-console pod 里面去访问prometheus pod也无法访问,但是在node节点上是可以访问的,因此排除prometheus的相关问题,想起来昨天的操作,但是查看了相关文档,好像没有回退的命令,为了先解决故障,就用如下命令开通了openshift-monitoring和openshift-console的网络访问权限,然后就好了。
oc adm pod-network join-projects --to=openshift-console openshift-monitoring
深入排查
- 首先去查了官方文档,oc adm pod-network主要有三个参数可以选择:
isolate-projects:让某个项目单独隔离,即独立的不与其他项目访问,双向的。
join-projects:让某个项目能够与其他项目,一个或若干个能够互相访问。
make-projects-global:让这个项目成为其他项目都能访问的。
- netid 一样的项目是能够互相访问的,netid为0的代表是公共的项目,其他项目都能访问,可以看到openshift-monitoring项目默认是能够与其他项目互相访问的,所以我一开始开通的网络权限没有意义,但是那条命令只让ipaas-dev和openshift-monitoring能够互相访问,openshift-console与openshift-monitoring互相访问不了了。
[root@master ~]# oc get netnamespaces
NAME NETID EGRESS IPS
default 0 []
fuse 11932360 []
kube-public 16037396 []
kube-system 5893429 []
management-infra 888177 []
openshift 11855152 []
openshift-console 3638287 []
openshift-infra 8183415 []
openshift-logging 10994998 []
openshift-monitoring 0 []
openshift-node 2169558 []
openshift-sdn 5037844 []
openshift-web-console 6908028 []
- 新建3个项目,可以看到三个netid都不一样
[root@master ~]# oc new-project test-1
[root@master ~]# oc new-project test-2
[root@master ~]# oc new-project test-3
[root@master ~]# oc get netnamespaces
NAME NETID EGRESS IPS
default 0 []
fuse 11932360 []
kube-public 16037396 []
kube-system 5893429 []
management-infra 888177 []
openshift 11855152 []
openshift-console 3638287 []
openshift-infra 8183415 []
openshift-logging 10994998 []
openshift-monitoring 0 []
openshift-node 2169558 []
openshift-sdn 5037844 []
openshift-web-console 6908028 []
test-1 9984602 []
test-2 2938297 []
test-3 5782801 []
- 打通test-1、2、3,可以看到这三个netid变成一样的了
[root@master ~]# oc adm pod-network join-projects --to=test-1 test-2 test-3
[root@master ~]# oc get netnamespaces
NAME NETID EGRESS IPS
default 0 []
fuse 11932360 []
kube-public 16037396 []
kube-system 5893429 []
management-infra 888177 []
openshift 11855152 []
openshift-console 3638287 []
openshift-infra 8183415 []
openshift-logging 10994998 []
openshift-monitoring 0 []
openshift-node 2169558 []
openshift-sdn 5037844 []
openshift-web-console 6908028 []
test-1 9984602 []
test-2 9984602 []
test-3 9984602 []
- 把test-1项目隔离出来,可以看到netid变了
[root@master ~]# oc adm pod-network isolate-projects test-1
[root@master ~]# oc get netnamespaces
NAME NETID EGRESS IPS
default 0 []
fuse 11932360 []
kube-public 16037396 []
kube-system 5893429 []
management-infra 888177 []
openshift 11855152 []
openshift-console 3638287 []
openshift-infra 8183415 []
openshift-logging 10994998 []
openshift-monitoring 0 []
openshift-node 2169558 []
openshift-sdn 5037844 []
openshift-web-console 6908028 []
test-1 8048714 []
test-2 9984602 []
test-3 9984602 []
- 把test-1项目变成公共可访问的项目,后来我也使用这条命令将openshift-monitoring变成公共项目了。
[root@master ~]# oc adm pod-network make-projects-global test-1
[root@master ~]# oc get netnamespaces
NAME NETID EGRESS IPS
default 0 []
fuse 11932360 []
kube-public 16037396 []
kube-system 5893429 []
management-infra 888177 []
openshift 11855152 []
openshift-console 3638287 []
openshift-infra 8183415 []
openshift-logging 10994998 []
openshift-monitoring 0 []
openshift-node 2169558 []
openshift-sdn 5037844 []
openshift-web-console 6908028 []
test-1 0 []
test-2 9984602 []
test-3 9984602 []
帮助命令参考:
[root@master ~]# oc adm pod-network join-projects --help
Join project network
Allows projects to join existing project network when using the redhat/openshift-ovs-multitenant network plugin.
Usage:
oc adm pod-network join-projects [flags]
Examples:
# Allow project p2 to use project p1 network
oc adm pod-network join-projects --to=<p1> <p2>
# Allow all projects with label name=top-secret to use project p1 network
oc adm pod-network join-projects --to=<p1> --selector='name=top-secret'
Options:
--selector='': Label selector to filter projects. Either pass one/more projects as arguments or use this project
selector
--to='': Join network of the given project name
Use "oc adm options" for a list of global command-line options (applies to all commands).
[root@master ~]# oc adm pod-network isolate-projects --help
Isolate project network
Allows projects to isolate their network from other projects when using the redhat/openshift-ovs-multitenant network
plugin.
Usage:
oc adm pod-network isolate-projects [flags]
Examples:
# Provide isolation for project p1
oc adm pod-network isolate-projects <p1>
# Allow all projects with label name=top-secret to have their own isolated project network
oc adm pod-network isolate-projects --selector='name=top-secret'
Options:
--selector='': Label selector to filter projects. Either pass one/more projects as arguments or use this project
selector
Use "oc adm options" for a list of global command-line options (applies to all commands).
[root@master ~]# oc adm pod-network make-projects-global --help
Make project network global
Allows projects to access all pods in the cluster and vice versa when using the redhat/openshift-ovs-multitenant network
plugin.
Usage:
oc adm pod-network make-projects-global [flags]
Examples:
# Allow project p1 to access all pods in the cluster and vice versa
oc adm pod-network make-projects-global <p1>
# Allow all projects with label name=share to access all pods in the cluster and vice versa
oc adm pod-network make-projects-global --selector='name=share'
Options:
--selector='': Label selector to filter projects. Either pass one/more projects as arguments or use this project
selector
Use "oc adm options" for a list of global command-line options (applies to all commands).