1. 前言
docker
跨主机网络有很多种方式, 包括桥接, 路由 以及 用ovs
实现跨主机通信等等. 这篇文章主要使用两台机器来用flannel
实现docker
容器跨主机通信.
环境:
Machine 1 : 172.21.0.16 主机名:master
Machine 2 : 172.21.0.12 主机名:worker
flannel
1. [docker 网络][flannel] 配置安装测试
2. [docker 网络][flannel] 背后操作
3. [docker 网络][flannel] 源码简单分析
2. etcd
由于
flannel
为了避免ip
重复分配, 使用了etcd
来解决冲突. 由于测试, 在master(172.21.0.16)
只使用了一个单机的etcd
. 安装可以参考etcd的单节点手工安装.
etcdctl --endpoints http://172.21.0.16:2379 set /coreos.com/network/config '{"Network": "10.0.0.0/16", "SubnetLen": 24, "SubnetMin": "10.0.1.0","SubnetMax": "10.0.20.0", "Backend": {"Type": "vxlan"}}'
Network: 用于指定
Flannel
地址池, 整个overlay
网络为10.0.0.0/16
网段.
SubnetLen: 用于指定分配给单个宿主机的docker0
的ip
段的子网掩码的长度
SubnetMin: 用于指定最小能够分配的ip
段
SudbnetMax: 用于指定最大能够分配的ip段,在上面的示例中,表示每个宿主机可以分配一个24位掩码长度的子网,可以分配的子网从10.0.1.0/24
到10.0.20.0/24
,也就意味着在这个网段中,最多只能有20
台宿主机
Backend: 用于指定数据包以什么方式转发,默认为udp
模式, 这里使用的是vxlan
模式.
执行如下, 将配置信息放到
etcd
中保存.
[root@master ~]# etcdctl --endpoints http://172.21.0.16:2379 set /coreos.com/network/config '{"Network": "10.0.0.0/16", "SubnetLen": 24, "SubnetMin": "10.0.1.0","SubnetMax": "10.0.20.0", "Backend": {"Type": "vxlan"}}'
{"Network": "10.0.0.0/16", "SubnetLen": 24, "SubnetMin": "10.0.1.0","SubnetMax": "10.0.20.0", "Backend": {"Type": "vxlan"}}
[root@master ~]# etcdctl get /coreos.com/network/config
{"Network": "10.0.0.0/16", "SubnetLen": 24, "SubnetMin": "10.0.1.0","SubnetMax": "10.0.20.0", "Backend": {"Type": "vxlan"}}
3. 安装flannel
3.1 关闭docker
因为该宿主机的
docker
使用的子网网络是从flannel
中获得, 而不是docker
默认的172.17.0.1/16
, 所以flannel
需要在docker
前启动. 以master
配置为例,worker
配置基本一致.
[root@master ~]# systemctl stop docker
3.2 下载flannel
[root@master flannel]# pwd
/root/flannel
[root@master flannel]# wget https://github.com/coreos/flannel/releases/download/v0.11.0/flannel-v0.11.0-linux-amd64.tar.gz
[root@master flannel]# tar -zxvf flannel-v0.11.0-linux-amd64.tar.gz
flanneld
mk-docker-opts.sh
README.md
[root@master flannel]# cp flanneld mk-docker-opts.sh /usr/local/bin/
[root@master flannel]#
3.3 启动flannel
[root@master flannel]# /usr/local/bin/flanneld --etcd-endpoints="http://172.21.0.16:2379"
I1102 16:38:51.015597 20734 main.go:514] Determining IP address of default interface
I1102 16:38:51.015795 20734 main.go:527] Using interface with name eth0 and address 172.21.0.16
I1102 16:38:51.015813 20734 main.go:544] Defaulting external address to interface address (172.21.0.16)
I1102 16:38:51.015887 20734 main.go:244] Created subnet manager: Etcd Local Manager with Previous Subnet: None
I1102 16:38:51.015892 20734 main.go:247] Installing signal handlers
I1102 16:38:51.016953 20734 main.go:386] Found network config - Backend type: vxlan
I1102 16:38:51.016988 20734 vxlan.go:120] VXLAN config: VNI=1 Port=0 GBP=false DirectRouting=false
I1102 16:38:51.060136 20734 local_manager.go:234] Picking subnet in range 10.0.1.0 ... 10.0.20.0
I1102 16:38:51.060882 20734 local_manager.go:220] Allocated lease (10.0.13.0/24) to current node (172.21.0.16)
I1102 16:38:51.061160 20734 main.go:317] Wrote subnet file to /run/flannel/subnet.env
I1102 16:38:51.061169 20734 main.go:321] Running backend.
I1102 16:38:51.061420 20734 vxlan_network.go:60] watching for new subnet leases
I1102 16:38:51.063824 20734 iptables.go:145] Some iptables rules are missing; deleting and recreating rules
I1102 16:38:51.063840 20734 iptables.go:167] Deleting iptables rule: -s 10.0.0.0/16 -j ACCEPT
I1102 16:38:51.063904 20734 main.go:429] Waiting for 22h59m59.996699728s to renew lease
I1102 16:38:51.064971 20734 iptables.go:167] Deleting iptables rule: -d 10.0.0.0/16 -j ACCEPT
I1102 16:38:51.065938 20734 iptables.go:155] Adding iptables rule: -s 10.0.0.0/16 -j ACCEPT
I1102 16:38:51.067710 20734 iptables.go:155] Adding iptables rule: -d 10.0.0.0/16 -j ACCEPT
启动后查看相关变化:
[root@master ~]# etcdctl ls /coreos.com/network/subnets
/coreos.com/network/subnets/10.0.13.0-24
[root@master ~]# etcdctl get /coreos.com/network/subnets/10.0.13.0-24
{"PublicIP":"172.21.0.16","BackendType":"vxlan","BackendData":{"VtepMAC":"aa:52:69:c2:8a:ef"}}
[root@master ~]# ifconfig flannel.1
flannel.1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1450
inet 10.0.13.0 netmask 255.255.255.255 broadcast 0.0.0.0
inet6 fe80::a852:69ff:fec2:8aef prefixlen 64 scopeid 0x20<link>
ether aa:52:69:c2:8a:ef txqueuelen 0 (Ethernet)
RX packets 0 bytes 0 (0.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 0 bytes 0 (0.0 B)
TX errors 0 dropped 8 overruns 0 carrier 0 collisions 0
[root@master ~]#
1. 宿主机上多了一个设备
flannel.1
, 地址为10.0.13.0/32
, 并且它的mac
地址已经存到了etcd
中.
2.etcd
分配了一个子网10.0.13.0/24
, 所以该宿主机上的docker
网络就使用该网段. 所以需要去配置docker0
的配置.flannel
也提供了修改docker0
的网络配置方法, 就是修改docker
的启动命令指定网络.
[root@master ~]# cat /run/flannel/subnet.env
FLANNEL_NETWORK=10.0.0.0/16
FLANNEL_SUBNET=10.0.13.1/24
FLANNEL_MTU=1450
FLANNEL_IPMASQ=false
[root@master ~]# /root/flannel/mk-docker-opts.sh -c
[root@master ~]# cat /run/docker_opts.env
DOCKER_OPTS=" --bip=10.0.13.1/24 --ip-masq=true --mtu=1450"
[root@master ~]#
3.4 修改docker 启动文件
就是把
--bip=10.0.13.1/24 --ip-masq=true --mtu=1450
放到启动命令后面.
[root@master flannel]# vim /lib/systemd/system/docker.service
...
EnvironmentFile=/run/docker_opts.env
ExecStart=/usr/bin/dockerd $DOCKER_OPTS
...
[root@master flannel]# systemctl daemon-reload
[root@master flannel]# systemctl restart docker
[root@master flannel]#
3.5 查看docker0
[root@master flannel]# ifconfig docker0
docker0: flags=4099<UP,BROADCAST,MULTICAST> mtu 1500
inet 10.0.13.1 netmask 255.255.255.0 broadcast 0.0.0.0
inet6 fe80::42:62ff:fe53:ac4b prefixlen 64 scopeid 0x20<link>
ether 02:42:62:53:ac:4b txqueuelen 0 (Ethernet)
RX packets 0 bytes 0 (0.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 8 bytes 648 (648.0 B)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
flannel.1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1450
inet 10.0.13.0 netmask 255.255.255.255 broadcast 0.0.0.0
inet6 fe80::a852:69ff:fec2:8aef prefixlen 64 scopeid 0x20<link>
ether aa:52:69:c2:8a:ef txqueuelen 0 (Ethernet)
RX packets 0 bytes 0 (0.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 0 bytes 0 (0.0 B)
TX errors 0 dropped 8 overruns 0 carrier 0 collisions 0
...
可以看到
docker0
已经由172.17.0.1/16
变化为10.0.13.1/24
.
4. 验证
现在两台机器都已经启动了
flannel
, 并且配置如下:
[root@master flannel]# etcdctl ls /coreos.com/network/subnets
/coreos.com/network/subnets/10.0.13.0-24
/coreos.com/network/subnets/10.0.10.0-24
[root@master flannel]# etcdctl get /coreos.com/network/subnets/10.0.10.0-24
{"PublicIP":"172.21.0.12","BackendType":"vxlan","BackendData":{"VtepMAC":"5e:44:e9:fd:6a:61"}}
[root@master flannel]#
可以看到
worker(172.21.0.12)
节点的子网是10.0.10.0/24
, 并且该机器上flannel.1
的地址为5e:44:e9:fd:6a:61
.
4.1 在master(172.21.0.16)和worker(172.21.0.12)节点中启动容器
// master(172.21.0.16)
[root@master flannel]# docker run -d --name con1 busybox top
[root@master flannel]# docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
b8242a6be998 busybox "top" 25 seconds ago Up 24 seconds con1
[root@master flannel]# docker exec -it con1 sh
/ # ifconfig
eth0 Link encap:Ethernet HWaddr 02:42:0A:00:0D:02
inet addr:10.0.13.2 Bcast:0.0.0.0 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1450 Metric:1
RX packets:8 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:648 (648.0 B) TX bytes:0 (0.0 B)
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
UP LOOPBACK RUNNING MTU:65536 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)
/ # route -n
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
0.0.0.0 10.0.13.1 0.0.0.0 UG 0 0 0 eth0
10.0.13.0 0.0.0.0 255.255.255.0 U 0 0 0 eth0
// worker(172.21.0.12)
[root@worker flannel]# docker run -d --name con1 busybox top
[root@worker flannel]# docker exec -it con1 sh
/ # ifconfig
eth0 Link encap:Ethernet HWaddr 02:42:0A:00:0A:02
inet addr:10.0.10.2 Bcast:0.0.0.0 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1450 Metric:1
RX packets:8 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:648 (648.0 B) TX bytes:0 (0.0 B)
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
UP LOOPBACK RUNNING MTU:65536 Metric:1
RX packets:0 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)
/ # route -n
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
0.0.0.0 10.0.10.1 0.0.0.0 UG 0 0 0 eth0
10.0.10.0 0.0.0.0 255.255.255.0 U 0 0 0 eth0
/ #
4.2 验证相互访问
example-1.png
master
的容器con1
分别访问worker
的容器con1
,docker0
,flannel.1
以及主机.
[root@master flannel]# docker exec -it con1 sh
===> 访问worker的容器con1
/ # ping -c 1 10.0.10.2
PING 10.0.10.2 (10.0.10.2): 56 data bytes
64 bytes from 10.0.10.2: seq=0 ttl=62 time=0.559 ms
--- 10.0.10.2 ping statistics ---
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max = 0.559/0.559/0.559 ms
===> 访问worker的容器docker0
/ # ping -c 1 10.0.10.1
PING 10.0.10.1 (10.0.10.1): 56 data bytes
64 bytes from 10.0.10.1: seq=0 ttl=63 time=0.454 ms
--- 10.0.10.1 ping statistics ---
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max = 0.454/0.454/0.454 ms
===> 访问worker的容器flannel.1
/ # ping -c 1 10.0.10.0
PING 10.0.10.0 (10.0.10.0): 56 data bytes
64 bytes from 10.0.10.0: seq=0 ttl=63 time=0.475 ms
--- 10.0.10.0 ping statistics ---
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max = 0.475/0.475/0.475 ms
===> 访问worker
/ # ping -c 1 172.21.0.12
PING 172.21.0.12 (172.21.0.12): 56 data bytes
64 bytes from 172.21.0.12: seq=0 ttl=63 time=0.384 ms
--- 172.21.0.12 ping statistics ---
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max = 0.384/0.384/0.384 ms
worker
的容器con1
分别访问master
的容器con1
,docker0
,flannel.1
以及主机.
[root@worker flannel]# docker exec -it con1 sh
===> 访问master的容器con1
/ # ping -c 1 10.0.13.2
PING 10.0.13.2 (10.0.13.2): 56 data bytes
64 bytes from 10.0.13.2: seq=0 ttl=62 time=0.522 ms
--- 10.0.13.2 ping statistics ---
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max = 0.522/0.522/0.522 ms
===> 访问master的容器docker0
/ # ping -c 1 10.0.13.1
PING 10.0.13.1 (10.0.13.1): 56 data bytes
64 bytes from 10.0.13.1: seq=0 ttl=63 time=0.376 ms
--- 10.0.13.1 ping statistics ---
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max = 0.376/0.376/0.376 ms
===> 访问master的容器flannel.1
/ # ping -c 1 10.0.13.0
PING 10.0.13.0 (10.0.13.0): 56 data bytes
64 bytes from 10.0.13.0: seq=0 ttl=63 time=0.447 ms
--- 10.0.13.0 ping statistics ---
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max = 0.447/0.447/0.447 ms
===> 访问master
/ # ping -c 1 172.21.0.16
PING 172.21.0.16 (172.21.0.16): 56 data bytes
64 bytes from 172.21.0.16: seq=0 ttl=63 time=0.403 ms
--- 172.21.0.16 ping statistics ---
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max = 0.403/0.403/0.403 ms
可以看到两个容器以及互相连通起来了.
4.3 在master
再启动一个容器
在
master
再启动一个容器, 验证一下机器内部访问情况.
[root@master flannel]# docker run -d --name con2 busybox top
fb4c2e01f937489e836ae59a513ea5afdd06bd76d101d4543474ddf337a7902f
[root@master flannel]#
[root@master flannel]# docker exec -it con2 sh
/ # ifconfig
eth0 Link encap:Ethernet HWaddr 02:42:0A:00:0D:03
inet addr:10.0.13.3 Bcast:0.0.0.0 Mask:255.255.255.0
UP BROADCAST RUNNING MULTICAST MTU:1450 Metric:1
RX packets:8 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:648 (648.0 B) TX bytes:0 (0.0 B)
...
===> 访问同一机器内的容器con1
/ # ping -c 1 10.0.13.2
PING 10.0.13.2 (10.0.13.2): 56 data bytes
64 bytes from 10.0.13.2: seq=0 ttl=64 time=0.097 ms
--- 10.0.13.2 ping statistics ---
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max = 0.097/0.097/0.097 ms
===> 访问docker0
/ # ping -c 1 10.0.13.1
PING 10.0.13.1 (10.0.13.1): 56 data bytes
64 bytes from 10.0.13.1: seq=0 ttl=64 time=0.077 ms
--- 10.0.13.1 ping statistics ---
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max = 0.077/0.077/0.077 ms
===> 访问本机
/ # ping -c 1 172.21.0.16
PING 172.21.0.16 (172.21.0.16): 56 data bytes
64 bytes from 172.21.0.16: seq=0 ttl=64 time=0.084 ms
--- 172.21.0.16 ping statistics ---
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max = 0.084/0.084/0.084 ms
===> 访问外网
/ # ping -c 1 www.baidu.com
PING www.baidu.com (220.181.38.150): 56 data bytes
64 bytes from 220.181.38.150: seq=0 ttl=249 time=5.879 ms
--- www.baidu.com ping statistics ---
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max = 5.879/5.879/5.879 ms
/ #
example-2.png