八个Docker常见故障
https://mp.weixin.qq.com/s/2GNKmRJtBGHhUyVBRbRgeA
报错一:error initializing graphdriver
Docker启动报错
系统是CentOS 7.2
系统内核及docker版本如下 :
[root@docker ~]# uname -r
3.10.0-327.el7.x86_64
[root@docker ~]#
[root@docker ~]#
[root@docker ~]#
[root@docker ~]# docker version
Client:
Version: 18.04.0-ce
API version: 1.37
Go version: go1.9.4
Git commit: 3d479c0
Built: Tue Apr 10 18:21:36 2018
OS/Arch: linux/amd64
Experimental: false
Orchestrator: swarm
Server:
Engine:
Version: 18.04.0-ce
API version: 1.37 (minimum version 1.12)
Go version: go1.9.4
Git commit: 3d479c0
Built: Tue Apr 10 18:25:25 2018
OS/Arch: linux/amd64
Experimental: false
启动报错提示如下 :
[root@docker ~]# systemctl start docker
Job for docker.service failed because the control process exited with error code. See "systemctl status docker.service" and "journ
[root@docker ~]#
[root@docker ~]#
[root@docker ~]#
[root@docker ~]# systemctl status docker
● docker.service - Docker Application Container Engine
Loaded: loaded (/usr/lib/systemd/system/docker.service; disabled; vendor preset: disabled)
Active: failed (Result: start-limit) since 日 2018-04-22 20:52:39 CST; 5s ago
Docs: https://docs.docker.com
Process: 4810 ExecStart=/usr/bin/dockerd (code=exited, status=1/FAILURE)
Main PID: 4810 (code=exited, status=1/FAILURE)
4月 22 20:52:39 docker.cgy.com systemd[1]: Failed to start Docker Application Container Engine.
4月 22 20:52:39 docker.cgy.com systemd[1]: Unit docker.service entered failed state.
4月 22 20:52:39 docker.cgy.com systemd[1]: docker.service failed.
4月 22 20:52:39 docker.cgy.com systemd[1]: docker.service holdoff time over, scheduling restart.
4月 22 20:52:39 docker.cgy.com systemd[1]: start request repeated too quickly for docker.service
4月 22 20:52:39 docker.cgy.com systemd[1]: Failed to start Docker Application Container Engine.
4月 22 20:52:39 docker.cgy.com systemd[1]: Unit docker.service entered failed state.
4月 22 20:52:39 docker.cgy.com systemd[1]: docker.service failed.
从以上报错提示信息中也没看到错误的具体原因。然后我又用dockerd
来直接启动,就在输出信息最下面看到一条错误提示,如下:
[root@docker ~]# dockerd
INFO[2018-04-22T21:12:46.111704443+08:00] libcontainerd: started new docker-containerd process pid=5903
INFO[0000] starting containerd module=containerd revision=773c489c9c1b21a6d78b5c538cd395416ec50f88 version=v1.0.3
。。。。。。省略一部分输出。。。。。。
INFO[0000] loading plugin "io.containerd.grpc.v1.introspection"... module=containerd type=io.containerd.grpc.v1
INFO[0000] serving... address="/var/run/docker/containerd/docker-containerd-debug.sock" module="containerd/debug"
INFO[0000] serving... address="/var/run/docker/containerd/docker-containerd.sock" module="containerd/grpc"
INFO[0000] containerd successfully booted in 0.002763s module=containerd
Error starting daemon: error initializing graphdriver: overlay: the backing xfs filesystem is formatted without d_type support, which leads to incorrect behavior. Reformat the filesystem with ftype=1 to en d_type support. Backing filesystems without d_type support are not supported.
根据最后的报错Error starting daemon:
搜索到这篇博客,得到解决。
https://blog.csdn.net/liu9718214/article/details/79134900
具体解决办法是:
vim /etc/sysconfig/docker
加入如下:
OPTIONS="--selinux-enabled --log-driver=journald --signature-verification=false"
/etc/docker/daemon.json
加入如下内容:
{
"registry-mirrors": ["http://4a1df5ef.m.daocloud.io"], # 是用来pull容器加速用的,跟此次问题无关。
"storage-driver": "devicemapper" # 解决此次问题
}
然后重启docker,顺利解决:
[root@docker ~]# systemctl restart docker
[root@docker ~]#
[root@docker ~]#
[root@docker ~]# ps aux | grep docker
root 5922 1.7 1.6 528432 62568 ? Ssl 21:15 0:00 /usr/bin/dockerd
root 5927 1.1 0.5 356984 22100 ? Ssl 21:15 0:00 docker-containerd --config /var/run/docker/containerd/containerd.toml
root 6028 0.0 0.0 112664 964 pts/0 S+ 21:15 0:00 grep --color=auto docker
报错二:iptables failed
FirewallD
CentOS-7 中介绍了 firewalld,firewall的底层是使用iptables进行数据过滤,建立在iptables之上,这可能会与 Docker 产生冲突。
当 firewalld 启动或者重启的时候,将会从 iptables 中移除 DOCKER 的规则,从而影响了 Docker 的正常工作。
当你使用的是 Systemd 的时候, firewalld 会在 Docker 之前启动,但是如果你在 Docker 启动之后再启动 或者重启 firewalld ,你就需要重启 Docker 进程了。
系统:
[root@controller ~]# cat /etc/redhat-release
CentOS Linux release 7.0.1406 (Core)
报错提示如下:
[root@controller ~]# docker run -it -P docker.io/nginx
/usr/bin/docker-current: Error response from daemon: driver failed programming external connectivity on endpoint gloomy_kirch (10289e7a87e65771da90cda531951b7339bee9cb5953474460451cd48013aff0): iptables failed: iptables --wait -t nat -A DOCKER -p tcp -d 0/0 --dport 32810 -j DNAT --to-destination 172.17.0.2:80 ! -i docker0: iptables: No chain/target/match by that name.
(exit status 1).
这是由于在运行这次容器之前,成功启动过一次,在上次访问时,因为防火墙的问题导致不能正常访问Nginx,所以将iptables的filter表清空了,并且重启过iptables,然后再次运行时,就报了以上错误。
解决办法
重启防火墙
#CentOS 7下执行
[root@controller ~]# systemctl restart firewalld
再重启docker守护进程即可
[root@controller ~]# systemctl restart docker
再次在容器中运行一个nginx就不会报错了
[root@controller ~]# docker run -it --name nginx -p 80:80 -v /www:/wwwroot docker.io/nginx /bin/bash
root@a8a92c8f7760:/#
报错三 : Unable to take ownership of thin-pool
docker daemon启动失败:Unable to take ownership of thin-pool
Apr 27 13:51:59 master systemd: Started Docker Storage Setup.
Apr 27 13:51:59 master systemd: Starting Docker Application Container Engine...
Apr 27 13:51:59 master dockerd-current: time="2018-04-27T13:51:59.088441356+08:00" level=warning msg="could not change group /var/run/docker.sock to docker: group docker not found"
Apr 27 13:51:59 master dockerd-current: time="2018-04-27T13:51:59.091166189+08:00" level=info msg="libcontainerd: new containerd process, pid: 20930"
Apr 27 13:52:00 master dockerd-current: Error starting daemon: error initializing graphdriver: devmapper: Unable to take ownership of thin-pool (docker--vg-docker--pool) that already has used data blocks
Apr 27 13:52:00 master systemd: docker.service: main process exited, code=exited, status=1/FAILURE
Apr 27 13:52:00 master systemd: Failed to start Docker Application Container Engine.
Apr 27 13:52:00 master systemd: Unit docker.service entered failed state.
Apr 27 13:52:00 master systemd: docker.service failed
原因: /var/lib/docker/devicemapper/metadata/ 内metadata丢失
workaround:
https://bugzilla.redhat.com/show_bug.cgi?id=1321640#c5
Eric Paris 2016-04-27 08:20:10 EDT
I feel like the kcs kinda misses telling users the actual problem. Nor does it really make it clear the solution.
IF you are using device mapper (instead of loopback) /var/lib/docker contains metadata informing docker about the contents of the device mapper storage area. If you delete /var/lib/docker that metadata is lost. Docker is then able to detect that the thin pool has data but docker is unable to make use of that information. The only solution is to delete the thin pool and recreate it so that both the thin pool and the metadata in /var/lib/docker will be empty.
解决办法:
- 执行命令:
rm -rf /var/lib/docker/*
- 执行命令:
rm -rf /etc/sysconfig/docker-storage
- 执行命令:
lvremove /dev/docker-vg/docker-pool
- 使用现有的docker-vg LVM卷组:
cat <<EOF > /etc/sysconfig/docker-storage-setup
VG=docker-vg
EOF
- 执行命令:
docker-storage-setup
- 重启docker即可:
systemctl start docker
报错四: write /sys/fs/cgroup/cpuset/docker/cpuset.cpus: invalid argument
docker run运行容器时报出如下错误:
[root@backup-system cpu]# docker run -ti --name hkp_ubuntu --cpuset-cpus=0-3 ubuntu bash
docker: Error response from daemon: OCI runtime create failed: container_linux.go:370: starting container process caused: process_linux.go:326: applying cgroup configuration for process caused: failed to write "0-3\n" to "/sys/fs/cgroup/cpuset/docker/cpuset.cpus": write /sys/fs/cgroup/cpuset/docker/cpuset.cpus: invalid argument: unknown.
这个错误是因为该cgroup的cpu正在被其它cgroup使用,所以不能设置独占。
因此需要先检查并调整各个cgroup的cpuset.cpus,确保当前cgroup所用的cpu的确只分配给它了,那么此时就可以设置cpu_exclusive独占了。
当前的具体原因是做实验在 /sys/fs/cgroup/cpuset/
新建了 container目录,并把 container/cpuset.cpus 设置为了 0-3
[root@backup-system docker]# cat /sys/fs/cgroup/cpuset/container/cpuset.cpus
0-3
解决方法:
将/sys/fs/cgroup/cpuset/container/cpuset.cpus
设为空后,上述问题得到解决。
具体原因可查看此篇博客:https://www.lenky.info/archives/2019/03/2679