kubernetes集群安装笔记

准备

centos7主机两台

192.168.209.102  node102
192.168.209.103  node103  (master)

设置hostname的方法

#设置hostname 的方法 
hostnamectl set-hostname node102     #在 192.168.209.102 上执行
hostnamectl set-hostname node103     #在 192.168.209.103 上执行
hostnamectl --static #查看设置结果

软件版本

docker 18.09.0
kubelet-1.15.4 kubeadm-1.15.4 kubectl-1.15.4 
helm 2.14.0

所有操作无特殊说明都需要在所有节点(k8s-master 和 k8s-node)上执行

环境

关闭防火墙

systemctl stop firewalld.service
systemctl disable firewalld.service

如果不想启用防火墙，设置可以参考这里看一下kubernetes需要开放的端口 https://kubernetes.io/docs/setup/independent/install-kubeadm/#check-required-ports

关闭swap

kubernetes1.8开始不关闭swap无法启动

禁用交换分区

swapoff -a && sed -i '/ swap / s/^/#/' /etc/fstab

也就是去掉 /etc/fstab 里面这一行/dev/mapper/centos-swap swap swap defaults

禁用SELinux

将 SELinux 设置为 permissive 模式(将其禁用)

setenforce 0
sed -i 's/^SELINUX=enforcing$/SELINUX=permissive/' /etc/selinux/config

安装docker

设置docker镜像源

curl -o /etc/yum.repos.d/Docker-ce-Ali.repo https://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo


yum install -y yum-utils device-mapper-persistent-data lvm2
yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
yum makecache fast

运行命令查看可用docker镜像
yum list docker-ce --showduplicates | sort -r


安装指定版本的docker
yum install docker-ce-18.09.0-3.el7 -y

安装完毕后启动docker

systemctl start docker

如果不是从全新的环境开始安装，最好把docker清理干净

docker ps   #为空
docker ps -a   #为空**加粗样式**
docker network   #只有默认三种网络
docker volume ls   #为空
管理节点上 docker service ls  #为空

修改iptables

CentOS 7上的一些用户报告了由于iptables被绕过而导致流量路由不正确的问题。创建/etc/sysctl.d/k8s.conf文件，添加如下内容

cat <<EOF >  /etc/sysctl.d/k8s.conf
vm.swappiness = 0
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1
EOF

使配置生效

#使配置生效
modprobe br_netfilter
sysctl -p /etc/sysctl.d/k8s.conf

加载ipvs模块

cat > /etc/sysconfig/modules/ipvs.modules <<EOF
#!/bin/bash
modprobe -- ip_vs
modprobe -- ip_vs_rr
modprobe -- ip_vs_wrr
modprobe -- ip_vs_sh
modprobe -- nf_conntrack_ipv4
EOF

#这条命令有点长
chmod 755 /etc/sysconfig/modules/ipvs.modules && bash /etc/sysconfig/modules/ipvs.modules && lsmod | grep -e ip_vs -e nf_conntrack_ipv4

开始

用kubeadm 部署 kubernetes

安装kubeadm, kubelet 注意: ==yum install 安装的时候一定要看一下kubernetes的版本号后面kubeadm init 的时候需要用到==

cat <<EOF > /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
exclude=kube*
EOF


#安装   注意：：这里一定要看一下版本号，因为 Kubeadm init 的时候 填写的版本号不能低于kuberenete版本
yum install -y kubelet kubeadm kubectl --disableexcludes=kubernetes
#注 如果需要指定版本 用下面的命令   kubelet-<version>
yum install kubelet-1.15.4 kubeadm-1.15.4 kubectl-1.15.4 --disableexcludes=kubernetes

#启动 kubelet 
systemctl enable kubelet.service && systemctl start kubelet.service

#查看 kubelet 状态
[root@node103 ~]#  systemctl status kubelet.service
#查看出错信息
[root@node103 ~]#  journalctl -xefu kubelet

启动kubelet.service之后我们查看一下kubelet状态是未启动状态，查看原因发现是 “/var/lib/kubelet/config.yaml”文件不存在，这里可以暂时先不用处理，当kubeadm init 之后会创建此文件

Master 节点

kubeadm init \
--apiserver-advertise-address=192.168.209.103 \
--image-repository registry.aliyuncs.com/google_containers \
--kubernetes-version v1.15.4 \
--pod-network-cidr=10.244.0.0/16 \
--token-ttl 0

--apiserver-advertise-address 对应的是master节点的IP
--image-repository 设置为阿里云的镜像，防止部分镜像无法拉取
--kubernetes-version 关闭版本探测，因为它的默认值是stable-1，会从https://storage.googleapis.com/kubernetes-release/release/stable-1.txt下载最新的版本号，指定版本跳过网络请求，再次强调==一定要和Kubernetes版本号一致==

[init] Using Kubernetes version: v1.15.4
[preflight] Running pre-flight checks
......
Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

  mkdir -p $HOME/.kube
  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
  sudo chown $(id -u):$(id -g) $HOME/.kube/config

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
  https://kubernetes.io/docs/concepts/cluster-administration/addons/

Then you can join any number of worker nodes by running the following on each as root:

kubeadm join 192.168.209.103:6443 --token 9yh5wi.15p63wyw19kkzxsl \
    --discovery-token-ca-cert-hash sha256:e6ddd0a8419514ab5e98bceea40b347aaf8b2044db9fceade811da7bec51c362

初始化成功后会提示在使用之前需要再配置一下，配置方法已经给出，另外会生成一个临时token以及增加节点的方法

#普通用户要使用k8s 需要执行下面操作
  
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config


#如果是root 可以直接执行
export KUBECONFIG=/etc/kubernetes/admin.conf

# 以上两个二选一即可，这里我是直接用的root 所以直接执行
export KUBECONFIG=/etc/kubernetes/admin.conf

现在我们查看一下 kubelet 的状态已经是 running 状态，启动成功

[root@node103 ~]#  systemctl status kubelet.service
● kubelet.service - kubelet: The Kubernetes Node Agent
   Loaded: loaded (/usr/lib/systemd/system/kubelet.service; enabled; vendor preset: disabled)
  Drop-In: /usr/lib/systemd/system/kubelet.service.d
           └─10-kubeadm.conf
   Active: active (running) since Sat 2019-10-12 14:24:49 CST; 47min ago
     Docs: https://kubernetes.io/docs/
 Main PID: 2092 (kubelet)
   CGroup: /system.slice/kubelet.service
           └─2092 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf -...

Oct 12 14:25:11 node103 kubelet[2092]: E1012 14:25:11.090966    2092 kuberuntime_manager.go:692] createPodSandbox for pod "coredns-bccdc95cf...
Oct 12 14:25:11 node103 kubelet[2092]: E1012 14:25:11.091062    2092 pod_workers.go:190] Error syncing pod a6dd2502-9bcf-426e-a5a8-5...a8-53caf
Oct 12 14:25:11 node103 kubelet[2092]: W1012 14:25:11.522357    2092 docker_sandbox.go:384] failed to read pod IP from plugin/docker: Networ...
Oct 12 14:25:11 node103 kubelet[2092]: W1012 14:25:11.536992    2092 pod_container_deletor.go:75] Container "40bb67673e4888b1b69b9f4...ntainers
Oct 12 14:25:11 node103 kubelet[2092]: W1012 14:25:11.547593    2092 cni.go:309] CNI failed to retrieve network namespace path: cann...25eb6a8"
Oct 12 14:25:11 node103 kubelet[2092]: W1012 14:25:11.548785    2092 docker_sandbox.go:384] failed to read pod IP from plugin/docker: Networ...
Oct 12 14:25:11 node103 kubelet[2092]: W1012 14:25:11.562485    2092 pod_container_deletor.go:75] Container "e9e61a73a517720c96508bd...ntainers
Oct 12 14:25:11 node103 kubelet[2092]: W1012 14:25:11.566360    2092 cni.go:309] CNI failed to retrieve network namespace path: cann...bb38bc4"
Oct 12 14:25:12 node103 kubelet[2092]: W1012 14:25:12.806694    2092 pod_container_deletor.go:75] Container "cfb51541812189c587dbdf8...ntainers
Oct 12 14:25:12 node103 kubelet[2092]: W1012 14:25:12.843299    2092 pod_container_deletor.go:75] Container "27626aad865161213e9e2e1...ntainers
Hint: Some lines were ellipsized, use -l to show in full.

查看状态

确认每个组件都是 Healthy 状态

[root@node103 ~]# kubectl get cs
NAME                 STATUS    MESSAGE             ERROR
controller-manager   Healthy   ok                  
scheduler            Healthy   ok                  
etcd-0               Healthy   {"health":"true"}

查看node状态

[root@node103 ~]# kubectl get node
NAME      STATUS    ROLES    AGE     VERSION
node103   NotReady  master   18h     v1.15.4

安装port Network( flannel )

k8s cluster 工作必须安装pod网络，否则pod之间无法通信，k8s支持多种方案，这里选择flannel

[root@node103 ~]# kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml

podsecuritypolicy.extensions/psp.flannel.unprivileged created
clusterrole.rbac.authorization.k8s.io/flannel created
clusterrolebinding.rbac.authorization.k8s.io/flannel created
serviceaccount/flannel created
configmap/kube-flannel-cfg created
daemonset.extensions/kube-flannel-ds-amd64 created
daemonset.extensions/kube-flannel-ds-arm64 created
daemonset.extensions/kube-flannel-ds-arm created
daemonset.extensions/kube-flannel-ds-ppc64le created
daemonset.extensions/kube-flannel-ds-s390x created

检查Pod状态

[root@node103 ~]# kubectl get pod --all-namespaces -o wide
NAMESPACE     NAME                              READY   STATUS    RESTARTS   AGE   IP                NODE      NOMINATED NODE   READINESS GATES
kube-system   coredns-bccdc95cf-4x6dk           1/1     Running   10         13h   10.244.0.4        node103   <none>           <none>
kube-system   coredns-bccdc95cf-tjvv9           1/1     Running   10         13h   10.244.0.5        node103   <none>           <none>
kube-system   etcd-node103                      1/1     Running   3          13h   192.168.209.103   node103   <none>           <none>
kube-system   kube-apiserver-node103            1/1     Running   3          13h   192.168.209.103   node103   <none>           <none>
kube-system   kube-controller-manager-node103   1/1     Running   3          13h   192.168.209.103   node103   <none>           <none>
kube-system   kube-flannel-ds-amd64-qxchh       1/1     Running   1          13h   192.168.209.103   node103   <none>           <none>
kube-system   kube-proxy-kpbkx                  1/1     Running   3          13h   192.168.209.103   node103   <none>           <none>
kube-system   kube-scheduler-node103            1/1     Running   3          13h   192.168.209.103   node103   <none>           <none>

好事多磨，kube-flannel-ds-amd64-qxchh, coredns-bccdc95cf-tjvv9.coredns-bccdc95cf-4x6dk这几个服务不正常折腾了我将近一天的时间。现在把遇到的问题和解决方案简略的讲一下：
查看错误最好的方法就是根据日志定位错误,查询pod状态信息的命令如下

kubectl describe pod coredns-bccdc95cf-4x6dk   --namespace=kube-system

镜像包下不下来
设置阿里云镜像，单独执型docker pull命令,我遇到了这个包无法pull failed的问题

docker pull quay.io/coreos/flannel:v0.11.0-amd64

发现报错plugin flannel does not support config version

修改配置文件


{
  "name": "cbr0",
  "cniVersion": "0.3.1",
  "plugins": [
    {
      "type": "flannel",
      "delegate": {
        "hairpinMode": true,
        "isDefaultGateway": true
      }
    },
    {
      "type": "portmap",
      "capabilities": {
        "portMappings": true
      }
    }
  ]
}

systemctl daemon-reload

使用 kubectl 命令是报错

[root@node103 ~]#  kubectl get pod
The connection to the server localhost:8080 was refused - did you specify the right host or port?

原因：由于使用kubeadm安装的k8s ,所以需要使用 kubernetes-admin 来运行。

解决方法: (如果admin.conf没有就从master节点上copy一份到当前节点)

dial tcp 10.96.0.1:443: getsockopt: no route to host --- kubernetes（k8s）DNS 服务反复重启

systemctl stop kubelet
systemctl stop docker
iptables --flush
iptables -tnat --flush
systemctl start kubelet
systemctl start docker

增加节点

在从节点node102执行如下命令:

kubeadm join --token <token> <master-ip>:<master-port> --discovery-token-ca-cert-hash sha256:<hash>

<master-ip>:<master-port> ，本文这里对应得是192.168.209.103:6443
token，一般token两天就过期了，如果过期了你需要重新创建（查看token命令是kubeadm token list，创建token命令是kubeadm token create)，如下

[root@node103 ~]#  kubeadm token list
TOKEN                     TTL         EXPIRES   USAGES                   DESCRIPTION                                                EXTRA GROUPS
9yh5wi.15p63wyw19kkzxsl   <forever>   <never>   authentication,signing   The default bootstrap token generated by 'kubeadm init'.   system:boot

[root@node103 ~]# openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | openssl rsa -pubin -outform der 2>/dev/null | openssl dgst -sha256 -hex | sed 's/^.* //'
e6ddd0a8419514ab5e98bceea40b347aaf8b2044db9fceade811da7bec51c362
# 增加子节点的命令如下：
[root@node103 ~]# kubeadm join 192.168.209.103:6443 --token 9yh5wi.15p63wyw19kkzxsl    --discovery-token-ca-cert-hash sha256:e6ddd0a8419514ab5e98bceea40b347aaf8b2044db9fceade811da7bec51c362

用kubeadm 增加节点的方法：：有时忘记token 或 token过期，以及查看 --discovery-token-ca-cert-hash 的方法

#查看当前存在的token
[root@]# kubeadm token list

#生成新的token
[root@]# kubeadm token create

#再次查看已有的token 发现多了一个
[root@]# kubeadm token list


#查看 --discovery-token-ca-cert-hash 方法
[root@]# openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | openssl rsa -pubin -outform der 2>/dev/null |    openssl dgst -sha256 -hex | sed 's/^.* //'

#删掉oken
[root@]# kubeadm token delete  token字符串
bootstrap token with id "hpvhe4" deleted

执行完成之后查看节点

[root@node103 linux-amd64]# kubectl get nodes
NAME      STATUS   ROLES    AGE    VERSION
node102   Ready    <none>   173m   v1.15.4
node103   Ready    master   17h    v1.15.4
[root@node103 linux-amd64]# kubectl get pods -n kube-system
NAME                              READY   STATUS    RESTARTS   AGE
coredns-bccdc95cf-4x6dk           1/1     Running   10         17h
coredns-bccdc95cf-tjvv9           1/1     Running   10         17h
etcd-node103                      1/1     Running   3          17h
kube-apiserver-node103            1/1     Running   3          17h
kube-controller-manager-node103   1/1     Running   3          17h
kube-flannel-ds-amd64-qxchh       1/1     Running   1          16h
kube-flannel-ds-amd64-txjq8       1/1     Running   0          174m
kube-proxy-4fgbn                  1/1     Running   0          174m
kube-proxy-kpbkx                  1/1     Running   3          17h
kube-scheduler-node103            1/1     Running   3          17h
tiller-deploy-7bbf796b9c-f4sws    1/1     Running   0          2m51s

删除node

删除节点之后，节点想再次加入到集群中需要先执行 kubeadm reset , 之后再执行 kubeadm join
[root@node103 ~]# kubectl delete node node102
---node102 节点名称，当然不只这一种删除pod的方法，我这里不一一列出了

安装helm

helm

在helm的github上下载想要安装的helm版本包，本文为helm-v2.14.0-linux-amd64.tar.gz

[root@node103 ~]# tar -zxvf helm-v2.14.0-linux-amd64.tar.gz 
[root@node103 ~]# mv linux-amd64/helm  /usr/bin

tiller

初始化并验证 Helm，这样就会自动安装服务器端Tiller。注意：由于国内网络的问题，在安装 Tiller 的时候，需要下载镜像 gcr.io/kubernetes-helm/tiller:v2.14.0，很有可能会安装失败。所以我们这里使用阿里镜像来安装Tiller。

[root@node103 linux-amd64]# helm init --upgrade -i registry.cn-hangzhou.aliyuncs.com/google_containers/tiller:v2.14.0 --stable-repo-url https://kubernetes.oss-cn-hangzhou.aliyuncs.com/charts
Creating /root/.helm 
Creating /root/.helm/repository 
Creating /root/.helm/repository/cache 
Creating /root/.helm/repository/local 
Creating /root/.helm/plugins 
Creating /root/.helm/starters 
Creating /root/.helm/cache/archive 
Creating /root/.helm/repository/repositories.yaml 
Adding stable repo with URL: https://kubernetes.oss-cn-hangzhou.aliyuncs.com/charts 
Adding local repo with URL: http://127.0.0.1:8879/charts 
$HELM_HOME has been configured at /root/.helm.

Tiller (the Helm server-side component) has been installed into your Kubernetes Cluster.

Please note: by default, Tiller is deployed with an insecure 'allow unauthenticated users' policy.
To prevent this, run `helm init` with the --tiller-tls-verify flag.
For more information on securing your installation see: https://docs.helm.sh/using_helm/#securing-your-helm-installation

1.6开始，API Server启用了RBAC授权。而Tiller部署没有定义授权的ServiceAccount，这会导致访问API Server时被拒绝。我们可以采用如下方法，为Tiller部署添加授权

kubectl create serviceaccount --namespace kube-system tiller
kubectl create clusterrolebinding tiller-cluster-rule --clusterrole=cluster-admin --serviceaccount=kube-system:tiller
kubectl patch deploy --namespace kube-system tiller-deploy -p '{"spec":{"template":{"spec":{"serviceAccount":"tiller"}}}}'

若没有找到该pods，则重新执行命令

helm init --upgrade -i registry.cn-hangzhou.aliyuncs.com/google_containers/tiller:v2.14.0 --stable-repo-url https://kubernetes.oss-cn-hangzhou.aliyuncs.com/charts

如果需要删除服务端，可以使用下面的命令

helm reset

helm reset -f

当要移除helm init创建的目录等数据时,执行helm reset --remove-helm-home

问题

Error: forwarding ports: error upgrading connection: error dialing backend: dial tcp 192.168.209.102:10250: connect: no route to host

我重启了一下虚拟机就好了。应该是iptables的问题，解决方式参考上面的问题4.

参考文档

centos7 使用kubeadm 快速部署 kubernetes 国内源

kubernetes 常见问题整理

K8s - Kubernetes集群的安装部署教程（CentOS系统）

Docker & kubernetes(k8s) 集群搭建

二进制安装Kubernetes(K8s)集群---从零安装教程

kubernetes(k8s)离线安装helm和tiller

k8s包管理器helm安装部署及使用

k8s集群添加节点过程记录及问题解决

Kubernetes系列之二：将Slave节点加入集群

世民谈云计算理解Docker（8）

最后编辑于：2019.10.14 09:37:43

人面猴
序言：七十年代末，一起剥皮案震惊了整个滨河市，随后出现的几起案子，更是在滨河造成了极大的恐慌，老刑警刘岩，带你破解...
沈念sama阅读 211,376评论 6赞 491
死咒
序言：滨河连续发生了三起死亡事件，死亡现场离奇诡异，居然都是意外死亡，警方通过查阅死者的电脑和手机，发现死者居然都...
沈念sama阅读 90,126评论 2赞 385
救了他两次的神仙让他今天三更去死
文/潘晓璐我一进店门，熙熙楼的掌柜王于贵愁眉苦脸地迎上来，“玉大人，你说我怎么就摊上这事。” “怎么了？”我有些...
开封第一讲书人阅读 156,966评论 0赞 347
道士缉凶录：失踪的卖姜人
文/不坏的土叔我叫张陵，是天一观的道长。经常有香客问我，道长，这世上最难降的妖魔是什么？我笑而不...
开封第一讲书人阅读 56,432评论 1赞 283
港岛之恋（遗憾婚礼）
正文为了忘掉前任，我火速办了婚礼，结果婚礼上，老公的妹妹穿的比我还像新娘。我一直安慰自己，他们只是感情好，可当我...
茶点故事阅读 65,519评论 6赞 385
恶毒庶女顶嫁案：这布局不是一般人想出来的
文/花漫我一把揭开白布。她就那样静静地躺着，像睡着了一般。火红的嫁衣衬着肌肤如雪。梳的纹丝不乱的头发上，一...
开封第一讲书人阅读 49,792评论 1赞 290
城市分裂传说
那天，我揣着相机与录音，去河边找鬼。笑死，一个胖子当着我的面吹牛，可吹牛的内容都是我干的。我是一名探鬼主播，决...
沈念sama阅读 38,933评论 3赞 406
双鸳鸯连环套：你想象不到人心有多黑
文/苍兰香墨我猛地睁开眼，长吁一口气：“原来是场噩梦啊……” “哼！你这毒妇竟也来了？” 一声冷哼从身侧响起，我...
开封第一讲书人阅读 37,701评论 0赞 266
万荣杀人案实录
序言：老挝万荣一对情侣失踪，失踪者是张志新（化名）和其女友刘颖，没想到半个月后，有当地人在树林里发现了一具尸体，经...
沈念sama阅读 44,143评论 1赞 303
护林员之死
正文独居荒郊野岭守林人离奇死亡，尸身上长有42处带血的脓包…… 初始之章·张勋以下内容为张勋视角年9月15日...
茶点故事阅读 36,488评论 2赞 327
白月光启示录
正文我和宋清朗相恋三年，在试婚纱的时候发现自己被绿了。大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
茶点故事阅读 38,626评论 1赞 340
活死人
序言：一个原本活蹦乱跳的男人离奇死亡，死状恐怖，灵堂内的尸体忽然破棺而出，到底是诈尸还是另有隐情，我是刑警宁泽，带...
沈念sama阅读 34,292评论 4赞 329
日本核电站爆炸内幕
正文年R本政府宣布，位于F岛的核电站，受9级特大地震影响，放射性物质发生泄漏。R本人自食恶果不足惜，却给世界环境...
茶点故事阅读 39,896评论 3赞 313
男人毒药：我在死后第九天来索命
文/蒙蒙一、第九天我趴在偏房一处隐蔽的房顶上张望。院中可真热闹，春花似锦、人声如沸。这庄子的主人今日做“春日...
开封第一讲书人阅读 30,742评论 0赞 21
一桩弑父案，背后竟有这般阴谋
文/苍兰香墨我抬头看了看天上的太阳。三九已至，却和暖如春，着一层夹袄步出监牢的瞬间，已是汗流浃背。一阵脚步声响...
开封第一讲书人阅读 31,977评论 1赞 265
情欲美人皮
我被黑心中介骗来泰国打工，没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留，地道东北人。一个月前我还...
沈念sama阅读 46,324评论 2赞 360
代替公主和亲
正文我出身青楼，却偏偏与公主长得像，于是被迫代替她去往敌国和亲。传闻我的和亲对象是个残疾皇子，可洞房花烛夜当晚...
茶点故事阅读 43,494评论 2赞 348