上了Docker怎能不上Kubernetes(下文简称为k8s)呢,k8s是一款旨在提供跨主机集群的自动部署、扩展以及运行应用程序容器的平台。至于它具体是什么,有什么好处,可以自行网络搜索(主要是好处太多了罗列不过来)

本文的主要内容为在服务器上如何安装k8s以及搭建单master节点的k8s集群

系列文章

本文为【Docker & Kubernetes & GPU】系列文章中的一篇:

环境配置

本文实践的服务器环境为:

  • CentOS Linux release 7.5.1804 (Core)
  • 内核版本:3.10.0-862.14.4.el7.x86_64
  • Docker-CE版本:18.06.1-ce
  • 所安装的k8s版本为:1.12.2

1、安装k8s相关组件

以下内容根据官方安装指南进行简化整理,完整版请移步:https://kubernetes.io/docs/setup/independent/install-kubeadm/

因为k8s是谷歌开源的,所以下文涉及的各种下载均需要连接谷歌服务器,而这对于我们来说是不可行的。解决办法有两种:其一是服务器上挂代理;另外就是下载地址替换

注意:需要在所有节点上安装k8s(kubelet、kubeadm、kubectl)以及Docker

添加k8s的yum源

创建并编辑/etc/yum.repos.d/kubernetes.repo文件,输入以下内容(已将地址替换为阿里的镜像【阿里镜像站】):

1
2
3
4
5
6
7
8
[kubernetes]
name=Kubernetes
baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg
exclude=kube*

修改SELinux

因为k8s的pod网络需要访问宿主机文件系统,所以需要将SELinux设置为permissive模式

1
2
3
# Set SELinux in permissive mode (effectively disabling it)
setenforce 0
sed -i 's/^SELINUX=enforcing$/SELINUX=permissive/' /etc/selinux/config

关闭Swap

Kubernetes 1.8开始要求关闭系统的Swap(基于性能的考虑),如果不关闭,默认配置下kubelet将无法启动

1
2
swapoff -a 
sed -i 's/.*swap.*/#&/' /etc/fstab

安装k8s

为了避免安装过程中繁琐的确认,添加-y参数,一键安装

1
yum install -y kubelet kubeadm kubectl --disableexcludes=kubernetes

一点网络设置

因为有些RHEL/CentOS 7的用户报告说iptables被绕过导致流量路由出错,所以需要设置以下内容

创建并编辑/etc/sysctl.d/k8s.conf文件,输入以下内容:

1
2
net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1

然后使用sysctl --system查看是否添加成功

启动kubelet服务

  1. 创建配置链接:systemctl enable kubelet

    完成后会提示Created symlink from /etc/systemd/system/multi-user.target.wants/kubelet.service to /etc/systemd/system/kubelet.service.

  2. 启动服务:systemctl start kubelet

  3. (执行完后续的初始化Master再检查,否则看到的是失败状态)

    检查服务启动状态:systemctl status kubelet

    如果看到Active: active (running)那就代表启动成功了

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    ● kubelet.service - kubelet: The Kubernetes Node Agent
    Loaded: loaded (/etc/systemd/system/kubelet.service; enabled; vendor preset: disabled)
    Drop-In: /etc/systemd/system/kubelet.service.d
    └─10-kubeadm.conf
    Active: active (running) since 四 2018-11-08 10:33:25 CST; 5h 30min ago
    Docs: https://kubernetes.io/docs/
    Main PID: 29230 (kubelet)
    Tasks: 30
    Memory: 47.9M
    CGroup: /system.slice/kubelet.service
    └─29230 /usr/bin/kubelet --bootst
  4. 如果有提示[WARNING Service-Docker]: docker service is not enabled, please run 'systemctl enable docker.service'话,就照着执行systemctl enable docker.service

2、搭建k8s集群

此处搭建的是单Master节点,完整版请移步:https://kubernetes.io/docs/setup/independent/create-cluster-kubeadm/

初始化Master节点

基础命令是kubeadm init,不过先别着急执行

初始化Master的过程中会下载一些镜像,因为k8s很多系统组件也是以容器的方式运行。可以先执行kubeadm config images pull尝试下载一下,如果没有设置代理,那么肯定会出现网络错误,因为无法连接谷歌的服务器

关于谷歌镜像的下载解决办法有多种:

  • 使用阿里云自己做一个镜像站,修改k8s配置从阿里云下载
  • 使用GitHub同步结合Docker Hub Auto Build
  • 手动下载镜像然后重新打tag

另附Google所有的镜像

这里我们采用的是网上一个现成的解决方案:https://anjia0532.github.io/2017/11/15/gcr-io-image-mirror/ ,他将谷歌镜像全部同步到了自己的GitHub仓库中(目前仍在维护)并上传到了Docker Hub中,我们下载下来再重新打tag即可

解决谷歌镜像问题

那么需要哪些镜像呢,执行kubeadm config images list查看一下。对于k8s 1.12版本需要的是以下镜像及版本(注意,不同k8s版本需要的镜像版本不同):

1
2
3
4
5
6
7
k8s.gcr.io/kube-apiserver:v1.12.2
k8s.gcr.io/kube-controller-manager:v1.12.2
k8s.gcr.io/kube-scheduler:v1.12.2
k8s.gcr.io/kube-proxy:v1.12.2
k8s.gcr.io/pause:3.1
k8s.gcr.io/etcd:3.2.24
k8s.gcr.io/coredns:1.2.2

将以上内容存到一个文件中,在这里是~/k8s_need_images.dat

然后创建并编辑~/retag_images.sh文件,输入以下内容:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
images=(`cat k8s_need_images.dat`)
echo ${images[@]}
for img in ${images[@]}
do
# 之后还有需要下载的镜像,直接在k8s_need_images.txt文件中添加即可
# 不需要下载的(比如之前添加过的)前加上#号即可
# 镜像名既支持k8s.gcr.io开头的,也支持gcr.io/google_containers开头的
if [[ "${img:0:1}"x != "#"x ]]; then
img_name=`echo $img | awk -F '/' '{print $NF}'`
download_img="anjia0532/google-containers.${img_name}"
echo deal with $img_name
docker pull $download_img
docker tag $download_img $img
docker rmi $download_img
fi
done

执行sh ~/retag_images.sh ,稍等一会。完成后使用docker images查看下,所需要的k8s镜像都已存在

设置Pod网络方案

初始化Master的时候,还需要做的一件事情是要选择一种Pod网络方案。k8s提供了许多种网络方案,这里我们选择使用Flannel,那么在初始化的时候还需要加上参数--pod-network-cidr=10.244.0.0/16

执行初始化

执行kubeadm init --pod-network-cidr=10.244.0.0/16

然后可以看到以下日志内容

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
[init] using Kubernetes version: v1.12.2
[preflight] running pre-flight checks
[preflight/images] Pulling images required for setting up a Kubernetes cluster
[preflight/images] This might take a minute or two, depending on the speed of your internet connection
[preflight/images] You can also perform this action in beforehand using 'kubeadm config images pull'
[kubelet] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[kubelet] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[preflight] Activating the kubelet service
[certificates] Generated ca certificate and key.
[certificates] Generated apiserver certificate and key.
[certificates] apiserver serving cert is signed for DNS names [your.hostname1.com kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 hostname1-ip]
[certificates] Generated apiserver-kubelet-client certificate and key.
[certificates] Generated front-proxy-ca certificate and key.
[certificates] Generated front-proxy-client certificate and key.
[certificates] Generated etcd/ca certificate and key.
[certificates] Generated etcd/healthcheck-client certificate and key.
[certificates] Generated apiserver-etcd-client certificate and key.
[certificates] Generated etcd/server certificate and key.
[certificates] etcd/server serving cert is signed for DNS names [your.hostname1.com localhost] and IPs [127.0.0.1 ::1]
[certificates] Generated etcd/peer certificate and key.
[certificates] etcd/peer serving cert is signed for DNS names [your.hostname1.com localhost] and IPs [hostname1-ip 127.0.0.1 ::1]
[certificates] valid certificates and keys now exist in "/etc/kubernetes/pki"
[certificates] Generated sa key and public key.
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/admin.conf"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/kubelet.conf"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/controller-manager.conf"
[kubeconfig] Wrote KubeConfig file to disk: "/etc/kubernetes/scheduler.conf"
[controlplane] wrote Static Pod manifest for component kube-apiserver to "/etc/kubernetes/manifests/kube-apiserver.yaml"
[controlplane] wrote Static Pod manifest for component kube-controller-manager to "/etc/kubernetes/manifests/kube-controller-manager.yaml"
[controlplane] wrote Static Pod manifest for component kube-scheduler to "/etc/kubernetes/manifests/kube-scheduler.yaml"
[etcd] Wrote Static Pod manifest for a local etcd instance to "/etc/kubernetes/manifests/etcd.yaml"
[init] waiting for the kubelet to boot up the control plane as Static Pods from directory "/etc/kubernetes/manifests"
[init] this might take a minute or longer if the control plane images have to be pulled
[apiclient] All control plane components are healthy after 20.502026 seconds
[uploadconfig] storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config-1.12" in namespace kube-system with the configuration for the kubelets in the cluster
[markmaster] Marking the node your.hostname1.com as master by adding the label "node-role.kubernetes.io/master=''"
[markmaster] Marking the node your.hostname1.com as master by adding the taints [node-role.kubernetes.io/master:NoSchedule]
[patchnode] Uploading the CRI Socket information "/var/run/dockershim.sock" to the Node API object "your.hostname1.com" as an annotation
[bootstraptoken] using token: gnafk2.7b1lq8543rhbcsbz
[bootstraptoken] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstraptoken] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstraptoken] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstraptoken] creating the "cluster-info" ConfigMap in the "kube-public" namespace
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy

Your Kubernetes master has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/

You can now join any number of machines by running the following on each node
as root:

kubeadm join hostname1-ip:6443 --token gnafk2.7b1lq8543rhbcsbz --discovery-token-ca-cert-hash sha256:d2296123b1364d26678b1f92210d54fa4bb36455ffbcd665e9f04e05288b7b34

看到successfully以及检查kubelet服务状态为active (running)的话就代表Master节点初始化成功了

不过根据日志内容,提示我们还需要干三件事

配置kubectl

为了让非root用户(比如work用户)也能使用kubectl来管理k8s集群,需要执行以下命令:

1
2
3
mkdir -p $HOME/.kube
cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
chown $(id -u):$(id -g) $HOME/.kube/config

对于root用户,既可以使用上述方法(推荐),也可以只执行export KUBECONFIG=/etc/kubernetes/admin.conf,不过对于每开一个新shell都需要export太麻烦了,还是推荐上述方法

对于需要使用kubectl的用户,还可以执行以下命令以便启用kubectl命令补全功能:echo "source <(kubectl completion bash)" >> ~/.bashrc

安装Pod网络

为了能让集群的Pod之间进行通讯,需要安装Pod网络,执行以下命令:

1
kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/bc79dd1505b0c8681ece4de4c0d86c5cd2643275/Documentation/kube-flannel.yml

添加Node节点

在要作为Node节点的服务器上,执行初始化Master成功后日志中提示的内容:

1
kubeadm join hostname1-ip:6443 --token gnafk2.7b1lq8543rhbcsbz --discovery-token-ca-cert-hash sha256:d2296123b1364d26678b1f92210d54fa4bb36455ffbcd665e9f04e05288b7b34

注意,Master节点初始化成功后生成的token有效期为24小时,如果token失效的了话可以重新生成,具体可见一些错误部分

如果执行成功的话会看到以下日志内容:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
[preflight] running pre-flight checks
[discovery] Trying to connect to API Server "hostname1-ip:6443"
[discovery] Created cluster-info discovery client, requesting info from "https://hostname1-ip:6443"
[discovery] Requesting info from "https://hostname1-ip:6443" again to validate TLS against the pinned public key
[discovery] Cluster info signature and contents are valid and TLS certificate validates against pinned roots, will use API Server "hostname1-ip:6443"
[discovery] Successfully established connection with API Server "hostname1-ip:6443"
[kubelet] Downloading configuration for the kubelet from the "kubelet-config-1.12" ConfigMap in the kube-system namespace
[kubelet] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[kubelet] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
[preflight] Activating the kubelet service
[tlsbootstrap] Waiting for the kubelet to perform the TLS Bootstrap...
[patchnode] Uploading the CRI Socket information "/var/run/dockershim.sock" to the Node API object "your.hostname3.com" as an annotation

This node has joined the cluster:
* Certificate signing request was sent to apiserver and a response was received.
* The Kubelet was informed of the new secure connection details.

Run 'kubectl get nodes' on the master to see this node join the cluster.

看到joined可以确认Node节点添加成功

配置Node节点

在Master节点上执行kubectl get nodes命令,可以看到节点状态:

1
2
3
4
NAME                 STATUS     ROLES    AGE    VERSION
your.hostname3.com NotReady <none> 19m v1.12.2
your.hostname2.com NotReady <none> 6s v1.12.2
your.hostname1.com Ready master 110m v1.12.2

这里添加了两个Node节点,出现在这里也可以确认Node节点添加成功。不过可以看到两台Node节点的STATUS为NotReady,这是因为Node节点也需要启动一些组件,这些组件运行在Pod中

使用kubectl get pod --all-namespaces查看集群上运行的所有Pod

1
2
3
4
5
6
7
8
9
10
11
12
13
NAMESPACE     NAME                                                  READY   STATUS              RESTARTS   AGE
kube-system coredns-576cbf47c7-dv4wr 1/1 Running 0 117m
kube-system coredns-576cbf47c7-mq9kp 1/1 Running 0 117m
kube-system etcd-your.hostname1.com 1/1 Running 0 116m
kube-system kube-apiserver-your.hostname1.com 1/1 Running 0 116m
kube-system kube-controller-manager-your.hostname1.com 1/1 Running 0 116m
kube-system kube-flannel-ds-amd64-fpw9b 0/1 Init:0/1 0 26m
kube-system kube-flannel-ds-amd64-lp8kn 0/1 Init:0/1 0 7m47s
kube-system kube-flannel-ds-amd64-qzrlb 1/1 Running 0 34m
kube-system kube-proxy-96kzl 0/1 ContainerCreating 0 26m
kube-system kube-proxy-vb28n 1/1 Running 0 117m
kube-system kube-proxy-w62pn 0/1 ContainerCreating 0 7m47s
kube-system kube-scheduler-your.hostname1.com 1/1 Running 0 116m

会看到有一些Pod并没有正常运行(在这里是两个kube-proxy和两个kube-flannel),使用kubectl describe pod kube-proxy-96kzl --namespace=kube-system查看其中一个。返回内容最下面部分的Events记录了一些信息:

1
2
3
4
5
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning DNSConfigForming 10m (x151 over 80m) kubelet, your.hostname3.com Nameserver limits were exceeded, some nameservers have been omitted, the applied nameserver line is: 127.0.0.1 xx.xx.xx.xx xx.xx.xx.xx
Warning FailedCreatePodSandBox 46s (x172 over 80m) kubelet, your.hostname3.com Failed create pod sandbox: rpc error: code = Unknown desc = failed pulling image "k8s.gcr.io/pause:3.1": Error response from daemon: Get https://k8s.gcr.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)

可以看到是因为下载k8s.gcr.io/pause:3.1这个docker镜像失败了,原因自然是因为无法连接谷歌服务器。经过测试,在Node节点上总共需要下载以下两个镜像外加一个正常网络可以下载的quay.io/coreos/flannel:v0.10.0-amd64

1
2
k8s.gcr.io/pause:3.1
k8s.gcr.io/kube-proxy:v1.12.2

所以可以在Node节点上使用文章前部所使用的解决谷歌镜像问题方法来下载这两个镜像

下载完成后,稍等一会再使用kubectl get pod --all-namespaces查看,可以发现所有的Pod都是Running状态了,使用kubectl get nodes查看也可以发现所有Node节点为Ready状态

至此,k8s集群搭建完成

一些错误

  • [WARNING Service-Docker]: docker service is not enabled, please run ‘systemctl enable docker.service’

没有关联docker服务的配置文件。照着执行systemctl enable docker.service即可

  • [WARNING RequiredIPVSKernelModulesAvailable]: the IPVS proxier will not be used, because the following required kernel modules are not loaded: [ip_vs_sh ip_vs ip_vs_rr ip_vs_wrr] or no builtin kernel ipvs support: map[ip_vs:{} ip_vs_rr:{} ip_vs_wrr:{} ip_vs_sh:{} nf_conntrack_ipv4:{}]
    you can solve this problem with following methods:

    ​ 1.Run ‘modprobe – ‘ to load missing kernel modules;

    ​ 2.Provide the missing builtin kernel ipvs support

有一些内核模块没有加载,执行modprobe -va ip_vs_sh ip_vs ip_vs_rr ip_vs_wrr即可

  • [discovery] Failed to connect to API Server “hostname1-ip:6443”: token id “ztsrrn” is invalid for this cluster or it has expired. Use “kubeadm token create” on the master node to creating a new valid token

在添加Node节点时报的错误,是因为token过期。在Master节点上执行kubeadm token create重新生成token。并且需要重新获取CA的hash值,否则会出现cluster CA found in cluster-info configmap is invalid的错误。再执行openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | openssl rsa -pubin -outform der 2>/dev/null | openssl dgst -sha256 -hex | sed 's/^.* //',获取到新token以及新CA的hash值后,重新添加Node节点即可

  • Unable to connect to the server: x509: certificate signed by unknown authority (possibly because of “crypto/rsa: verification error” while trying to verify candidate authority certificate “kubernetes”)

没有配置kubelet,导致TLS证书不匹配

  • Node节点手动下载了相关镜像,但是Pod状态仍不发生改变

这个是时候可能是相关Pod在某个地方卡住了,可以使用kubectl delete pod xxx --namespace=kube-system将卡住的Pod删除,集群会自动重新生成相关Pod,然后就能使用手动下载的镜像了

3、小结

至此便完成了单master节点的搭建,如果在生产环境使用,还是推荐搭建HA多master

本文使用了官方提供的kubeadm工具进行配置,非常简单易用。不过网上也有好多使用二进制文件进行配置的,虽然烦琐一些但是对k8s的原理会有更深入的理解。读者可以根据需要选择更适合自己的方式

References

在这里有个微服务部署在k8s集群上的demo,可以试用感受一下