k8s污点Taints与容忍详解Toleration

污点状态:

  • NoSchedule:如果 Node 上带有污点 effect 为 NoSchedule,而 Node 上不带相应容忍,Kubernetes 就不会调度 Pod 到这台 Node 上。
  • PreferNoShedule:如果 Node 上带有污点 effect 为 PreferNoShedule,这时候 Kubernetes 会努力不要调度这个 Pod 到这个 Node 上。
  • NoExecute:如果 Node 上带有污点 effect 为 NoExecute,这个已经在 Node 上运行的 Pod 会从 Node 上驱逐掉。没有运行在 Node 的 Pod 不能被调度到这个 Node 上。

污点值:

  • 污点 value 的值可以为 NoSchedule、PreferNoSchedule 或 NoExecute

污点属性:

  • 污点是k8s集群的pod中的一种属性
  • 污点属性分为以上三种

污点组成:

  • key、value 及一个 effect 三个元素
<key>=<value>:<effect>

1、设置单污点及单容忍度

kubectl taint nodes master1 node-role.kubernetes.io/master=:NoSchedule     

kubectl taint node node1 key1=value1:NoSchedule       # 设置value值

kubectl taint node master1 key2=:PreferNoSchedule     # 不设置value值

2、设置多污点及多容忍度

kubectl taint nodes node1 key1=value1:NoSchedule  

kubectl taint nodes node1 key1=value1:NoExecute  

kubectl taint nodes node1 key2=value2:NoSchedule

3、查看pod中的污点状态

[root@master1 ~]# kubectl describe nodes master1
Name:               master1
Roles:              master
Labels:             beta.kubernetes.io/arch=amd64
                    beta.kubernetes.io/os=linux
                    kubernetes.io/arch=amd64
                    kubernetes.io/hostname=master1
                    kubernetes.io/os=linux
                    node-role.kubernetes.io/master=
Annotations:        flannel.alpha.coreos.com/backend-data: {"VNI":1,"VtepMAC":"36:51:e1:31:e5:9e"}
                    flannel.alpha.coreos.com/backend-type: vxlan
                    flannel.alpha.coreos.com/kube-subnet-manager: true
                    flannel.alpha.coreos.com/public-ip: 192.168.200.3
                    kubeadm.alpha.kubernetes.io/cri-socket: /var/run/dockershim.sock
                    node.alpha.kubernetes.io/ttl: 0
                    volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp:  Wed, 13 Jan 2021 06:04:10 -0500
Taints:             node-role.kubernetes.io/master:NoSchedule              # 污点状态及容忍度
Unschedulable:      false
Lease:
  HolderIdentity:  master1
  AcquireTime:     <unset>
  RenewTime:       Thu, 14 Jan 2021 01:14:07 -0500
Conditions:
  Type                 Status  LastHeartbeatTime                 LastTransitionTime                Reason                       Message
  ----                 ------  -----------------                 ------------------                ------                       -------
  NetworkUnavailable   False   Wed, 13 Jan 2021 06:12:43 -0500   Wed, 13 Jan 2021 06:12:43 -0500   FlannelIsUp                  Flannel is running on this node
  MemoryPressure       False   Thu, 14 Jan 2021 01:11:17 -0500   Wed, 13 Jan 2021 06:50:32 -0500   KubeletHasSufficientMemory   kubelet has sufficient memory available
  DiskPressure         False   Thu, 14 Jan 2021 01:11:17 -0500   Wed, 13 Jan 2021 06:50:32 -0500   KubeletHasNoDiskPressure     kubelet has no disk pressure
  PIDPressure          False   Thu, 14 Jan 2021 01:11:17 -0500   Wed, 13 Jan 2021 06:50:32 -0500   KubeletHasSufficientPID      kubelet has sufficient PID available
  Ready                True    Thu, 14 Jan 2021 01:11:17 -0500   Wed, 13 Jan 2021 06:50:32 -0500   KubeletReady                 kubelet is posting ready status
Addresses:
  InternalIP:  192.168.200.3
  Hostname:    master1
Capacity:
  cpu:                4
  ephemeral-storage:  17394Mi
  hugepages-1Gi:      0
  hugepages-2Mi:      0
  memory:             2897500Ki
  pods:               110
Allocatable:
  cpu:                4
  ephemeral-storage:  16415037823
  hugepages-1Gi:      0
  hugepages-2Mi:      0
  memory:             2795100Ki
  pods:               110
System Info:
  Machine ID:                 feb4edfea2404d3c8ad028ca4593bb32
  System UUID:                C6F44D56-0F24-6114-23E7-8DF6CD4E4CFE
  Boot ID:                    afcc0ef6-d767-4b97-9a7b-9b2500757f2e
  Kernel Version:             3.10.0-862.el7.x86_64
  OS Image:                   CentOS Linux 7 (Core)
  Operating System:           linux
  Architecture:               amd64
  Container Runtime Version:  docker://19.3.0
  Kubelet Version:            v1.18.2
  Kube-Proxy Version:         v1.18.2
PodCIDR:                      10.244.0.0/24
PodCIDRs:                     10.244.0.0/24
Non-terminated Pods:          (6 in total)
  Namespace                   Name                               CPU Requests  CPU Limits  Memory Requests  Memory Limits  AGE
  ---------                   ----                               ------------  ----------  ---------------  -------------  ---
  kube-system                 etcd-master1                       0 (0%)        0 (0%)      0 (0%)           0 (0%)         19h
  kube-system                 kube-apiserver-master1             250m (6%)     0 (0%)      0 (0%)           0 (0%)         19h
  kube-system                 kube-controller-manager-master1    200m (5%)     0 (0%)      0 (0%)           0 (0%)         19h
  kube-system                 kube-flannel-ds-wzf7w              100m (2%)     100m (2%)   50Mi (1%)        50Mi (1%)      19h
  kube-system                 kube-proxy-7h5sb                   0 (0%)        0 (0%)      0 (0%)           0 (0%)         19h
  kube-system                 kube-scheduler-master1             100m (2%)     0 (0%)      0 (0%)           0 (0%)         19h
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  Resource           Requests    Limits
  --------           --------    ------
  cpu                650m (16%)  100m (2%)
  memory             50Mi (1%)   50Mi (1%)
  ephemeral-storage  0 (0%)      0 (0%)
  hugepages-1Gi      0 (0%)      0 (0%)
  hugepages-2Mi      0 (0%)      0 (0%)
Events:              <none>

4、过滤出有几台节点存在污和容忍度是什么

[root@master1 ~]# kubectl describe node master1 | grep Taints
Taints:             node-role.kubernetes.io/master:NoSchedule

[root@master1 ~]# kubectl describe node master2 | grep Taints
Taints:             node-role.kubernetes.io/master:NoSchedule

[root@master1 ~]# kubectl describe node master3 | grep Taints
Taints:             node-role.kubernetes.io/master:NoSchedule

5、有无污点返回的结果

Taints:             node-role.kubernetes.io/master:NoSchedule     # 有污点

Taints:             <none>                                        # 没污点

6、删除污点使其pod能够调度和使用

kubectl taint node master1 node-role.kubernetes.io/master:NoSchedule-

kubectl taint nodes master1 key:NoSchedule-

kubectl create 一直处于 ContainerCreating 状态

使用命令查看 pods 状态,发现过去很久还是没有启动成功。

guoqingsongmbp:k8s guo$ kubectl get pods
NAME READY STATUS RESTARTS AGE
nginx 0/1 ContainerCreating 0 15s

1
2
3

继续查看详情

guoqingsongmbp:k8s guo$ kubectl describe pod nginx
Name: nginx
Namespace: default
Node: minikube/192.168.99.105
Start Time: Tue, 25 Dec 2018 17:45:28 +0800
Labels: app=nginx
Annotations: <none>
Status: Pending
IP:
Containers:
nginx:
Container ID:
Image: nginx
Image ID:
Port: 80/TCP
State: Waiting
Reason: ContainerCreating
Ready: False
Restart Count: 0
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-5bz7m (ro)
Conditions:
Type Status
Initialized True
Ready False
PodScheduled True
Volumes:
default-token-5bz7m:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-5bz7m
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: <none>
Events:
Type Reason Age From Message
—- —— —- —- ——-
Normal Scheduled 14s default-scheduler Successfully assigned nginx to minikube
Normal SuccessfulMountVolume 14s kubelet, minikube MountVolume.SetUp succeeded for volume “default-token-5bz7m”
Warning FailedCreatePodSandBox 0s kubelet, minikube Failed create pod sandbox.

 

发现最后出错了,Warning FailedCreatePodSandBox 0s kubelet, minikube Failed create pod sandbox.。

进入到 minikube 节点里面进行排查问题。

guoqingsongmbp:k8s guo$ minikube ssh
_ _
_ _ ( ) ( )
___ ___ (_) ___ (_)| |/’) _ _ | |_ __
/’ _ ` _ `\| |/’ _ `\| || , < ( ) ( )| ‘_`\ /’__`\
| ( ) ( ) || || ( ) || || |\`\ | (_) || |_) )( ___/
(_) (_) (_)(_)(_) (_)(_)(_) (_)`\___/'(_,__/’`\____)

# 查看日志
$ journalctl -xe

1
2
3
4
5
6
7
8
9
10

发现有这么一个错误,如下:

Dec 25 09:40:03 minikube dockerd[2468]: time=”2018-12-25T09:40:03.283646463Z” level=info msg=”Attempting next endpoint for pull after error: Get https://gcr.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)”
Dec 25 09:40:03 minikube dockerd[2468]: time=”2018-12-25T09:40:03.283664032Z” level=error msg=”Handler for POST /v1.31/images/create returned error: Get https://gcr.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)”
Dec 25 09:40:03 minikube localkube[3258]: E1225 09:40:03.284457 3258 remote_runtime.go:92] RunPodSandbox from runtime service failed: rpc error: code = Unknown desc = failed pulling image “gcr.io/google_containers/pause-amd64:3.0”: Error response from daemon: Get https://gcr.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)

1
2
3

这里是因为会使用地址 gcr.io/google_containers/pause-amd64:3.0 进行拉取镜像,但是这个地址被墙了,所以不通,这里的解决办事是去 docker hub 上拉取完毕之后然后再进行更改 tag。如下:

$ docker pull docker.io/kubernetes/pause
Using default tag: latest
latest: Pulling from kubernetes/pause
a3ed95caeb02: Pull complete
f72a00a23f01: Pull complete
Digest: sha256:2088df8eb02f10aae012e6d4bc212cabb0ada93cb05f09e504af0c9811e0ca14
Status: Downloaded newer image for kubernetes/pause:latest

$ docker tag kubernetes/pause:latest gcr.io/google_containers/pause-amd64:3.0

1
2
3
4
5
6
7
8
9

最后把原来的 pod 删除掉,再重新启动即可。

guoqingsongmbp:k8s guo$ kubectl delete -f pod_nginx.yml
pod “nginx” deleted

guoqingsongmbp:k8s guo$ kubectl create -f pod_nginx.yml
pod “nginx” created

guoqingsongmbp:k8s guo$ kubectl get pods
NAME READY STATUS RESTARTS AGE
nginx 1/1 Running 0 14m

guoqingsongmbp:k8s guo$ kubectl describe pod nginx
Name: nginx
Namespace: default
Node: minikube/192.168.99.105
Start Time: Tue, 25 Dec 2018 18:00:56 +0800
Labels: app=nginx
Annotations: <none>
Status: Running
IP: 172.17.0.4
Containers:
nginx:
Container ID: docker://cf22052ba5626cf6d99fbdb3867fa545a20c16d6f02c7eb9d9ad25b6ce6500ad
Image: nginx
Image ID: docker-pullable://nginx@sha256:5d32f60db294b5deb55d078cd4feb410ad88e6fe77500c87d3970eca97f54dba
Port: 80/TCP
State: Running
Started: Tue, 25 Dec 2018 18:01:26 +0800
Ready: True
Restart Count: 0
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-5bz7m (ro)
Conditions:
Type Status
Initialized True
Ready True
PodScheduled True
Volumes:
default-token-5bz7m:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-5bz7m
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: <none>
Events:
Type Reason Age From Message
—- —— —- —- ——-
Normal Scheduled 14m default-scheduler Successfully assigned nginx to minikube
Normal SuccessfulMountVolume 14m kubelet, minikube MountVolume.SetUp succeeded for volume “default-token-5bz7m”
Normal Pulling 14m kubelet, minikube pulling image “nginx”
Normal Pulled 14m kubelet, minikube Successfully pulled image “nginx”
Normal Created 14m kubelet, minikube Created container
Normal Started 14m kubelet, minikube Started container

Kubernetes——YAML文件

kubernetes——yaml文件的编写
yaml文件的结尾后缀名.yaml或者.yml都能够识别。
yaml文件就像脚本一样,可以放在任意的位置。
编写yaml文件需要用到的帮助手册的查看:
kubectl explain deploy    #使用explain进行查看
kubectl explain deploy.spec.template   #可以一级一级查下去中间用.连接即可。

需要提前知道的:
imagePullPolicy有三种模式
Nerver:不拉取镜像,如果本地没有镜像,那么容器起不来。
Always:默认选项,再使用yaml文件部署pod的时候,无论本地是否有镜像,都尝试拉取dockerhub上的镜像。
IfNotPresent:如果本地没有再去dockerhub上拉取。

yaml文件编写:
#vim nginx.yaml
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: nginx1
spec:
replicas: 2
template:
metadata:
labels:
run: nginx
spec:
containers:
– name: nginx
image: nginx
imagePullPolicy: IfNotPresent

kubectl apply -f nginx.yaml
kubectl delete -f nginx.yaml
#也可以通过kubectl delete deploy nginx1
#yaml文件一定要严格按照格式编写,该对齐的一定要严格对齐。
# vim nginx-pod.yml
apiVersion: v1
kind: Pod
metadata:
name: nginx
spec:
containers:
– name: nginx
image: nginx
imagePullPolicy: IfNotPresent

kubectl apply -f nginx-pod.yml
#单独的pod没有deploy来管控它,这种不受管控的pod可以直接通过 kubectl delete pod nginx 来删除。
kubectl delete -f nginx-pod.yml
kubectl get po
kubectl api-resources   #查看各资源的缩写

标签Labels
labels是service(svc)找到pod的途径
相当于是对pod打标签
kubectl get po –show-labels    #可以查看到pod的标签
给资源打标签:
kubectl label no node3 disktype=ssd    #node3打上disktype=ssd的标签
kubectl get no –show-labels
给node打上标签之后,使用yaml文件来进行标签选择:
kubectl explain deploy.spec.template.spec  |  grep -C 5 -i selector 可以查看选择器nodeSelector

#vim select.yaml
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: nginx2
spec:
replicas: 5
template:
labels:
run: nginx
spec:
nodeSelector:
disktype: ssd
container:
– name: nginx
image: nginx
imagePullPolicy: IfNotPresent
kubectl apply -f select.yaml
kubectl get po -o wide -w    #会发现5个都运行在了标签为disktype=ssd的node3上。
kubectl get no –show-labels
k8s——集群健壮性测试
如果你在一个yaml文件中定义一个deploy运行5个pod,这5个pod在集群的node中均有分布。
如果其中的一个node节点因为某种原因宕机了,那么还会有5个pod在运行吗?
答案是会的,因为你的deploy的预期是5个,如果你的节点down掉了,他会在其他的节点上去启动相应的pod来满足5个pod的要求。
使用kubectl get po -o wide 查看,会发现running状态的pod还是5个,他会在好的节点上去启动副本来满足预期。
kubectl get deploy
kubectl get no -w   #可以监控node的状态-w watch
当你的节点通过你的维护又跑起来了之后,通过kubectl get pod -o wide 查看发现:这个pod并不会再回到ken3,
因为pod一旦分布到node上部署,直到生命周期完结都不会离开相应的node
k8s控制器
deploy  job(相当于at,一次性) cj —— 相当于crond job
deploy daemon —— 持久运行
控制器job
编写job的yaml文件:
kubectl explain job
#vim job.yaml
apiVersion: batch/v1
kind: Job
metadata:
name: my_job
spec:
template:
spec:
restartPolicy: Nerver   #Always是默认选项,总是重启,Job不能使用Always
containers:
– name: busybox
image: busybox
imagePullPolicy: IfNotPresent
args:
– /bin/sh
– -c
– echo “test”;sleep 3
kubectl apply -f job.yaml
kubectl get job
kubectl get po     #状态是complete,完成job后退出了
kubectl logs my_job-s2grt   #返回了test,说明echo “test”已经执行成功了。
restartPolicy:
Never:不重启
Always:总是重启,默认选项
OnFailure:只有pod异常退出时,返回码非0就会重启
kubectl delete -f  job.yaml
然后将执行的echo “test” 命令改错,然后再启动
kubectl apply -f job.yaml
kubectl get job
kubectl get po -w  #状态complete之后
kubectl logs my_job-s2grt  #返回的是/bin/sh找不到这条命令
如果将echo改错,然后取消sleep 3
kubectl get po -w   #会发现一直在生成pod,但是状态是Error
kubectl logs my_job-swe2sd   #返回的是/bin/sh找不到这条命令
kubectl get po    #一直在出现创建pod的情况
在job失败的时候会出现大量的pod,为什么会发生这样的情况呢?
因为job的期望完成值是1,但是由于job的内部错误,导致永远无法完成该job任务,且由于重启的策略是Never,所以job会不断生成新的pod
去努力试图完成这个job任务。
要删除直接使用kubectl delete -f job.yaml
当你的重启策略为OnFailure的时候,这个时候job会不断重启一个pod来满足自己的期望完成值。
如果你再查看的pod的状态:kubectl get po ,它的restarts会一直不断增加。
job不能使用Always,如果将重启策略设为always,kubectl apply -f job.yaml 会报出不支持的错误。
parallelism:运行pod的时候,最大的期望pod运行数,只有再剩下的pod数小于最大期望运行数量的时候才会运行少于最大值,其他均以最大值去运行。
vim job1.yaml
apiVersion: batch/v1
kind: Job
metadata:
name:my-job
spec:
parallelism: 2
template:
spec:
restartPolicy: Nerver
containers:
– name: busybox
image: busybox
imagePullPolicy: IfNotPresent
args:
– /bin/sh
– -c
– echo “test”
kubectl apply -f job1.yaml
kubectl get po
还可以指定pod的运行数量:
apiVersion:batch/v1
kind: Job
metadata:
name: my-job
spec:
completions: 6
parallelism: 2
template:
spec:
restartPolicy: Nerver
containers:
– name: busybox
image: busybox
imagePullPolicy: IfNotPresent
args:
– /bin/sh
– -c
– echo “test”

kubectl apply -f job.yaml
kubectl get po    #发现同一时间是运行2个pod,一共完成6个。

yaml文件的自动生成
生成job的yaml文件:
kubectl create job job1 –image=busybox –dry-run -o yaml
kubectl create job job1 –image=busybox –dry-run -o yaml > job1.yaml
然后编辑这个文件做一些简单的修改即可。这是编写yaml文件的简单方法。
生成deploy的yaml文件:
kubectl create deploy deploy1 –image=busybox –dry-run -o yaml > deploy1.yaml
然后修改即可
kubectl get po -o yaml     #生成极为详细的yaml文件,在github上很多长的yaml文件就是这样生成修改的。
yaml文件的组成: apiVersion kind metadata spec
kubectl api-resources   #查看api的缩写
控制器daemonset
kubectl explain ds
vim ds.yaml
apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
name: myds
spec:
template:
metadata:
labels:
run: nginx
spec:
container:
– name: nginx
image: nginx
imagePullPolicy: IfNotPresent
kubectl apply -f ds.yaml
kubectl get po -o wide     #daemonset 每个节点都运行了一个
控制器CJ
kubectl explain cj
vim cj.yaml
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: mycj
spec:
schedule: ‘* * * * *’
jobTemplate:
spec:
template:
spec:
restartPolicy: OnFailure
containers:
– name: busybox
image: busybox
iamgePullPolicy: IfNotPresent
args:
– /bin/sh
– -c
– echo “test”
kubectl apply -f cj.yaml
kubectl get cj
kubectl get po -o wide  #相应的pod每分钟启动一次
kubectl logs mycj-215123546-wert12    #返回test
svc外网访问pod
外网访问pod实际上是通过service,外网通过service的ip,service通过标签选择器来找到其对用的pod,报文的转发是通过kube-proxy
要运行svc首先要运行deploy
vim svc.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: deploy1
spec:
replicas: 2
selector:
matchLabels:
name: ken
template:
metadata:
labels:
name:ken
spec:
containers:
– image: nginx
name: nginx
imagePullPolicy: IfNotPresent
ports:
– containerPort: 80   #加上这条信息相当于Expose

kubectl get po -o wide   #这个时候能够看到pod的ip,但是只能集群内部访问
kubectl get ns     #查看namespace

kubectl explain svc
apiVersion: apps/v1
kind: Deployment
metadata:
name: deploy1
spec:
replicas: 2
selector:
matchLabels:
name: ken
template:
metadata:
labels:
name:ken
spec:
containers:
– image: nginx
name: nginx
imagePullPolicy: IfNotPresent
ports:
– containerPort: 80

apiVersion: v1
kind: Service
metadata:
name: mysvc
spec:
selector:
name: ken
ports:
– port: 80    #svc的端口
targetPort: 80  #pod的端口

kubectl apply -f svc.yaml
kubectl get po -o wide
kubectl get svc    #可以查看到svc的ip以及映射到的host主机的端口在30000以上
kubectl describe svc mysvc   #可以查看到svc自己的ip,以及pod的endpoints(ip+port)
现在访问host的ip+映射的端口即可访问pod

也可以自己定义映射的端口,但是要在30000以上:

apiVersion: v1
kind: Service
metadata:
name: mysvc
spec:
type: NodePort
selector:
name: ken
ports:
– port: 80
targetPort: 80
nodePort: 30002

kubectl apply -f svc.yaml
kubectl get svc    #发现host的端口映射已经被更改为30002

查看pod启动失败原因

Focusing specifically on the example showed in the question.

The setup is following:

  • 1 GKE node with: 1 vCPU and 3.75 GB of RAM

The resources scheduled onto this single node cluster:

  • 4 Deployments where each have following fields:
        resources:
          requests: # <-- IMPORTANT
            cpu: "100m" # <-- IMPORTANT
            memory: "128Mi"
          limits:
            cpu: "100m"
            memory: "128Mi"

For an example I tried to replicate setup as close as possible to the one in the question:

  • $ kubectl get pods
NAME                           READY   STATUS    RESTARTS   AGE
nginx-four-99d88fccb-v664b     0/1     Pending   0          51m
nginx-one-8584c66446-rcw4p     1/1     Running   0          53m
nginx-three-5bcb988986-jp22f   1/1     Running   0          51m
nginx-two-6c9545d7d4-mrpw6     1/1     Running   0          52m

As you can see there is a Pod that is in Pending state. Further investigation implies:

  • $ kubectl describe pod/nginx-four-99d88fccb-v664b

A lot of information will show about the Pod but the part that needs to be checked is Events:

Events:
  Type     Reason            Age                From               Message
  ----     ------            ----               ----               -------
  Warning  FailedScheduling  56m (x2 over 56m)  default-scheduler  0/1 nodes are available: 1 Insufficient cpu.
  Normal   Scheduled         56m                default-scheduler  Successfully assigned default/nginx-two-6c9545d7d4-mrpw6 to gke-gke-old-default-pool-641f10b7-36qb
  Normal   Pulling           56m                kubelet            Pulling image "nginx"
  Normal   Pulled            56m                kubelet            Successfully pulled image "nginx"
  Normal   Created           56m                kubelet            Created container nginx
  Normal   Started           56m                kubelet            Started container nginx

As you can see from above output:

  • FailedScheduling: ... 0/1 nodes are available: 1 Insufficient cpu

As posted in the question:

I keep getting not having enough cpu availability even the node is using only 9% cpu at the same time.

Http: server gave HTTP response to HTTPS client 解决方法

系统:CentOS 7.6 Minimal
镜像:CentOS-7-x86_64-Everything-1810
Docker版本:18.09.6, build 481bc77156

如果没有 /etc/docker/daemon.json 这个文件,可以自己新建一个

情况1.这种写法是没有配置Docker加速器的情况下

在 /etc/docker/daemon.json 中设置以下:

// 没有配置加速器的

// 单个私服的写法
{
“insecure-registries”: [“registry的IP地址:端口号”]
}
// 多个私服的写法
{
“insecure-registries”: [“registry1的IP地址:端口号”,”registry2的IP地址:端口号”]
}

1
2
3
4
5
6
7
8
9
10

情况2.这种写法是配置过Docker加速器的情况下

在 /etc/docker/daemon.json 中设置以下:

// 没有配置加速器的

// 单个私服的写法
{
“registry-mirrors”: [“http://f1361db2.m.daocloud.io”],
“insecure-registries”: [“registry的IP地址:端口号”]
}
// 多个私服的写法
{
“registry-mirrors”: [“http://f1361db2.m.daocloud.io”],
“insecure-registries”: [“registry1的IP地址:端口号”,”registry2的IP地址:端口号”]
}

1
2
3
4
5
6
7
8
9
10
11
12

以上配置完成以后使用命令

systemctl daemon-reload
systemctl restart docker.service
systemctl enable docker.service

————————————————
版权声明:本文为CSDN博主「兔子不会武功」的原创文章,遵循CC 4.0 BY-SA版权协议,转载请附上原文出处链接及本声明。
原文链接:https://blog.csdn.net/liyin6847/article/details/90599612

Rancher环境下的catalog创建

项目描述:在Rancher 的catalog界面中新加一个catalog,用这个catalog创建的stack可以正常工作。
实现环境
在局域网内四台vmware的ESX虚拟机中进行试验,其中A(192.168.4.33)机器上启动rancherServer,另外B(192.168.4.27)和C(192.168.4.46)通过AddHost的方式添加进来。B和C主机添加进去之后,对每个主机进行编辑,各自添加一个hostname=B和C的labels,本步骤是为了后面测试catalog的docker-compose.yml中指定在哪台机器上运行是否正确。D(192.168.4.12)主机作为Registry。选择添加进catalog的的应用是GoToMyCloud官网。(注释: GoToMyCloud是一款 用于远程办公的远程PC控制软件 )
实现步骤:
说明:GoToMyCloud Web项目主要有两个服务器组成:
A: mysql数据库服务器
此服务器用容器来代替,需要在容器启动的时候带动mysql服务程序,并在第一次启动时将数据库的表建出来。
B:tomcat网页服务器
需安装java和tomcat ,将GoToMyCloud网页代码做成java的war包放在tomcat的指定路径下,就可访问网页。war包中有个配置用于指定数据库的IP地址,地址在做war包的时候,为了利用容器之间的link关系,需要进行修改,将ip地址换成mysql服务器的容器名称,这里定下来用mysqlhost。
具体的实现步骤如下
1.搭建私有registry;
2.编写Dockerfile,制作mysql和tomcat的镜像;
3.编写docker-compose.yml和rancher- compose.yml,为创建新的catalog做准备;
4.添加自定义的catalog到rancher UI中并测试。
搭建私有Registry

搭建Registry的步骤只有一步,运行如下命令就可以创建一个registry的容器:


在上面的这个命令中,一定要使用-v做个主机目录和regsitry目录的映射,这样的好处是容器崩溃了,数据依然还在,重建regsitry的话数据就可以恢复了。
客户端使用此registry时,如果不修改/etc/sysconfig/docker这个配置文件,会报“认证不过”的错误。在这个文件中找到OPTIONS这个key,在后面的值中加:
–insecure-registry 192.168.4.12:80

保存后,通过service docker restart 就可以访问这个私有的Registry了。

编写Dockerfile,制作mysql和tomcat的镜像
mysql镜像制作:
1.使用docker pull ubuntu命令先取下一个ubuntu的镜像作为基础镜像;

2.基于这个镜像使用docker run –name ubuntu_mysql –it bash命令创建一个容器,在这个容器中使用apt-get install mysql-server,安装过程中会提示输入密码这个使用Cloudsoar12,安装完成后,需要配置mysql允许外部主机访问,这里要改两个地方:

A. /etc/mysql/my.cnf中找到bind-addree=127.0.0.1这行,然后把他注释掉,然后重启mysql服务;
B.使用mysql进入服务器,设置mysql允许其他主机访问。

3.将第二步中配置好的容器通过docker commit ubuntu 192.168.2.35:80/base _mysql做成一个服务镜像,其中192.168.2.35:80是搭建的私有Registry,做成镜像后,通过push命令将这个镜像提交到私有Registry中。
4.编写Dockerfile制作应用镜像。 新建一个目录,命名为mysql,在这个目录下,放置如下三个文件:
A:GoToMyCloudDB.sql
容器首次启动时,通过这个文件可以将数据库表建出来。
B:run.sh
启动mysql服务,并且判断是否是第一次启动,如果是就导入GoToMyCloudDB.sql。里面的具体内容如下:
C:Dockerfile 文件
内容如下:

在mysql这个目录下,建出上面的三个文件后,通过使用docker build –t 192.168. 4.12:80/brank_mysql,制作服务镜像,做完后上传到私有Registry上。

tomcat镜像制作
GoToMyCloud项目对tomcat的要求是8.0以上,在dockerhub上有一个版本叫做emedeiros/tomcat的,这个里面安装好了8.0.4的tomcat,java环境也已经配置好了,选择它作为服务镜像,具体的步骤如下:

1.先使用docker pull 将emedeiros/tomcat 拉到本地;

2.新建一个目录,取名为tomcat,在这个目录下放3个文件:

A:GoToMyCloud.war,放网页代码
B:Run.sh,容器启动后的执行脚本,内容如下:
C:Dockerfile文件,里面的内容如下:

完成这步后使用docker build 将这个镜像打出来,并上传到registry上,比如叫做 192.168.2.35:80/brank_tomcat。

测试镜像
为了确认上面打出来的两个镜像是否是没有问题的,先不适用docker-compose来做,先通过docker 命令来启动两个容器,看是否可以把整个项目运行起来。验证步骤如下:

1.在192.168.4.46的主机上创建mysql 容器;docker run –name gotomycloud_mysql -d -p 3306:3306 192.168. 4.12:80/brank_mysql

2.在192.168.4.46的主机上创建tomcat 容器;

3.打开局域网里的浏览器访问192.168.4.46:8080/GoToMyCloud看是否可以正常访问,正常的话说明制作的两个镜像是没有问题的。