refinement
This commit is contained in:
parent
4bc0d45de5
commit
2fc6180697
189
README.md
189
README.md
@ -1,6 +1,6 @@
|
||||
* [kubectl - BASH autocompletion](#kubectl-bash-autocompletion)
|
||||
* [Install k3s](#install-k3s)
|
||||
* [On on-premises](#install-k3s-on-premises)
|
||||
* [On premises/IaaS](#install-k3s-on-premises)
|
||||
* [Configure upstream DNS-resolver](#upstream-dns-resolver)
|
||||
* [Change NodePort range](#nodeport-range)
|
||||
* [Clustering](#clustering)
|
||||
@ -8,9 +8,8 @@
|
||||
* [Namespaces and resource limits](#namespaces-limits)
|
||||
* [Persistent volumes (StorageClass - dynamic provisioning)](#pv)
|
||||
* [Rancher Local](#pv-local)
|
||||
* [Rancher Longhorn - distributed in local cluster](#pv-longhorn)
|
||||
* [Rancher Longhorn (distributed in local cluster) - MY FAVOURITE :-)](#pv-longhorn)
|
||||
* [NFS](#pv-nfs)
|
||||
* [Seaweedfs](#pv-seaweedfs)
|
||||
* [Ingress controller](#ingress-controller)
|
||||
* [Disable Traefik-ingress](#disable-traefik-ingress)
|
||||
* [Enable NGINX-ingress with OCSP stapling](#enable-nginx-ingress)
|
||||
@ -31,12 +30,12 @@
|
||||
* [Get deployment history](#helm-history)
|
||||
* [Rollback](#helm-rollback)
|
||||
* [Kubernetes in action](#kubernetes-in-action)
|
||||
* [Running DaemonSets on `hostPort`](#running-daemonsets)
|
||||
* [Running DaemonSets with `hostNetwork: true`](#running-daemonsets)
|
||||
* [Running StatefulSet with NFS storage](#running-statefulset-nfs)
|
||||
* [Services](#services)
|
||||
* [Client-IP transparency and loadbalancing](#services-client-ip-transparency)
|
||||
* [Session affinity/persistence](#services-session-persistence)
|
||||
* [Keep your cluster balanced](#keep-cluster-balanced)
|
||||
* [Keeping the cluster balanced](#keep-cluster-balanced)
|
||||
* [Node maintenance](#node-maintenance)
|
||||
* [What happens if a node goes down?](#what-happens-node-down)
|
||||
* [Dealing with disruptions](#disruptions)
|
||||
@ -55,36 +54,12 @@ echo "source <(kubectl completion bash)" >> ~/.bashrc
|
||||
```
|
||||
|
||||
# Install k3s <a name="user-content-install-k3s"></a>
|
||||
## On premises <a name="user-content-install-k3s-on-premises"></a>
|
||||
## On premises/IaaS <a name="user-content-install-k3s-on-premises"></a>
|
||||
https://k3s.io/:
|
||||
```
|
||||
curl -sfL https://get.k3s.io | sh -
|
||||
```
|
||||
If disired, set a memory consumption limit of the systemd-unit like so:
|
||||
```
|
||||
root#> mkdir /etc/systemd/system/k3s.service.d
|
||||
root#> vi /etc/systemd/system/k3s.service.d/limits.conf
|
||||
[Service]
|
||||
MemoryMax=1024M
|
||||
|
||||
root#> systemctl daemon-reload
|
||||
root#> systemctl restart k3s
|
||||
|
||||
root#> systemctl status k3s
|
||||
k3s.service - Lightweight Kubernetes
|
||||
Loaded: loaded (/etc/systemd/system/k3s.service; enabled; vendor preset: enabled)
|
||||
Drop-In: /etc/systemd/system/k3s.service.d
|
||||
└─limits.conf
|
||||
Active: active (running) since Thu 2020-11-26 10:46:26 CET; 13min ago
|
||||
Docs: https://k3s.io
|
||||
Process: 9618 ExecStartPre=/sbin/modprobe br_netfilter (code=exited, status=0/SUCCESS)
|
||||
Process: 9619 ExecStartPre=/sbin/modprobe overlay (code=exited, status=0/SUCCESS)
|
||||
Main PID: 9620 (k3s-server)
|
||||
Tasks: 229
|
||||
Memory: 510.6M (max: 1.0G)
|
||||
CGroup: /system.slice/k3s.service
|
||||
|
||||
```
|
||||
### Upstream DNS-resolver <a name="user-content-upstream-dns-resolver"></a>
|
||||
Docs: https://rancher.com/docs/rancher/v2.x/en/troubleshooting/dns/
|
||||
|
||||
@ -114,7 +89,7 @@ ExecStart=/usr/local/bin/k3s \
|
||||
If you want to build a K3s-cluster the default networking model is *overlay@VXLAN*. In this case make sure that
|
||||
* all of your nodes can reach (ping) each other over the underlying network (local, routed/vpn). This is required for the overlay network to work properly. VXLAN spans a mashed network over all K3s-nodes.
|
||||
* if your nodes are spread over public networks (like the internet) use a VPN (like IPSec or OpenVPN) to secure the traffic between the nodes. **VXLAN uses plain UDP for transport!**
|
||||
* if your nodes are connected through VPN, `flannel` (overlay network daemon) should explicitly communicate over the vpn network interface instead of the public network interface. Following settings should be made on the nodes:
|
||||
* if your nodes are connected through VPN, `flannel` (overlay network daemon) should explicitly communicate via the vpn network interface instead of the public network interface. Following settings should be made on the nodes:
|
||||
```
|
||||
/etc/systemd/system/k3s-agent.service:
|
||||
|
||||
@ -164,7 +139,7 @@ and change the content to
|
||||
```
|
||||
forward . ipaddr.of.your.dns-resolver
|
||||
```
|
||||
Finally redeploy the CoreDNS deployment with:
|
||||
Finally re-deploy the CoreDNS deployment with:
|
||||
`kubectl -n kube-system rollout restart deployment coredns`
|
||||
|
||||
**Note:** If you restart the cluster (`k3d cluster stop your-cluster` and `k3d cluster start your-cluster`), the changes will be gone!
|
||||
@ -180,7 +155,7 @@ Read more about [AccessModes](https://kubernetes.io/docs/concepts/storage/persis
|
||||
https://rancher.com/docs/k3s/latest/en/storage/
|
||||
Only supports *AccessMode*: ReadWriteOnce (RWO)
|
||||
|
||||
## Longhorn (distributed in local cluster) <a name="user-content-pv-longhorn"></a>
|
||||
## Rancher Longhorn (distributed in local cluster) - MY FAVOURITE :-) <a name="user-content-pv-longhorn"></a>
|
||||
* Requirements: https://longhorn.io/docs/0.8.0/install/requirements/
|
||||
* Debian: `apt install open-iscsi`
|
||||
* Install: https://rancher.com/docs/k3s/latest/en/storage/
|
||||
@ -239,11 +214,6 @@ spec:
|
||||
requests:
|
||||
storage: 32Mi
|
||||
```
|
||||
## Seaweedfs <a name="user-content-pv-seaweedfs"></a>
|
||||
Docs: https://github.com/seaweedfs
|
||||
Docs: https://github.com/seaweedfs/seaweedfs-csi-driver
|
||||
|
||||
In order to use the CSI driver you need to have a working seaweedfs-cluster. As seaweedfs is really lightweight it can be deployed on a bunch (at least three) of raspberries (min. version 3) as well as on the K3s cluster too.
|
||||
|
||||
# Ingress controller <a name="user-content-ingress-controller"></a>
|
||||
## Disable Traefik-ingress <a name="user-content-disable-traefik-ingress"></a>
|
||||
@ -251,7 +221,7 @@ edit /etc/systemd/system/k3s.service:
|
||||
```
|
||||
[...]
|
||||
ExecStart=/usr/local/bin/k3s \
|
||||
server --disable traefik --resolv-conf /etc/resolv.conf \
|
||||
server [...] --disable traefik \
|
||||
[...]
|
||||
```
|
||||
Finally `systemctl daemon-reload` and `systemctl restart k3s`
|
||||
@ -264,36 +234,37 @@ https://kubernetes.github.io/ingress-nginx/deploy/#using-helm
|
||||
```
|
||||
kubectl create ns ingress-nginx
|
||||
helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx
|
||||
helm install my-release ingress-nginx/ingress-nginx -n ingress-nginx
|
||||
helm install nginx-ingress ingress-nginx/ingress-nginx -n ingress-nginx
|
||||
```
|
||||
|
||||
`kubectl -n ingress-nginx get all`:
|
||||
```
|
||||
NAME READY STATUS RESTARTS AGE
|
||||
pod/svclb-my-release-ingress-nginx-controller-m6gxl 2/2 Running 0 110s
|
||||
pod/my-release-ingress-nginx-controller-695774d99c-t794f 1/1 Running 0 110s
|
||||
pod/svclb-nginx-ingress-controller-m6gxl 2/2 Running 0 110s
|
||||
pod/nginx-ingress-controller-695774d99c-t794f 1/1 Running 0 110s
|
||||
|
||||
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
|
||||
service/my-release-ingress-nginx-controller-admission ClusterIP 10.43.116.191 <none> 443/TCP 110s
|
||||
service/my-release-ingress-nginx-controller LoadBalancer 10.43.55.41 192.168.178.116 80:31110/TCP,443:31476/TCP 110s
|
||||
service/nginx-ingress-controller-admission ClusterIP 10.43.116.191 <none> 443/TCP 110s
|
||||
service/nginx-ingress-controller LoadBalancer 10.43.55.41 192.168.178.116 80:31110/TCP,443:31476/TCP 110s
|
||||
|
||||
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
|
||||
daemonset.apps/svclb-my-release-ingress-nginx-controller 1 1 1 1 1 <none> 110s
|
||||
daemonset.apps/svclb-nginx-ingress-controller 1 1 1 1 1 <none> 110s
|
||||
|
||||
NAME READY UP-TO-DATE AVAILABLE AGE
|
||||
deployment.apps/my-release-ingress-nginx-controller 1/1 1 1 110s
|
||||
deployment.apps/nginx-ingress-controller 1/1 1 1 110s
|
||||
|
||||
NAME DESIRED CURRENT READY AGE
|
||||
replicaset.apps/my-release-ingress-nginx-controller-695774d99c 1 1 1 110s
|
||||
replicaset.apps/nginx-ingress-controller-695774d99c 1 1 1 110s
|
||||
```
|
||||
As nginx ingress is hungry for memory, let´s reduce the number of workers to 1:
|
||||
The nginx ingress global configuration can be modified as follows:
|
||||
```
|
||||
kubectl -n ingress-nginx edit configmap my-release-ingress-nginx-controller
|
||||
kubectl -n ingress-nginx edit configmap ingress-nginx-controller
|
||||
|
||||
apiVersion: v1
|
||||
<<<ADD BEGINN>>>
|
||||
data:
|
||||
enable-ocsp: "true"
|
||||
use-gzip: "true"
|
||||
worker-processes: "1"
|
||||
<<<ADD END>>>
|
||||
kind: ConfigMap
|
||||
@ -301,13 +272,13 @@ kind: ConfigMap
|
||||
```
|
||||
Finally the deployment needs to be restarted:
|
||||
|
||||
`kubectl -n ingress-nginx rollout restart deployment my-release-ingress-nginx-controller`
|
||||
`kubectl -n ingress-nginx rollout restart deployment ingress-nginx-controller`
|
||||
|
||||
**If you are facing deployment problems like the following one**
|
||||
```
|
||||
Error: UPGRADE FAILED: cannot patch "gitea-ingress-staging" with kind Ingress: Internal error occurred: failed calling webhook "validate.nginx.ingress.kubernetes.io": Post https://my-release-ingress-nginx-controller-admission.ingress-nginx.svc:443/networking/v1beta1/ingresses?timeout=10s: context deadline exceeded
|
||||
Error: UPGRADE FAILED: cannot patch "gitea-ingress-staging" with kind Ingress: Internal error occurred: failed calling webhook "validate.nginx.ingress.kubernetes.io": Post https://nginx-ingress-controller-admission.ingress-nginx.svc:443/networking/v1beta1/ingresses?timeout=10s: context deadline exceeded
|
||||
```
|
||||
A possible fix: `kubectl -n ingress-nginx delete ValidatingWebhookConfiguration my-release-ingress-nginx-admission`
|
||||
A possible fix: `kubectl -n ingress-nginx delete ValidatingWebhookConfiguration ingress-nginx-admission`
|
||||
|
||||
# Cert-Manager (references ingress controller) <a name="user-content-cert-manager"></a>
|
||||
## Installation <a name="user-content-cert-manager-install"></a>
|
||||
@ -325,7 +296,7 @@ kubectl -n cert-manager get all
|
||||
## Cluster-internal CA Issuer <a name="user-content-cert-manager-cluster-ca-issuer"></a>
|
||||
Docs: https://cert-manager.io/docs/configuration/ca/
|
||||
|
||||
## Let´s Encrypt issuer <a name="user-content-cert-manager-le-issuer"></a>
|
||||
## Let´s Encrypt HTTP-01 issuer <a name="user-content-cert-manager-le-issuer"></a>
|
||||
Docs: https://cert-manager.io/docs/tutorials/acme/ingress/#step-6-configure-let-s-encrypt-issuer
|
||||
```
|
||||
ClusterIssuers are a resource type similar to Issuers. They are specified in exactly the same way,
|
||||
@ -338,7 +309,7 @@ lets-encrypt-cluster-issuers.yaml:
|
||||
apiVersion: cert-manager.io/v1
|
||||
kind: ClusterIssuer
|
||||
metadata:
|
||||
name: letsencrypt-staging-issuer
|
||||
name: letsencrypt-http01-staging-issuer
|
||||
spec:
|
||||
acme:
|
||||
# You must replace this email address with your own.
|
||||
@ -358,7 +329,7 @@ spec:
|
||||
apiVersion: cert-manager.io/v1
|
||||
kind: ClusterIssuer
|
||||
metadata:
|
||||
name: letsencrypt-prod-issuer
|
||||
name: letsencrypt-http01-prod-issuer
|
||||
spec:
|
||||
acme:
|
||||
# The ACME server URL
|
||||
@ -386,7 +357,7 @@ data:
|
||||
apiVersion: cert-manager.io/v1
|
||||
kind: ClusterIssuer
|
||||
metadata:
|
||||
name: letsencrypt-dns01-issuer
|
||||
name: letsencrypt-dns01-prod-issuer
|
||||
spec:
|
||||
acme:
|
||||
email: user@example.com
|
||||
@ -411,9 +382,9 @@ spec:
|
||||
`kubectl apply -f lets-encrypt-cluster-issuers.yaml`
|
||||
|
||||
## Deploying a LE-certificate with ingress <a name="user-content-cert-manager-ingress"></a>
|
||||
All you need is an `Ingress` resource of class `nginx` which references a ClusterIssuer (`letsencrypt-prod-issuer`) resource.
|
||||
All you need is an `Ingress` resource of class `nginx` which references a ClusterIssuer (`letsencrypt-http01-prod-issuer`) resource.
|
||||
|
||||
HTTP-01 solver (`cert-manager.io/cluster-issuer: "letsencrypt-prod-issuer"`):
|
||||
HTTP-01 solver (`cert-manager.io/cluster-issuer: "letsencrypt-http01-prod-issuer"`):
|
||||
```
|
||||
apiVersion: networking.k8s.io/v1beta1
|
||||
kind: Ingress
|
||||
@ -423,7 +394,7 @@ metadata:
|
||||
annotations:
|
||||
# use the shared ingress-nginx
|
||||
kubernetes.io/ingress.class: "nginx"
|
||||
cert-manager.io/cluster-issuer: "letsencrypt-prod-issuer"
|
||||
cert-manager.io/cluster-issuer: "letsencrypt-http01-prod-issuer"
|
||||
spec:
|
||||
tls:
|
||||
- hosts:
|
||||
@ -438,7 +409,7 @@ spec:
|
||||
serviceName: some-target-service
|
||||
servicePort: some-target-service-port
|
||||
```
|
||||
DNS-01 solver (`cert-manager.io/cluster-issuer: "letsencrypt-dns01-issuer"`):
|
||||
DNS-01 solver (`cert-manager.io/cluster-issuer: "letsencrypt-dns01-prod-issuer"`):
|
||||
```
|
||||
apiVersion: networking.k8s.io/v1beta1
|
||||
kind: Ingress
|
||||
@ -448,7 +419,7 @@ metadata:
|
||||
annotations:
|
||||
# use the shared ingress-nginx
|
||||
kubernetes.io/ingress.class: "nginx"
|
||||
cert-manager.io/cluster-issuer: "letsencrypt-dns01-issuer"
|
||||
cert-manager.io/cluster-issuer: "letsencrypt-dns01-prod-issuer"
|
||||
spec:
|
||||
tls:
|
||||
- hosts:
|
||||
@ -538,7 +509,7 @@ spec:
|
||||
## Troubleshooting <a name="user-content-cert-manager-troubleshooting"></a>
|
||||
Docs: https://cert-manager.io/docs/faq/acme/
|
||||
|
||||
ClusterIssuer runs in default namespace:
|
||||
ClusterIssuers are *visible* any namespaces:
|
||||
```
|
||||
kubectl get clusterissuer
|
||||
kubectl describe clusterissuer <object>
|
||||
@ -555,7 +526,9 @@ kubectl -n <stage> get challenge
|
||||
kubectl -n <stage> describe challenge <object>
|
||||
```
|
||||
|
||||
After successfull setup perform a TLS-test: `https://www.ssllabs.com/ssltest/index.html`
|
||||
After successfull setup perform a TLS-test:
|
||||
* https://testssl.sh/ (`apt install testssl.sh`)
|
||||
* https://www.ssllabs.com/ssltest/index.html
|
||||
|
||||
# HELM charts <a name="user-content-helm"></a>
|
||||
Docs:
|
||||
@ -690,13 +663,12 @@ NOTES:
|
||||
```
|
||||
|
||||
# Kubernetes in action <a name="user-content-kubernetes-in-action"></a>
|
||||
## Running DaemonSets on `hostPort` <a name="user-content-running-daemonsets"></a>
|
||||
* Docs: https://kubernetes.io/docs/concepts/workloads/controllers/daemonset/
|
||||
* Good article: https://medium.com/stakater/k8s-deployments-vs-statefulsets-vs-daemonsets-60582f0c62d4
|
||||
## Running DaemonSets with `hostNetwork: true` <a name="user-content-running-daemonsets"></a>
|
||||
* [Docs: DaemonSet](https://kubernetes.io/docs/concepts/workloads/controllers/daemonset/)
|
||||
* [(Security) hints on using `hostNetwork`](https://kubernetes.io/docs/concepts/policy/pod-security-policy/#host-namespaces)
|
||||
* [Pod´s DNS policy](https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/#pod-s-dns-policy)
|
||||
|
||||
In this case configuration of networking in context of services is not needed.
|
||||
|
||||
This setup is suitable for legacy scenarios where static IP-address are required and a NodePort service is not an alternative:
|
||||
This setup is suitable for scenarios where kubernetes nodes are already running with dual-IP-stack (IPv4 and IPv6) and the Pod needs IPv6 too, but k3s was deployed in ipv4-only mode. In this case the Pod can be deployed in the network namespace of the kubernetes node.
|
||||
```
|
||||
kind: DaemonSet
|
||||
apiVersion: apps/v1
|
||||
@ -704,54 +676,30 @@ metadata:
|
||||
name: netcat-daemonset
|
||||
labels:
|
||||
app: netcat-daemonset
|
||||
spec:
|
||||
selector:
|
||||
matchLabels:
|
||||
app: netcat-daemonset
|
||||
template:
|
||||
metadata:
|
||||
labels:
|
||||
app: netcat-daemonset
|
||||
spec:
|
||||
containers:
|
||||
- command:
|
||||
- nc
|
||||
- -lk
|
||||
- -p
|
||||
- "23456"
|
||||
- -v
|
||||
- -e
|
||||
- /bin/true
|
||||
env:
|
||||
- name: DEMO_GREETING
|
||||
value: Hello from the environment
|
||||
image: dockreg-zdf.int.zwackl.de/alpine/latest/amd64:prod
|
||||
imagePullPolicy: IfNotPresent
|
||||
name: netcat-daemonset
|
||||
ports:
|
||||
- containerPort: 23456
|
||||
hostPort: 23456
|
||||
protocol: TCP
|
||||
resources:
|
||||
limits:
|
||||
cpu: 500m
|
||||
memory: 64Mi
|
||||
requests:
|
||||
cpu: 50m
|
||||
memory: 32Mi
|
||||
restartPolicy: Always
|
||||
securityContext: {}
|
||||
terminationGracePeriodSeconds: 30
|
||||
updateStrategy:
|
||||
rollingUpdate:
|
||||
maxUnavailable: 1
|
||||
type: RollingUpdate
|
||||
spec:
|
||||
selector:
|
||||
matchLabels:
|
||||
app: netcat-daemonset
|
||||
template:
|
||||
metadata:
|
||||
labels:
|
||||
app: netcat-daemonset
|
||||
spec:
|
||||
hostNetwork: true
|
||||
dnsPolicy: ClusterFirstWithHostNet
|
||||
restartPolicy: Always
|
||||
terminationGracePeriodSeconds: 10
|
||||
containers:
|
||||
- name: alpine-netcat-daemonset
|
||||
image: alpine
|
||||
imagePullPolicy: IfNotPresent
|
||||
command: ["nc", "-lk", "-p", "23456", "-v", "-e", "/bin/true"]
|
||||
```
|
||||
## Running StatefulSet with NFS storage <a name="user-content-running-statefulset-nfs"></a>
|
||||
* https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/
|
||||
* [Docs: StatefulSet](https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/)
|
||||
* [NFS dynamic volume provisioning deployed](#pv-nfs)
|
||||
|
||||
**Be careful:** *StatefulSets* are designed for stateful applications (like databases). To avoid split-brain scenarios StatefulSets behave as static as possible. If a node goes down, the StatefulSet controller will reschedule the pods to another nodes, which can meet the requirements! If you want to force a re-scheduling:
|
||||
StatefulSets are designed for stateful applications (like databases). To avoid split-brain scenarios StatefulSets behave as static as possible. If a node goes down, the StatefulSet controller will reschedule the pods to another node, that meets the required conditions! If you want to force a re-scheduling:
|
||||
`kubectl delete pod web-1 --grace-period=0 --force`
|
||||
|
||||
More details on this can be found [here](https://kubernetes.io/docs/tasks/run-application/force-delete-stateful-set-pod/)
|
||||
@ -824,7 +772,6 @@ spec:
|
||||
|
||||
`externalTrafficPolicy: Local` preserves the original client ip-address which is visible to the PODs. In any case (`DaemonSet` or `StatefulSet`) traffic remains on the Node which gets the traffic. In case of `StatefulSet` if more than one POD of a `ReplicaSet` is scheduled on the same Node, the workload gets balanced over all PODs on the same Node.
|
||||
|
||||
|
||||
### Session affinity/persistence <a name="user-content-services-session-persistence"></a>
|
||||
```
|
||||
apiVersion: v1
|
||||
@ -841,7 +788,7 @@ spec:
|
||||
Session persistence (`None` by default) is only supported on client ip-address. Cookie-stickiness or stickiness on any other/higher layer is not supported yet.
|
||||
|
||||
## What happens if a node goes down? <a name="user-content-what-happens-node-down"></a>
|
||||
If a node goes down kubernetes marks this node as *NotReady*, but nothing else:
|
||||
If a node goes down kubernetes marks this node as *NotReady*, but nothing else happens until [Pod tolerations](https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration) take place, which are per default configured with a *timeout* of `300s` (5 minutes!).
|
||||
```
|
||||
$ kubectl get node
|
||||
NAME STATUS ROLES AGE VERSION
|
||||
@ -858,7 +805,7 @@ web-2 1/1 Running 0 26
|
||||
ds-test-c6xx8 1/1 Running 0 18m
|
||||
ds-test-w45dv 1/1 Running 5 28h
|
||||
```
|
||||
Kubernetes supports [Pod-tolerations](https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration), which are per default configured with a *timeout* of `300s` (5 minutes!). This means, that affected Pods will *remain* for a timespan of 300s on a *broken* node before eviction takes place
|
||||
Pod tolerations can be determined like this:
|
||||
```
|
||||
$ kubectl -n <namespace> describe pod <pod-name>
|
||||
|
||||
@ -868,7 +815,7 @@ Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists fo
|
||||
[...]
|
||||
```
|
||||
|
||||
To be more reactive Pod-tolerations can be configured as follows:
|
||||
To be more reactive to node failures Pod tolerations can be adjusted within the Pod template as follows:
|
||||
```
|
||||
kind: Deployment or StatefulSet
|
||||
apiVersion: apps/v1
|
||||
@ -891,10 +838,10 @@ spec:
|
||||
[...]
|
||||
```
|
||||
|
||||
## Keep your cluster balanced <a name="user-content-keep-cluster-balanced"></a>
|
||||
Kubernetes, in first place, takes care of high availability, but not of well balance of pod/node.
|
||||
## Keeping the cluster balanced <a name="user-content-keep-cluster-balanced"></a>
|
||||
In first place, Kubernetes takes care of high availability, but not of well balance of pods per node.
|
||||
|
||||
In case of `Deployment` or `StatefulSet` a `topologySpreadConstraint` needs to be specified:
|
||||
In case of `Deployment` or `StatefulSet` a [`topologySpreadConstraint`](https://kubernetes.io/docs/concepts/workloads/pods/pod-topology-spread-constraints/) needs to be specified:
|
||||
```
|
||||
kind: Deployment or StatefulSet
|
||||
apiVersion: apps/v1
|
||||
@ -909,7 +856,7 @@ spec:
|
||||
topologySpreadConstraints:
|
||||
- labelSelector:
|
||||
matchLabels:
|
||||
app: {{ .Chart.Name }}-{{ .Values.stage }}
|
||||
app: the-app
|
||||
maxSkew: 1
|
||||
topologyKey: kubernetes.io/hostname
|
||||
whenUnsatisfiable: DoNotSchedule
|
||||
|
||||
Loading…
Reference in New Issue
Block a user