Dominik Chilla 7e5066dd42 k3d + Services

2021-04-17 22:55:31 +02:00

30 KiB

Raw Blame History

kubectl - BASH autocompletion
Install k3s
- On on-premises
  - Configure upstream DNS-resolver
  - Change NodePort range
- On Docker with k3d
Namespaces and resource limits
Persistent volumes (StorageClass - dynamic provisioning)
Ingress controller
- Disable Traefik-ingress
- Enable NGINX-ingress with OCSP stapling
  - Installation
Cert-Manager (references ingress controller)
HELM charts
Kubernetes in action

kubectl - BASH autocompletion

For current shell only:

source <(kubectl completion bash)

Persistent:

echo "source <(kubectl completion bash)" >> ~/.bashrc

Install k3s

On premises

https://k3s.io/:

curl -sfL https://get.k3s.io | sh -

If disired, set a memory consumption limit of the systemd-unit like so:

root#> mkdir /etc/systemd/system/k3s.service.d
root#> vi /etc/systemd/system/k3s.service.d/limits.conf
[Service]
MemoryMax=1024M

root#> systemctl daemon-reload
root#> systemctl restart k3s

root#> systemctl status k3s
k3s.service - Lightweight Kubernetes
   Loaded: loaded (/etc/systemd/system/k3s.service; enabled; vendor preset: enabled)
  Drop-In: /etc/systemd/system/k3s.service.d
           └─limits.conf
   Active: active (running) since Thu 2020-11-26 10:46:26 CET; 13min ago
     Docs: https://k3s.io
  Process: 9618 ExecStartPre=/sbin/modprobe br_netfilter (code=exited, status=0/SUCCESS)
  Process: 9619 ExecStartPre=/sbin/modprobe overlay (code=exited, status=0/SUCCESS)
 Main PID: 9620 (k3s-server)
    Tasks: 229
   Memory: 510.6M (max: 1.0G)
   CGroup: /system.slice/k3s.service

Upstream DNS-resolver

Docs: https://rancher.com/docs/rancher/v2.x/en/troubleshooting/dns/

Default: 8.8.8.8 => does not resolve local domains!

local /etc/resolv.k3s.conf -> ip-of-dnsresolver (127.0.0.1 does not work!)
vi /etc/systemd/system/k3s.service:

[...]
ExecStart=/usr/local/bin/k3s \
    server [...] --resolv-conf /etc/resolv.k3s.conf \

Re-load systemd config: systemctl daemon-reload
Re-start k3s: systemctl restart k3s.service
Re-deploy coredns-pods: kubectl -n kube-system delete pod name-of-coredns-pods

Change NodePort range to 1 - 65535

vi /etc/systemd/system/k3s.service:

[...]
ExecStart=/usr/local/bin/k3s \
    server [...] --kube-apiserver-arg service-node-port-range=1-65535 \

Re-load systemd config: systemctl daemon-reload
Re-start k3s: systemctl restart k3s.service

On Docker with K3d

K3d is a terraforming orchestrator which deploys a K3s cluster (masters and nodes) directly on docker without the need for virtual machines for each node (master/worker).

Prerequisites: a local docker installation without user-namespaces enabled.
Warning: K3d deploys privileged containers!

https://k3d.io/:

curl -s https://raw.githubusercontent.com/rancher/k3d/main/install.sh | bash

Create a K3s cluster without traefik as well as metrics-server

k3d cluster create cluster1 \
  --agents 2 \
  --k3s-server-arg '--disable=traefik' \
  --k3s-server-arg '--disable=metrics-server' \
  --k3s-server-arg '--kube-apiserver-arg=service-node-port-range=1-65535'

If you encounter helm throwing errors like this one:

Error: Kubernetes cluster unreachable

... just do:

$ kubectl config view --raw > ~/kubeconfig-k3d.yaml 
$ export KUBECONFIG=~/kubeconfig-k3d.yaml

Namespaces and resource limits

kubectl apply -f https://gitea.zwackl.de/dominik/k3s/raw/branch/master/namespaces_limits.yaml

Persistent Volumes (StorageClass - dynamic provisioning)

Rancher Local

https://rancher.com/docs/k3s/latest/en/storage/
Only supports AccessMode: ReadWriteOnce (RWO)

Longhorn (distributed in local cluster)

Requirements: https://longhorn.io/docs/0.8.0/install/requirements/
- Debian: apt install open-iscsi
Install: https://rancher.com/docs/k3s/latest/en/storage/

NFS

For testing purposes as well as simplicity you may use following NFS container image:

mkdir -p 
docker run -d --name nfs-server \
  --net=host \
  --privileged \
  -v /data/docker/nfs-server/data/:/nfsshare \
  -e SHARED_DIRECTORY=/nfsshare \
  itsthenetwork/nfs-server-alpine:latest

All Nodes need to have the NFS-client package (Ubuntu: nfs-common) installed

helm repo add ckotzbauer https://ckotzbauer.github.io/helm-charts
helm install my-nfs-client-provisioner --set nfs.server=<nfs-server/ip-addr> --set nfs.path=</data/nfs> ckotzbauer/nfs-client-provisioner

Check if NFS StorageClass is available:

$ kubectl get sc
NAME                   PROVISIONER                               RECLAIMPOLICY   VOLUMEBINDINGMODE      ALLOWVOLUMEEXPANSION   AGE
local-path (default)   rancher.io/local-path                     Delete          WaitForFirstConsumer   false                  101d
nfs-client             cluster.local/my-nfs-client-provisioner   Delete          Immediate              true                   172m

Now you can use nfs-client as StorageClass like so:

apiVersion: apps/v1
kind: StatefulSet
[...]
  volumeClaimTemplates:
  - metadata:
      name: nfs-backend
    spec:
      accessModes: [ "ReadWriteMany" ]
      storageClassName: "nfs-client"
      resources:
        requests:
          storage: 32Mi

or so:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: nfs-pvc-1
  namespace: <blubb>
spec:
  storageClassName: "nfs-client"
  accessModes:
    - ReadWriteMany
  resources:
    requests:
      storage: 32Mi

Ingress controller

Disable Traefik-ingress

edit /etc/systemd/system/k3s.service:

[...]
ExecStart=/usr/local/bin/k3s \
    server --disable traefik --resolv-conf /etc/resolv.conf \
[...]

Finally systemctl daemon-reload and systemctl restart k3s

Enable K8s own NGINX-ingress with OCSP stapling

Installation

This is the helm chart of the K8s own nginx ingress controller: https://kubernetes.github.io/ingress-nginx/deploy/#using-helm

kubectl create ns ingress-nginx
helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx
helm install my-release ingress-nginx/ingress-nginx -n ingress-nginx

kubectl -n ingress-nginx get all:

NAME                                                       READY   STATUS    RESTARTS   AGE
pod/svclb-my-release-ingress-nginx-controller-m6gxl        2/2     Running   0          110s
pod/my-release-ingress-nginx-controller-695774d99c-t794f   1/1     Running   0          110s

NAME                                                    TYPE           CLUSTER-IP      EXTERNAL-IP       PORT(S)                      AGE
service/my-release-ingress-nginx-controller-admission   ClusterIP      10.43.116.191   <none>            443/TCP                      110s
service/my-release-ingress-nginx-controller             LoadBalancer   10.43.55.41     192.168.178.116   80:31110/TCP,443:31476/TCP   110s

NAME                                                       DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR   AGE
daemonset.apps/svclb-my-release-ingress-nginx-controller   1         1         1       1            1           <none>          110s

NAME                                                  READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/my-release-ingress-nginx-controller   1/1     1            1           110s

NAME                                                             DESIRED   CURRENT   READY   AGE
replicaset.apps/my-release-ingress-nginx-controller-695774d99c   1         1         1       110s

As nginx ingress is hungry for memory, let´s reduce the number of workers to 1:

kubectl -n ingress-nginx edit configmap my-release-ingress-nginx-controller

apiVersion: v1
<<<ADD BEGINN>>>
data:
  enable-ocsp: "true"
  worker-processes: "1"
<<<ADD END>>>
kind: ConfigMap
[...]

Finally the deployment needs to be restarted:

kubectl -n ingress-nginx rollout restart deployment my-release-ingress-nginx-controller

If you are facing deployment problems like the following one

Error: UPGRADE FAILED: cannot patch "gitea-ingress-staging" with kind Ingress: Internal error occurred: failed calling webhook "validate.nginx.ingress.kubernetes.io": Post https://my-release-ingress-nginx-controller-admission.ingress-nginx.svc:443/networking/v1beta1/ingresses?timeout=10s: context deadline exceeded

A possible fix: kubectl -n ingress-nginx delete ValidatingWebhookConfiguration my-release-ingress-nginx-admission

Cert-Manager (references ingress controller)

Installation

Docs: https://hub.helm.sh/charts/jetstack/cert-manager

helm repo add jetstack https://charts.jetstack.io
helm repo update
kubectl apply --validate=false -f https://github.com/jetstack/cert-manager/releases/download/v1.0.2/cert-manager.crds.yaml
kubectl create namespace cert-manager
helm install cert-manager --namespace cert-manager jetstack/cert-manager
kubectl -n cert-manager get all

Let´s Encrypt issuer

Docs: https://cert-manager.io/docs/tutorials/acme/ingress/#step-6-configure-let-s-encrypt-issuer

ClusterIssuers are a resource type similar to Issuers. They are specified in exactly the same way, 
but they do not belong to a single namespace and can be referenced by Certificate resources from 
multiple different namespaces.

lets-encrypt-cluster-issuers.yaml:

apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
  name: letsencrypt-staging-issuer
spec:
  acme:
    # You must replace this email address with your own.
    # Let's Encrypt will use this to contact you about expiring
    # certificates, and issues related to your account.
    email: user@example.com
    server: https://acme-staging-v02.api.letsencrypt.org/directory
    privateKeySecretRef:
      # Secret resource that will be used to store the account's private key.
      name: letsencrypt-staging-account-key
    # Add a single challenge solver, HTTP01 using nginx
    solvers:
    - http01:
        ingress:
          class: nginx
---
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
  name: letsencrypt-prod-issuer
spec:
  acme:
    # The ACME server URL
    server: https://acme-v02.api.letsencrypt.org/directory
    # Email address used for ACME registration
    email: user@example.com
    # Name of a secret used to store the ACME account private key
    privateKeySecretRef:
      name: letsencrypt-prod-account-key
    # Enable the HTTP-01 challenge provider
    solvers:
    - http01:
        ingress:
          class: nginx

kubectl apply -f lets-encrypt-cluster-issuers.yaml

Deploying a LE-certificate

All you need is an Ingress resource of class nginx which references a ClusterIssuer (letsencrypt-prod-issuer) resource:

apiVersion: networking.k8s.io/v1beta1
kind: Ingress
metadata:
  namespace: <stage>
  name: some-ingress-name
  annotations:
    # use the shared ingress-nginx
    kubernetes.io/ingress.class: "nginx"
    cert-manager.io/cluster-issuer: "letsencrypt-prod-issuer"
spec:
  tls:
  - hosts:
    - some-certificate.name.san
    secretName: target-certificate-secret-name
  rules:
  - host: some-certificate.name.san
    http:
      paths:
      - path: /
        backend:
          serviceName: some-target-service
          servicePort: some-target-service-port

Troubleshooting

Docs: https://cert-manager.io/docs/faq/acme/

ClusterIssuer runs in default namespace:

kubectl get clusterissuer
kubectl describe clusterissuer <object>

All other ingres-specific cert-manager resources are running specific namespaces:

kubectl -n <stage> get certificaterequest
kubectl -n <stage> describe certificaterequest <object>
kubectl -n <stage> get certificate
kubectl -n <stage> describe certificate <object>
kubectl -n <stage> get secret
kubectl -n <stage> describe secret <object>
kubectl -n <stage> get challenge
kubectl -n <stage> describe challenge <object>

After successfull setup perform a TLS-test: https://www.ssllabs.com/ssltest/index.html

HELM charts

Docs:

https://helm.sh/docs/intro/using_helm/

Prerequisites:

running kubernetes installation
kubectl with ENV[KUBECONFIG] pointing to appropriate config file
helm

Create a chart

helm create helm-test

~/kubernetes/helm$ tree helm-test/
helm-test/
├── charts
├── Chart.yaml
├── templates
│   ├── deployment.yaml
│   ├── _helpers.tpl
│   ├── hpa.yaml
│   ├── ingress.yaml
│   ├── NOTES.txt
│   ├── serviceaccount.yaml
│   ├── service.yaml
│   └── tests
│       └── test-connection.yaml
└── values.yaml

Install local chart without packaging

helm install helm-test-dev helm-test/ --set image.tag=latest --debug --wait

or just a dry-run:

helm install helm-test-dev helm-test/ --set image.tag=latest --debug --dry-run

--wait: Waits until all Pods are in a ready state, PVCs are bound, Deployments have minimum (Desired minus maxUnavailable) 
Pods in ready state and Services have an IP address (and Ingress if a LoadBalancer) before marking the release as successful. 
It will wait for as long as the --timeout value. If timeout is reached, the release will be marked as FAILED. Note: In 
scenarios where Deployment has replicas set to 1 and maxUnavailable is not set to 0 as part of rolling update strategy, 

--wait will return as ready as it has satisfied the minimum Pod in ready condition.

List deployed helm charts

~/kubernetes/helm$ helm list
NAME         	NAMESPACE	REVISION	UPDATED                                	STATUS  	CHART          	APP VERSION
helm-test-dev	default  	4       	2020-08-27 12:30:38.98457042 +0200 CEST	deployed	helm-test-0.1.0	1.16.0

Upgrade local chart without packaging

~/kubernetes/helm$ helm upgrade helm-test-dev helm-test/ --set image.tag=latest --wait --timeout 60s
Release "helm-test-dev" has been upgraded. Happy Helming!
NAME: helm-test-dev
LAST DEPLOYED: Thu Aug 27 12:47:09 2020
NAMESPACE: default
STATUS: deployed
REVISION: 7
NOTES:
1. Get the application URL by running these commands:
  export POD_NAME=$(kubectl get pods --namespace default -l "app.kubernetes.io/name=helm-test,app.kubernetes.io/instance=helm-test-dev" -o jsonpath="{.items[0].metadata.name}")
  echo "Visit http://127.0.0.1:8080 to use your application"
  kubectl --namespace default port-forward $POD_NAME 8080:80

helm upgrade [...] --wait is synchronous and exit with 0 on success, otherwise with >0 on failure. helm upgrade will wait for 5 minutes Setting the --timeout (Default 5 minutes) flag makes This can be used in term of CI/CD deployments with Jenkins.

Get status of deployed chart

~/kubernetes/helm$ helm status helm-test-dev
NAME: helm-test-dev
LAST DEPLOYED: Thu Aug 27 12:47:09 2020
NAMESPACE: default
STATUS: deployed
REVISION: 7
NOTES:
1. Get the application URL by running these commands:
  export POD_NAME=$(kubectl get pods --namespace default -l "app.kubernetes.io/name=helm-test,app.kubernetes.io/instance=helm-test-dev" -o jsonpath="{.items[0].metadata.name}")
  echo "Visit http://127.0.0.1:8080 to use your application"
  kubectl --namespace default port-forward $POD_NAME 8080:80

Get deployment history

~/kubernetes/helm$ helm history helm-test-dev
REVISION	UPDATED                 	STATUS         	CHART          	APP VERSION	DESCRIPTION                                                        
10      	Thu Aug 27 12:56:33 2020	failed         	helm-test-0.1.0	1.16.0     	Upgrade "helm-test-dev" failed: timed out waiting for the condition
11      	Thu Aug 27 13:08:34 2020	superseded     	helm-test-0.1.0	1.16.0     	Upgrade complete                                                   
12      	Thu Aug 27 13:09:59 2020	superseded     	helm-test-0.1.0	1.16.0     	Upgrade complete                                                   
13      	Thu Aug 27 13:10:24 2020	superseded     	helm-test-0.1.0	1.16.0     	Rollback to 11                                                     
14      	Thu Aug 27 13:23:22 2020	failed         	helm-test-0.1.1	blubb      	Upgrade "helm-test-dev" failed: timed out waiting for the condition
15      	Thu Aug 27 13:26:43 2020	pending-upgrade	helm-test-0.1.1	blubb      	Preparing upgrade                                                  
16      	Thu Aug 27 13:27:12 2020	superseded     	helm-test-0.1.1	blubb      	Upgrade complete                                                   
17      	Thu Aug 27 14:32:32 2020	superseded     	helm-test-0.1.1	           	Upgrade complete                                                   
18      	Thu Aug 27 14:33:58 2020	superseded     	helm-test-0.1.1	           	Upgrade complete                                                   
19      	Thu Aug 27 14:36:49 2020	failed         	helm-test-0.1.1	cosmetics  	Upgrade "helm-test-dev" failed: timed out waiting for the condition

Rollback

helm rollback helm-test-dev 18 --wait

~/kubernetes/helm$ helm history helm-test-dev
REVISION	UPDATED                 	STATUS         	CHART          	APP VERSION	DESCRIPTION                                                        
10      	Thu Aug 27 12:56:33 2020	failed         	helm-test-0.1.0	1.16.0     	Upgrade "helm-test-dev" failed: timed out waiting for the condition
11      	Thu Aug 27 13:08:34 2020	superseded     	helm-test-0.1.0	1.16.0     	Upgrade complete                                                   
12      	Thu Aug 27 13:09:59 2020	superseded     	helm-test-0.1.0	1.16.0     	Upgrade complete                                                   
13      	Thu Aug 27 13:10:24 2020	superseded     	helm-test-0.1.0	1.16.0     	Rollback to 11                                                     
14      	Thu Aug 27 13:23:22 2020	failed         	helm-test-0.1.1	blubb      	Upgrade "helm-test-dev" failed: timed out waiting for the condition
15      	Thu Aug 27 13:26:43 2020	pending-upgrade	helm-test-0.1.1	blubb      	Preparing upgrade                                                  
16      	Thu Aug 27 13:27:12 2020	superseded     	helm-test-0.1.1	blubb      	Upgrade complete                                                   
17      	Thu Aug 27 14:32:32 2020	superseded     	helm-test-0.1.1	           	Upgrade complete                                                   
18      	Thu Aug 27 14:33:58 2020	superseded     	helm-test-0.1.1	           	Upgrade complete                                                   
19      	Thu Aug 27 14:36:49 2020	failed         	helm-test-0.1.1	cosmetics  	Upgrade "helm-test-dev" failed: timed out waiting for the condition
20      	Thu Aug 27 14:37:36 2020	deployed       	helm-test-0.1.1	           	Rollback to 18

~/kubernetes/helm$ helm status helm-test-dev
NAME: helm-test-dev
LAST DEPLOYED: Thu Aug 27 14:37:36 2020
NAMESPACE: default
STATUS: deployed
REVISION: 20
NOTES:
1. Get the application URL by running these commands:
  export POD_NAME=$(kubectl get pods --namespace default -l "app.kubernetes.io/name=helm-test,app.kubernetes.io/instance=helm-test-dev" -o jsonpath="{.items[0].metadata.name}")
  echo "Visit http://127.0.0.1:8080 to use your application"
  kubectl --namespace default port-forward $POD_NAME 8080:80

Kubernetes in action

Running DaemonSets on `hostPort`

In this case configuration of networking in context of services is not needed.

This setup is suitable for legacy scenarios where static IP-address are required and a NodePort service is not an alternative:

kind: DaemonSet
apiVersion: apps/v1
metadata:
  name: netcat-daemonset
  labels:
    app: netcat-daemonset
spec:                                                                     
  selector:                                                    
    matchLabels:                                               
      app: netcat-daemonset                                             
  template:                                                    
    metadata:                                                  
      labels:                                                  
        app: netcat-daemonset                                           
    spec:                                        
      containers:                                              
      - command:                                               
        - nc                                                   
        - -lk                              
        - -p                    
        - "23456"                          
        - -v                                                     
        - -e                                                     
        - /bin/true                                              
        env:                                                     
        - name: DEMO_GREETING                                    
          value: Hello from the environment                      
        image: dockreg-zdf.int.zwackl.de/alpine/latest/amd64:prod
        imagePullPolicy: Always                                  
        name: netcat-daemonset                                            
        ports:                                                   
        - containerPort: 23456 
          hostPort: 23456
          protocol: TCP                                          
        resources:                                               
          limits:                                                
            cpu: 500m                                            
            memory: 64Mi                                         
          requests:                                              
            cpu: 50m                                             
            memory: 32Mi                                         
      restartPolicy: Always                         
      securityContext: {}                           
      terminationGracePeriodSeconds: 30             
  updateStrategy:                                   
    rollingUpdate:                                  
      maxUnavailable: 1                             
    type: RollingUpdate

Running StatefulSet with NFS storage

Be careful: StatefulSets are designed for stateful applications (like databases). To avoid split-brain scenarios StatefulSets behave as static as possible. If a node goes down, the StatefulSet controller will not reschedule the pods to another functioning nodes! This only happens to stateless Deployments! In this case you need to force the rescheduling by hand like this:
kubectl delete pod web-1 --grace-period=0 --force

More details on this can be found here

If you want DaemonSet-like Node-affinity on StatefulSets then read this

---
apiVersion: v1
kind: Service
metadata:
  name: nginx
  labels:
    app: nginx
spec:
  ports:
  - port: 80
    name: web
  clusterIP: None
  selector:
    app: nginx
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: web
spec:
  selector:
    matchLabels:
      app: nginx
  serviceName: "nginx"
  replicas: 2
  template:
    metadata:
      labels:
        app: nginx
    spec:
      terminationGracePeriodSeconds: 10
      containers:
      - name: nginx
        image: nginx:alpine
        ports:
        - containerPort: 80
          name: web
        volumeMounts:
        - name: nfs-backend
          mountPath: /nfs-backend
  volumeClaimTemplates:
  - metadata:
      name: nfs-backend
    spec:
      accessModes: [ "ReadWriteMany" ]
      storageClassName: "nfs-client"
      resources:
        requests:
          storage: 32Mi

Services

Client-IP transparency and loadbalancing

apiVersion: v1
kind: Service
[...]
spec:
  type: NodePort
  externalTrafficPolicy: <<Local|Cluster>>
  [...]

externalTrafficPolicy: Cluster (default) spreads the incoming traffic over all pods evenly. To achieve this the client ip-address must be source-NATted and therefore it´s not visible to the PODs.

externalTrafficPolicy: Local preserves the original client ip-address which is visible to the PODs. In any case (DaemonSet or StatefulSet) traffic remains on the Node which gets the traffic. In case of StatefulSet if more than one POD of a ReplicaSet is scheduled on the same Node, the workload gets balanced over all PODs on the same Node.

Session affinity/persistence

apiVersion: v1
kind: Service
[...]
spec:
  type: NodePort
  sessionAffinity: <<ClientIP|None>>
  sessionAffinityConfig:
    clientIP:
      timeoutSeconds: 10
  []

Session persistence is only possible

What happens if a node goes down?

If a node goes down kubernetes marks this node as NotReady, but nothing else:

$ kubectl get node
NAME         STATUS     ROLES    AGE    VERSION
k3s-node2    Ready      <none>   103d   v1.19.5+k3s2
k3s-master   Ready      master   103d   v1.19.5+k3s2
k3s-node1    NotReady   <none>   103d   v1.19.5+k3s2

$ kubectl get pod
NAME                                         READY   STATUS        RESTARTS   AGE
ds-test-5mlkt                                1/1     Running       14         28h
my-nfs-client-provisioner-57ff8c84c7-p75ck   1/1     Running       0          31m
web-1                                        1/1     Running       0          26m
web-2                                        1/1     Running       0          26m
ds-test-c6xx8                                1/1     Running       0          18m
ds-test-w45dv                                1/1     Running       5          28h

Kubernetes knows something like a --pod-eviction-timeout, which is a grace period (default: 5 minutes) for deleting pods on failed nodes. This timeout is useful to keep pods on nodes, which are rebooted in term of maintenance reasons. So, first of all, nothing happens to the pods on failed nodes until pod eviction timeout exceeded. If the pod eviction period times out, Kubernetes reschedules stateless Deployments to working nodes. DaemonSets as well as StatefulSets will not be rescheduled on other nodes at all.

Docs: https://kubernetes.io/docs/concepts/scheduling-eviction/eviction-policy/

Keep your cluster balanced

Kubernetes, in first place, takes care of high availability, but not of well balance of pod/node. This project could be a solution! Pod/Node balance is not a subject to DaemonSets.

Node maintenance

Mark a node for maintenance:

$ kubectl drain k3s-node2 --ignore-daemonsets 

$ kubectl get node
NAME         STATUS                     ROLES    AGE    VERSION
k3s-node1    Ready                      <none>   105d   v1.19.5+k3s2
k3s-master   Ready                      master   105d   v1.19.5+k3s2
k3s-node2    Ready,SchedulingDisabled   <none>   105d   v1.19.5+k3s

All Deployment as well as StatefulSet pods have been rescheduled on remaining nodes. DaemonSet pods were not touched! Node maintenance can be performed now.

To bring the maintained node back in cluster:

$ kubectl uncordon k3s-node2 
node/k3s-node2 uncordoned

30 KiB Raw Blame History Unescape Escape

kubectl - BASH autocompletion

Install k3s

On premises

Upstream DNS-resolver

Change NodePort range to 1 - 65535

On Docker with K3d

Namespaces and resource limits

Persistent Volumes (StorageClass - dynamic provisioning)

Rancher Local

Longhorn (distributed in local cluster)

NFS

Ingress controller

Disable Traefik-ingress

Enable K8s own NGINX-ingress with OCSP stapling

Installation

Cert-Manager (references ingress controller)

Installation

Let´s Encrypt issuer

Deploying a LE-certificate

Troubleshooting

HELM charts

Create a chart

Install local chart without packaging

List deployed helm charts

Upgrade local chart without packaging

Get status of deployed chart

Get deployment history

Rollback

Kubernetes in action

Running DaemonSets on hostPort

Running StatefulSet with NFS storage

Services

Client-IP transparency and loadbalancing

Session affinity/persistence

What happens if a node goes down?

Keep your cluster balanced

Node maintenance

Dealing with disruptions

30 KiB

Raw Blame History

Running DaemonSets on `hostPort`