From 2fc6180697fcc507692d72ba914f3aedc66b0912 Mon Sep 17 00:00:00 2001
From: Dominik Chilla <dominik@zwackl.de>
Date: Sat, 27 Nov 2021 20:56:14 +0100
Subject: [PATCH] refinement

---
 README.md | 189 ++++++++++++++++++++----------------------------------
 1 file changed, 68 insertions(+), 121 deletions(-)
diff --git a/README.md b/README.md
index b375a0d..be4c055 100644
--- a/README.md
+++ b/README.md
@@ -1,6 +1,6 @@
 * [kubectl - BASH autocompletion](#kubectl-bash-autocompletion)
 * [Install k3s](#install-k3s)
-  * [On on-premises](#install-k3s-on-premises)
+  * [On premises/IaaS](#install-k3s-on-premises)
     * [Configure upstream DNS-resolver](#upstream-dns-resolver)
     * [Change NodePort range](#nodeport-range)
     * [Clustering](#clustering)
@@ -8,9 +8,8 @@
 * [Namespaces and resource limits](#namespaces-limits)
 * [Persistent volumes (StorageClass - dynamic provisioning)](#pv)
   * [Rancher Local](#pv-local)
-  * [Rancher Longhorn - distributed in local cluster](#pv-longhorn)
+  * [Rancher Longhorn (distributed in local cluster) - MY FAVOURITE :-)](#pv-longhorn)
   * [NFS](#pv-nfs)
-  * [Seaweedfs](#pv-seaweedfs)
 * [Ingress controller](#ingress-controller)
   * [Disable Traefik-ingress](#disable-traefik-ingress)
   * [Enable NGINX-ingress with OCSP stapling](#enable-nginx-ingress)
@@ -31,12 +30,12 @@
   * [Get deployment history](#helm-history)
   * [Rollback](#helm-rollback)
 * [Kubernetes in action](#kubernetes-in-action)
-  * [Running DaemonSets on `hostPort`](#running-daemonsets)
+  * [Running DaemonSets with `hostNetwork: true`](#running-daemonsets)
   * [Running StatefulSet with NFS storage](#running-statefulset-nfs)
   * [Services](#services)
     * [Client-IP transparency and loadbalancing](#services-client-ip-transparency)
     * [Session affinity/persistence](#services-session-persistence)
-  * [Keep your cluster balanced](#keep-cluster-balanced)
+  * [Keeping the cluster balanced](#keep-cluster-balanced)
   * [Node maintenance](#node-maintenance)
   * [What happens if a node goes down?](#what-happens-node-down)
   * [Dealing with disruptions](#disruptions)
@@ -55,36 +54,12 @@ echo "source <(kubectl completion bash)" >> ~/.bashrc
 ```
 
 # Install k3s <a name="user-content-install-k3s"></a>
-## On premises <a name="user-content-install-k3s-on-premises"></a>
+## On premises/IaaS <a name="user-content-install-k3s-on-premises"></a>
 https://k3s.io/:
 ```
 curl -sfL https://get.k3s.io | sh -
 ```
-If disired, set a memory consumption limit of the systemd-unit like so:
-```
-root#> mkdir /etc/systemd/system/k3s.service.d
-root#> vi /etc/systemd/system/k3s.service.d/limits.conf
-[Service]
-MemoryMax=1024M
 
-root#> systemctl daemon-reload
-root#> systemctl restart k3s
-
-root#> systemctl status k3s
-k3s.service - Lightweight Kubernetes
-   Loaded: loaded (/etc/systemd/system/k3s.service; enabled; vendor preset: enabled)
-  Drop-In: /etc/systemd/system/k3s.service.d
-           └─limits.conf
-   Active: active (running) since Thu 2020-11-26 10:46:26 CET; 13min ago
-     Docs: https://k3s.io
-  Process: 9618 ExecStartPre=/sbin/modprobe br_netfilter (code=exited, status=0/SUCCESS)
-  Process: 9619 ExecStartPre=/sbin/modprobe overlay (code=exited, status=0/SUCCESS)
- Main PID: 9620 (k3s-server)
-    Tasks: 229
-   Memory: 510.6M (max: 1.0G)
-   CGroup: /system.slice/k3s.service
-
-```
 ### Upstream DNS-resolver <a name="user-content-upstream-dns-resolver"></a>
 Docs: https://rancher.com/docs/rancher/v2.x/en/troubleshooting/dns/
 
@@ -114,7 +89,7 @@ ExecStart=/usr/local/bin/k3s \
 If you want to build a K3s-cluster the default networking model is *overlay@VXLAN*. In this case make sure that
 * all of your nodes can reach (ping) each other over the underlying network (local, routed/vpn). This is required for the overlay network to work properly. VXLAN spans a mashed network over all K3s-nodes.
 * if your nodes are spread over public networks (like the internet) use a VPN (like IPSec or OpenVPN) to secure the traffic between the nodes. **VXLAN uses plain UDP for transport!**
-* if your nodes are connected through VPN, `flannel` (overlay network daemon) should explicitly communicate over the vpn network interface instead of the public network interface. Following settings should be made on the nodes:
+* if your nodes are connected through VPN, `flannel` (overlay network daemon) should explicitly communicate via the vpn network interface instead of the public network interface. Following settings should be made on the nodes:
 ```
 /etc/systemd/system/k3s-agent.service:
 
@@ -164,7 +139,7 @@ and change the content to
 ```
   forward . ipaddr.of.your.dns-resolver
 ```
-Finally redeploy the CoreDNS deployment with:  
+Finally re-deploy the CoreDNS deployment with:  
 `kubectl -n kube-system rollout restart deployment coredns`
 
 **Note:** If you restart the cluster (`k3d cluster stop your-cluster` and `k3d cluster start your-cluster`), the changes will be gone!
@@ -180,7 +155,7 @@ Read more about [AccessModes](https://kubernetes.io/docs/concepts/storage/persis
 https://rancher.com/docs/k3s/latest/en/storage/  
 Only supports *AccessMode*: ReadWriteOnce (RWO)
 
-## Longhorn (distributed in local cluster) <a name="user-content-pv-longhorn"></a>
+## Rancher Longhorn (distributed in local cluster) - MY FAVOURITE :-) <a name="user-content-pv-longhorn"></a>
 * Requirements: https://longhorn.io/docs/0.8.0/install/requirements/
   * Debian: `apt install open-iscsi`
 * Install: https://rancher.com/docs/k3s/latest/en/storage/
@@ -239,11 +214,6 @@ spec:
     requests:
       storage: 32Mi
 ```
-## Seaweedfs <a name="user-content-pv-seaweedfs"></a>
-Docs: https://github.com/seaweedfs  
-Docs: https://github.com/seaweedfs/seaweedfs-csi-driver
-
-In order to use the CSI driver you need to have a working seaweedfs-cluster. As seaweedfs is really lightweight it can be deployed on a bunch (at least three) of raspberries (min. version 3) as well as on the K3s cluster too.
 
 # Ingress controller <a name="user-content-ingress-controller"></a>
 ## Disable Traefik-ingress <a name="user-content-disable-traefik-ingress"></a>
@@ -251,7 +221,7 @@ edit /etc/systemd/system/k3s.service:
 ```
 [...]
 ExecStart=/usr/local/bin/k3s \
-    server --disable traefik --resolv-conf /etc/resolv.conf \
+    server [...] --disable traefik \
 [...]
 ```
 Finally `systemctl daemon-reload` and `systemctl restart k3s`
@@ -264,36 +234,37 @@ https://kubernetes.github.io/ingress-nginx/deploy/#using-helm
 ```
 kubectl create ns ingress-nginx
 helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx
-helm install my-release ingress-nginx/ingress-nginx -n ingress-nginx
+helm install nginx-ingress ingress-nginx/ingress-nginx -n ingress-nginx
 ```
 
 `kubectl -n ingress-nginx get all`:
 ```
 NAME                                                       READY   STATUS    RESTARTS   AGE
-pod/svclb-my-release-ingress-nginx-controller-m6gxl        2/2     Running   0          110s
-pod/my-release-ingress-nginx-controller-695774d99c-t794f   1/1     Running   0          110s
+pod/svclb-nginx-ingress-controller-m6gxl        2/2     Running   0          110s
+pod/nginx-ingress-controller-695774d99c-t794f   1/1     Running   0          110s
 
 NAME                                                    TYPE           CLUSTER-IP      EXTERNAL-IP       PORT(S)                      AGE
-service/my-release-ingress-nginx-controller-admission   ClusterIP      10.43.116.191   <none>            443/TCP                      110s
-service/my-release-ingress-nginx-controller             LoadBalancer   10.43.55.41     192.168.178.116   80:31110/TCP,443:31476/TCP   110s
+service/nginx-ingress-controller-admission   ClusterIP      10.43.116.191   <none>            443/TCP                      110s
+service/nginx-ingress-controller             LoadBalancer   10.43.55.41     192.168.178.116   80:31110/TCP,443:31476/TCP   110s
 
 NAME                                                       DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR   AGE
-daemonset.apps/svclb-my-release-ingress-nginx-controller   1         1         1       1            1           <none>          110s
+daemonset.apps/svclb-nginx-ingress-controller   1         1         1       1            1           <none>          110s
 
 NAME                                                  READY   UP-TO-DATE   AVAILABLE   AGE
-deployment.apps/my-release-ingress-nginx-controller   1/1     1            1           110s
+deployment.apps/nginx-ingress-controller   1/1     1            1           110s
 
 NAME                                                             DESIRED   CURRENT   READY   AGE
-replicaset.apps/my-release-ingress-nginx-controller-695774d99c   1         1         1       110s
+replicaset.apps/nginx-ingress-controller-695774d99c   1         1         1       110s
 ```
-As nginx ingress is hungry for memory, let´s reduce the number of workers to 1:
+The nginx ingress global configuration can be modified as follows:
 ```
-kubectl -n ingress-nginx edit configmap my-release-ingress-nginx-controller
+kubectl -n ingress-nginx edit configmap ingress-nginx-controller
 
 apiVersion: v1
 <<<ADD BEGINN>>>
 data:
   enable-ocsp: "true"
+  use-gzip: "true"
   worker-processes: "1"
 <<<ADD END>>>
 kind: ConfigMap
@@ -301,13 +272,13 @@ kind: ConfigMap
 ```
 Finally the deployment needs to be restarted:
 
-`kubectl -n ingress-nginx rollout restart deployment my-release-ingress-nginx-controller`
+`kubectl -n ingress-nginx rollout restart deployment ingress-nginx-controller`
 
 **If you are facing deployment problems like the following one**
 ```
-Error: UPGRADE FAILED: cannot patch "gitea-ingress-staging" with kind Ingress: Internal error occurred: failed calling webhook "validate.nginx.ingress.kubernetes.io": Post https://my-release-ingress-nginx-controller-admission.ingress-nginx.svc:443/networking/v1beta1/ingresses?timeout=10s: context deadline exceeded
+Error: UPGRADE FAILED: cannot patch "gitea-ingress-staging" with kind Ingress: Internal error occurred: failed calling webhook "validate.nginx.ingress.kubernetes.io": Post https://nginx-ingress-controller-admission.ingress-nginx.svc:443/networking/v1beta1/ingresses?timeout=10s: context deadline exceeded
 ```
-A possible fix: `kubectl -n ingress-nginx delete ValidatingWebhookConfiguration my-release-ingress-nginx-admission`
+A possible fix: `kubectl -n ingress-nginx delete ValidatingWebhookConfiguration ingress-nginx-admission`
 
 # Cert-Manager (references ingress controller) <a name="user-content-cert-manager"></a>
 ## Installation <a name="user-content-cert-manager-install"></a>
@@ -325,7 +296,7 @@ kubectl -n cert-manager get all
 ## Cluster-internal CA Issuer <a name="user-content-cert-manager-cluster-ca-issuer"></a>
 Docs: https://cert-manager.io/docs/configuration/ca/
 
-## Let´s Encrypt issuer <a name="user-content-cert-manager-le-issuer"></a>
+## Let´s Encrypt HTTP-01 issuer <a name="user-content-cert-manager-le-issuer"></a>
 Docs: https://cert-manager.io/docs/tutorials/acme/ingress/#step-6-configure-let-s-encrypt-issuer
 ```
 ClusterIssuers are a resource type similar to Issuers. They are specified in exactly the same way, 
@@ -338,7 +309,7 @@ lets-encrypt-cluster-issuers.yaml:
 apiVersion: cert-manager.io/v1
 kind: ClusterIssuer
 metadata:
-  name: letsencrypt-staging-issuer
+  name: letsencrypt-http01-staging-issuer
 spec:
   acme:
     # You must replace this email address with your own.
@@ -358,7 +329,7 @@ spec:
 apiVersion: cert-manager.io/v1
 kind: ClusterIssuer
 metadata:
-  name: letsencrypt-prod-issuer
+  name: letsencrypt-http01-prod-issuer
 spec:
   acme:
     # The ACME server URL
@@ -386,7 +357,7 @@ data:
 apiVersion: cert-manager.io/v1
 kind: ClusterIssuer
 metadata:
-  name: letsencrypt-dns01-issuer
+  name: letsencrypt-dns01-prod-issuer
 spec:
   acme:
     email: user@example.com
@@ -411,9 +382,9 @@ spec:
 `kubectl apply -f lets-encrypt-cluster-issuers.yaml`
 
 ## Deploying a LE-certificate with ingress <a name="user-content-cert-manager-ingress"></a>
-All you need is an `Ingress` resource of class `nginx` which references a ClusterIssuer (`letsencrypt-prod-issuer`) resource.
+All you need is an `Ingress` resource of class `nginx` which references a ClusterIssuer (`letsencrypt-http01-prod-issuer`) resource.
 
-HTTP-01 solver (`cert-manager.io/cluster-issuer: "letsencrypt-prod-issuer"`):
+HTTP-01 solver (`cert-manager.io/cluster-issuer: "letsencrypt-http01-prod-issuer"`):
 ```
 apiVersion: networking.k8s.io/v1beta1
 kind: Ingress
@@ -423,7 +394,7 @@ metadata:
   annotations:
     # use the shared ingress-nginx
     kubernetes.io/ingress.class: "nginx"
-    cert-manager.io/cluster-issuer: "letsencrypt-prod-issuer"
+    cert-manager.io/cluster-issuer: "letsencrypt-http01-prod-issuer"
 spec:
   tls:
   - hosts:
@@ -438,7 +409,7 @@ spec:
           serviceName: some-target-service
           servicePort: some-target-service-port
 ```
-DNS-01 solver (`cert-manager.io/cluster-issuer: "letsencrypt-dns01-issuer"`):
+DNS-01 solver (`cert-manager.io/cluster-issuer: "letsencrypt-dns01-prod-issuer"`):
 ```
 apiVersion: networking.k8s.io/v1beta1
 kind: Ingress
@@ -448,7 +419,7 @@ metadata:
   annotations:
     # use the shared ingress-nginx
     kubernetes.io/ingress.class: "nginx"
-    cert-manager.io/cluster-issuer: "letsencrypt-dns01-issuer"
+    cert-manager.io/cluster-issuer: "letsencrypt-dns01-prod-issuer"
 spec:
   tls:
   - hosts:
@@ -538,7 +509,7 @@ spec:
 ## Troubleshooting <a name="user-content-cert-manager-troubleshooting"></a>
 Docs: https://cert-manager.io/docs/faq/acme/
 
-ClusterIssuer runs in default namespace:
+ClusterIssuers are *visible* any namespaces:
 ```
 kubectl get clusterissuer
 kubectl describe clusterissuer <object>
@@ -555,7 +526,9 @@ kubectl -n <stage> get challenge
 kubectl -n <stage> describe challenge <object>
 ```
 
-After successfull setup perform a TLS-test: `https://www.ssllabs.com/ssltest/index.html`
+After successfull setup perform a TLS-test:
+* https://testssl.sh/ (`apt install testssl.sh`)
+* https://www.ssllabs.com/ssltest/index.html
 
 # HELM charts <a name="user-content-helm"></a>
 Docs:
@@ -690,13 +663,12 @@ NOTES:
 ```
 
 # Kubernetes in action <a name="user-content-kubernetes-in-action"></a>
-## Running DaemonSets on `hostPort` <a name="user-content-running-daemonsets"></a>
-* Docs: https://kubernetes.io/docs/concepts/workloads/controllers/daemonset/
-* Good article: https://medium.com/stakater/k8s-deployments-vs-statefulsets-vs-daemonsets-60582f0c62d4
+## Running DaemonSets with `hostNetwork: true` <a name="user-content-running-daemonsets"></a>
+* [Docs: DaemonSet](https://kubernetes.io/docs/concepts/workloads/controllers/daemonset/)
+* [(Security) hints on using `hostNetwork`](https://kubernetes.io/docs/concepts/policy/pod-security-policy/#host-namespaces)
+* [Pod´s DNS policy](https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/#pod-s-dns-policy)
 
-In this case configuration of networking in context of services is not needed.
-
-This setup is suitable for legacy scenarios where static IP-address are required and a NodePort service is not an alternative:
+This setup is suitable for scenarios where kubernetes nodes are already running with dual-IP-stack (IPv4 and IPv6) and the Pod needs IPv6 too, but k3s was deployed in ipv4-only mode. In this case the Pod can be deployed in the network namespace of the kubernetes node.
 ```
 kind: DaemonSet
 apiVersion: apps/v1
@@ -704,54 +676,30 @@ metadata:
   name: netcat-daemonset
   labels:
     app: netcat-daemonset
-spec:                                                                     
-  selector:                                                    
-    matchLabels:                                               
-      app: netcat-daemonset                                             
-  template:                                                    
-    metadata:                                                  
-      labels:                                                  
-        app: netcat-daemonset                                           
-    spec:                                        
-      containers:                                              
-      - command:                                               
-        - nc                                                   
-        - -lk                              
-        - -p                    
-        - "23456"                          
-        - -v                                                     
-        - -e                                                     
-        - /bin/true                                              
-        env:                                                     
-        - name: DEMO_GREETING                                    
-          value: Hello from the environment                      
-        image: dockreg-zdf.int.zwackl.de/alpine/latest/amd64:prod
-        imagePullPolicy: IfNotPresent                                  
-        name: netcat-daemonset                                            
-        ports:                                                   
-        - containerPort: 23456 
-          hostPort: 23456
-          protocol: TCP                                          
-        resources:                                               
-          limits:                                                
-            cpu: 500m                                            
-            memory: 64Mi                                         
-          requests:                                              
-            cpu: 50m                                             
-            memory: 32Mi                                         
-      restartPolicy: Always                         
-      securityContext: {}                           
-      terminationGracePeriodSeconds: 30             
-  updateStrategy:                                   
-    rollingUpdate:                                  
-      maxUnavailable: 1                             
-    type: RollingUpdate
+spec:
+  selector:
+    matchLabels:
+      app: netcat-daemonset
+  template:
+    metadata:
+      labels:
+        app: netcat-daemonset
+    spec:
+      hostNetwork: true
+      dnsPolicy: ClusterFirstWithHostNet
+      restartPolicy: Always
+      terminationGracePeriodSeconds: 10
+      containers:
+      - name: alpine-netcat-daemonset
+        image: alpine
+        imagePullPolicy: IfNotPresent
+        command: ["nc", "-lk", "-p", "23456", "-v", "-e", "/bin/true"]
 ```
 ## Running StatefulSet with NFS storage <a name="user-content-running-statefulset-nfs"></a>
-* https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/
+* [Docs: StatefulSet](https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/)
 * [NFS dynamic volume provisioning deployed](#pv-nfs)
 
-**Be careful:** *StatefulSets* are designed for stateful applications (like databases). To avoid split-brain scenarios StatefulSets behave as static as possible. If a node goes down, the StatefulSet controller will reschedule the pods to another nodes, which can meet the requirements! If you want to force a re-scheduling:  
+StatefulSets are designed for stateful applications (like databases). To avoid split-brain scenarios StatefulSets behave as static as possible. If a node goes down, the StatefulSet controller will reschedule the pods to another node, that meets the required conditions! If you want to force a re-scheduling:  
 `kubectl delete pod web-1 --grace-period=0 --force`
 
 More details on this can be found [here](https://kubernetes.io/docs/tasks/run-application/force-delete-stateful-set-pod/)
@@ -824,7 +772,6 @@ spec:
 
 `externalTrafficPolicy: Local` preserves the original client ip-address which is visible to the PODs. In any case  (`DaemonSet` or `StatefulSet`) traffic remains on the Node which gets the traffic. In case of `StatefulSet` if more than one POD of a `ReplicaSet` is scheduled on the same Node, the workload gets balanced over all PODs on the same Node.
 
-
 ### Session affinity/persistence <a name="user-content-services-session-persistence"></a>
 ```
 apiVersion: v1
@@ -841,7 +788,7 @@ spec:
 Session persistence (`None` by default) is only supported on client ip-address. Cookie-stickiness or stickiness on any other/higher layer is not supported yet.
 
 ## What happens if a node goes down? <a name="user-content-what-happens-node-down"></a>
-If a node goes down kubernetes marks this node as *NotReady*, but nothing else:
+If a node goes down kubernetes marks this node as *NotReady*, but nothing else happens until [Pod tolerations](https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration) take place, which are per default configured with a *timeout* of `300s` (5 minutes!).
 ```
 $ kubectl get node
 NAME         STATUS     ROLES    AGE    VERSION
@@ -858,7 +805,7 @@ web-2                                        1/1     Running       0          26
 ds-test-c6xx8                                1/1     Running       0          18m
 ds-test-w45dv                                1/1     Running       5          28h
 ```
-Kubernetes supports [Pod-tolerations](https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration), which are per default configured with a *timeout* of `300s` (5 minutes!). This means, that affected Pods will *remain* for a timespan of 300s on a *broken* node before eviction takes place
+Pod tolerations can be determined like this:
 ```
 $ kubectl -n <namespace> describe pod <pod-name>
 
@@ -868,7 +815,7 @@ Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists fo
 [...]
 ```
 
-To be more reactive Pod-tolerations can be configured as follows:
+To be more reactive to node failures Pod tolerations can be adjusted within the Pod template as follows:
 ```
 kind: Deployment or StatefulSet
 apiVersion: apps/v1
@@ -891,10 +838,10 @@ spec:
       [...]
 ```
 
-## Keep your cluster balanced <a name="user-content-keep-cluster-balanced"></a>
-Kubernetes, in first place, takes care of high availability, but not of well balance of pod/node.
+## Keeping the cluster balanced <a name="user-content-keep-cluster-balanced"></a>
+In first place, Kubernetes takes care of high availability, but not of well balance of pods per node.
 
-In case of `Deployment` or `StatefulSet` a `topologySpreadConstraint` needs to be specified:
+In case of `Deployment` or `StatefulSet` a [`topologySpreadConstraint`](https://kubernetes.io/docs/concepts/workloads/pods/pod-topology-spread-constraints/) needs to be specified:
 ```
 kind: Deployment or StatefulSet
 apiVersion: apps/v1
@@ -909,7 +856,7 @@ spec:
       topologySpreadConstraints:
       - labelSelector:
           matchLabels:
-            app: {{ .Chart.Name }}-{{ .Values.stage }}
+            app: the-app
         maxSkew: 1
         topologyKey: kubernetes.io/hostname
         whenUnsatisfiable: DoNotSchedule