Notes on Dockers and Kubernetes

Dockers

  • It uses cgroups (control groups) to limit resources access per process.

  • It uses Namespacing (Isolating resources per process)

  • Container represents application which includes one or many services.

  • Docker Swarms and Kubernetes can be compared. However Kubernetes is more sophisticated, provides in-built tools for logging and monitoring. Does auto-scaling!!!, needs manual load balancing configuration (docker swam does auto load-balancing), does better rolling updates. Docker swam uses 3rd party tools like ELK for monitoring where k8s has better in-built tools.

  • docker does not run init, cron, syslog processes. You may want to use phusion/baseimage or phusion/passenger-docker if you want to avoid too many zombie processes. If you want to run multiple processes in background, you can use phusion images or use supervisord process.

  • Example Dockerfile :

    # syntax=docker/dockerfile:1
    FROM node:12-alpine
    RUN apk add --no-cache python2 g++ make
    WORKDIR /app
    COPY . .
    RUN yarn install --production
    CMD ["node", "src/index.js"]
    
  • Build and run docker image:

    docker build -t getting-started .             # Tag the image.
    docker run -dp 3000:3000 getting-started  <my_optional_cmd>
    
  • You can have multiple RUN commands to be executed within the image. You can have atmost only one CMD which is default command. You can override this during invocation.

  • docker-compose is useful:

    - To bringup an application which depends on multiple containers in specific order
    - Create proper networking artifacts among them. Easy to specify port forwarding settings.
    - Each container in docker-compose.yaml file is called a service. 
    

    The service name can be used as hostname from other containers.

  • To wait for one docker container before starting another container, you can make use of health-check feature in docker-compose. That will enforce depdendency wait before forking other containers. See https://stackoverflow.com/questions/31746182/docker-compose-wait-for-container-x-before-starting-y

  • You can use docker volumes to share the directory between host and containers. This breaks isolation, but works to share resources.

  • Commands:

    docker version
    docker ps
    docker images
    docker run hello-world
    docker run -it ubuntu bash  
    docker run busybox echo hi there # we override the default command
    docker run busybox ping google.com; docker ps --all
    docker login
    docker create hello-world
    docker start
    ## docker run === docker create + docker start
    docker-compose -v
    docker system prune   ## Delete unused images
    docker logs  <container-id>
    docker stop <container-id>   ## Sends SIGTERM
    docker kill <container-id>   ## Sends SIGKILL
    docker run redis
    docker exec -it <container-id> redis-cli  ## Run additional command on running container.
                    ## -it - interactive, allows stdin to be forwarded.
    docker exec -it <container-id> bash       ## Most common use
    
    ## Create a new container after running few commands. Take snapshot of current docker image.
    docker commit -c 'CMD ["redis-server"]' <container-id>
    
    docker run -p 8080:80 <image-id>  ## Forward localhost:8080 to container's port 80.
    
    ## Dockerfile vs docker-compose.yml
    docker-compose up            ## Similar to:  docker run my-image
    docker-compose up --build    ## Rebuild and run
    
    docker-compose ps   ## Looks for ./docker-compose.yml to list containers belonging to this.
    

Kubernetes

  • See https://kubernetes.io/docs/concepts/overview/components/

  • Cluster nodes - Master Node and Worker nodes

  • Self healing i.e. reschedule and replace containers,

  • Gives indepedent IP address and DNS resolution for containers.

  • Can auto-mount storage volumes from public cloud or local.

  • Master Node Components:

    - API Server:
    
    • REST API server,
    • Reads/writes into etcd.
    • Can scale horizontally.
    • Can configure secondary API servers so that primary acts like proxy load balancer.
    • Clients can directly interact with worker nodes for main application workload. Need not be routed through master. API Server only provides configuration entrypoint and also service endpoint lookup DNS service ??
    • Scheduler:

      • Highly configurable using policies, plugins and profiles.
      • Assigns new pods to nodes.
      • API Server provides new object requirements, global nodes resource usage (from etcd)
      • Considers requirements for QoS, data locality, affinity, cluster topology etc.
      • The decision is communicated to API server which later delgates work to other control plane agents.
    • etcd: Distributed key/value store:

      • Only API Server reads and writes.
      • Always appends periodically compacted
      • etcdctl provides backup, snapshot, restore for single node cluster (non-production).
      • For production, we need etcd in HA mode.
      • Stacked topology if etcd lives with master node(s).
      • By default kubeadm cluster bootstrapping tool provisions stacked configuration.
      • Both stacked and external etcd configurations support HA configurations.
      • etcd is based on the Raft Consensus Algorithm
      • At any given time, one of the nodes in the group will be the master, and the rest are followers.
      • etcd gracefully handles master elections and can tolerate node failure, including master node failures.
      • Any node can be treated as a master.
      • Stores cluster state, configuration details such as subnets, ConfigMaps, etc and Secrets.
    • controller-Managers:

      • Watch loops for cluster's desired state vs current state
      • If mismatch in config, operation is further delegated to relevant controller manager.
    • kube-controller-manager:

      • Fix pod counts when pods go down or new one needed.
      • Create endpoints, service accounts, API access tokens
    • cloud-controller-manager:

      • Interface with cloud provider services for nodes, storage, load balancing and routing.

    Could be many master nodes for HA. The etcd is better than zookeeper -distributed key-value store. https://news.ycombinator.com/item?id=18687516

  • Worker Node: kubelet, Kube-Proxy, Pods. kubelet is node agent which runs the containers as directed by scheduler. Kube-proxy is the networking proxy agent which maintains subnets, routing and helps with load balancing. Each pod can contain many docker instances.

  • Node is like EC2. Pod is like a VM. Container is a docker instance. k8s does not handle container directly. Smallest unit of management is a Pod. Pod is an abstraction layer. Note that a Container could be docker or it could also be rkt container or even a VM. Pod is a logical host. Containers in pod share same IP and Ports space. Containers in same Pod share logical volumes. Multi-container Pods may use some container as helper or proxy or bridge for the other container. (e.g. nginx reverse proxy to nodejs application) We avoid bundling many processes in single container - One process per container is easier to troubleshoot and reuse. Hence a collection of container for Pod makes more sense.

  • You can deploy brokers to Kubernetes as StatefulSets. This ensures every broker had a unique identity and ensures that the persistent volumes (data) of each broker remains attached to the same pod (and are never interchanged between pods).

  • start minkube with VM Driver:

    minikube start --driver=hyperv
    # By default docker driver will be used instead of VM driver e.g. hyperv
    # The docker driver may not work properly.
    
  • kubeadm tool: used for provisioning cluster. Mainly used to automate and test cluster not much in production.:

    kubeadm init --apiserver-advertise-address=192.168.3.19
      --pod-network-cidr=192.168.0.0/16
    # It prepares the host as masternode
    # If you want to uninstall master node config: kubeadm reset
    # For worker nodes to join the cluster, run following as root there:
    kubeadm join 192.168.3.19:6443 --token asdfghjklll  ...
    
  • File locations and files:

    - clientpod.yaml  : Specifies POD configuration:
    

    {
    apiVersion: v1, kind: 'Pod', // apiVersion limits the kind option values. name: 'client-pod', containers: [ { name: 'client', image: 'myuser/mysql', ports: <exposed-ports> }]

    }

    • client-node-port.yaml : Specifies node port mappings.
      {
      kind: 'Service', // In Kubernetes context, service is network service. name: 'client-node-port', // Service name creates pseudo hostname with IP. // curl http://client-node-port:8080 // Above used for inter-pod communication. // The nodePort is applied to top-level VM IP. // http://VM-IP:32100 is auto balanced to http://pod-IP:80 // Note that Every POD has unique IP. type: NodePort | ClusterIP | LoadBalancer, ports: [ { port: 8080, targetPort:80, nodePort:32100 } ] // // Better leave nodePort out, it will be auto-assigned to 30000-32767 // // containerPort => Node-Port => Host-Port mapping. // Host:port auto load balanced to all node ports ??? selector: { component: 'web' }

      }

    • Config files used to create Objects of type: Pod | Service | StatefulSet | ReplicaController

    • /var/lib/kubelet/config.yaml

    • /etc/kubernetes/pki/ (certificates for API server and for healthchecks)

    • /etc/kubernetes/admin.conf, kubelet.conf, controller-manager.conf, scheduler.conf, manifests/kube-scheduler.yaml, etc

    • control plane runs as static PODs as defined in manifests/.yaml files.

  • To create password:

    # kubectl create secret ...
    # kubectl get secrets
    
  • To deploy mysql container on your cluster, you just need .yaml file:

    # kubectl create -f  \
       https://k8s.io/examples/application/word-press/mysql-deployment.yaml
    # // The yaml file specifies ports, pvc (persistent volume claim for 20GB).
    
    # kubectl get pvc
    
    # kubectl get pods -o wide
       NAME          READY   STATUS    RESTARTS   AGE   IP           NODE        
       hello-bzrzk   1/1     Running   0          22s   10.244.1.2   multinode-demo-m02
       hello-frcvw   1/1     Running   0          22s   10.244.0.3   multinode-demo  
    
    # kubectl get services  # Lists kubernetes, mysql, wordpress etc
          # service type could be: LoadBalancer or ClusterIP or NodePort or Ingress
          # i.e. How service endpoint works. Internal / External / Loadbalanced
          # Exposure: ClusterIP < NodePort < LoadBalancer
    
    # minikube service wordpress --url
        http://192.168.99.100:31536  # Gives you service URL 
    
    # kubectl get deployments
    
    # kubectl get services
    
       NAME         TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)   AGE
       kubernetes   ClusterIP   10.96.0.1    <none>        443/TCP   59m  // Built-in Service
    
    # minikube ip
      172.31.235.139
    
    # You can do: docker ps and docker kill to kill a container
    # but it will be auto-restarted by kubernetes.
    
    # Every pod creates a new IP. You never directly access the POD by its IP.
    # You always connect to POD using node's IP and the forwarded port.
    
    # kubectl describe pods     ## Longer form details.
    # minikube docker-env       ## Displays docker env variables
    # eval $(minkube docker-env)  ## Makes local docker client to point to docker daemon in node VM.
    # docker system prune -a    ## Will delete all unused images, networks, etc.
    
    # // Rollout deployment
    
    # kubectl create deployment mynginx --image=nginx: 1.15-alpine'
    # kubectl get deploy,rs,po      ## deployments,replicasets,pods
    #                               ## -l app=mynginx to filter deployments.
    # // scale the deployment up to three replicas.
    # kubectl scale deploy mynginx --replicas=3
    # kubectl describe deployment
    # // The image is nginx: version 1.15-alpine.
    # // And so far, we only have a single revision. called revision 1
    # kubectl rollout history deploy mynginx  [ --revision=1 ]
    
    # // Let's upgrade our image.
    # kubectl set image deployment mynginx nginx=nginx:1.16-alpine
    # // Implicit roll out is kicked off after setting new image.
    
    # kubectl rollout history deploy mynginx --revision=2
    
    # ## it shows the 1.16-alpine image, the updated image.
    # Now, let's take a quick look at our objects.
    
    # kubectl get deploy,rs,po      ## deployments,replicasets,pods
    #
    # // It shows the replicasets are migrated from old ones to new ones.
    # // To rollback to previous version ...
    # kubectl rollout undo deployment mynginx --to-revision=1
    #
    # Revision 3 is created which represents rollback version of version1
    # you can perform up to 10 consecutive rolling updates and rollbacks.
    # Now, finally, rolling updates and rollbacks
    # are not specific for deployments, only.
    # They are supported by other controllers, as well,
    # such as DaemonSets and StatefulSets.
    
  • To Clean up:

    # kubectl delete secret mysql-pass
    # kubectl delete deployment -l app=wordpress
      deployment "wordpress" deleted
      deployment "wordpress-mysql" deleted  (Note: dependents auto deleted)
    
    # kubectl delete service -l app=wordpress
      service "wordpress" deleted
      service "wordpress-mysql" deleted 
    
    # kubectl delete pvc -l app=wordpress
    
    # kubectl delete -f  mypod.yaml  # Delete pod that was created using this.
    
    # Note: Deployment is responsible to run set of pods.
    #       Service is responsible for network access for pods.
    
  • Install minikube on windows :

    # minikube start --driver=hyperv
    * minikube v1.25.1 on Microsoft Windows 10 Pro 10.0.19042 Build 19042                                     
    * Using the hyperv driver based on user configuration                                                     
    * Downloading VM boot image ...                                                                           
        > minikube-v1.25.0.iso.sha256: 65 B / 65 B [-------------] 100.00% ? p/s 0s                           
        > minikube-v1.25.0.iso: 226.25 MiB / 226.25 MiB  100.00% 29.49 MiB p/s 7.9s                           
    * Starting control plane node minikube in cluster minikube                                                
    * Downloading Kubernetes v1.23.1 preload ...                                                              
        > preloaded-images-k8s-v16-v1...: 504.42 MiB / 504.42 MiB  100.00% 29.30 Mi                           
    * Creating hyperv VM (CPUs=2, Memory=6000MB, Disk=20000MB) ...                                            
    * Preparing Kubernetes v1.23.1 on Docker 20.10.12 ...                                                     
      - kubelet.housekeeping-interval=5m                                                                      
      - Generating certificates and keys ...                                                                  
      - Booting up control plane ...                                                                          
      - Configuring RBAC rules ...                                                                            
    * Verifying Kubernetes components...                                                                      
      - Using image gcr.io/k8s-minikube/storage-provisioner:v5                                                
    * Enabled addons: storage-provisioner, default-storageclass                                               
    * kubectl not found. If you need it, try: 'minikube kubectl -- get pods -A'                               
    * Done! kubectl is now configured to use "minikube" cluster and "default" namespace by default                         
    
    # minikube kubectl -- get pods -A
        > kubectl.exe.sha256: 64 B / 64 B [----------------------] 100.00% ? p/s 0s
        > kubectl.exe: 45.62 MiB / 45.62 MiB [---------] 100.00% 12.56 MiB p/s 3.8s
    
    NAMESPACE     NAME                               READY   STATUS    RESTARTS        AGE
    kube-system   coredns-64897985d-fdbdj            1/1     Running   0               3m26s
    kube-system   etcd-minikube                      1/1     Running   0               3m39s
    kube-system   kube-apiserver-minikube            1/1     Running   0               3m39s
    kube-system   kube-controller-manager-minikube   1/1     Running   0               3m39s
    kube-system   kube-proxy-qv5jp                   1/1     Running   0               3m25s
    kube-system   kube-scheduler-minikube            1/1     Running   0               3m39s
    kube-system   storage-provisioner                1/1     Running   1 (2m55s ago)   3m37s
    
    #  minikube status
    
    minikube
    type: Control Plane
    host: Running
    kubelet: Running
    apiserver: Running
    kubeconfig: Configured
    
    # minikube kubectl -- version
    Client Version: version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.1", ...}
    Server Version: version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.1", ...}
    
    $ kubectl cluster-info
    Kubernetes control plane is running at https://172.31.235.139:8443
    CoreDNS is running at https://172.31.235.139:8443/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy
    
  • minikube IP vs ClusterIP:

    - The minikube IP is a top-level VM IP address. 
    

    (Typically looks like 192.168.x.y OR 172.21.131.216)

    • The clusterIP is virtual IP with in that VM. Typically looks like: 10.96.0.1
  • Kubernetes Limits:

    - No more than 110 pods per node
    - No more than 5000 nodes
    - No more than 150000 total pods
    - No more than 300000 total containers
    
  • You can execute commands on specific pod like below:

    kubectl exec mypod1 -c container1 -- /bin/cat /usr/share/nginx/html/index.html
    # See for more info: https://www.mirantis.com/blog/multi-container-pods-and-container-communication-in-kubernetes/
    
  • You can build cluster imperatively or declaratively.

  • Non-minikube installation order:

    kubeadm init --config=configfile.yaml
    kubeadm init --apiserver-advertise-address=192.168.3.19  // By default, the default network interface used.
                 --pod-network-cidr=192.168.0.0/16
    
    # Now empty cluster is available. worker nodes can join now.
    kubeadm join 192.168.3.19:6443 --token asdfghjklll  ...   # Execute this from worker node.
    sudo kubeadm init phase control-plane all --config=configfile.yaml
    sudo kubeadm init phase etcd local --config=configfile.yaml
    # you can now modify the control plane and etcd manifest files
    sudo kubeadm init --skip-phases=control-plane,etcd --config=configfile.yaml
    
  • VirtualBox and HyperV Networking:

    Within VirtualBox, the NAT guest always has default ethernet interface enp0s3: inet 10.0.2.15 netmask 255.255.255.0 broadcast 10.0.2.255 The host-side NAT IP is also fixed: 10.0.2.2; However it is not visible in Host when you run ipconfig or ifconfig.

  • If you want to network host and VM, then you should create VirtualBox Host-Only network adapter using virtualBox UI. See https://www.techrepublic.com/article/how-to-create-virtualbox-networks-with-the-host-network-manager/ And then for the guest, choose networking adapter as "Host-Only Adapter" instead of NAT.

  • Common Host Only network CIDR ranges start with 192.168.*.*/24 (VirtualBox prefers this range) OR 172.*.*.*/24 (Hyper-V kubernetes prefers this range). You should not use loopback IP range 127.*.*.* or link-local (169.254.0.0/16 and 224.0.0.0/24).

  • By default, minikube HostOnlyCIDR is 172.*.*.*/24

  • You can start minkube like below to specify HostOnlyCIDR:

    minikube start --cpus 2 \
                --memory 2048 \
                --disk-size 20g \
                --vm-driver virtualbox \
                --network-plugin flannel \
                --kubernetes-version v1.12.2 \
                --host-only-cidr 192.168.77.1/24
    
  • When you want mysql running on Host to be made available to all pods, Here is a content of the mysql-service.yaml file:

    ---
    apiVersion: v1
    kind: Service
    metadata:
       name: mysql-service
    spec:
       ports:
       - protocol: TCP
         port: 3306
         targetPort: 3306
    ---
    apiVersion: v1
    kind: Endpoints
    metadata:
      name: mysql-service
    subsets:
      - addresses:
          - ip: 192.168.77.1
        ports:
          - port: 3306
    

Pod YAML config File

Example Pod config file:

apiVersion: v1             // Hardcoded as v1 for Pod objects.
kind: Pod                  // Object type is Pod
metadata:
  name: nginx-pod          // Uniquely identifies the Pod.
  labels:
    app: nginx             // Belongs to a class of app=nginx. Used to filter.
spec:                      // Pod Config
  containers:              // List of containers
  - name: nginx            // Container name
    image: nginx:1.15.11   // DockerHub Image
    ports:
    - containerPort: 80    // Internal container Port

Note: Pod does not express it's desired replicas count. For that, it is used with one of the controllers: Deployments, ReplicaSets, ReplicationController Also the deployment .yaml file includes replicasets specification as well.

Deployment File

  • Deployment object describes Pods and with total number of instances (i.e. replicas)

  • It has it's own label and config (spec) and also assigns label and config for pods.

  • It can choose pods described in other .yaml file using selectors as well.

  • Example Deployment file:

    apiVersion: apps/v1           // API endpoint of the API Server.
                                  // Also matches existing version of Deployment object.
                                  // If version changes, new deployment object will be created.
    
    kind: Deployment              // Object type is "Deployment"
    metadata:
      name: nginx-deployment
      labels:
        app: nginx
    spec:                          // Configuration.
      replicas: 3                  // Total number of Pods desired
      selector: 
        matchLabels:
          app: nginx               // Select Pods from other Pods .yaml file
      template:                    // Pods template starts here.
        metadata:                  // Metadata for pod
          labels:                  // Assign labels for the Pod.
            app: nginx             // name is inherited from Pods .yaml file or ???
    
        spec:                      // spec.template.spec is Pod config.
          containers:              // Containers of the Pod.
          - name: nginx
            image: nginx:1.15.11
            ports:
            - containerPort: 80    // Internal TCP port that container binds to.
    
    // Note: The 4 required fields are: apiVersion, kind, metadata and spec.
    //       The extra field "status" is populated by kubernetes and tracks the
    //       current state vs the desired state.
    

ReplicaSet

ReplicaSet is next generation "Replication Controller" which is an implementation of auto-scaling for Pods. A ReplicaSet with replica count 3 for a specific Pod template would create identical Pod-1, Pod-2 and Pod-3. Replicas are created or destroyed by continously monitoring the "desired state config".

Labels

Labels are key-value pairs attached to Kubernetes objects such as:

  • Pods (e.g. env=qa and app=frontend)
  • ReplicaSets
  • Nodes
  • Namespaces
  • Persistent Volumes
  • Deployments

You can select subset of objects using labels.

Kubernetes Namespaces

If multiple users and teams use same Kubernetes cluster we can partition the cluster into virtual sub-clusters using Namespaces. Object names should be unique only within the same Namespace:

$ kubectl get namespaces             # These 4 created out-of-the-box.
NAME              STATUS       AGE
default           Active       11h   # Default one for new objects.
kube-node-lease   Active       11h   # node heartbeat data.
kube-public       Active       11h   # Public info about cluster.
kube-system       Active       11h   # Control plane agents

Provides multi-tenancy.

Kubernetes Service

  • An abstract way to expose an application running on a set of Pods as a network service.

  • Kubernetes gives Pods their own IP addresses and a single DNS name for a set of Pods, and can load-balance across them.

  • Service definition uses a selector to filter the Pods for which the service should apply:

    apiVersion: v1
    kind: Service
    metadata:
      name: my-service
    spec:
      selector:
        app: MyApp
      ports:
        - protocol: TCP
          port: 80
          targetPort: 9376
    
  • This specification creates a new Service object named "my-service", which targets TCP port 9376 on any Pod with the app=MyApp label.

  • Kubernetes assigns this Service an IP address (sometimes called the "cluster IP"), which is used by the Service proxies.

  • The controller for the Service selector continuously scans for Pods that match its selector, and then POSTs any updates to an Endpoint object (to api server) also named "my-service".

Kubernetes Dashboard

# Install Kubernetes Dashboard
kubectl apply -f https://raw.githubusercontent.com/kubernetes/dashboard/v2.2.0/aio/deploy/recommended.yaml

# Patch the dashboard to allow skipping login
kubectl patch deployment kubernetes-dashboard -n kubernetes-dashboard --type 'json' \
     -p '[{"op": "add", "path": "/spec/template/spec/containers/0/args/-", "value": "--enable-skip-login"}]'

# Install Metrics Server
kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/download/v0.4.2/components.yaml

# Patch the metrisc server to work with insecure TLS
kubectl patch deployment metrics-server -n kube-system --type 'json' \
  -p '[{"op": "add", "path": "/spec/template/spec/containers/0/args/-", "value": "--kubelet-insecure-tls"}]'

# Run the Kubectl proxy to allow accessing the dashboard
kubectl proxy

Kubernetes Provisioning For testing - KIND, minikube, k3s

  • kubeadm - Does not provision hosts, but does everything else.
  • kubespray - Install on AWS, Azure, GCE, OpenStack, vSphere, or bare metal. Ansible based.
  • kops, kube-aws, etc - Kubernetes incubator projects. Install on cloud using cli.
  • https://github.com/kelseyhightower/kubernetes-the-hard-way
  • minikube notes:
    • minikube --driver=none (on Linux): Run Kubernetes components directly on host OS with out VM. Docker required.
    • minikube --driver=docker: Does not need VM. Does not need virtualization enabled on host. Use existing docker install.
  • Worker nodes support windows 2019 server operating system.
  • Type-2 Hypervisor examples are: VitualBox, KVM2 or Docker and Podman.
  • Hyper-V is a Type-1 hypervisor-based architecture available on Windows.
  • Deployments vs StatefulSets vs Daemonsets: See https://medium.com/stakater/k8s-deployments-vs-statefulsets-vs-daemonsets-60582f0c62d4

Containerd, Docker, Kubernetes, Lxc Linux Containers

  • See https://www.cloudsavvyit.com/10075/what-is-containerd-and-how-does-it-relate-to-docker-and-kubernetes/
  • Since Docker was developer oriented using docker API was heavy, they created dockershim (now deprecated). The docker runtime was extracted to be a different component called containerd and donated to CNCF.
  • The containerd is container runtime used by Docker. This is compliant to OCI spec. Other alternative runtimes are CRI-O and runC.
  • The Linux Containers are mini virtual machines where as dockers designed to run single application only.

Learn Kubernetes Hardway

  • See https://github.com/kelseyhightower/kubernetes-the-hard-way 30K stars!
  • See Videos: https://www.youtube.com/watch?v=2bVK-e-GuYI Just Me and OpenSource channel by Venkat.
  • lxc to create containers.
  • netstat -nltp # Display ports along with program namd and PID.
  • CRI is Container Runtime Interface. CRI-containerd is CRI compatible containerd, modern alternative to dockershim.
  • CRI-O enables the use of any Open Container Initiative (OCI) compatible runtime with Kubernetes such as runC and clear Containers.
  • frakti enables CRI implementation through hardware virtualization. It supports Kata Containers.
  • You typically create a container image of your application and push it to a registry before referring to it in a Pod.

Node Agent

  • Node agent is also known as kubelet.
  • The kubelet connects to container runtimes using plugin based interface called CRI.
  • The CRI consists of:
    • protocol buffers - Serialization format like JSON. Binary Format. Involves IDL Interface Description Language to specify data structure and generate code to parse or create data stream. Similar to Thrift but Thrift also helps with RPC client/server code generation which protobuf does not do.
    • gRPC - Google RPC. Modern RPC system designed in Google in 2015:
      • Uses protobuf for data format.
      • Uses bidirectional streaming and flow control.
      • Uses HTTP/2 and TLS.
    • libraries and other tools.
    • The CRI implements two services: ImageService and RuntimeService.
    • The ImageService is responsible for all the image-related operations,
    • The RuntimeService is responsible for all the Pod and container-related operations.

Worker Node Proxy - kube-proxy

  • The kube-proxy is the network agent which runs on each node.

  • The kube-proxy is responsible for TCP, UDP, and SCTP stream forwarding or round-robin forwarding across a set of Pod backends, and it implements forwarding rules defined by users through Service API objects.

  • Kubernetes Networking model:

    • Treats Pods as VMs so every Pod gets their own IP.
    • Containers with-in Pod communicate each other using localhost. (No IP per container)
  • Container Network Interface:

    * It is a CNCF (cloud native computing foundation) project.
    * See https://github.com/containernetworking/cni/blob/master/SPEC.md
    * It consists of:
    
    • A specification and libraries for writing plugins to configure network interfaces in Linux containers.
    • Also includes a number of supported plugins.
    • The spec Enables developement of 3rd party networking plugins compatible with Kubernetes Networking Model.
    • Kubernetes networking model compatible some 3rd Party networking plugin implementations are:
      • Project Calico (Layer 3 virtual network)
      • Amazon ECS CNI Plugins
      • Knitter - Supports multiple networking
      • Azure CNI - Extends Azure Virtual Networks to containers
    • Some built-in core reference implementation supporting CNI:
      • See https://github.com/containernetworking/plugins
      • bridge - Creates a bridge network between host and container.
      • ipvlan - Adds ipvlan interface in container.
      • loopback - Set the state of loopback interface to up.
      • vlan - Allocate vlan device
  • Networking challenges:

    • Container-to-Container communication inside Pods. (using network namespace)
    • Pod-To-Pod communication with same node or across cluster nodes
    • Pod-To-Service communication with in same namespace and across clusters.
    • External-to-service communication for clients to access applications.
  • Kubernetes enables external accessibility through Services:

    • complex encapsulations of network routing rule definitions stored in iptables on cluster nodes and implemented by kube-proxy agents.
    • Applications become accessible from outside the cluster over a virtual IP address.

Worker Node Addons

  • Addons are cluster features implemented through 3rd-party pods and services.:
    • DNS: Cluster DNS
    • Dashboard: Web interface for cluster management.
    • Monitoring: Collects cluster-level container metrics to central data store
    • Logging: Collects Logs to central log store.