If you have any questions or run into problems, please ping in the #infra-k8s-learning Slack channel.
These days are intended to be able to be done in an hour.
Day 1: Minikube and Pod basics
Complete Hello Minikube. Also note that you can install minikube via a homebrew cask: brew install minikube. If you want to hop on the VPN, use good old virtualbox instead of xhyve and omit the --vm-driver=xhyve Make sure you are off the VPN before creating the minikube machine and don’t have any virtualbox hostonly interfaces hanging around (big machine delete-host-only-networks to clear those up). If using xhyve, make sure to chown/setuid the docker-machine-xhyve stuff as in the doc. Here’s what I typically use to start minikube: minikube start --insecure-registry registry.banno-internal.com --cpus 4 --memory 4096.
If you’re having connection issues to minikube try and restart your computer. This has worked for people using Docker for Mac.
Go through Kubernetes By Example - Pods.
kubectl get gets used a lot… play with different variations of it while running a pod from before:
kubectl get podkubectl get pod/$POD_NAMEkubectl get pod $POD_NAMEkubectl get pod/$POD_NAME -o yamlkubectl get nodekubectl get node -o wide
There’s also a kubectl describe which is a detailed report on a resource. Try out a kubectl describe on a pod and node.
Questions:
- Do pods share an IP address? How about containers in a pod?
- What does the –port flag on the
kubectl rundo? - How do you view logs for a pod?
- How do you exec into a container? How is the connection to the container done?
- How do you update a pod? A deployment?
Day 2: Learning more about kubectl
kubectl is your bread and butter for interacting with the kubernetes control plane. Today we’re going to investigate some of the things you can do with it. Launch a nginx deployment with kubectl run nginx --image=nginx.
Here’s some commands to try out:
kubectl get podskubectl exec -it $nginx_pod -- bashwhere$nginx_podis the pod from earlier. This lauches a executable within the container of the pod. Note that the executable must be in the container. Since this container is debian based, you canapt-get update && apt-get install -y curltocurl http://localhost.- When we ran our nginx pod earlier, we didn’t tell the run command to expose any service (the
--exposecommand). However, we can still force a port forward locally withkubectl port-forward $nginx_pod 8080:80. Hit http://localhost:8080 in your browser. This TCP traffic is multiplexed over the http connection to the kubernetes apiserver. kubectl get pods $nginx_pod -o yaml > nginx-pod.yamlto get the current pod as a YAML file.kubectl logs -f $nginx_podwill follow the stdout/stderr of the podkubectl explain pod.specwill explain any part of any kubernetes resource object. It’s very helpful to find quick documentation.kubectl scale deployments/nginx --replicas=2to scale out the deployment of nginx to two pods.kubectl get podsto see. We’ll go over deployments in the next few days.kubectl rollout status deployment/nginxto see the current deployment’s progress. It will also wait on the current deployment to finish. Try scaling up the nginx to 10 in one terminal and do the status in another.kubectl edit deployment/nginxto edit the current deployment. Try adding a newfoo: barlabel underspec.template.metadata.labelskubectl rollout history deployment/nginxto see the revisions our our changekubectl run --rm -i --tty --restart=Never busybox --image=busybox -- shto run a quick pod that will be removed after you exit out.kubectl top pod $pod_nameafter enablingheapsterviaminikube addons enable heapster. heapster is a lightweight metrics collector that kubernetes uses for auto-scaling.
Overview of kubectl and kubectl user guide has all the differnt things you can do.
- install the kubectl completion (might be done already if you’re using homebrew)
It keeps the base configuration in $HOME/.kube/config which keeps track of the server certificate the id token for authentcation to a cluster.
Questions:
- What does
kubectl apply -f nginx-pod.yamldo? What if there’s an existing pod with the same name? - What does
kubectl delete -f nginx-pod.yamldo? - For a different ux, try
minikube dashboard - How do you pronouce
kubectl?
Day 3: Labels and Services
So each pod gets its own IP address. But how does one other pod find another pod to connect to? We don’t want to have to re-invent service discovery all over again. This is where kubernetes Services come in. So pods all have a network CIDR in which they’re allocated into. But there is a whole other CIDR that kubernetes will use for what’s called a Service. A Service is a allocated virtual IP address which will proxy to one of a group of pods that matches based on Labels. Every kubernetes resource can have plain key/value labels in the metadata of the object.
Read Kubernetes By Example - Labels for more information on labels.
When a service is created (in basic terms):
- kubernetes allocates an IP address from its service CIDR pool and remember it for all time
- kubernetes find matching pods that match the label selector of the service and stores those as service endpoints.
- a
kube-proxyprocess runs on all nodes. It gets updates from the kubernetes api whenever a service is created or those endpoints change (you’ll learn that everything goes through the kubernetes apiserver). When a service IP is added or modified,kube-proxychanges the iptables of the running system so that when ever that service IP is requested, the kernel sends the tcp traffic to one of the backing pod IPs.
Not only can a service just be a virtual IP of type ClusterIP (what kubernetes calls it), it can be of a NodePort which allocated a free port in the range 30000-31000 or somthing which will forward traffic to the cluster IP. All NodePorts are also ClusterIPs.
If you’re running within a cloud provider (the best way to run Kubernetes), there’s an additional service type called LoadBalancer which is NodePort service type, but Kubernetes will also allocate a load balancer for whatever cloud you’re in and point it at the nodes’ allocated node port. It will reuse an allocated load balancer and just add listeners to it if there’s one already.
Read Services for more on services.
Not only does kubernetes do nice things with ClusterIPs, it integrates it with DNS. So then applications can just refer to simple names like nginx instead. Kubernetes sets the DNS search domains to the current namespace (we’ll learn more about namespaces later) and to a level up above that i.e. nginx.infrastructue for the nginx service in the infrastructue namespace.
It also does environment variables like the old docker-compose used to do for services in the same namespace, but it’s not used that often.
Read DNS Pods and Services to learn all the gory details about DNS.
Here’s some things to try out:
- deploy our nginx from before and set a label on it with
kubectl label pod $nginx_pod app=nginxand create aClusterIPservice for matching theapp=nginxlabels, on the port 80 using YAML and applying it withkubectl apply -f service.yaml. Theapplabel is typically used to match a service against pods. - minikube adds the dns support out of the box. Start a quick shell pod with ubuntu or busybox and
curl http://nginx. What is the virtual IP? - try
kubectl get endpointsto get all the backing pod IPs of services. It’s helpful to help find mismatches on labels or to see if a service is fully down. - change the nginx service to be a
NodePortand connect to the node port of the minikube machine. - add a named port to your nginx service. look up the SRV records for service.
- There’s yet another type of
service:ExternalName. It can set up a service to alias to a external host outside the cluster. It’s very helpful for migrating workloads like an ingress (https://github.com/kubernetes/ingress-nginx/pull/629). Create anExternalNameservice forapi.banno.comas justapi. Then try and curl it from within the cluster withcurl -kv https://api/ping. To be able to curl it, you’ll need to start an ubuntu pod withkubectl run --rm -i --tty --restart=Never ubuntu --image=ubuntu -- bashand thenapt-get update && apt-get install -y curl.
Questions:
- What happens when you try to connect to a service that’s not backed by any pods?
- What do you think happens to the source IP of a connection from a service? See Source IP for more information.
- What’s the difference between using kube-proxy as a network proxy vs iptables? Can kube-proxy be a network proxy?
- In the crowded JHA network, can the pod CIDR and service CIDR overlap with any JHA services?
Day 4: Pod Lifecyles & Container Probes
Pod lifecycles are important for both application resiliency and smooth rollouts.
There’s some good reading on health checks:
- Utilizing Kubernetes Liveness and Readiness Probes to Automatically Recover from Failure
- Kubernetes By Example - Health Checks
- Pod Lifecycle
Do the examples on Liveness and Readiness Probes
Go through Configure Pod Initialization and learn how a a pod can have init containers to do some work before starting other containers.
Read Attach Handlers to Container Lifecycle Events and configure handlers on nginx.
Questions:
- What’s the difference between a readiness probe and a liveness probe?
- Where can you find out why an pod has been restarted?
- When a container OOMs, where is it logged and what happens?
Day 5: Deployments
Coming from Marathon, Deployments will look pretty familiar. There pretty analogous to a Marathon app. They’re declarative, in that you declare what and how many you need and Kubernetes takes care of the rest. It handles crashing or deleted Pods to make sure the number of desired instances is always up.
Read Deployments and follow the examples to get the low down on Deployments. There’s a lot of little things to try in there.
If you want to learn a bit about Horizontal Pod Scaling or hpa for short, go through this quick walkthrough. It uses an underlying Deployment as the container, but changes the amount of replicas based on CPU load. *NOTE: in minikube, you must enable the heapster addon before trying out HPA: minikube addons enable heapster
i
There’s a nice little tool for helping tail multiple pods called kubetail. It’s handy for quick debugging and a lightweight alternative to Kibana. For the nginx deployment before, tail all the pods by kubetail nginx. Try to do a label selctor as well.
Questions:
- Can you get all pods for a certain deployment with a filter passed to
kubectl get pods? How about using a label selector? - What command can you run to wait for a deployment to finish?
- How do you force a restart of all pods in a deployment?
- How do you detect a stalled or failed deployment? How do you rollback?
- How can health probes be used to ensure a graceful deployment rollout?
- How would you do a canary deployment using deployments?
Day 6: Namespaces, ConfigMaps, & Secrets
Namespaces provide a scope of Kuberetes objects. They provide good buckets for a team or a bunch of things that are closely related.
Follow the namespaces tutorial for some more information on namespaces.
The pods that run within a namespace get their DNS search path set to look for things in their current namespace as primary. For instance, if I have a foo service in the same namespace that pod is running, that pod can just refer to foo A record to connect to it. But if the foo service is in a different, say bar namespace than a given pod, that pod has to use the foo.bar A record to refer to it. In either case, foo.bar will work and the full name will work as well: foo.bar.svc.cluster.local.
Namespaces also can be used to put a resource quota on teams: Resource Quotas
Role based authentication (which will go over in a later day), can constrain within a namespace or cluster wide.
ConfigMaps provide a benefit over environment maps, in that they treated as key/value configs that can be shared between pods as either environmental variables or as files that can be mounted in.
Read:
- Configure Containers Using a ConfigMap
- learn how to use them in a pod with Use ConfigMap Data in Pods
- standup redis using Configuring Redis using a ConfigMap
Alongside ConfigMaps, are something called Secrets. They work very similar to a ConfigMap, however, their usage is more restricted and they’re a little harder to work with.
Read: Secrets for the overall design of secrets. Currently, they are stored in plaintext in etcd. That all said, with Tectonic, etcd uses SSL between peers and clients, and can only be accessible from the kubernetes masters. All other control plane traffic is over SSL. There’s work in Kubernetes 1.8 to encrypt and rotate the secrets in etcd.
Go through Distribute Credentials Securely Using Secrets
Questions:
- When would use a
Secretover aConfigMap? How about vice versa? - How does a pod see an updated
ConfigMap? Does the pod need restarted? Or is theConfigMapfile just updated automatically? Try it out - How can you see Banno using namespaces?
Day 7: Ingress
So far, we’ve been mostly concerned with some of the lower abstractions of Kubernetes like Pods and Services. Now we’re going to start to learn some of the higher abstractions that Kubernetes has that build on those Pods and Services.
The first we’ll cover is Ingress. Services are called a Layer 4 (TCP & UDP over IP) construct. See OSI model - Wikipedia for more information on what layers represent. Each higher layer builds on the layers below it.
Ingress is a Layer 7 service used to represent HTTP services. Read Kubernetes Ingress - Jay Gorrell to get a grasp of ingress. The article is a little outdated, but still very relavent.
When using minikube, you must minikube addons enable ingress to enable the ingress controller. It’s normally enabled within Kubernetes clusters, but for some reason, minikube leaves it disabled.
Follow along with Ingress Resources - Kubernetes for more in depth explanation of Ingress.
Using the echoserver deployment in the nginx ingress controller repostory: create an Ingress that sends the /foo HTTP requests to the service. Minikube (really Kubernetes) by default will deploy and start an nginx ingress controller deployment and service by default when an Ingress is created. You’ll need to find the NodePort on which it’s running. By default, Kubernetes will put this nginx-ingress-controller in the kube-system namespace.
There are several more docs and examples in the ingress-nginx repository.
Kubernetes uses annotations which are sort of like labels, but not as ad-hoc. They’re sort of a leaky abstration and are able to add specific things to a certain implementation that others might not support. Look through nginx supported annotations for some of the things they can e used for.
Questions:
- One thing we didn’t talk about was TLS and the ingress support of it. How does it work? If you want to spend a lot of time on it: try generating your own cert and getting your echoserver ingress above to use it. Also see kube-lego
- How do
Ingressrelate toServiceLoadBalancer? What would aLoadBalancerpoint to? - What other ingress controllers are there besides nginx?
Day 8: DaemonSets & Volumes & StatefulSets, oh my!
Today, we’re going to finish workload abstractions with DaemonSets and StatefulSets. And we’ll cover volumes as an aside as they’re usually pertinent to StatefulSet.
First, DaemonSets. They’re pretty simple. It’s a way to ensure that a Pod is scheduled on every node or a given subset of nodes that match a selector. As nodes get created and restarted, the kubernetes controller ensures that each node has a Pod for that DaemonSet. Read Daemon Sets for more information and examples. Their spec is pretty similar to a Deployment in that you provide a pod template and it will create the pods. They’ll be used in our setup for things like: log aggregation & host metrics collection. Tectonic also runs a few kubernetes components inside DaemonSets: kube-apiserver, kube-flannel, kube-proxy as well as this container-linux-update-agent which is a special daemon that interacts with the Container Linux OS to coordinate evicting workloads for restarts from releases. Container Linux releases fairly often for security fixes and upgrades. One of the main tenants of Container Linux is that updates are the most effective way to improve server security and those updates should be automatic. We’ll learn about pod eviction and disruption budgets in the next day as ways of making sure that updates don’t bring down a service fully.
You’ve already seen a little of volumes as ConfigMaps and Secrets are types can be mounted into a Pod as a volume. But those volumes only live as long as the Pod is alive. There’s also an emptyDir volume type along with a ton of other types. If you want a volume to live longer than a Pod, you need to use a PersistentVolume. They’re used via a binding between volume creation and a volume claim. Read Persistent Volumes for how this works. There’s also something called dynamic volume provisioning using a StorageClass kuberentes resource. Instead of creating PersistentVolumes manually, you can set up a StorageClass for when you’re running within a cloud provider. When this is used and PersistentVolumeClaim is created, a new PersistentVolume is created.
Follow the example of deploying a wordpress site and mysql with persistent volumes in minikube.
Like a Deployment, a StatefulSet manages Pods that are based on an identical container spec. Unlike a Deployment, a StatefulSet maintains a sticky identity for each of their Pods. These pods are created from the same spec, but are not interchangeable: each has a persistent identifier that it maintains across any rescheduling. Read StatefulSets for how StatefulSets fit in with everything else.
There’s a lot of examples of StatefulSets that you can work through:
- Example: Deploying Cassandra Here’s an early incarnation of
StatefulSetback when it was calledPetSetwhen they stress tested it using 1000 instances of Cassandra. - Running a MySQL master slave setup
- Kafka
Read k8s StatefulSets and DaemonSets for some recent updates to StatefulSet and DaemonSet and how they can deploy updates.
Questions:
- What happens to a
PersistentVolumewhen a the claim for it is deleted? - What happens to a
PersistentVolumethat was used by aStatefulSetmember? - What does a
StatefulSetneed a “headless” service for? - From another pod, how would you refer to member of a
StatefulSetin DNS?