More Fun with Kubeadm & Fedora

I recently wrote about getting up and running with kubeadm and Fedora CoreOS, which I got working, but which sent me into a miniature funk of uncertainly over various little integration issues.

First, I was getting around the lack of support in rpm-ostree for rpms that place stuff in /opt, which isn’t a traditional place for package managers to put stuff, but which is where kubeadm puts its cni binaries, for historical reasons. I got the Fedora package that provides the cni binaries, containernetworking-plugins, and that doesn’t stick things into /opt, modified to say it provides kubernetes-cni, which is what the upstream kubernetes rpm maintainers call it, but I had to transgress against rpmlint by leaving out the version number. The upstream packagers call explicitly for cni version 0.6.0, while Fedora is shipping version 0.7.4.

As far as I could tell, the later version worked just fine, but I wasn’t sure I’d get my package change merged while telling that lie of omission. That led me to wonder about whether I should try to convince the upstream packagers to move the cni binaries — kubernetes is hard coded to look for them in opt, but you can specify a different location when you’re setting things up, so getting the binaries moved to /usr/libexec/cni, where Fedora keeps them, could be an option. Or, I’ve played with some symlink-type trickery in the past to make cni binaries appear under /opt while actually installed elsewhere, so maybe I could convince the project to accept something like that.

However, Fedora’s cri-o package depends on at least cni 0.7.x, so achieving nicer (or, failing that, trickier) kubernetes-cni packaging to allow for installation on rpm-ostree hosts would mean incompatibility with the runtime I was interested in using, so what’s the point of anything, anyway, even?

Well, my change did get merged to the Fedora package (you can test the package and give it karma) and those newer cni binaries do appear to work just fine with the upstream kubelet, so I’m feeling somewhat better about that part. Also, I think I agreed to co-maintain the containernetworking-plugins package moving forward, so that’s fun.

Elsewhere on the problematic networking front, I had to insert this puzzled passage into my last post:

…I found during my tests that my Fedora CoreOS host was configuring a address on the cni0 interface, which was conflicting with my flannel networking. I found that if I deleted that address from the device, the cni0 interface would get a new, address that worked for my cluster:

sudo ip addr del dev cni0

I wasn’t sure where that was coming from, though I had some vague sense that I once knew the answer. I relearned / remembered that this is part of the config for cri-o, and lives in the file /etc/cni/net.d/100-crio-bridge.conf, but then disappears, seemingly following the next reboot after I’ve configured a cluster. I thought that perhaps I should be passing as the cidr argument when running kubeadm init, instead of the conventional, but maybe not. I need to do more poking here.

Speaking of reboots, I ran into a problem in which cri-o wasn’t dealing very well with reboots — as expected, all the containers running under cri-o were going away during a reboot, but this should be no big deal, because kubernetes can restart its control plane containers from kubelet manifest files, and start up anything else from its records in etcd. I found, though, that following a reboot, the kubelet was complaining about how a sandbox for the pod it was trying to run already existed, even though cri-o wasn’t running any pods. I found a reported issue that looked similar and contained a workaround, but ended up solving the issue by figuring out how to update cri-o to a later version…

One of qualities that sets cri-o apart from other kubernetes container runtime options is that each cri-o version is pegged to a particular kubernetes version — there’s a cri-o 1.12 for kubernetes 1.12, a 1.13 for 1.13, and so on. As I write this, the latest upstream version of kubernetes is 1.13.4, and that’s the version of the kubelet, kubectl and kubeadm I’ve been running on my Fedora test VMs. However, the current version of cri-o shipping with Fedora is 1.12.0. It seemed to be working (that reboot issue notwithstanding), but I wanted to be running 1.13.

Poking around in Fedora’s updates system, I found that cri-o 1.13 was available as a test package, but was now being packaged as a Fedora module. Fedora Modularity is a big topic, but the bottom line is that it allows for more flexibility in packaging and maintaining software in the Fedora family. In this case, it offers a way out of the one stable version per Fedora release that doesn’t fit well with packages like cri-o, which will have multiple stable versions out in the world at once. The trouble, potentially, was that I wasn’t sure whether module-based rpms would play nicely with rpm-ostree and package layering.

As it turned out, I simply had to enable the repo by changing enabled=0 to enabled=1 in /etc/yum.repos.d/fedora-updates-testing-modular.repo (moving forward, I’d like to see rpm-ostree grow support for enabling repos per-operation, a la yum and dnf), and then install the package using rpm-ostree install cri-o. On the test host I was already using, where I’d already installed cri-o, I had to take another step — I should have been able to run rpm-ostree upgrade -r to fetch any available image updates alongside any updates to my layered packages, but since Fedora CoreOS is still in experimental preview mode, its configured ostree remote doesn’t point anywhere, which leads rpm-ostree upgrade to error out before grabbing any layered package upgrades. Instead, I had to run rpm-ostree uninstall cri-o && rpm-ostree install cri-o to uninstall and then reinstall the newer version.

I tested out a kubeadm cluster with the updated cri-o, and it worked, so I left some karma for the package. When I started typing up my notes about the reboot issue, I figured I’d reboot again to see if the new version was behaving the same way, and the problem disappeared. I don’t know if something about the mismatch between kube 1.13 and cri-o 1.12 was to blame, but I was happy not to see the issue any more.

Next up, I want to play with the HA kubeam docs — I’m thinking I’ll set up a three master/etcd node, three worker node cluster on one of my oVirt clusters, team it up with the oVirt volume provisioner and flexvolume driver for persistent volume support, and then install KubeVirt with nested kvm for some tests of that. I’m interested to see how this Fedora CoreOS / kubeadm cluster fares over time through some package and image upgrades while running some longish-lived workloads. Since the ostree repo isn’t up yet during the experimental preview, I suppose I’ll be sprucing up these BYOAtomic docs to compose and host my own repo.

Stay tuned.

Installing Kubeadm on Fedora CoreOS

There are many different ways to bring up a Kubernetes cluster, but the simplest option I’ve found for getting up and running with a single or multi-node cluster involves a tool called kubeadm, for which the Kubernetes project maintains good installation and configuration docs.

These docs include directions for hosts running Debian/Ubuntu, RHEL/CentOS and Container Linux, but the host I’m interested in is Fedora CoreOS — the successor project to Container Linux and Fedora Atomic, which is currently available as an experimental preview.

Now, the Container Linux directions do work for Fedora CoreOS. In fact, since these steps simply involve copying binaries and systemd unit files onto the host, they’d likely work for any sort of Linux host.

The Debian/Ubuntu and RHEL/CentOS directions involve deb and rpm software packages, which are maintained by the Kubernetes project. As with software packages more generally, these debs and rpms cut out some of the installation steps, handle dependencies, and offer a mechanism for future updates, so I prefer to use them.

As with Container Linux and Fedora Atomic Host, Fedora CoreOS ships system and library dependencies in a (more or less) immutable image, and is meant to host applications running in containers. However, Fedora CoreOS images are assembled using the tool rpm-ostree, which does allow for additional rpms to be layered atop the the base image.

That’s why, with a little bit of modification, the RHEL/CentOS kubeadm installation steps can be made to work with Fedora CoreOS, too.

The upstream kubeadm installation directions for RHEL/CentOS begin by configuring a yum repository:

sudo tee /etc/yum.repos.d/kubernetes.repo <<EOF

One of the dependencies for the upstream kubelet package is a set of container networking plugins, which the kubernetes project also packages, under the name kubernetes-cni. Unfortunately, their package places these binaries under /opt, which rpm-ostree will not abide. Fedora CoreOS already includes these cni binaries in its base image, but under the name containernetworking-plugins.

I’ve made an alternate version of this package that’s modified to report that it satisfies the kubernetes-cni requirement. I’ve submitted a pull request to get this change included in Fedora’s containernetworking-plugins package — if it gets merged I’ll be able to delete this step. Until then, let’s view this as an opportunity to see rpm-ostree’s facility for replacing specific packages in the base image with alternatives:

sudo rpm-ostree override replace

Next, we’ll use package layering to install the kubelet, kubeadm and kubectl binaries. I’m also installing cri-o here, because that’s the runtime I’m interested in using with kubernetes. I’m tacking an -r onto the end of this command to reboot my host, which is necessary for the replace and install layering operations to take effect.

sudo rpm-ostree install cri-o kubelet kubectl kubeadm -r

Once we’ve installed our layered packages, they’ll be updated alongside the regular image updates for our Fedora CoreOS host. If you’d rather update these packages manually, you need to edit their repo files under /etc/yum.repos.d/ to change enabled=1 to enabled=0.

Since we’re using cri-o as our runtime interface, we need to manually set the correct cgroup driver:

echo "KUBELET_EXTRA_ARGS=--cgroup-driver=systemd" | sudo tee /etc/sysconfig/kubelet

SELinux can be troublesome to configure correctly, and upstream kubeadm docs deal with the issue by throwing SELinux into permissive mode. I’ve found that kubeadm runs quite happily in SELinux enforcing mode, if you pre-create a few directories and set their contexts appropriately:

for i in {/var/lib/etcd,/etc/kubernetes/pki,/etc/kubernetes/pki/etcd,/etc/cni/net.d}; do sudo mkdir -p $i && sudo chcon -Rt svirt_sandbox_file_t $i; done

Also required for cri-o are the following sysctl parameters and kernel modules, which we’ll set and configure to persist across reboots:

sudo modprobe overlay && sudo modprobe br_netfilter

sudo tee /etc/modules-load.d/crio-net.conf <<EOF

sudo tee /etc/sysctl.d/99-kubernetes-cri.conf <<EOF
net.bridge.bridge-nf-call-iptables  = 1
net.ipv4.ip_forward                 = 1
net.bridge.bridge-nf-call-ip6tables = 1

sudo sysctl --system

Next, we’ll enable and start cri-o and the kubelet:

sudo systemctl enable --now cri-o && sudo systemctl enable --now kubelet

Finally, we’re ready to initialize our cluster, using kubeadm init. Since I’m using cri-o, I need to add --cri-socket=/var/run/crio/crio.sock and since I’m using flannel for networking, I need to include the --pod-network-cidr argument:

sudo kubeadm init --pod-network-cidr= --cri-socket=/var/run/crio/crio.sock

Once the kubeadm init command completes, we need to follow the directions on the screen to create and populate a .kube config directory:

mkdir -p $HOME/.kube

sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config

sudo chown $(id -u):$(id -g) $HOME/.kube/config

As I mentioned earlier, I’m using flannel for networking, which requires a kubectl command to set up:

kubectl apply -f

Also on the networking front, I found during my tests that my Fedora CoreOS host was configuring a address on the cni0 interface, which was conflicting with my flannel networking. I found that if I deleted that address from the device, the cni0 interface would get a new, address that worked for my cluster:

sudo ip addr del dev cni0

If you’re going to run a single all-in-one node, you need to un-taint the master node so it can run pods. If you’re setting up additional nodes, you’ll need to re-run all the steps we’ve gone through above, substituting the final kubeadm init step for the kubeadm join command that’s printed on screen at the end of the init operation. If you’re using cri-o, the additional nodes also need a tacked-on --cri-socket=/var/run/crio/crio.sock argument on the join command.

kubectl taint nodes --all

To make sure that everything is working properly, we can run a “hello world” deployment on our new cluster and expose the resulting pod via a NodePort service:

kubectl create deployment hello --image=nginx

kubectl expose deployment hello --type NodePort --port=80

Finally, we can find out which NodePort was assigned, and use curl to see that the server is up:

$ kubectl get svc

NAME         TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)        AGE
hello        NodePort   <none>        80:31967/TCP   3s
kubernetes   ClusterIP       <none>        443/TCP        2m14s

$ curl http://$(hostname):31967

<!DOCTYPE html>
<title>Welcome to nginx!</title>
body {
width: 35em;
margin: 0 auto;
font-family: Tahoma, Verdana, Arial, sans-serif;
<h1>Welcome to nginx!</h1>
<p>If you see this page, the nginx web server is successfully installed and
working. Further configuration is required.</p>

<p>For online documentation and support please refer to
<a href=""></a>.<br/>
Commercial support is available at
<a href=""></a>.</p>

<p><em>Thank you for using nginx.</em></p>