Go 1.20 Coverage Profiling Support for Kubernetes Apps

2023-02-02 1784 words 9 minutes

https://d33wubrfki0l68.cloudfront.net/362d18c9cdc3c75639d671644658f6b5b274de3d/88b93/2023/02/go-e2e/featured-image.jpg

Contents

During the pre-release phase of the new Go release (1.20), one particular feature caught my attention which I immediately tried out. Now, coverage profiles can be generated for programs (binaries), as opposed to just unit tests. This can be useful for debugging and, more importantly, for integration and end-to-end tests during local development and in CI pipelines.

The Go team also created a landing page with FAQs, explaining how to configure and enable this new thingy. In a nutshell, a Go binary compiled with GOFLAGS=-cover and executed with the environment variable GOCOVERDIR=<output_dir> will create a coverage metafile during execution (helpful with restarts) and individual coverage profiles when the application terminates, irrespective of the exit code.

With the documentation provided by the Go team, I was quickly able to create coverage profiles on my local machine. Success 🥳

However, most of my work these days still happens in (on) Kubernetes, such as writing controllers for the AWS ACK project. And this is where using the new Go feature wasn’t as straightforward.

In a Kubernetes application (Pod), state is ephemeral unless it is written to an external location. I thought about using persistent volumes, but it would have complicated the matter as I would have to know when the application has successfully terminated before extracting the data from the persistent volume with a helper application. Alternatively, I could have used a sidecar container running in the application Pod, but again I would have to write coordination logic to know when the application (test) was successful to read and write the coverage data to an external location, e.g. NFS.

There must be an easier solution… 🧐

My Development Flow

My typical setup to develop on Kubernetes involves kind (Kubernetes in Docker) to create local Kubernetes clusters and ko to build and deploy container images without having to create and worry about a Dockerfile 🤷

You might be familiar with kind as it’s also often used in CI, such as Github Actions. ko’s adoption is growing slowly but steadily and I can only encourage you to take a look at it as I can’t live without it anymore.

For integration and end-to-end tests I use the excellent and minimalistic E2E Framework from Kubernetes SIG Testing (Special Interest Group), created by my friend Vladimir Vivien. Vladimir wrote a nice article how easy it is to get started here.

Tip

The E2E framework can be used for all sorts of integration and end-to-end tests in Kubernetes, but not just for building controllers and operators. You might find it useful to verify that your containerized application behaves as expected, write tests for complex Kubernetes microservice deployments, assert that a new library you wrote can be imported and deployed in a container, etc.

But independent from the framework you use for your integration and end-to-end tests in Kubernetes, so far it wasn’t possible¹ to generate coverage reports for compiled Go applications. With the new 1.20 release of Go and a little bit of Docker and Kubernetes trickery, we now have all the pieces in place to create coverage reports for integration and end-to-end tests 🤩

Creating Coverage Reports in Kubernetes

The following steps describe how to create coverage profiles for Go binaries packaged as containers and deployed to a Kubernetes cluster using kind. To keep things simple and tangible, I’ll use a small Go package I created to interact with the VMware vSphere APIs as an example. You don’t have to be an expert in VMware technology or the package itself as the focus is on creating coverage reports. The full code, including running end-to-end tests with coverage in Github Actions is available in github.com/embano1/vsphere.

Tip

If you want to execute the steps outlined below, you need the following tools installed: git, Go (1.20+), Docker, kind and ko

You might be wondering why I wrote E2E tests for this package since I have good unit test coverage already. Well, unit tests with client/server API mocks can only get us so far. Deploying and verifying an end-to-end setup in a container environment, such as Kubernetes, makes me more confident that users of my package won’t face the typical “works on my machine” issues, such as network permissions, authentication (secrets), keep-alives, etc. - while also avoiding to write and maintain brittle mocks.

The simple E2E test used for our coverage example creates a vSphere server (simulator) and a client application deployed as a Kubernetes Job to perform a login using the vsphere package. Once the client has successfully connected, it will exit with code 0 and the Job moves to the completed state (which the test suite asserts). The full code is in the test folder.

As described earlier, the tricky part is getting the coverage data out of the Kubernetes test application. Luckily, with kind we can use Docker Volumes to mount a folder from the host, such as my MacBook or a Github Actions runner, to a Kubernetes worker, a container created by kind. Then we can mount that volume into a Kubernetes Job i.e., our test application, using the Kubernetes HostPathVolumeSource. If you feel like being right inside the movie “Inception”, you are not alone…

Note

Yes, this only works in environments where you can use local volumes, such as kind. For remote Kubernetes clusters, you would have to use a persistent volume and some coordination logic. I’ll leave this one up for you, smart reader 😜

Step by Step

First we need to check out the example source code and create a folder where the final coverage data will be stored:

1
2


git clone https://github.com/embano1/vsphere.git && cd vsphere
mkdir coverdata

Next, we’ll create a Kubernetes cluster with a kind configuration file mapping our local coverage folder to the worker node:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14


cat > kind.yaml <<EOF
apiVersion: kind.x-k8s.io/v1alpha4
kind: Cluster
nodes:
- role: control-plane
- role: worker
  extraMounts:
  - hostPath: $PWD/coverdata # path on local machine
    containerPath: /coverdata # worker node folder
    readOnly: false
EOF

export KIND_CLUSTER_NAME=e2e
kind create cluster --config kind.yaml --wait 3m --name ${KIND_CLUSTER_NAME}

Tip

Make sure your Docker engine has access and privileges to mount the hostPath folder

With a Kubernetes cluster running, we can compile the test application binary with active coverage, create a container image and upload it into the Kubernetes cluster. Sounds complicated? Meet ko!

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16


# tell ko to upload to the active kind Kubernetes context
export KO_DOCKER_REPO=kind.local

# sanity check ;)
go version
go version go1.20.0 darwin/arm64

# compile binary for ARM (M1 Mac, use amd64 alternatively), create image and upload to kind
GOFLAGS=-cover ko build -B --platform=linux/arm64 ./test/images/client
2023/02/03 21:46:19 Using base distroless.dev/static:latest@sha256:a218b8525e4db35a0ce8fb5b13e2a980cc3ceef78b6bf88aabbb700373c1c2e2 for github.com/embano1/vsphere/test/images/client
2023/02/03 21:46:20 Building github.com/embano1/vsphere/test/images/client for linux/arm64
2023/02/03 21:46:22 Loading kind.local/client:3c4f716fcbdcf71aaf7f1e25dfe0101b860eccb3d51a1f910fa57e955840799e
2023/02/03 21:46:23 Loaded kind.local/client:3c4f716fcbdcf71aaf7f1e25dfe0101b860eccb3d51a1f910fa57e955840799e
2023/02/03 21:46:23 Adding tag latest
2023/02/03 21:46:24 Added tag latest
kind.local/client:3c4f716fcbdcf71aaf7f1e25dfe0101b860eccb3d51a1f910fa57e955840799e

Next, we need to instruct the Kubernetes E2E test suite to create a Job for the test client which uses the container image created above, the mounted coverage volume and set GOCOVERDIR accordingly.

Expand the code block below to see the specific lines in the E2E function which creates the test client Job.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82


func newClient(namespace, secret string) *batchv1.Job {
	var e envConfig
	if err := envconfig.Process("", &e); err != nil {
		panic("process environment variables: " + err.Error())
	}

	l := map[string]string{
		"app":  job,
		"test": "e2e",
	}

	const coverDirPath = "/coverdata"

	k8senv := []v1.EnvVar{
		{Name: "VCENTER_URL", Value: fmt.Sprintf("https://%s.%s", vcsim, namespace)},
		{Name: "VCENTER_INSECURE", Value: "true"},
		{Name: "GOCOVERDIR", Value: coverDirPath},
	}

	client := batchv1.Job{
		ObjectMeta: metav1.ObjectMeta{
			Name:      job,
			Namespace: namespace,
			Labels:    l,
		},
		Spec: batchv1.JobSpec{
			Parallelism: pointer.Int32(1),
			Template: v1.PodTemplateSpec{
				ObjectMeta: metav1.ObjectMeta{
					Labels: l,
				},
				Spec: v1.PodSpec{
					Containers: []v1.Container{{
						Name:            job,
						Image:           fmt.Sprintf("%s/client", e.DockerRepo),
						Env:             k8senv,
						ImagePullPolicy: v1.PullIfNotPresent,
						// TODO (@embano1): investigate why this is required in Github Actions to solve "permission
						// denied" error writing to volume (w/ Docker on OSX this is not needed)
						SecurityContext: &v1.SecurityContext{
							RunAsUser: pointer.Int64(0),
						},
						VolumeMounts: []v1.VolumeMount{
							{
								Name:      "credentials",
								ReadOnly:  true,
								MountPath: mountPath,
							},
							{
								Name:      "coverdir",
								ReadOnly:  false,
								MountPath: coverDirPath,
							},
						},
					}},
					Volumes: []v1.Volume{
						{
							Name: "credentials",
							VolumeSource: v1.VolumeSource{
								Secret: &v1.SecretVolumeSource{
									SecretName: secret,
								},
							},
						},
						{
							Name: "coverdir",
							VolumeSource: v1.VolumeSource{
								HostPath: &v1.HostPathVolumeSource{
									Path: coverDirPath,
								},
							},
						},
					},
					RestartPolicy:                 v1.RestartPolicyOnFailure,
					TerminationGracePeriodSeconds: pointer.Int64Ptr(5),
				},
			},
		},
	}

	return &client
}

Now we can run our E2E tests as usual:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11


go test -race -count=1 -v -tags=e2e ./test
=== RUN   TestWaitForClientJob
=== RUN   TestWaitForClientJob/appsv1/deployment
    client_test.go:66: vcsim ready replicas 1
=== RUN   TestWaitForClientJob/appsv1/deployment/client_job_completes
    client_test.go:86: client job complete
--- PASS: TestWaitForClientJob (15.05s)
    --- PASS: TestWaitForClientJob/appsv1/deployment (15.05s)
        --- PASS: TestWaitForClientJob/appsv1/deployment/client_job_completes (5.02s)
PASS
ok      github.com/embano1/vsphere/test 26.840s

Let’s check if we got some coverage data…

1
2
3
4


cd coverdata
ls
covcounters.1846795f27c6071dca49109ab71c15e6.1.1675457440336943752
covmeta.1846795f27c6071dca49109ab71c15e6

Heureka! But wait, we’re not done yet. We need to create a human-readable coverage report as these are binary files.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16


# print a summary
go tool covdata percent -i=.
        github.com/embano1/vsphere/client       coverage: 61.2% of statements
        github.com/embano1/vsphere/logger       coverage: 57.1% of statements
        github.com/embano1/vsphere/test/images/client   coverage: 86.7% of statements

# generate HTML report
go tool covdata textfmt -i=. -o profile.txt
go tool cover -html=profile.txt -o coverage.html

# open the file
open coverage.html

# cleanup and delete the cluster
kind delete cluster
Deleting cluster "e2e" ...

This is a damn cool new feature if you ask me! In fact, it spotted an error I had in one of the AWS ACK controllers I’m currently writing where the E2E tests showed green but a critical code path was never executed. I wouldn’t have caught this without being able to inspect the HTML coverage report for the controller binary now available with Go 1.20.

Bonus: since we’re using tools available in many CI environments, such as Github Actions, it’s super easy to create these coverage reports on pull requests and upload the coverage files to the CI check summary page. Here’s an example from the vsphere repository.

If you enjoyed this post, share it with your friends, and hit me up on Twitter.

Credits

Photo by ShareGrid on Unsplash

Well, of course Filippo Valsorda got it working even before 1.20… ↩︎