Afaan Ashiq

Software Engineering

A bad day at the office with Helm

October 1, 2022 5 min read

This post will highlight a not-so-desirable feature of Helm that has stung me before. I hope you don’t repeat my mistakes after reading this!


Table of Contents


What is Helm?

Helm is a package manager for Kubernetes.
It also provides a useful templating engine with Golang-like syntax for Kubernetes manifests.

Helm projects require a Chart.yaml file, this file defines a series of characteristics.
Including a dependecies section, which lists out the charts that the current chart depends on.

At the time of writing (October 2022), this dependencies section can be presented as follows:

dependencies: # A list of the chart requirements (optional)
  - name: The name of the chart (nginx)
    version: The version of the chart ("1.2.3")
    repository: (optional) The repository URL ("https://example.com/charts") or alias ("@repo-name")
    condition: (optional) A yaml path that resolves to a boolean, used for enabling/disabling charts (e.g. subchart1.enabled )
    tags: # (optional)
      - Tags can be used to group charts for enabling/disabling together
    import-values: # (optional)
      - ImportValues holds the mapping of source values to parent key to be imported. Each item can be a string or pair of child/parent sublist items.
    alias: (optional) Alias to be used for the chart. Useful when you have to add the same chart multiple times

Applying Helm to a simple application

For our purposes, let’s take the following somewhat contrived and simple example system. simple_system_sketch

In our superbly simple system, requests come in to our API service.
The API then fetches some data from our postgres database, performs some business logic and returns a response.

So in Helm chart terms, we can sketch this out to be something like the following: simple_system_helm_chart_sketch

The Chart.yaml file for our api_service might look like this:

apiVersion: v1
name: api-service
version: 0.1.0
dependencies:
  - name: postgres
    version: 10.2.5

Deploying our Helm chart

Now let’s say we are committing the cardinal sin of avoiding any Infastructure As Code (IAC) tooling. We can deploy our application with the following command to our namespace, which we have named simple_api_application:

helm install . -n simple_api_application

Here we are saying to Helm, install the chart at the current directory (.) to our namespace (-n simple_api_application).

To achieve this Helm will look at all of our declared dependencies and download the packaged chart for each dependency.

In our case, we have just the 1 postgres chart which we depend on. Helm will then drop these packaged files into a reserved folder called charts/ in our current directory.

charts/
    postgres-10.2.5.tgz
Chart.lock
Chart.yaml
values.yaml

After Helm has finished fetching the dependent packaged charts, it will take the new files alongside the chart in the current directory, and it will deploy our application as we asked.


Changing the chart version

If we make a change to say the version of our postgres we will want to make the following change to our main Chart.yaml:

apiVersion: v1
name: api-service
version: 0.1.0 
dependencies:
  - name: postgres
    version: 10.3.0 # <- minor version bump from 10.2.5 to 10.3.0

Helm also offers the ability to update deployments which will compare the current deployment with what we have defined in our Chart.yaml and replace any components which have since been updated.

helm upgrade . api-service

Both of these commands are very useful to deploy bundled applications which would otherwise have lots of moving parts. But they come with a sneaky caveat that has bitten me pretty badly in the past!


A tough lesson

Remember when I said that Helm looked at our Chart.yaml and downloaded the packaged charts that we needed? Well in the scenario in which are updating the version of a service then we have nothing to worry about. Helm will replace the existing postgres-10.2.5.tgz file with the new postgres-10.3.0.tgz as we wanted. The updated application will mirror what we have specified in our Chart.yaml.

However, Helm makes no real consideration for when we might want to change the name of a service. Let’s say we wanted to change the name of our postgres to instead be postgresql. To do this, we would need to make the following change to our main Chart.yaml

apiVersion: v1
name: api-service
version: 0.1.0 
dependencies:
  - name: postgresql  # <- we changed the name from `postgres` to `postgresql`
    version: 10.3.0

Now we will need to redeploy our updated applicatopm.
Helm will do what it always does and look at our Chart.yaml. Helm will compare that to our current deployment and it will say hey we don’t have a postgresql service running.
So Helm will go and grab the packaged chart and save it our reserved charts/ folder. But Helm will not remove the now outdated postgres chart.

So our folder directory will look like the following:

charts/
    postgres-10.2.5.tgz
    postgresql-10.3.0.tgz
Chart.lock
Chart.yaml
values.yaml

See what happened there?
Helm completely ignored the original packaged chart because it was no longer defined in our Chart.yaml. Can you guess what happens next?
Helm will take everything in the charts/ folder and deploy to the namespace that we gave it…
Yikes, this means we could have outdated services continuing to run in our namespace, despite the fact that we had removed them from our Chart.yaml!


Summary

Even worse, Helm kinda acknowledges this somewhat undesirable feature.
At the time of writing (Oct 2022), see the following snippet from the official Helm docs:

Dependencies are not required to be represented in ‘Chart.yaml’. For that reason, an update command will not remove charts unless they are (a) present in the Chart.yaml file, but (b) at the wrong version.

The simple solution to this problem is to remove the charts/ folder prior to the Helm command.

rm `charts/` && helm upgrade . api-service

This does however mean that for every deployment, Helm will need to pull fresh charts even if nothing has changed. So there is a tradeoff to be made here regarding the additional time and network latency for each command.