...
To upgrade major versions you will need to follow the https://stfc.atlassian.net/wiki/spaces/CLOUDKB/pages/285704256/Cluster+API+Upgrade#Upgrade-Clusterctl-and-CAPI-components section first then https://stfc.atlassian.net/wiki/spaces/CLOUDKB/pages/285704256/Cluster+API+Upgrade#Upgrading-Kubernetes-Major-Version for each hop.
Overview
This process assumes the administrator is doing a full upgrade of all components. These can be upgraded independently with the caveat that the Infrastructure layer supports the version of Kubernetes planned: https://cluster-api.sigs.k8s.io/reference/versions
Infrastructure
Components which interact with OpenStack infrastructure
OpenStackCluster and Addons charts
These provide details for our OpenStack cloud components (i.e. allow Cluster API + Cluster API Openstack to create VMs) and fulfill the “contract” requirements from the cluster CRD
Clusterctl and
cluster.x-k8s.io
These represent the generic CAPI components and expect an infrastructure provider (e.g. OpenStack) to a contract to “adapt” to each cloud provider
Kubernetes
Kubernetes components excluding those which handle OpenStack components. These are generic and all CAPI documentation online applies
Kubernetes Version
This is set by the
cluster
KRD and can be found withkubectl describe kubeadmcontrolplane -A
Current value can be seen under
spec.Version
Can be set to a
n+1
minor versions from the current versionCan be set to any patch of the same minor version
Upper bound is set from the CAPI images you are running
CAPI Image
These are pre-generated Ubuntu images with
kubeadm
,containerd
, …etc. packages pre-installedGenerated by the Cloud Team to ensure that they come meet the combined UKRI and STFC Cloud security policies - see Terms Of Service
OS and package patches can be upgraded independently of Kubernetes version
I.e. a K8s cluster set to
v5.10.2
with a CAPI image runningv5.10.6
is allowedHowever a K8s cluster set to
v5.10.8
on a CAPI image runningv5.10.6
is not
Infrastructure Upgrades
Upgrading OpenStackCluster Charts
Info |
---|
This is required to bring any annotations required by the latest |
Update the helm Cluster API charts:
Code Block |
---|
helm repo update capi
helm repo update capi-addons
helm upgrade cluster-api-addon-provider capi-addons/cluster-api-addon-provider -n clusters --wait
cd <folder_with_values> |
Ensure the latest helm chart works without upgrading the K8s Major version:
Code Block |
---|
helm upgrade <cluster_name> capi/openstack-cluster -f values.yaml -f clouds.yaml -f user-values.yaml -f flavors.yaml -n clusters |
Update
user-values.yaml
by eithergit pull
the latest image from the cloud team, or manually editing themachineImage
andkubernetesVersion
fieldsRe-run the helm upgrade to upgrade the cluster version:
Code Block |
---|
helm upgrade <cluster_name> capi/openstack-cluster --install -f values.yaml -f clouds.yaml -f user-values.yaml -f flavors.yaml -n clusters |
Monitor the upgrade using
clusterctl describe cluster <cluster_name> -n clusters
Upgrade Clusterctl and CAPI components
Info |
---|
We need to upgrade clusterctl to be aware of the latest CAPI |
...
and CAPO components. These handle the infrastructure integration. |
Download the latest version which supports your cluster version.
...
Validate that the upgrade is valid and apply the command provided by clusterctl
Upgrading Kubernetes Major Version
Update the helm Cluster API charts:
Code Block |
---|
helm repo update capi
helm repo update capi-addons
helm upgrade cluster-api-addon-provider capi-addons/cluster-api-addon-provider -n clusters --wait
cd <folder_with_values> |
Ensure the latest helm chart works without upgrading the K8s Major version:
...
Kubernetes Upgrades
This section assumes production clusters and upgrades components individually.
For development / low risk clusters both steps can be combined into a single roll-out.
Without minor version upgrade
Upgrade VM images and Kubernetes version to the latest patch version available
This ensures any bug-fixes are applied which could prevent minor version upgrades
Code Block |
---|
# Edit the machineImage in user-values.yaml to use the latest patch release
# Edit the kubernetesVersion in user-values.yaml to match the image name
helm upgrade <cluster_name> capi/openstack-cluster -f values.yaml -f clouds.yaml -f user-values.yaml -f flavors.yaml -n clusters |
Update
user-values.yaml
by eithergit pull
the latest image from the cloud team, or manually editing themachineImage
andkubernetesVersion
fieldsRe-run the helm upgrade to upgrade the cluster version:
Code Block |
---|
helm upgrade <cluster_name> capi/openstack-cluster --install -f values.yaml -f clouds.yaml -f user-values.yaml -f flavors.yaml -n clusters |
...
Wait for the rollout of new infra to complete
The rollout can be monitored with
kubectl get kcp -A
andkubectl get md -A
Machine details can be found in
kubectl get machines -A
andkubectl get openstackmachines -A
With minor version upgrade
Upgrade to the
Troubleshooting
On the management cluster
Check the
machines
andopenstackmachines
CRDs match the VMs in the web interfacekubectl get machines -A
andkubectl get openstackmachines -A
Check the control plane node’s status
kubectl describe machine <name> -n clusters
Logs are available if nothing is happening / the process is stuck
OpenStack logs:
kubectl logs deploy/capo-controller-manager -n capo-system -f
CAPI logs:
kubectl logs deploy/capi-controller-manager -n capi-system -f
Check the control plane status:
kubectl describe kcp/<name>-control-plane -n clusters
Check for events on the management cluster:
kubectl get events -n clusters --watch
On the target cluster
Check you have access via kubectl
This could indicate an OpenStack networking configuration problem if you do not
Check the LBs and networks exist - if not check the CAPO logs on the management cluster
Check
etcd
is healthy withkubectl get pods -n kube-system
:If they’re failing to start
kubectl describe pod/etcd-<name> -n kube-system
If they’re running check they’re healthy with
kubectl logs pod/etcd-<name> -n kube-system
In the event
etcd
is unhealthy contact the cloud team to assist with recovery
Check the
kubeapi
pod is starting per machineIf they’re failing to start
kubectl describe pod/kubeapi-<name> -n kube-system
If they’re running check they’re healthy with
kubectl logs pod/kubeapi-<name> -n kube-system
In the event the Kubelet is failing to start or is unhealthy contact the cloud to assist with recovery