Skip to content

Upgrade Run:ai

Important

Run:ai data is stored in Kubernetes persistent volumes (PVs). Prior to Run:ai 2.12, PVs are owned by the Run:ai installation. Thus, uninstalling the runai-backend helm chart may delete all of your data.

From version 2.12 forward, PVs are owned the customer and are independent of the Run:ai installation. As such, they are subject to storage class reclaim policy.

Preparations

No preparation required.

  • Ask for a tar file runai-air-gapped-<new-version>.tar from Run:ai customer support. The file contains the new version you want to upgrade to. new-version is the updated version of the Run:ai control plane.
  • Upload the images as described here.

Specific version instructions

Upgrade from Version 2.7 or 2.8

Before upgrading the control plane, run:

POSTGRES_PV=$(kubectl get pvc pvc-postgresql -n runai-backend -o jsonpath='{.spec.volumeName}')
THANOS_PV=$(kubectl get pvc pvc-thanos-receive -n runai-backend -o jsonpath='{.spec.volumeName}')
kubectl patch pv $POSTGRES_PV $THANOS_PV -p '{"spec":{"persistentVolumeReclaimPolicy":"Retain"}}'

kubectl delete secret -n runai-backend runai-backend-postgresql
kubectl delete sts -n runai-backend keycloak runai-backend-postgresql

Before version 2.9, the Run:ai installation, by default, included NGINX. It was possible to disable this installation. If NGINX is enabled in your current installation, as per the default, run the following 2 lines:

kubectl delete ValidatingWebhookConfiguration runai-backend-nginx-ingress-admission
kubectl delete ingressclass nginx 
(If Run:ai configuration has previously disabled NGINX installation then these lines should not be run).

Next, install NGINX as described here

Then create a TLS secret and upgrade the control plane as described in the control plane installation. Before upgrading, find customizations and merge them as discussed below.

Upgrade from version 2.9, 2.10, 2.11

Two significant changes to the control-plane installation have happened with version 2.12: PVC ownership and installation customization.

PVC Ownership

Run:ai will no longer directly create the PVCs that store Run:ai data (metrics and database). Instead, going forward,

  • Run:ai requires a Kubernetes storage class to be installed.
  • The PVCs are created by the Kubernetes StatefulSets.

The storage class, as per Kubernetes standards, controls the reclaim behavior: whether the data is saved or deleted when the Run:ai control plane is deleted.

To remove the ownership in an older installation, run:

kubectl patch pvc -n runai-backend pvc-thanos-receive  -p '{"metadata": {"annotations":{"helm.sh/resource-policy": "keep"}}}'
kubectl patch pvc -n runai-backend pvc-postgresql  -p '{"metadata": {"annotations":{"helm.sh/resource-policy": "keep"}}}'

Ingress

Delete the ingress object which will be recreated by the control plane upgrade

kubectl delete ing -n runai-backend runai-backend-ingress

Installation Customization

The Run:ai control-plane installation has been rewritten and is no longer using a backend values file. Instead, to customize the installation use standard --set flags. If you have previously customized the installation, you must now extract these customizations and add them as --set flag to the helm installation:

  • Find previous customizations to the control plane if such exist. Run:ai provides a utility for that here https://raw.githubusercontent.com/run-ai/docs/v2.13/install/backend/cp-helm-vals-diff.sh. For information on how to use this utility please contact Run:ai customer support.
  • Search for the customizations you found in the optional configurations table and add them in the new format.

Thanos Querier service account

If you are running Kubernetes 1.25 or later, and are upgrading to version 2.13, you will need to change the thanos querier service account name:

kubectl patch deployment -n runai-backend runai-backend-thanos-query  -p '{"spec": {"template": {"spec": {"serviceAccount": "runai-backend-thanos-querier", "serviceAccountName": "runai-backend-thanos-querier"}}}}'

This is not relevant if you are upgrading to Run:ai version 1.25

Upgrade Control Plane

helm upgrade -i runai-backend -n runai-backend runai-backend/control-plane \
    --set global.domain=<DOMAIN> \
    --set postgresql.primary.persistence.existingClaim=pvc-postgresql \ 
    --set thanos.receive.persistence.existingClaim=pvc-thanos-receive 

Note

The helm repository name has changed from runai-backend/runai-backend to runai-backend/control-plane.

Upgrade Cluster

To upgrade the cluster follow the instructions here.

helm get values runai-cluster -n runai > values.yaml
helm upgrade runai-cluster -n runai runai-cluster-<version>.tgz -f values.yaml
(replace <version> with the cluster version)