Skip to content

Install a Cluster

Customize Installation

  • Perform the cluster installation instructions explained here.
  • (Optional) make the following changes to the configuration file you have downloaded:
Key Default Description
runai-operator.config.project-controller.createNamespaces true Set to false if unwilling to provide Run:ai the ability to create namespaces, or would want to create namespaces manually rather than use the Run:ai convention of runai-<PROJECT-NAME>. When set to false, will require an additional manual step when creating new Run:ai Projects.
runai-operator.config.project-controller.clusterWideSecret true Set to false if unwilling to provide Run:ai the ability to create Kubernetes Secrets. When not enabled, automatic secret propagation will not be available
runai-operator.config.mps-server.enabled false Allow the use of NVIDIA MPS. MPS is useful with Inference workloads. Requires extra cluster permissions
runai-operator.config.runai-container-toolkit.enabled true Controls the usage of Fractions. Requires extra cluster permissions
runai-operator.config.global.runtime docker Defines the container runtime of the cluster (supports docker and containerd). Set to containerd when using Tanzu
runai-operator.config.runaiBackend.password Default password already set admin@run.ai password. Need to change only if you have changed the password here
runai-operator.config.global.prometheusService.address The address of the default Prometheus Service If you installed your own custom Prometheus Service, add this field with the address
kube-prometheus-stack.enabled true Install Prometheus. Set to false if Prometheus is already installed in cluster

Prometheus

The Run:ai Cluster installation installs Prometheus by default. If your Kubernetes cluster already has Prometheus installed, set the flag kube-prometheus-stack.enabled to false.

When using an existing Prometheus installation, you will need to add additional rules to your Prometheus configuration. The rules can be found under deploy/runai-prometheus-rules.yaml.

NVIDIA Prerequisutes

See the NVIDIA Prerequisutes section.

Install Cluster

Run:

helm repo add runai https://run-ai-charts.storage.googleapis.com
helm repo update

helm install runai-cluster runai/runai-cluster -n runai \
    -f runai-<cluster-name>.yaml --create-namespace

Info

To install a specific version, add --version <version> to the install command.

helm install runai-cluster -n runai  \ 
  runai-cluster-<version>.tgz -f runai-<cluster-name>.yaml --create-namespace

Tip

Use the --dry-run flag to gain an understanding of what is being installed before the actual installation. For more details see Understanding cluster access roles.


Last update: 2022-11-20
Created: 2021-08-03