Skip to content

Install a Cluster

Customize Installation

  • Perform the cluster installation instructions explained here.
  • (Optional) make the following changes to the configuration file you have downloaded:
Key Default Description
pspEnabled false Set to true when using PodSecurityPolicy
ingress-nginx.podSecurityPolicy.enabled Set to true when using PodSecurityPolicy
runai-operator.config.project-controller.createNamespaces true Set to false if unwilling to provide Run:ai the ability to create namespaces, or would want to create namespaces manually rather than use the Run:ai convention of runai-<PROJECT-NAME>. When set to false, will require an additional manual step when creating new Run:ai Projects.
runai-operator.config.project-controller.createRoleBindings true Automatically assign Users to Projects. Set to false if unwilling to provide Run:ai the ability to set RoleBinding. When set to false, will require an additional manual step when adding or removing users from Projects.
runai-operator.config.project-controller.clusterWideSecret true Set to false if unwilling to provide Run:ai the ability to create Kubernetes Secrets. When not enabled, automatic secret propagation will not be available
runai-operator.config.mps-server.enabled false Allow the use of NVIDIA MPS. MPS is useful with Inference workloads. Requires extra cluster permissions
runai-operator.config.runai-container-toolkit.enabled true Controls the usage of Fractions. Requires extra cluster permissions
runai-operator.config.global.runtime docker Defines the container runtime of the cluster (supports docker and containerd). Set to containerd when using Tanzu
runai-operator.config.runaiBackend.password Default password already set admin@run.ai password. Need to change only if you have changed the password here
runai-operator.config.global.prometheusService.address The address of the default Prometheus Service If you installed your own custom Prometheus Service, add this field with the address
kube-prometheus-stack.enabled true Install Prometheus. Set to false if Prometheus is already installed in cluster

Prometheus

The Run:ai Cluster installation installs Prometheus by default. If your Kubernetes cluster already has Prometheus installed, set the flag kube-prometheus-stack.enabled to false.

When using an existing Prometheus installation, you will need to add additional rules to your Prometheus configuration. The rules can be found under deploy/runai-prometheus-rules.yaml.

NVIDIA Prerequisutes

See the NVIDIA Prerequisutes section.

Install Cluster

Run:

helm repo add runai https://run-ai-charts.storage.googleapis.com
helm repo update

helm install runai-cluster runai/runai-cluster -n runai \
    -f runai-<cluster-name>.yaml --create-namespace

Info

To install a specific version, add --version <version> to the install command.

helm install runai-cluster -n runai  \ 
  runai-cluster-<version>.tgz -f runai-<cluster-name>.yaml --create-namespace

Tip

Use the --dry-run flag to gain an understanding of what is being installed before the actual installation. For more details see Understanding cluster access roles.

Next Steps

Continue to configure web interfaces.


Last update: May 16, 2022