Below are instructions on how to install Run:AI cluster. Before installing, please review the installation prerequisites here: Run AI GPU Cluster Prerequisites.
We strongly recommend running the Run:AI pre-install script to verify that all prerequisites are met.
Step 1: Kubernetes¶
Step 2: NVIDIA¶
Step 3: Install Run:AI¶
Log in to Run:AI Admin UI at app.run.ai. Use credentials provided by Run:AI Customer Support:
- If no clusters are currently configured, you will see a Cluster installation wizard
- If a cluster has already been configured, use the menu on the top left and select "Clusters". On the top right, click "Add New Cluster".
Using the Wizard:
- Choose a target Kubernetes platform (see table above)
- Download a Helm values YAML file
- (Optional) customize the values file. See Customize Cluster Installation
- Install Helm
- For RKE only, perform the steps here
- Run the
helmcommands as provided in the wizard.
To install a specific version, add
--version <version> to the install command.
Step 4: Verify your Installation¶
- Go to app.run.ai/dashboards/now.
- Verify that the number of GPUs on the top right reflects your GPU resources on your cluster and the list of machines with GPU resources appears on the bottom line.
For a more extensive verification of cluster health, see Determining the health of a cluster.
Step 5: (Optional) Set Node Roles¶
When installing a production cluster you may want to:
- Set one or more Run:AI system nodes. These are nodes dedicated to Run:AI software.
- Machine learning frequently requires jobs that require CPU but not GPU. You may want to direct these jobs to dedicated nodes that do not have GPUs, so as not to overload these machines.
- Limit Run:AI to specific nodes in the cluster.
To perform these tasks. See Set Node Roles.
- Set up Admin UI Users Working with Admin UI Users.
- Set up Projects for Researchers Working with Projects.
- Set up Researchers to work with the Run:AI Command-line interface (CLI). See Installing the Run AI Command-line Interface on how to install the CLI for users.
- Set up Project-based Researcher Access Control.
- Review advanced setup and maintenace scenarios.