Skip to content

Prerequisites

Before proceeding with this document, please review the installation types documentation to understand the difference between air-gapped and connected installations.

Hardware Requirements

(Production only) Run:AI System Nodes: To reduce downtime and save CPU cycles on expensive GPU Machines, we recommend that production deployments will contain two or more worker machines, designated for Run:AI Software. The nodes do not have to be dedicated to Run:AI, but for Run:AI purposes we would need:

  • 4 CPUs
  • 8GB of RAM
  • 120GB of Disk space

The backend installation of Run:AI will require the configuration of Kubernetes Persistent Volumes of a total size of 110GB.

Run:AI Software Prerequisites

You should receive a single file runai-<version>.tar from Run:AI customer support

You should receive a file: runai-gcr-secret.yaml from Run:AI Customer Support. The file provides access to the Run:AI Container registry.

OpenShift

Run:AI requires OpenShift 4.6 or later.

Important

  • Entitlement is the RedHat OpenShift licensing mechanism. Without entitlement, you will not be able to install the NVIDIA drivers used by the GPU Operator. For further information see: here.
  • If you are planning to use NVIDIA A100 with CoreOS, you will need the latest GPU Operator (version 1.8).

Download Third-Party Dependencies

An OpenShift installation of Run:AI has third-party dependencies that must be pre-downloaded to an Airgapped environment. These are the NVIDIA GPU Operator and Kubernetes Node Feature Discovery Operator

Download the NVIDIA GPU Operator pre-requisites. These instructions also include the download of the Kubernetes Node Feature Discovery Operator.

No additional work needs to be performed. We will use the Red Hat Certified Operator Catalog (Operator Hub) during the installation.

Installer Machine

The machine running the installation script (typically the Kubernetes master) must have:

  • At least 50GB of free space.
  • Docker installed.

Other

  • (Airgapped installation only) Private Docker Registry. Run:AI assumes the existence of a Docker registry for images. Most likely installed within the organization. The installation requires the network address and port for the registry (referenced below as <REGISTRY_URL>).

Last update: October 10, 2021