Skip to content


Below are the prerequisites of a cluster installed with Run:AI.

Kubernetes Software

Run:AI requires Kubernetes 1.16 or above. Kubernetes 1.20 is recommended (as of April 2021).

If you are using Red Hat OpenShift. The minimal version is OpenShift 4.3 which runs Kubernetes 1.16.


Run:AI requires all GPU nodes to be installed with NVIDIA driver version 410.104 or later and CUDA 9.0 or later.

Hardware Requirements

(see picture below)

  • (Production only) Dedicated Run:AI System Nodes: To reduce downtime and save CPU cycles on expensive GPU Machines, we recommend that production deployments will contain at least one, dedicated worker machine, designated for Run:AI Software:

    • 4 CPUs
    • 8GB of RAM
    • 50GB of Disk space
  • Shared data volume: Run:AI uses Kubernetes to abstract away the machine on which a container is running:

    • Researcher containers: The Researcher's containers need to be able to access data from any machine in a uniform way, to access training data and code as well as save checkpoints, weights, and other machine-learning-related artifacts.
    • The Run:AI system needs to save data on a storage device that is not dependent on a specific node.

    Typically, this is achieved via Network File Storage (NFS) or Network-attached storage (NAS). NFS is usually the preferred method for Researchers which may require multi-read/write capabilities.

  • Docker Registry With Run:AI, Workloads are based on Docker images. For container images to run on any machine, these images must be downloaded from a docker registry rather than reside on the local machine (though this also is possible). You can use a public registry such as docker hub or set up a local registry on-premise (preferably on a dedicated machine). Run:AI can assist with setting up the repository.

  • Kubernetes: Though out of scope for this document, Production Kubernetes installation requires separate nodes for the Kubernetes master.


User requirements

Usage of containers and images: The individual Researcher's work should be based on container images.

Network Requirements

Run:AI user interface runs from the cloud. All container nodes must be able to connect to the Run:AI cloud. Inbound connectivity (connecting from the cloud into nodes) is not required. If outbound connectivity is proxied/limited, the following exceptions should be applied:

During Installation

Run:AI requires an installation over the Kubernetes cluster. The installation access the web to download various images and registries. Some organizations place limitations on what you can pull from the internet. The following list shows the various solution components and their origin:

Name Description URLs Ports

Run:AI Repository

The Run:AI Package Repository is hosted on Run:AI’s account on Google Cloud


Docker Images Repository

Various Run:AI images


Docker Images Repository

Various third party Images


Post Installation

In addition, once running, Run:AI will send metrics to two sources:

Name Description URLs Ports


Grafana Metrics Server



Run:AI Cloud instance



Authentication Provider


Last update: April 7, 2021