Skip to content

Whats New

September 6th, 2020

We released a module that helps the Researcher perform Hyperparameter optimization (HPO). HPO is about running many smaller experiments with varying parameters to help determine the optimal parameter set Hyperparameter Optimization Walk-through

September 3rd, 2020

GPU Fractions now run in training and not only interactive. GPU Fractions training jobs can be preempted, bin-packed and consolidated like any integer jobs. See Run:AI Scheduler Fraction for more.

August 10th, 2020

Run:AI Now supports Distributed Training and Gang Scheduling. For further information , see the Launch Distributed Training Workloads Walkthrough.

August 4th, 2020

There is now an optional second level of Project hierarchy called Departments. For further information on how to configure and use Departments, see Working with Departments

July 28th, 2020

You can now enforce a cluster-wise setting which mandates all containers running using the Run:AI CLI to run as non root. For further information, see Enforce non-root Containers

July 21th, 2020

It is now possible to mount a Persistent Storage Claim using the Run:AI CLI. See the --pvc flag in the runai submit CLI flag

June 13th, 2020

New Settings for the Allocation of CPU and Memory

It is now possible to set limits for CPU and memory as well as to establish defaults based on the ratio of GPU to CPU and GPU to memory.

For further information see: Allocation of CPU and Memory

June 3rd, 2020

Node Group Affinity

Projects now support Node Affinity. This feature allows the administrator to assign specific projects to run only on specific nodes (machines). Example use cases:

  • The project team needs specialized hardware (e.g. with enough memory)
  • The project team is the owner of specific hardware which was acquired with a specialized budget
  • We want to direct build/interactive workloads to work on weaker hardware and direct longer training/unattended workloads to faster nodes

For further information see: Working with Projects

Limit Duration of Interactive Jobs

Researchers frequently forget to close Interactive jobs. This may lead to a waste of resources. Some organizations prefer to limit the duration of interactive jobs and close them automatically.

For further information on how to set up duration limits see: Working with Projects

May 24th, 2020

Kubernetes Operators

Cluster installation now works with Kubernetes Operators. Operators make it easy to install, update, and delete a Run:AI cluster.

For further information see: Upgrading a Run:AI Cluster Installation and Deleting a a Run:AI Cluster Installation

March 3rd, 2020

Admin Overview Dashboard

A new admin overview dashboard which shows a more holistic view of multiple clusters. Applicable for customers with more than one cluster.


Last update: September 15, 2020