Skip to content

Run:AI System Components

Components

  • Run:AI is installed over a Kubernetes Cluster

  • Researchers submit Machine Learning workloads via the Run:AI Command-Line Interface (CLI), or directly by sending YAML files to Kubernetes.

  • Administrators monitor and set priorities via the Administrator User Interface

architecture

The Run:AI Cluster

The Run:AI Cluster contains:

  • The Run:AI Scheduler which extends the Kubernetes scheduler. It uses business rules to schedule workloads sent by Researchers.
  • Fractional GPU management. Responsible for the Run:AI Virtualization technology which allows Researchers to allocate parts of a GPU rather than a whole GPU
  • The Run:AI agent. Responsible for sending Monitoring data to the Run:AI Cloud.
  • Clusters require outbound network connectivity to the Run:AI Cloud.
  • The Run:AI cluster is installed as a Kubernetes Operator
  • Run:AI is installed in its own namesapce runai
  • Workloads are run in the context of Projects. Each project is a Kubernetes namespace with its own settings and access control.

The Run:AI Cloud

The Run:AI Cloud is the basis of the Administrator User Interface.

  • The Run:AI cloud aggregates monitoring information from multiple tenants (customers).
  • Each customer may manage multiple Run:AI clusters.

Last update: August 9, 2020