Submitting Workloads via YAML¶
You can use YAML to submit Workloads directly to Run:ai. Below are examples of how to create training, interactive and inference workloads via YAML.
Submit Workload Example¶
Create a file named training1.yaml
with the following text:
apiVersion: run.ai/v2alpha1
kind: TrainingWorkload # (1)
metadata:
name: job-1 # (2)
namespace: runai-team-a # (3)
spec:
gpu:
value: "1"
image:
value: gcr.io/run-ai-demo/quickstart
name:
value: job-1 # (4)
- This is a Training workload.
- Kubernetes object name. Mandatory, but has no functional significance.
- Namespace: Replace
runai-team-a
with the name of the Run:ai namespace for the specific Project (typicallyrunai-<Project-Name>
). - Job name as appears in Run:ai. Can provide name, or create automatically if name prefix is configured.
Change the namespace and run: kubectl apply -f training1.yaml
Run: runai list jobs
to see the new Workload.
Delete Workload Example¶
Run: kubectl delete -f training1.yaml
to delete the Workload.
Creating a YAML syntax from a CLI command¶
An easy way to create a YAML for a workload is to generate it from the runai submit
command by using the --dry-run
flag. For example, run:
The result will be the following Kubernetes object declaration:
apiVersion: run.ai/v2alpha1
kind: InteractiveWorkload # (1)
metadata:
creationTimestamp: null
labels:
PreviousJob: "true"
name: job-0-2022-05-02t08-50-57
namespace: runai-team-a
spec:
command:
value: sleep infinity
gpu:
value: "1"
image:
value: ubuntu
imagePullPolicy:
value: Always
name:
value: job-0
... Additional internal and status properties...
- This is an Interactive workload.
Inference Workload Example¶
Creating an inference workload is similar to the above two examples.
apiVersion: run.ai/v2alpha1
kind: InferenceWorkload
metadata:
name: inference1
namespace: runai-team-a
spec:
name:
value: inference1
gpu:
value: "0.5"
image:
value: "gcr.io/run-ai-demo/example-triton-server"
minScale:
value: 1
maxScale:
value: 2
metric:
value: concurrency # (1)
target:
value: 80 # (2)
ports:
items:
port1:
value:
container: 8000
- Possible metrics can be
cpu-utilization
,latency
,throughput
,concurrency
,gpu-utilization
,custom
. Different metrics may require additional installations at the cluster level. - Inference requires a port to receive requests.
Suspend/Resume Interactive/Training Workload¶
To suspend training:
apiVersion: run.ai/v2alpha1
kind: TrainingWorkload #
metadata:
name: job-1 #
namespace: runai-team-a #
spec:
gpu:
value: "1"
active:
value: false
image:
value: gcr.io/run-ai-demo/quickstart
name:
value: job-1 #
In order to suspend the workload, set active
to false
. To resume the workload, either set active
to true
or remove it entirely.
See Also¶
- To understand how to connect to the inference workload, see Inference Quickstart.
- To learn more about Inference and Run:ai see Inference overview.