Integrate Run:ai with Seldon Core¶
Seldon Core is software that deploys machine learning models to production over Kubernetes. The purpose of this document is to explain how to use Seldon Core together with Run:ai.
Of special importance, is the usage of Seldon together with the Run:ai fractions technology: Machine learning production tends to take less GPU Memory. As such, allocating a fraction of the GPU per job allows for better GPU Utilization.
Prerequisites¶
Install Seldon Core as described here. We recommend using the helm-based installation of both Seldon Core and Istio.
Create a Seldon deployment¶
The instructions below follow a sample machine learning model that tests the Run:ai - Seldon Core integration. Save the following in a file named <FILE-NAME>.yaml
apiVersion: machinelearning.seldon.io/v1
kind: SeldonDeployment
metadata:
name: seldon-model
namespace: runai-<PROJECT-NAME>
spec:
name: test-deployment
predictors:
- componentSpecs:
- spec:
containers:
- name: classifier
image: seldonio/mock_classifier:1.5.0-dev
resources:
limits:
nvidia.com/gpu: <GPUs>
schedulerName: runai-scheduler
graph:
children: []
endpoint:
type: REST
name: classifier
type: MODEL
name: example
replicas: 1
apiVersion: machinelearning.seldon.io/v1 kind: SeldonDeployment metadata: name: seldon-model namespace: runai-
Replace <PROJECT-NAME>
with the Run:ai projects and <GPUs>
with the amount of GPUs you want to allocate (e.g. 0.5 GPUs).
Verification¶
Run: runai list jobs
and verify that the job is running
Delete a deployment¶
Run: