The Trainings interface provides a wizard to make submitting jobs easy.
You must have:
- Workspaces enabled.
- At least one Project configured.
See your system administrator to ensure the prerequisites are enabled and configured.
Where there is a card gallery, use the search bar to find specific cards based on title or field values.
To add a training:
- Press Tranings in the menu.
- In the Projects pane, select the destination project. Use the search box to find projects that are not listed. If you can't find the project, see your system administrator.
- In the Templates pane, select a template from the list. Use the search box to find templates that are not listed. If you can't find the specific template you need, see your system administrator.
- In the Training name pane, enter a name for the Traninng, then press continue.
- In the Environment pane select or create a new environment. Use the search box to find environments that are not listed.
- In the Compute resource pane, select resources for your tranings or create a new compute resource. Use the search box to find resources that are not listed. Press More settings to use Node Affinity to limit the resources to a specific node.
- In the Data sources pane, press add a new data source. For more information, see Creating a new data source When complete press, Create Data Source.
- When complete, press Create training.
The Trainings list contains a list of training jobs that you have created or have access to.
To manage your trainings:
- Press the 1. Press Tranings in the menu.
- Select a Training from the list.
- Choose from the following actions:
- Activate—activates the selected training job.
- Stop—stops the selected training job.
- Connect—connects to the training job's configured environment.
- Copy & edit—copies the details of the selected training job to a new training job.
- Delete—deletes the current training session.
- Show details—displays details about the training job.
Training details are displayed using the Show details action. The details available per training job include;
- Event hostory—a graph of the job's status over time along with a list of events found in the log.
Metrics—a graph of available metrics for the job. Use the drop down select a date and a time slice. Metrics include:
- GPU utilization
- GPU memory useage
- CPU useage
- CPU memory useage
Logs—a log file of the current status. Use the download button to save the logs.
To hide the training details, press Hide details.