To set up the infrastructure for ML workloads, create a virtual machine (VM) with 8 GPUs and a shared filesystem for training and a VM with one GPU for inference. In this guide, we will use the Nebius AI Cloud CLI to create VMs in a project in theDocumentation Index
Fetch the complete documentation index at: https://docs.nebius.com/llms.txt
Use this file to discover all available pages before exploring further.
eu-north1 region.
Before you start
Install the Nebius AI Cloud CLI
The Nebius AI Cloud CLI manages all Nebius AI Cloud resources. For more details, see the Nebius AI Cloud CLI documentation. To install and initialize the Nebius AI Cloud CLI, run the following commands one by one:nebius profile create, will guide you through several prompts. After you complete the prompts, your browser will open the Nebius AI Cloud web console sign-in screen. Sign in to the web console to complete the initialization. If you have access to multiple tenants, the CLI will prompt you to choose a tenant ID. After that, save your project ID in the CLI configuration:
If the project ID has not been configured during the nebius profile create flow, get the project ID and save it in the CLI configuration:
Install jq
In this guide, we will use jq to extract IDs and tokens from JSON data returned by the Nebius AI Cloud CLI. For more details, see the jq documentation.Generate keys for SSH access to the VM
Generate an SSH key pair.Create a VM with eight GPUs with InfiniBand™ and a shared filesystem for training
-
Create a boot disk and save its ID to an environment variable:
The command creates a 200 GiB SSD disk with a 4 KiB block size and an Ubuntu boot image with pre-installed NVIDIA GPU drivers. For details about boot disk images (
--source-image-family-image-family), see Boot disk images for Compute virtual machines. -
Create a shared filesystem and save its ID to an environment variable:
The command creates a 1 TiB SSD shared filesystem with 4 KiB blocks.
-
Get the subnet ID and save it to an environment variable:
Possible subnet ID:
vpcsubnet-e0dcbaa76x2024xyz8. -
For high-speed networking and efficient training, consider interconnecting multiple VM GPUs in a GPU cluster using InfiniBand. To do this, before creating the VM, create a GPU cluster to connect the VM and get its ID:
-
Create a VM with 8 GPUs for training:
The given example assumes that you work with VMs that have public addresses, so you can later connect to these VMs by SSH. However, if you need isolated VMs without public addresses, remove the
"public_ip_address": {}line from the VM configuration. To access the VM, you can set up a WireGuard jump server later. This approach enhances security and still provides access to the VM within the same subnet. For more information about creating VMs and managing their network parameters, see How to create a virtual machine in Nebius AI Cloud.
Create a VM with one GPU for inference
-
Create a boot disk and save its ID to an environment variable:
--source-image-family-image-family), see Boot disk images for Compute virtual machines.
-
Create a VM with one GPU for inference:
Connect to the VMs
Connect to the VM for training via SSH:-
Get your VM’s public IP address and save it to an environment variable:
-
Use the public IP address to connect to the VM:
-
Get your VM’s public IP address and save it to an environment variable:
-
Use the public IP address to connect to the VM:
What’s next
- Learn about VM and GPU types
- Learn how to create different types of VMs
- Learn more about VM networking
- Learn how to work with GPU clusters