GPU Pods

Pods are the core compute units on Yotta Platform. They allow users to deploy, manage, and connect to isolated GPU workloads across various hardware through the Console or the API interface.

💻 Managing Pods via Console

Pod Page Overview

When entering the Pods page:

  • The system by default displays Pods in In Progress state, including:

    • Initialize – Resources are being allocated; the Pod is deploying.

    • Running – The Pod is running normally.

    • Stopping – The Pod is pausing; resources are being reclaimed.

    • Stopped – The Pod has been paused.

    • Terminating – The Pod is being terminated; resources are being reclaimed.

  • Click the History tab to view Pods that have completed within the last 24 hours, which includes:

    • Terminated – The Pod has been deleted.

    • Failed – Deployment failed. Common causes:

      • Insufficient system resources

      • Invalid image configuration

  • There is a search bar where you can use Pod name to find your Pod (fuzzy search supported). You can also use Pod Status or GPU Type to filter Pods.

⚙️ Deploying a Pod

Step-by-Step Guide

  • Navigate to: Compute → Pods

  • Click Deploy (top right). You’ll enter the GPU Selection page.

  • Select GPU Type

    • Choose a GPU model suitable for your workload.

  • Configure Pod

    • Fill in required parameters (fields marked with * are mandatory).

Image Requirements

Click Edit next to image name to further configure your image. We provided a list of official images compiled by Yotta Labs. Also, we allow users to select custom images including both Public Images and Private Images.

Here are are few requirements if you want to build your custom image:

  • Must be compiled for x86 architecture

  • Must be Debian/Ubuntu

  1. Deploy Click Deploy to complete the process.

💾 System Volume

The System Volume would automatically mount a list of system directories on the created Pod. This ensures that software, configurations, and data stored within these directories are persistent even if the Pod is edited or restarted.

Supported Directories

Read and write operations to the following directories will be persistent:

Directory

Brief Description

/home

User home directories; user-level configs and data.

/root

Root user home directory; scripts and temp data.

/var

Variable files (logs, caches, runtime data).

/run

Runtime status files (PIDs, sockets).

/etc

System and service configuration files.

/usr

System-level apps, libraries, and runtime components.

Size Requirements

To ensure that the Pod can launch and run smoothly, we recommend using the following rule to decide the size of your system volume:

The siez of the system volume needs to ≥ Image Size × 3

Example:

  • Image: PyTorch base image (10 GiB)

  • Recommended System Volume Size: At least 30 GiB

  • Preserving Environments: Retaining toolchains or dependencies (e.g., pip packages) after a Pod rebuild.

  • Persisting Configurations: Saving changes made in /etc.

  • Retaining Logs: Keeping logs in /var for a specific period.

Do not use the System Volume for general data storage. If you have large data to store, use dedicated storage (Volume/PVC). Example:

  • Database data files.

  • User-uploaded files.

  • Long-term archival data.

How to Enable

Click Dev & Debug Pod to enable the system volume:

🔌 Connecting to Your Pod

Once the Pod is launched:

  • Click the Connect button on the Pod card to view exposed services.

  • Availability depends on the port configuration defined at deployment.

  • When the container port is Ready, the status will update automatically.

📜 Viewing Logs

  • Click Logs on the Pod card to view both:

    • System Logs (platform-level)

    • Container Logs (application-level)

This helps with debugging deployment or runtime issues.

🧊 Pausing or Terminating Pods

🔸 Pause

If you only need to suspend temporarily:

  • Click Pause on the Pod card.

  • Only Volume storage will continue to incur charges.

  • You can Run to restart anytime.

  • Pods can be edited while paused.

🔸 Terminate

If you want to remove the Pod completely:

  • Click the “...” on the Pod card → choose Terminate.

  • The Pod will be permanently deleted and no longer billed.

  • Terminated Pods cannot be edited or restarted.

✏️ Editing a Pod

  1. Go to Compute → Pods and locate the Pod.

  2. Click Pause and wait until the Pod enters Stopped state.

  3. Click “...” → Edit, modify configurations, and save.

  4. Click Run to restart the Pod with the new settings.

📈 Pod Status Reference

Status
Description

Initialize

Resource allocation in progress; Pod deploying

Running

Pod is running

Stopping

Pausing in progress; resources reclaiming

Stopped

Pod is paused

Terminating

Termination in progress; resources reclaiming

Terminated

Pod fully terminated

Failed

Deployment failed (insufficient resources / invalid image)

💰 Pricing & Billing

Formula

Pod hourly cost = (GPU unit price × number of GPUs)
                + (Disk hourly rate × GB size)
                + (Volume hourly rate × GB size)

Deduction Rules

  • Billing starts once the Pod is Running.

  • When balance nears $0, all active Pods will be terminated automatically.

  • To avoid charges:

    • Use Pause to temporarily suspend (still charges for persistent volumes).

    • Use Terminate to completely stop billing.

Action
Billing Behavior

Pause

Charges continue for Volumes (Stopped state)

Terminate

No charges (Terminated state)


🧾 Viewing Your Bill

Go to Billing in the left sidebar to view:

  • Pod usage breakdown

  • GPU, Disk, and Volume hourly costs

  • Historical billing data


🧩 Managing Pods via OpenAPI

You can also manage Pods programmatically via Yotta Labs’ OpenAPI.

API Reference

Tip: Always review the API documentation before calling endpoints to avoid common request errors (invalid parameters, insufficient balance, etc.).

🧱 Example Use Cases

  • Automated Pod Deployment via Python SDK

  • Monitoring Pod Logs using API polling

  • Scaling Workloads across multiple GPU types

  • Integrating with CI/CD to trigger training jobs automatically


🪄 Best Practices

  • Use Pause instead of Terminate for short-term downtime.

  • Monitor balance regularly to prevent auto-termination.

  • Always verify image compatibility (x86 / Ubuntu-based).

  • For debugging, prefer checking container logs first.


Last updated

Was this helpful?