# Queue-based

### Key Features

<figure><img src="/files/wR04Etn8rIFxWb6lCNQj" alt=""><figcaption></figcaption></figure>

#### 🚀 Elastic Scaling

Queue Mode automatically adjusts the number of active workers based on incoming request volume. When queue load increases, additional workers are provisioned; when demand decreases, excess workers are automatically terminated to save costs.

#### 💰 Transparent Pricing

* **Per-second billing**: Pay only for actual compute time use.
* **No idle charges**: Workers only incur costs while actively processing requests

<figure><img src="/files/61pfhCpyScaVlm9j2kHb" alt=""><figcaption></figcaption></figure>

For more details in price and billing, see [Pricing & Billing | Yotta Labs](https://docs.yottalabs.ai/products/serverless/pricing-and-billing)

### Architecture

#### Queue System

The intelligent queue sits at the center of the architecture, managing:

* Request buffering during traffic spikes
* Load distribution across available workers
* Health checks and automatic failover
* Priority-based request handling

#### Worker Management

* **On-demand provisioning**: Workers spin up in seconds when needed
* **Resource isolation**: Each worker operates in a dedicated, secure environment
* **Status monitoring**: Real-time visibility into worker health and performance

### Getting Started

**1. Configure Your Container**

See [Launching a Deployment | Yotta Labs](https://docs.yottalabs.ai/products/serverless/launching-a-deployment)

**2.View/Edit/Clone Configurations**

<figure><img src="/files/Df5zJ2a5BekLPRUBGJsu" alt=""><figcaption></figcaption></figure>

<figure><img src="/files/J0cT2AUkoli1qkJgdEJr" alt=""><figcaption></figcaption></figure>

It would guide you back to configuration setting page. Try cloning or editing by clicking buttons at the bottom (editing is only available when paused/terminated).

<figure><img src="/files/t84JcX962y7wXKnB17nk" alt=""><figcaption></figcaption></figure>

**3.Scale Workers**

<figure><img src="/files/Fpbcuczry7OfDe3eC3Uy" alt=""><figcaption></figcaption></figure>

<figure><img src="/files/x9iIMbFgHo6Hw39zl1fJ" alt=""><figcaption></figcaption></figure>

❗️After you scale workers, price would change accordingly.

**4.Terminate/Run Deployment**

* Terminate

<figure><img src="/files/0FJ61HpvLBqlun9AeS1a" alt=""><figcaption></figcaption></figure>

❗️Every time you click `pause` to terminate, the original service would stop. Once restarted, new worker IDs will be assigned, and uptime will reset, counting from zero again.If no volume is mounted, all temporary files and caches will be lost. Resuming the serverless will require reloading the container image and re-downloading the model.

* Run

<figure><img src="/files/xG2t1NBQQt2NMccexnbm" alt=""><figcaption></figcaption></figure>

**5.Terminate Worker /See log**

<figure><img src="/files/zUah8T4DouQi7cNP8GGx" alt=""><figcaption></figcaption></figure>

:exclamation:When there is only one worker in your configuration, you cannot stop any worker using the terminate button shown above. If you'd like to stop it, please pause the deployment.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.yottalabs.ai/products/serverless/queue-based.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
