# Pricing

### 1. Overview

AI Gateway aggregates various types of models, including LLM language models, text-to-image models, and text-to-video models. The billing logic for each type of model has differences.

* LLM models are billed based on the number of tokens consumed, divided into two dimensions: input and output;Some LLM models (such as GLM series) support context caching, with cached tokens billed at a lower cached unit price.

<figure><img src="https://4009603828-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F2ezFC70sdvT4ACdioCrw%2Fuploads%2FtPJUdLjmixfrhI8bItbX%2Fimage.png?alt=media&#x26;token=a2cf0a3e-1761-4cbc-bd7a-f5ff8c24c54c" alt="" width="411"><figcaption></figcaption></figure>

* Text-to-image models are generally billed based on the number of images generated; some models have tiered pricing based on resolution, like 1k, 2k or 4k.

<figure><img src="https://4009603828-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F2ezFC70sdvT4ACdioCrw%2Fuploads%2FruC7OF1bXwjjTcBSFCou%2Fimage.png?alt=media&#x26;token=06472e99-117f-4586-a3b2-6da8aac64526" alt="" width="329"><figcaption></figcaption></figure>

* Image/Text-to-video models are billed based on the duration (in seconds) of the generated video. Some also has tier pricing based on resolution and existence of audio.

<figure><img src="https://4009603828-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F2ezFC70sdvT4ACdioCrw%2Fuploads%2FSTug9Lyey3mHuRcvOHyp%2Fimage.png?alt=media&#x26;token=ef930f95-af94-4833-848e-c0299905e5b5" alt="" width="500"><figcaption></figcaption></figure>

### 2. Model Pricing Examples

Below are the reference values for the pricing field of the currently integrated models (prices are in USD, with token-based models priced at USD per million tokens):

| Model             | type  | input | output | cached | Remarks                 |
| ----------------- | ----- | ----- | ------ | ------ | ----------------------- |
| claude sonnet 4.6 | token | $3.00 | $15.00 |        |                         |
| glm 5             | token | $0.95 | $3.04  | $0.19  | Support context caching |
| Seedream 4.5      | image |       | $0.38  |        |                         |

### 3. Common Questions

**Q: Why don't image generation models use token billing?**

The computational resource consumption of image generation models mainly depends on image resolution and generation quantity, with little correlation to the number of prompt text tokens. Therefore, the original providers all charge based on "per image." Some models (such as DALL·E 3) have different pricing for different resolutions, expressed through the resolution\_tiers array.

**Q: Can per\_image and resolution\_tiers coexist?**

No. They are mutually exclusive: use per\_image if all model resolutions have the same price; use resolution\_tiers if different resolutions have different prices. If both exist simultaneously, the data layer should validate and report an error.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.yottalabs.ai/products/ai-gateway/pricing.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
