Skip to content

Commit 8c806dc

Browse files
authored
content list
1 parent 5711fdc commit 8c806dc

1 file changed

Lines changed: 8 additions & 0 deletions

File tree

0_Azure/3_AzureAI/9_AzureOpenAI/demos/11_ModelAvailability.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,7 @@ Last updated: 2025-03-03
2121
## Content
2222

2323
- [Deployment Options](#deployment-options)
24+
- [Pricing models](#pricing-models)
2425
- [When Azure OpenAI Model Availability PTU is Not Available](#when-azure-openai-model-availability-ptu-is-not-available)
2526
- [Complete a Capacity Request](#complete-a-capacity-request)
2627
- [Use a Different Model](#use-a-different-model)
@@ -36,6 +37,13 @@ Last updated: 2025-03-03
3637
| **Global Standard** | Global Standard deployments leverage Azure's global infrastructure to dynamically route customer traffic to the data center with the best availability for the customer’s inference requests. | Highest initial throughput limits, best model availability, low latency. | Potential latency variation for high volume workloads. | Ideal for applications needing high availability and low latency. Uses Azure's global load balancing and routing capabilities. |
3738
| **Provisioned Throughput Units (PTUs)** | PTUs provide guaranteed throughput by allocating specific processing capacity for your deployment. This ensures stable performance and predictable latency. | Predictable performance, allocated processing capacity, potential cost savings for high throughput workloads. | Requires accurate forecasting of capacity needs, may involve higher upfront costs. | Best for applications with consistent and high throughput requirements. Requires careful planning and capacity management. |
3839

40+
## Pricing models
41+
42+
| **Pricing Model** | **Description** | **Ideal For** | **Billing**|
43+
|----------------------------------|---------------------------------------------------------------------------------|----------------------------------------|-----------------------------------------------------------------------------|
44+
| **Standard (On-Demand)** | Charges based on the number of input and output tokens used. | Applications with variable or unpredictable usage. | Pay-as-you-go.|
45+
| **Provisioned Throughput Units (PTUs)** | Allocates specific throughput capacity for predictable costs. | Applications with predictable and consistent usage. | Monthly or annual basis, often at a discounted rate compared to on-demand. |
46+
3947
## When Azure OpenAI Model Availability PTU is Not Available
4048

4149
<img width="550" alt="image" src="https://github.com/user-attachments/assets/f1da9940-a809-4902-95ba-f524e490edbe" />

0 commit comments

Comments
 (0)