content list

brown9804 · web-flow · commit 8c806dc49ea4 · 2025-03-03T16:54:02.000-06:00
diff --git a/0_Azure/3_AzureAI/9_AzureOpenAI/demos/11_ModelAvailability.md b/0_Azure/3_AzureAI/9_AzureOpenAI/demos/11_ModelAvailability.md
@@ -21,6 +21,7 @@ Last updated: 2025-03-03
 ## Content
 
 - [Deployment Options](#deployment-options)
+- [Pricing models](#pricing-models)
 - [When Azure OpenAI Model Availability PTU is Not Available](#when-azure-openai-model-availability-ptu-is-not-available)
     - [Complete a Capacity Request](#complete-a-capacity-request)
     - [Use a Different Model](#use-a-different-model)
@@ -36,6 +37,13 @@ Last updated: 2025-03-03
 | **Global Standard**   | Global Standard deployments leverage Azure's global infrastructure to dynamically route customer traffic to the data center with the best availability for the customer’s inference requests. | Highest initial throughput limits, best model availability, low latency. | Potential latency variation for high volume workloads. | Ideal for applications needing high availability and low latency. Uses Azure's global load balancing and routing capabilities. |
 | **Provisioned Throughput Units (PTUs)** | PTUs provide guaranteed throughput by allocating specific processing capacity for your deployment. This ensures stable performance and predictable latency. | Predictable performance, allocated processing capacity, potential cost savings for high throughput workloads. | Requires accurate forecasting of capacity needs, may involve higher upfront costs. | Best for applications with consistent and high throughput requirements. Requires careful planning and capacity management. |
 
+## Pricing models
+
+| **Pricing Model**                | **Description**                                                                 | **Ideal For**                          | **Billing**|
+|----------------------------------|---------------------------------------------------------------------------------|----------------------------------------|-----------------------------------------------------------------------------|
+| **Standard (On-Demand)**         | Charges based on the number of input and output tokens used.                     | Applications with variable or unpredictable usage. | Pay-as-you-go.|
+| **Provisioned Throughput Units (PTUs)** | Allocates specific throughput capacity for predictable costs.                   | Applications with predictable and consistent usage. | Monthly or annual basis, often at a discounted rate compared to on-demand. |
+
 ## When Azure OpenAI Model Availability PTU is Not Available
 
 <img width="550" alt="image" src="https://github.com/user-attachments/assets/f1da9940-a809-4902-95ba-f524e490edbe" />