You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Enhance documentation on Azure Load Balancer and Gateway
Updated notes on Azure Load Balancer and Application Gateway features, including their core functionalities, health checks, security features, and developer impacts. Added clarifications on routing models and best fit for Foundry workloads.
Copy file name to clipboardExpand all lines: 0_Azure/3_AzureAI/AIFoundry/demos/13_APIM_LoadBalancer_AI.md
+14-1Lines changed: 14 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -41,6 +41,9 @@ Last updated: 2026-01-22
41
41
|**Hub-to-Hub**| Multiple hubs interconnected | High resiliency, regional autonomy | More complex networking, higher cost |
42
42
|**Hub-and-Spoke**| One central hub, multiple spokes | Easier to manage, centralized policies | Hub becomes a critical dependency |
43
43
44
+
> [!NOTE]
45
+
> For **MSFT Foundry APIs**, you can use **Application Gateway** because it’s HTTP/S‑aware, integrates with APIM, and provides advanced routing + WAF security. Azure Load Balancer is useful for **internal, low‑level traffic distribution**, but not sufficient on its own for developer‑facing Foundry workloads.
46
+
44
47
## Unified Gateway with APIM
45
48
46
49
`Applications only call APIM endpoints, not individual Foundry instances. This simplifies SDKs and client logic.`
@@ -61,10 +64,12 @@ From [What is Azure API Management?](https://learn.microsoft.com/en-us/azure/api
61
64
62
65
From [What is Azure API Management?](https://learn.microsoft.com/en-us/azure/api-management/api-management-key-concepts)
63
66
64
-
> E.g from [GPT-RAG Solution Accelerator](https://github.com/Azure/GPT-RAG)
From [GPT-RAG Solution Accelerator](https://github.com/Azure/GPT-RAG)
72
+
68
73
## Frontend Layer
69
74
70
75
`Frontend ensures that user traffic is secure and optimized before hitting backend workloads. You don’t need to hardcode region logic in the client, Front Door handles it`
@@ -102,6 +107,14 @@ From [Comparison between Azure Front Door and Azure CDN services](https://learn.
102
107
- Load Balancers:
103
108
- Azure Load Balancer or Application Gateway distribute traffic across multiple Foundry instances in a region.
104
109
- Health probes detect unhealthy instances and remove them from rotation.
|**Core Functionality**| Operates at **Layer 4 (TCP/UDP)**. Distributes raw network traffic across backend pools (VMs, containers, or services). No awareness of HTTP/S protocols. Best for simple, high‑throughput scenarios. | Operates at **Layer 7 (HTTP/S)**. Fully protocol‑aware, designed for web/API workloads. Supports SSL termination, URL/path‑based routing, and advanced traffic rules. |
114
+
|**Health & Routing**| Uses **TCP/UDP probes** to check if instances are reachable. Routing is basic (round‑robin, hash‑based). No ability to inspect API responses. | Uses **HTTP/S probes** that can validate Foundry endpoints directly. Supports routing by path, hostname, headers, and cookies. Enables intelligent failover and sticky sessions. |
115
+
|**Security & Features**| Provides basic distribution only. Security handled externally (NSGs, firewalls). No SSL offload, no WAF. | Includes **Web Application Firewall (WAF)**, SSL/TLS termination, request inspection, and session affinity. Directly protects Foundry APIs from malicious traffic. |
116
+
|**Developer Impact**| Lightweight, fast, but requires APIM or another Layer 7 service for API‑aware routing, logging, and quota enforcement. Developers see it as “plumbing.” | Rich features directly usable by developers: routing rules, SSL offload, WAF, cookie affinity. Integrates naturally with APIM for policy enforcement and observability. |
117
+
|**Best Fit for Foundry**| Internal traffic distribution where simplicity and raw throughput matter (e.g., VM/container clusters hosting Foundry). | External/API traffic distribution where **security, routing intelligence, and observability** are critical — the recommended choice for MSFT Foundry workloads. |
105
118
- Routing Models:
106
119
- Hub‑and‑Spoke: Central hub routes traffic to spokes (regional APIM + Foundry). Easier to manage, but hub is a dependency.
107
120
- Hub‑to‑Hub: Each hub can route to others, providing regional autonomy. More resilient but complex networking.
0 commit comments