Skip to content

Commit d7d22e5

Browse files
authored
how to resolve it
1 parent 9747df3 commit d7d22e5

1 file changed

Lines changed: 92 additions & 3 deletions

File tree

0_Azure/3_AzureAI/AIFoundry/demos/4_TruncationHandling.md

Lines changed: 92 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -13,15 +13,20 @@ Last updated: 2025-03-03
1313
1414
<details>
1515
<summary><b>List of References</b> (Click to expand)</summary>
16-
16+
17+
- [Chunk large documents for vector search solutions in Azure AI Search](https://learn.microsoft.com/en-us/azure/search/vector-search-how-to-chunk-documents)
18+
- [What is Azure OpenAI in Azure AI Foundry Models?](https://learn.microsoft.com/en-us/azure/ai-foundry/openai/overview)
19+
- [Troubleshooting and best practices for Azure OpenAI On Your Data](https://learn.microsoft.com/en-us/azure/ai-foundry/openai/how-to/on-your-data-best-practices)
1720

1821
</details>
1922

2023

2124
<details>
2225
<summary><b>Table of Contents</b> (Click to expand)</summary>
23-
24-
26+
27+
- [Overview](#overview)
28+
- [How to resolve truncation issues](#how-to-resolve-truncation-issues)
29+
2530
</details>
2631

2732

@@ -106,6 +111,90 @@ Last updated: 2025-03-03
106111

107112
</details>
108113

114+
## How to resolve truncation issues
115+
116+
| **Solution Area** | **Action** | **Why It Helps** |
117+
|--------------------------|--------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------|
118+
| Token Budgeting | Use Azure Functions with `tiktoken` to pre-calculate token usage before inference | Prevents exceeding token limits and enables chunk-aware document processing |
119+
| Semantic Chunking | Use Azure AI Search’s Document Layout or Text Split skillset for structure-aware chunking | Preserves logical boundaries and improves embedding and retrieval quality |
120+
| Temperature Control | Configure temperature and `top_p` in Azure OpenAI deployment settings | Reduces verbosity and keeps completions within token budget |
121+
| Output Constraints | Use `max_tokens`, `stop` sequences, and `top_p` in Azure OpenAI API calls | Ensures clean, bounded outputs and avoids mid-sentence truncation |
122+
| Monitoring & Scaling | Use Azure Monitor, Log Analytics, and PTUs for throughput and cost control | Enables observability and resilience at enterprise scale |
123+
124+
<details>
125+
<summary><b> Token Budgeting in Azure </b> (Click to expand)</summary>
126+
127+
> Azure OpenAI models like GPT-4-128k enforce strict token limits. Complex documents with nested logic or rare terms can tokenize inefficiently, leading to unexpected truncation. Use an `Azure Function or Logic App` with the `tiktoken` library to analyze and split documents into token-aware chunks before sending them to Azure OpenAI.
128+
129+
**How to Apply in Azure:**
130+
131+
- Deploy a lightweight Azure Function that:
132+
- Accepts document input
133+
- Uses `tiktoken` to count tokens
134+
- Splits content into ≤3000-token chunks
135+
- Returns chunks to Power Automate or Azure OpenAI for inference
136+
137+
**Monitoring:**
138+
- Use Azure Monitor and Log Analytics to track:
139+
- `tokens_used`
140+
- `flowRunId`
141+
- `request_uri`
142+
- Visualize trends in Power BI to detect spikes or anomalies
143+
144+
</details>
145+
146+
<details>
147+
<summary><b> Semantic Chunking with Azure AI Search </b> (Click to expand)</summary>
148+
149+
> Azure `AI Search` supports semantic chunking via built-in skills like `Document Layout and Text Split`. These tools preserve logical structure and improve retrieval quality for RAG pipelines. `Chunking is not just about staying under token limits—it also improves embedding quality and relevance scoring.` Click here to read more about [Chunk large documents for vector search solutions in Azure AI Search](https://learn.microsoft.com/en-us/azure/search/vector-search-how-to-chunk-documents)
150+
151+
**How to Apply in Azure:**
152+
- Use the **Document Layout skill** to chunk by:
153+
- Paragraphs
154+
- Headings (e.g., Markdown or HTML)
155+
- Tables or sections
156+
- Use the **Text Split skill** to:
157+
- Split by sentence or character count
158+
- Add 10–15% overlap between chunks
159+
160+
**Example Configuration:**
161+
162+
```json
163+
{
164+
"skills": [
165+
{
166+
"@odata.type": "#Microsoft.Skills.Text.SplitSkill",
167+
"textSplitMode": "pages",
168+
"maximumPageLength": 800,
169+
"overlappingLength": 100
170+
}
171+
]
172+
}
173+
```
174+
175+
</details>
176+
177+
<details>
178+
<summary><b> Temperature & Output Control in Azure OpenAI </b> (Click to expand)</summary>
179+
180+
> High temperature values (e.g., 0.8–1.0) increase creativity but also verbosity, which can lead to token overflow. Lower values (e.g., 0.2–0.4) yield more concise, deterministic outputs. Combine temperature control with `top_p`, `stop` sequences, and `max_tokens` in your Azure OpenAI deployment or API call. Click here to read more about [What is Azure OpenAI in Azure AI Foundry Models?](https://learn.microsoft.com/en-us/azure/ai-foundry/openai/overview)
181+
182+
**How to Apply in Azure:**
183+
- In Azure OpenAI Studio or API:
184+
```json
185+
{
186+
"temperature": 0.3,
187+
"top_p": 0.9,
188+
"max_tokens": 1500,
189+
"stop": ["\n\n", "###", "END"]
190+
}
191+
```
192+
193+
- For stateless, high-throughput scenarios:
194+
- Use Provisioned Throughput Units (PTUs) for predictable performance
195+
- Monitor latency and token usage with Azure Monitor.
196+
197+
</details>
109198

110199

111200

0 commit comments

Comments
 (0)