Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ User story

This solution accelerator enables customers to programmatically extract data and apply schemas to unstructured documents across text-based and multi-modal content. During processing, extraction and data schema transformation - these steps are scored for accuracy to automate processing and identify as-needed human validation. This allows for improved accuracy and greater speed for data integration into downstream systems.

It leverages Azure AI Foundry, Azure AI Content Understanding, Azure OpenAI Service, Azure blob storage, and Cosmos DB to transform large volumes of unstructured content through event-driven processing pipelines for integration into downstream applications and post-processing activities.
It leverages Azure AI Foundry, Azure AI Content Understanding, Azure OpenAI Service, Azure blob storage, and Azure Cosmos DB to transform large volumes of unstructured content through event-driven processing pipelines for integration into downstream applications and post-processing activities.


### Technical key features
Expand Down
14 changes: 14 additions & 0 deletions docs/AzureAccountSetup.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
## Azure account setup

1. Sign up for a [free Azure account](https://azure.microsoft.com/free/) and create an Azure Subscription.
2. Check that you have the necessary permissions:
* Your Azure account must have `Microsoft.Authorization/roleAssignments/write` permissions, such as [Role Based Access Control Administrator](https://learn.microsoft.com/azure/role-based-access-control/built-in-roles#role-based-access-control-administrator-preview), [User Access Administrator](https://learn.microsoft.com/azure/role-based-access-control/built-in-roles#user-access-administrator), or [Owner](https://learn.microsoft.com/azure/role-based-access-control/built-in-roles#owner).
* Your Azure account also needs `Microsoft.Resources/deployments/write` permissions on the subscription level.

You can view the permissions for your account and subscription by following the steps below:
- Navigate to the [Azure Portal](https://portal.azure.com/) and click on `Subscriptions` under 'Navigation'
- Select the subscription you are using for this accelerator from the list.
- If you try to search for your subscription and it does not come up, make sure no filters are selected.
- Select `Access control (IAM)` and you can see the roles that are assigned to your account for this subscription.
- If you want to see more information about the roles, you can go to the `Role assignments`
tab and search by your account name and then click the role you want to view more information about.
17 changes: 12 additions & 5 deletions docs/DeploymentGuide.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

## **Pre-requisites**

To deploy this solution accelerator, ensure you have access to an [Azure subscription](https://azure.microsoft.com/free/) with the necessary permissions to create **resource groups, resources, app registrations, and assign roles at the resource group level**. This should include Contributor role at the subscription level and Role Based Access Control role on the subscription and/or resource group level. Follow the steps in [Azure Account Set Up](./docs/AzureAccountSetUp.md).
To deploy this solution accelerator, ensure you have access to an [Azure subscription](https://azure.microsoft.com/free/) with the necessary permissions to create **resource groups, resources, app registrations, and assign roles at the resource group level**. This should include Contributor role at the subscription level and Role Based Access Control role on the subscription and/or resource group level. Follow the steps in [Azure Account Set Up](./AzureAccountSetUp.md).

Check the [Azure Products by Region](https://azure.microsoft.com/en-us/explore/global-infrastructure/products-by-region/?products=all&regions=all) page and select a **region** where the following services are available:

Expand Down Expand Up @@ -38,7 +38,7 @@ This will allow the scripts to run for the current session without permanently c

## Deployment Options & Steps

Pick from the options below to see step-by-step instructions for GitHub Codespaces, VS Code Dev Containers, Local Environments, and Bicep deployments.
Pick from the options below to see step-by-step instructions for GitHub Codespaces, VS Code Dev Containers, and Local Environments.

| [![Open in GitHub Codespaces](https://github.com/codespaces/badge.svg)](https://codespaces.new/microsoft/content-processing-solution-accelerator) | [![Open in Dev Containers](https://img.shields.io/static/v1?style=for-the-badge&label=Dev%20Containers&message=Open&color=blue&logo=visualstudiocode)](https://vscode.dev/redirect?url=vscode://ms-vscode-remote.remote-containers/cloneInVolume?url=https://github.com/microsoft/content-processing-solution-accelerator) |
|---|---|
Expand Down Expand Up @@ -117,16 +117,16 @@ When you start the deployment, most parameters will have **default values**, but
| **Azure AI Content Understanding Location** | Select from a drop-down list of values. | Sweden Central |
| **Secondary Location** | A **less busy** region for **Azure Cosmos DB**, useful in case of availability constraints. | eastus2 |
| **Deployment Type** | Select from a drop-down list. | GlobalStandard |
| **GPT Model** | Choose from **gpt-4, gpt-4o, gpt-4o-mini**. | gpt-4o |
| **GPT Model Deployment Capacity** | Configure capacity for **GPT models**. | 100k |
| **GPT Model** | Choose from **gpt-4o**. | gpt-4o |
| **GPT Model Deployment Capacity** | Configure capacity for **GPT models**. | 300k |
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@toherman-msft @brittneek I think the model capacity should be 30K instead of 300K.


</details>

<details>
<summary><b>[Optional] Quota Recommendations</b></summary>

By default, the **GPT model capacity** in deployment is set to **30k tokens**.
> **We recommend increasing the capacity to 100k tokens for optimal performance.**
> **We recommend increasing the capacity to 100k tokens, if available, for optimal performance.**

To adjust quota settings, follow these [steps](./AzureGPTQuotaSettings.md).

Expand Down Expand Up @@ -261,3 +261,10 @@ This will rebuild the source code, package it into a container, and push it to t
4. **Deleting Resources After a Failed Deployment**

- Follow steps in [Delete Resource Group](./DeleteResourceGroup.md) if your deployment fails and/or you need to clean up the resources.

## Next Steps

Now that you've completed your deployment, you can start using the solution. Try out these things to start getting familiar with the capabilities:
* Open the web container app URL in your browser and explore the web user interface and upload your own invoices.
* [Create your own schema definition](./CustomizeSchemaData.md), so you can upload and process your own types of documents.
* [Ingest the API](API.md) for processing documents programmatically.
Loading