Skip to content

Commit 474127f

Browse files
Merge branch 'main' into dev
2 parents e617edf + 55a554a commit 474127f

15 files changed

Lines changed: 180 additions & 31 deletions
Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,45 @@
1+
---
2+
name: Bug report
3+
about: Create a report to help us improve
4+
title: ''
5+
labels: bug
6+
assignees: ''
7+
8+
---
9+
10+
# Describe the bug
11+
A clear and concise description of what the bug is.
12+
13+
# Expected behavior
14+
A clear and concise description of what you expected to happen.
15+
16+
# How does this bug make you feel?
17+
_Share a gif from [giphy](https://giphy.com/) to tells us how you'd feel_
18+
19+
---
20+
21+
# Debugging information
22+
23+
## Steps to reproduce
24+
Steps to reproduce the behavior:
25+
1. Go to '...'
26+
2. Click on '....'
27+
3. Scroll down to '....'
28+
4. See error
29+
30+
## Screenshots
31+
If applicable, add screenshots to help explain your problem.
32+
33+
## Logs
34+
35+
If applicable, add logs to help the engineer debug the problem.
36+
37+
---
38+
39+
# Tasks
40+
41+
_To be filled in by the engineer picking up the issue_
42+
43+
- [ ] Task 1
44+
- [ ] Task 2
45+
- [ ] ...
Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,32 @@
1+
---
2+
name: Feature request
3+
about: Suggest an idea for this project
4+
title: ''
5+
labels: enhancement
6+
assignees: ''
7+
8+
---
9+
10+
# Motivation
11+
12+
A clear and concise description of why this feature would be useful and the value it would bring.
13+
Explain any alternatives considered and why they are not sufficient.
14+
15+
# How would you feel if this feature request was implemented?
16+
17+
_Share a gif from [giphy](https://giphy.com/) to tells us how you'd feel. Format: ![alt_text](https://media.giphy.com/media/xxx/giphy.gif)_
18+
19+
# Requirements
20+
21+
A list of requirements to consider this feature delivered
22+
- Requirement 1
23+
- Requirement 2
24+
- ...
25+
26+
# Tasks
27+
28+
_To be filled in by the engineer picking up the issue_
29+
30+
- [ ] Task 1
31+
- [ ] Task 2
32+
- [ ] ...

.github/ISSUE_TEMPLATE/subtask.md

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
---
2+
name: Sub task
3+
about: A sub task
4+
title: ''
5+
labels: subtask
6+
assignees: ''
7+
8+
---
9+
10+
Required by <link to parent issue>
11+
12+
# Description
13+
14+
A clear and concise description of what this subtask is.
15+
16+
# Tasks
17+
18+
_To be filled in by the engineer picking up the subtask
19+
20+
- [ ] Task 1
21+
- [ ] Task 2
22+
- [ ] ...

.github/PULL_REQUEST_TEMPLATE.md

Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,39 @@
1+
## Purpose
2+
<!-- Describe the intention of the changes being proposed. What problem does it solve or functionality does it add? -->
3+
* ...
4+
5+
## Does this introduce a breaking change?
6+
<!-- Mark one with an "x". -->
7+
8+
- [ ] Yes
9+
- [ ] No
10+
11+
<!-- Please prefix your PR title with one of the following:
12+
* `feat`: A new feature
13+
* `fix`: A bug fix
14+
* `docs`: Documentation only changes
15+
* `style`: Changes that do not affect the meaning of the code (white-space, formatting, missing semi-colons, etc)
16+
* `refactor`: A code change that neither fixes a bug nor adds a feature
17+
* `perf`: A code change that improves performance
18+
* `test`: Adding missing tests or correcting existing tests
19+
* `build`: Changes that affect the build system or external dependencies (example scopes: gulp, broccoli, npm)
20+
* `ci`: Changes to our CI configuration files and scripts (example scopes: Travis, Circle, BrowserStack, SauceLabs)
21+
* `chore`: Other changes that don't modify src or test files
22+
* `revert`: Reverts a previous commit
23+
* !: A breaking change is indicated with a `!` after the listed prefixes above, e.g. `feat!`, `fix!`, `refactor!`, etc.
24+
-->
25+
26+
## Golden Path Validation
27+
- [ ] I have tested the primary workflows (the "golden path") to ensure they function correctly without errors.
28+
29+
## Deployment Validation
30+
- [ ] I have validated the deployment process successfully and all services are running as expected with this change.
31+
32+
## What to Check
33+
Verify that the following are valid
34+
* ...
35+
36+
## Other Information
37+
38+
<!-- Add any other helpful information that may be needed here. -->
39+

CONTRIBUTING.md

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
# Contributing
2+
3+
This project welcomes contributions and suggestions. Most contributions require you to
4+
agree to a Contributor License Agreement (CLA) declaring that you have the right to,
5+
and actually do, grant us the rights to use your contribution. For details, visit
6+
https://cla.microsoft.com.
7+
8+
When you submit a pull request, a CLA-bot will automatically determine whether you need
9+
to provide a CLA and decorate the PR appropriately (e.g., label, comment). Simply follow the
10+
instructions provided by the bot. You will only need to do this once across all repositories using our CLA.
11+
12+
This project has adopted the [Microsoft Open Source Code of Conduct](https://opensource.microsoft.com/codeofconduct/).
13+
For more information see the [Code of Conduct FAQ](https://opensource.microsoft.com/codeofconduct/faq/)
14+
or contact [opencode@microsoft.com](mailto:opencode@microsoft.com) with any additional questions or comments.

TRANSPARENCY_FAQ.md

Lines changed: 8 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -2,15 +2,15 @@
22

33
- ### What is the Content Processing Solution Accelerator?
44

5-
This solution accelerator is an open-source GitHub Repository to extract data from unstructured documents and transform the data into defined schemas with validation to enhance the speed of downstream data ingestion and improve quality. It enables the ability to efficiently automate extraction, validation, and structuring of information for event driven system-to-system workflows. The solution is built using Azure OpenAI, Azure AI Services, Content Understanding Services, CosmosDB, and Azure Containers.
5+
This solution accelerator is an open-source GitHub Repository to extract data from unstructured documents and transform the data into defined schemas with validation to enhance the speed of downstream data ingestion and improve quality. It enables the ability to efficiently automate extraction, validation, and structuring of information for event driven system-to-system workflows. The solution is built using Azure OpenAI Service, Azure AI Services, Azure AI Content Understanding Service, Azure Cosmos DB, and Azure Container Apps.
66

77

88

99
- ### What can the Content Processing Solution Accelerator do?
1010

11-
The sample solution is tailored for a Data Analyst at a property insurance company, who analyzes large amounts of claim-related data including forms, reports, invoices, and property loss documentation. The sample data is synthetically generated utilizing Azure OpenAI and saved into related templates and files, which are unstructured documents that can be used to show the processing pipeline. Any names and other personally identifiable information in the sample data is fictitious.
11+
The sample solution is tailored for a Data Analyst at a property insurance company, who analyzes large amounts of claim-related data including forms, reports, invoices, and property loss documentation. The sample data is synthetically generated utilizing Azure OpenAI Service and saved into related templates and files, which are unstructured documents that can be used to show the processing pipeline. Any names and other personally identifiable information in the sample data is fictitious.
1212

13-
The sample solution processes the uploaded documents by exposing an API endpoint that utilizes Azure OpenAI and Content Understanding Service for extraction. The extracted data is then transformed into a specific schema output based on the content type (ex: invoice), and validates the extraction and schema mapping through accuracy scoring. The scoring enables thresholds to dictate a human-in-the-loop review of the output if needed, allowing a user to review, update, and add comments.
13+
The sample solution processes the uploaded documents by exposing an API endpoint that utilizes Azure OpenAI Service and Azure AI Content Understanding Service for extraction. The extracted data is then transformed into a specific schema output based on the content type (ex: invoice), and validates the extraction and schema mapping through accuracy scoring. The scoring enables thresholds to dictate a human-in-the-loop review of the output if needed, allowing a user to review, update, and add comments.
1414

1515
- ### What is/are the Content Processing Solution Accelerator’s intended use(s)?
1616

@@ -23,7 +23,11 @@
2323

2424
- ### What are the limitations of the Content Processing Solution Accelerator? How can users minimize the Content Processing Solution Accelerator’s limitations when using the system?
2525

26-
This solution accelerator can only be used as a sample to accelerate the creation of content processing solutions. The repository showcases a sample scenario of a Data Analyst at a property insurance company, analyzing large amounts of claim-related data, but a human must still be responsible to validate the accuracy and correctness of data extracted for their documents, schema definitions related to business specific documents to be extracted, quality and validation scoring logic and thresholds for human-in-the-loop review, ingesting transformed data into subsequent systems, and their relevancy for using with customers. Users of the accelerator should review the system prompts provided and update as per their organizational guidance. AI generated content in the solution may be inaccurate and should be manually reviewed by the user. Currently, the sample repository is available in English only and is only tested to support PDF, PNG, and JPEG files.
26+
This solution accelerator can only be used as a sample to accelerate the creation of content processing solutions. The repository showcases a sample scenario of a Data Analyst at a property insurance company, analyzing large amounts of claim-related data, but a human must still be responsible to validate the accuracy and correctness of data extracted for their documents, schema definitions related to business specific documents to be extracted, quality and validation scoring logic and thresholds for human-in-the-loop review, ingesting transformed data into subsequent systems, and their relevancy for using with customers. Users of the accelerator should review the system prompts provided and update as per their organizational guidance.
27+
28+
AI generated content in the solution may be inaccurate and the outputs and integrated solutions derived from the output data are not robustly trustworthy and should be manually reviewed by the user. You can find more information on AI generated content accuracy at https://aka.ms/overreliance-framework.
29+
30+
Currently, the sample repository is available in English only and is only tested to support PDF, PNG, and JPEG files up to 20MB in size.
2731

2832
- ### What operational factors and settings allow for effective and responsible use of the Content Processing Solution Accelerator?
2933

docs/CustomizingAzdParameters.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ Set the Environment Name Prefix
1111
azd env set AZURE_ENV_NAME 'cps'
1212
```
1313

14-
Change the Content Understanding Service Location (example: eastus2, westus2, etc.)
14+
Change the Azure Content Understanding Service Location (example: eastus2, westus2, etc.)
1515
```shell
1616
azd env set AZURE_ENV_CU_LOCATION 'West US'
1717
```

docs/DeploymentGuide.md

Lines changed: 1 addition & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -8,8 +8,7 @@ Check the [Azure Products by Region](https://azure.microsoft.com/en-us/explore/g
88

99
- [Azure AI Foundry](https://learn.microsoft.com/en-us/azure/ai-foundry/)
1010
- [Azure OpenAI Service](https://learn.microsoft.com/en-us/azure/ai-services/openai/)
11-
- [Azure AI Document Intelligence](https://learn.microsoft.com/en-us/azure/ai-services/document-intelligence/)
12-
- [Azure AI Content Understanding](https://learn.microsoft.com/en-us/azure/ai-services/content-understanding/)
11+
- [Azure AI Content Understanding Service](https://learn.microsoft.com/en-us/azure/ai-services/content-understanding/)
1312
- [Azure Blob Storage](https://learn.microsoft.com/en-us/azure/storage/blobs/)
1413
- [Azure Container Apps](https://learn.microsoft.com/en-us/azure/container-apps/)
1514
- [Azure Container Registry](https://learn.microsoft.com/en-us/azure/container-registry/)
@@ -225,29 +224,21 @@ This will rebuild the source code, package it into a container, and push it to t
225224
Bash
226225
227226
```bash
228-
229227
./upload_files.sh https://<< API Service Endpoint >>/contentprocessor/submit ./invoices <<Invoice Schema Id>>
230-
231228
```
232229
233230
```bash
234-
235231
./upload_files.sh https://<< API Service Endpoint >>/contentprocessor/submit ./propertyclaims <<Property Loss Damage Claim Form Schema Id>>
236-
237232
```
238233
239234
Windows
240235
241236
```powershell
242-
243237
./upload_files.ps1 https://<< API Service Endpoint >>/contentprocessor/submit .\invoices <<Invoice Schema Id>>
244-
245238
```
246239
247240
```powershell
248-
249241
./upload_files.ps1 https://<< API Service Endpoint >>/contentprocessor/submit .\propertyclaims <<Property Loss Damage Claim Form Schema Id>>
250-
251242
```
252243
253244
3. **Add Authentication Provider**

docs/Images/ReadMe/approach.png

37.8 KB
Loading
-51.4 KB
Loading

0 commit comments

Comments
 (0)