From 6d25f4df142c0c3757653de6bc4bd6b2b6c48093 Mon Sep 17 00:00:00 2001
From: Timna Brown <24630902+brown9804@users.noreply.github.com>
Date: Thu, 27 Mar 2025 12:43:50 -0600
Subject: [PATCH 1/3] AI Vision - Refresher and Whats New
---
.../3_AzureAI/WhatsNew/1_AIvisionMarch2025.md | 181 ++++++++++++++++++
1 file changed, 181 insertions(+)
create mode 100644 0_Azure/3_AzureAI/WhatsNew/1_AIvisionMarch2025.md
diff --git a/0_Azure/3_AzureAI/WhatsNew/1_AIvisionMarch2025.md b/0_Azure/3_AzureAI/WhatsNew/1_AIvisionMarch2025.md
new file mode 100644
index 000000000..7854d1c4d
--- /dev/null
+++ b/0_Azure/3_AzureAI/WhatsNew/1_AIvisionMarch2025.md
@@ -0,0 +1,181 @@
+# What's new in Azure AI Vision? - March 2025 Overview
+
+Costa Rica
+
+[](https://github.com)
+[](https://github.com/)
+[brown9804](https://github.com/brown9804)
+
+Last updated: 2024-12-24
+
+----------
+
+List of References (Click to expand)
+
+- [What is Azure AI Vision?](https://learn.microsoft.com/en-us/azure/ai-services/computer-vision/overview)
+- [What is Image Analysis?](https://learn.microsoft.com/en-us/azure/ai-services/computer-vision/overview-image-analysis?tabs=4-0)
+- [What is the Azure AI Face service?](https://learn.microsoft.com/en-us/azure/ai-services/computer-vision/overview-identity)
+- [What is Video Analysis?](https://learn.microsoft.com/en-us/azure/ai-services/computer-vision/intro-to-spatial-analysis-public-preview?tabs=sa)
+
+Optical Character Recognition (OCR)
+
+> Extracts text from images. You can use the Read API to extract printed and handwritten text from photos and documents. It is used to recognize and convert texts, helping with text on various surfaces and backgrounds. These include business documents, invoices, receipts, posters, business cards, letters, and whiteboards. The OCR APIs support extracting printed text in several languages.
+
+- **API Endpoint**: The request to the endpoint should include the image data either as a URL or as binary data in the request body. Endpoint format: `/vision/v3.2/read/analyze`
+- **Input Formats**: JPEG, PNG, BMP, PDF (for multi-page documents)
+- **Output**: JSON with detected text and bounding boxes
+- **Languages Supported**: Multiple languages including English, Spanish, French, German, Chinese, Japanese, and more.
+- **Response Structure**: The JSON response includes an array of `readResults`, each containing:
+ - **Page**: The page number (for multi-page documents).
+ - **Language**: The detected language of the text.
+ - **Angle**: The rotation angle of the text.
+ - **Width and Height**: Dimensions of the image.
+ - **Lines**: An array of detected lines of text, each with:
+ - **BoundingBox**: Coordinates of the text line.
+ - **Text**: The recognized text.
+ - **Words**: An array of words within the line, each with its own bounding box and text.
+- **Error Handling**: The API provides detailed error messages for issues such as unsupported file formats, corrupted images, or exceeding size limits.
+- **Performance**: The OCR service is optimized for speed and accuracy, capable of processing large volumes of images quickly.
+
+> Follow the [OCR quickstart](https://learn.microsoft.com/en-us/azure/ai-services/computer-vision/quickstarts-sdk/client-library?tabs=windows%2Cvisual-studio&pivots=programming-language-python) to get started.
+
+
+
+From [OCR official documentation](https://learn.microsoft.com/en-us/azure/ai-services/computer-vision/overview-ocr)
+
+
Image Analysis
+
+> Extracts many visual features from images, such as objects, faces, adult content detection, and auto-generated tags for identification.
+
+- **API Request**: The request to the endpoint should include the image data either as a URL or as binary data in the request body. Endpoint format: `/vision/v3.2/analyze`
+- **Response Structure**: The JSON response includes an array of `categories`, `tags`, `objects`, `faces`, `color`, and `imageType`, each containing:
+ - **Categories**: Classifies images into predefined categories. An array of categories with `name` and `score` indicating the confidence level.
+ - **Tags**: Identifies objects, living beings, scenery, and actions. An array of tags with `name` and `confidence` score.
+ - **Objects**: Provides coordinates for objects within an image. An array of detected objects with `rectangle` coordinates and `object` name.
+ - **Faces**: Detects faces and provides attributes like age, gender, and emotion. An array of detected faces with `age`, `gender`, `faceRectangle`, and `emotion`.
+ - **Color**: Determines dominant colors. Information about the dominant colors, accent color, and whether the image is black & white.
+ - **ImageType**: Identifies whether an image is a photograph, clipart, or line drawing. Details about the image type, such as whether it is a clipart or line drawing.
+- **Text Extraction (OCR)**: Extracts printed and handwritten text.
+- **Output**: JSON with detected features and their details.
+- **Error Handling**: The API provides detailed error messages for issues such as unsupported file formats, corrupted images, or exceeding size limits.
+- **Performance**: The Image Analysis service is optimized for speed and accuracy, capable of processing large volumes of images quickly.
+
+> Follow the [Image Analysis quickstart](https://learn.microsoft.com/en-us/azure/ai-services/computer-vision/quickstarts-sdk/image-analysis-client-library-40?tabs=visual-studio%2Cwindows&pivots=programming-language-python) to get started.
+
+ |
|
|
+
+From [official documentation](https://learn.microsoft.com/en-us/azure/ai-services/computer-vision/overview-image-analysis?tabs=4-0)
+
+
Face
+
+> Provides AI algorithms that detect, recognize, and analyze human faces in images. Facial recognition software can be important in many different scenarios such as identifying people in photos or videos or finding a particular individual among a large group of people.
+
+- **API Endpoint**: `/face/v1.0/detect`
+- **API Request**: The request to the endpoint should include the image data either as a URL or as binary data in the request body. The request can also specify which features to analyze using query parameters.
+ - **Parameters**:
+ - `returnFaceId`: Boolean to specify if face IDs should be returned.
+ - `returnFaceLandmarks`: Boolean to specify if face landmarks should be returned.
+ - `returnFaceAttributes`: Comma-separated string to specify which face attributes to return (e.g., `age,gender,emotion`).
+- **Features**:
+ - **Face Detection**: Identifies human faces and provides bounding boxes.
+ - **Face Recognition**: Matches detected faces against a database of known faces.
+ - **Face Verification**: Compares two faces to determine if they belong to the same person.
+ - **Face Grouping**: Groups similar faces together.
+ - **Face Attributes**: Provides detailed attributes like age, gender, emotion, head pose, facial hair, and accessories.
+- **Output**: JSON with detected faces and their attributes.
+- **Response Structure**: The JSON response includes an array of detected faces, each containing:
+ - **FaceId**: Unique identifier for the detected face.
+ - **FaceRectangle**: Coordinates of the bounding box around the face.
+ - **FaceLandmarks**: (Optional) Detailed coordinates of facial landmarks.
+ - **FaceAttributes**: (Optional) Detailed attributes of the face, including:
+ - `age`: Estimated age of the person.
+ - `gender`: Gender of the person.
+ - `emotion`: Detected emotions with confidence scores.
+ - `headPose`: Orientation of the head.
+ - `facialHair`: Presence of facial hair.
+ - `glasses`: Type of glasses worn.
+ - `accessories`: Detected accessories like headwear.
+ - `blur`: Blur level of the image.
+ - `exposure`: Exposure level of the image.
+ - `noise`: Noise level of the image.
+- **Error Handling**: The API provides detailed error messages for issues such as unsupported file formats, corrupted images, or exceeding size limits. Common error codes include:
+ - `InvalidImageFormat`: The provided image format is not supported.
+ - `InvalidImageSize`: The provided image exceeds the size limit.
+ - `FaceNotFound`: No faces were detected in the image.
+- **Performance**: The Face API is optimized for speed and accuracy, capable of processing large volumes of images quickly. It supports high concurrency and provides low-latency responses.
+
+> Follow the [Face quickstart](https://learn.microsoft.com/en-us/azure/ai-services/computer-vision/quickstarts-sdk/identity-client-library?tabs=windows%2Cvisual-studio&pivots=programming-language-csharp) to get started.
+
+
+
+From [official documentation](https://learn.microsoft.com/en-us/azure/ai-services/computer-vision/overview-identity)
+
+
Video Analysis
+
+> Includes video-related features like [Spatial Analysis](https://learn.microsoft.com/en-us/azure/ai-services/computer-vision/spatial-analysis-container?tabs=azure-stack-edge) and [Video Retrieval](https://learn.microsoft.com/en-us/azure/ai-services/computer-vision/how-to/video-retrieval). Spatial analysis analyzes real-time footage captured by cameras in relation to predefined areas of interest or rules set by users for specific actions or events within a video feed and produces events accordingly when rules are met.
+
+- **API Endpoint**: `/video/v1.0/analyze`
+- **API Request**: The request to the endpoint should include the video data either as a URL or as binary data in the request body. The request can also specify which features to analyze using query parameters.
+ - **Parameters**:
+ - `spatialAnalysis`: Boolean to specify if spatial analysis should be performed.
+ - `peopleCounting`: Boolean to specify if people counting should be performed.
+ - `movementTracking`: Boolean to specify if people movement tracking should be performed.
+ - `zoneMonitoring`: Boolean to specify if zone monitoring should be performed.
+ - `lineCrossingDetection`: Boolean to specify if line crossing detection should be performed.
+ - `customEvents`: Boolean to specify if custom events should be defined and detected.
+- **Features**:
+ - **Spatial Analysis**: Monitors and analyzes movements and interactions within physical spaces.
+ - **People Counting**: Counts the number of people in a designated area.
+ - **People Movement Tracking**: Tracks the movement of individuals within a space.
+ - **Zone Monitoring**: Detects when people enter or exit specific zones.
+ - **Line Crossing Detection**: Detects when people cross predefined lines.
+ - **Custom Events**: Defines custom events based on spatial criteria.
+- **Output**: JSON with detected events and their details.
+- **Response Structure**: The JSON response includes an array of detected events, each containing:
+ - **EventId**: Unique identifier for the detected event.
+ - **EventType**: Type of the detected event (e.g., `PeopleCounting`, `ZoneMonitoring`).
+ - **Timestamp**: Time at which the event was detected.
+ - **Details**: Detailed information about the event, including:
+ - `count`: Number of people detected (for people counting).
+ - `coordinates`: Coordinates of detected movements or zone entries/exits.
+ - `lineCrossed`: Information about the line crossed (for line crossing detection).
+ - `customEventDetails`: Details about custom events defined by the user.
+- **Error Handling**: The API provides detailed error messages for issues such as unsupported file formats, corrupted videos, or exceeding size limits. Common error codes include:
+ - `InvalidVideoFormat`: The provided video format is not supported.
+ - `InvalidVideoSize`: The provided video exceeds the size limit.
+ - `EventNotDetected`: No events were detected in the video.
+- **Performance**: The Video Analysis service is optimized for speed and accuracy, capable of processing large volumes of video data quickly. It supports high concurrency and provides low-latency responses.
+
+