Skip to content

Commit 9897d2b

Browse files
authored
overview
1 parent 21c2a8f commit 9897d2b

1 file changed

Lines changed: 153 additions & 1 deletion

File tree

0_Azure/2_AzureAnalytics/0_Fabric/demos/28_MirroringERPperfAnalysis.md

Lines changed: 153 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,10 @@ Last updated: 2025-02-28
1111

1212
> Quick technical overview of Mirroring an ERP System Database into Microsoft Fabric
1313
14+
> [!NOTE]
15+
> `Shortcuts` in Microsoft Fabric are objects that point to other storage locations, either within OneLake or external sources like Azure Data Lake, Amazon S3, and Dataverse. They provide `real-time access` to data without physically copying it, reducing storage overhead and ensuring the latest data is always available. <br/>
16+
> `Mirroring` in Microsoft Fabric involves creating a synchronized copy of an external database within OneLake. This process `physically stores` the data in Fabric, enabling faster query performance and ensuring data consistency through continuous updates.
17+
1418
<details>
1519
<summary><b>List of References</b> (Click to expand)</summary>
1620

@@ -21,7 +25,10 @@ Last updated: 2025-02-28
2125
- [Permission model](https://learn.microsoft.com/en-us/fabric/security/permission-model)
2226
- [Secure data access in Microsoft Fabric](https://learn.microsoft.com/en-us/training/modules/secure-data-access-in-fabric/)
2327
- [Authorization in SQL database in Microsoft Fabric](https://learn.microsoft.com/en-us/fabric/database/sql/authorization)
24-
28+
- [What is the SQL analytics endpoint for a lakehouse? - Microsoft Fabric](https://learn.microsoft.com/en-us/fabric/data-engineering/lakehouse-sql-analytics-endpoint)
29+
- [Query the SQL analytics endpoint of your SQL database in Fabric](https://learn.microsoft.com/en-us/fabric/database/sql/query-sql-analytics-endpoint)
30+
- [SQL database Overview (Preview) - Microsoft Fabric](https://learn.microsoft.com/en-us/fabric/database/sql/overview)
31+
2532
</details>
2633

2734

@@ -38,6 +45,7 @@ Last updated: 2025-02-28
3845
| **Continuous Replication** | Microsoft Fabric continuously replicates data from your ERP system database into Fabric's OneLake. |
3946
| **Low Latency** | The replication process is designed to be low-latency, ensuring near real-time data availability for analytics. |
4047
| **Read-Only Access** | The mirrored data is stored in a read-only format, protecting the operational ERP database from performance degradation due to analytics queries. |
48+
| **Use Cases** | - **Analytics and Reporting:** Perform analytics and generate reports without impacting the operational ERP database. <br> - **Data Engineering:** Use the mirrored data for data engineering tasks, such as data transformation and integration. <br> - **Data Science:** Leverage the mirrored data for data science projects, including machine learning and AI. |
4149

4250
### Continuous Replication
4351

@@ -106,6 +114,150 @@ graph TD
106114
E -->|Prevents modifications| J[Enhanced Security]
107115
```
108116

117+
### Use Cases
118+
```mermaid
119+
mindmap
120+
root((Use Cases))
121+
Analytics and Reporting
122+
Benefits
123+
Uninterrupted Operations
124+
Timely Insights
125+
Scalability
126+
Examples
127+
Financial Reporting
128+
Sales Analysis
129+
Operational Metrics
130+
Data Engineering
131+
Benefits
132+
Data Transformation
133+
Data Integration
134+
Data Quality
135+
Examples
136+
ETL Processes
137+
Data Enrichment
138+
Data Aggregation
139+
Data Science
140+
Benefits
141+
Predictive Analytics
142+
Machine Learning
143+
AI Applications
144+
Examples
145+
Customer Segmentation
146+
Demand Forecasting
147+
Anomaly Detection
148+
```
149+
#### Analytics and Reporting
150+
151+
> The primary goal is to enable business users and analysts to perform data analysis and generate reports without affecting the performance of the operational ERP database.
152+
153+
**Benefits:**
154+
- **Uninterrupted Operations:** Since the mirrored data is read-only, analytics queries do not interfere with the day-to-day operations of the ERP system.
155+
- **Timely Insights:** Near real-time replication ensures that the data used for analytics is up-to-date, allowing for timely and accurate decision-making.
156+
- **Scalability:** The system can handle large volumes of data, making it suitable for enterprises with extensive reporting needs.
157+
158+
> Examples:
159+
160+
| **Example** | **Description** |
161+
|-------------|-----------------|
162+
| **Financial Reporting** | Generate financial statements, balance sheets, and profit & loss reports. |
163+
| **Sales Analysis** | Analyze sales trends, customer behavior, and product performance. |
164+
| **Operational Metrics** | Monitor key performance indicators (KPIs) such as inventory levels, order fulfillment rates, and production efficiency. |
165+
166+
#### Data Engineering
167+
168+
> To facilitate data transformation, integration, and preparation tasks that are essential for building data pipelines and workflows.
169+
170+
> Benefits:
171+
- **Data Transformation:** Convert raw data into a structured format suitable for analysis and reporting.
172+
- **Data Integration:** Combine data from multiple sources, including the ERP system, to create a unified data repository.
173+
- **Data Quality:** Implement data cleansing and validation processes to ensure the accuracy and reliability of the data.
174+
175+
> Examples:
176+
177+
| **Example** | **Description** |
178+
|-------------|-----------------|
179+
| **ETL Processes** | Extract, transform, and load data from the ERP system into a data warehouse or data lake. |
180+
| **Data Enrichment** | Enhance the ERP data with additional information from external sources, such as market data or customer feedback. |
181+
| **Data Aggregation** | Summarize and aggregate data to create high-level views and dashboards. |
182+
183+
184+
#### **Data Science:**
185+
- **Purpose:** To leverage the mirrored data for advanced analytics, machine learning, and artificial intelligence projects.
186+
> Benefits:
187+
- **Predictive Analytics:** Use historical data to build models that predict future trends and outcomes.
188+
- **Machine Learning:** Train machine learning models on the mirrored data to automate decision-making processes and improve business operations.
189+
- **AI Applications:** Develop AI-driven applications that can provide insights, recommendations, and automation.
190+
191+
> Examples:
192+
193+
| **Example** | **Description** |
194+
|-------------|-----------------|
195+
| **Customer Segmentation** | Use clustering algorithms to segment customers based on their behavior and preferences. |
196+
| **Demand Forecasting** | Predict future demand for products and services using time series analysis. |
197+
| **Anomaly Detection** | Identify unusual patterns or outliers in the data that may indicate fraud, errors, or opportunities. |
198+
199+
## Mirroring Process
200+
201+
```mermaid
202+
mindmap
203+
root((Mirroring Process))
204+
Querying Mirrored Data
205+
Data Storage
206+
```
207+
208+
### Data Storage
209+
210+
```mermaid
211+
graph TD
212+
A[SQL Database in Fabric] -->|Stores Data| B[.mdf Files]
213+
A -->|Mirrors Data| C[OneLake]
214+
C -->|Stores Data| D[Delta Parquet Files]
215+
D -->|Optimized for| E[Analytical Queries]
216+
217+
subgraph Data Storage
218+
A --> B
219+
A --> C
220+
C --> D
221+
D --> E
222+
end
223+
```
224+
225+
- **Mirrored Data Storage:** The mirrored data is stored as delta parquet files in OneLake. Parquet is a columnar storage file format optimized for analytical queries, providing efficient data compression and encoding schemes.
226+
- **Delta Format:** The delta format supports ACID transactions and allows for efficient data updates and deletions. This ensures that the mirrored data remains consistent and reliable.
227+
- **Storage Management:** OneLake provides a unified storage solution that integrates with various Fabric services, such as Spark, Power BI, and data engineering tools. This integration allows for seamless data management and analytics.
228+
229+
230+
### Querying Mirrored Data
231+
232+
```mermaid
233+
mindmap
234+
root((Querying Mirrored Data))
235+
SQL Analytics Endpoint
236+
Queries Data
237+
Delta Parquet Files in OneLake
238+
Provides Data
239+
Reporting and Analytics
240+
Read-Only Mode
241+
Protects Operational Data
242+
Creates Views
243+
Data Presentation
244+
Joins Tables
245+
Cross-Database Queries
246+
Access Tools
247+
SSMS
248+
VS Code
249+
Fabric Portal
250+
```
251+
252+
- **SQL Analytics Endpoint:** The SQL Analytics Endpoint provides a SQL-based experience for querying mirrored data stored as delta parquet files in OneLake. It allows you to run reporting and analytics queries without impacting the primary workload of the operational ERP database.
253+
- **Read-Only Mode:** The endpoint operates in read-only mode, ensuring that analytics queries do not modify the mirrored data. This protects the integrity of the operational data and prevents any performance degradation.
254+
- **Creating Views:** You can create views in your SQL analytics endpoint to shape the data presentation. Views allow you to define how the data should be displayed and can include complex joins, aggregations, and calculations.
255+
- **Joining Tables:** The SQL analytics endpoint supports joining mirrored tables with other tables in different warehouses or lakehouses in the workspace. This enables you to perform cross-database queries and integrate data from multiple sources for comprehensive analysis.
256+
- **Access and Querying Tools:** You can access the SQL Analytics Endpoint through various tools such as SQL Server Management Studio, Visual Studio Code, and the Fabric portal. It supports T-SQL language, allowing you to create views, save functions, and apply SQL security.
257+
258+
259+
260+
109261

110262

111263

0 commit comments

Comments
 (0)