Skip to content

Commit 9d66ec8

Browse files
authored
Merge 9897d2b into 3c834a7
2 parents 3c834a7 + 9897d2b commit 9d66ec8

2 files changed

Lines changed: 274 additions & 2 deletions

File tree

Lines changed: 269 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,269 @@
1+
# Mirroring ERP into Fabric: Performance Analysis
2+
3+
Costa Rica
4+
5+
[![GitHub](https://img.shields.io/badge/--181717?logo=github&logoColor=ffffff)](https://github.com/)
6+
[brown9804](https://github.com/brown9804)
7+
8+
Last updated: 2025-02-28
9+
10+
----------
11+
12+
> Quick technical overview of Mirroring an ERP System Database into Microsoft Fabric
13+
14+
> [!NOTE]
15+
> `Shortcuts` in Microsoft Fabric are objects that point to other storage locations, either within OneLake or external sources like Azure Data Lake, Amazon S3, and Dataverse. They provide `real-time access` to data without physically copying it, reducing storage overhead and ensuring the latest data is always available. <br/>
16+
> `Mirroring` in Microsoft Fabric involves creating a synchronized copy of an external database within OneLake. This process `physically stores` the data in Fabric, enabling faster query performance and ensuring data consistency through continuous updates.
17+
18+
<details>
19+
<summary><b>List of References</b> (Click to expand)</summary>
20+
21+
- [What is Mirroring in Fabric?](https://learn.microsoft.com/en-us/fabric/database/mirrored-database/overview)
22+
- [Mirroring Fabric SQL database in Microsoft Fabric (preview)](https://learn.microsoft.com/en-us/fabric/database/sql/mirroring-overview)
23+
- [Open mirroring in Microsoft Fabric (Preview)](https://learn.microsoft.com/en-us/fabric/database/mirrored-database/open-mirroring)
24+
- [Latency in Activator](https://learn.microsoft.com/en-us/fabric/real-time-intelligence/data-activator/activator-latency)
25+
- [Permission model](https://learn.microsoft.com/en-us/fabric/security/permission-model)
26+
- [Secure data access in Microsoft Fabric](https://learn.microsoft.com/en-us/training/modules/secure-data-access-in-fabric/)
27+
- [Authorization in SQL database in Microsoft Fabric](https://learn.microsoft.com/en-us/fabric/database/sql/authorization)
28+
- [What is the SQL analytics endpoint for a lakehouse? - Microsoft Fabric](https://learn.microsoft.com/en-us/fabric/data-engineering/lakehouse-sql-analytics-endpoint)
29+
- [Query the SQL analytics endpoint of your SQL database in Fabric](https://learn.microsoft.com/en-us/fabric/database/sql/query-sql-analytics-endpoint)
30+
- [SQL database Overview (Preview) - Microsoft Fabric](https://learn.microsoft.com/en-us/fabric/database/sql/overview)
31+
32+
</details>
33+
34+
35+
<details>
36+
<summary><b>Table of Content</b> (Click to expand)</summary>
37+
38+
39+
</details>
40+
41+
## Overview
42+
43+
| **Purpose and Benefits** | **Description** |
44+
|--------------------------|-----------------|
45+
| **Continuous Replication** | Microsoft Fabric continuously replicates data from your ERP system database into Fabric's OneLake. |
46+
| **Low Latency** | The replication process is designed to be low-latency, ensuring near real-time data availability for analytics. |
47+
| **Read-Only Access** | The mirrored data is stored in a read-only format, protecting the operational ERP database from performance degradation due to analytics queries. |
48+
| **Use Cases** | - **Analytics and Reporting:** Perform analytics and generate reports without impacting the operational ERP database. <br> - **Data Engineering:** Use the mirrored data for data engineering tasks, such as data transformation and integration. <br> - **Data Science:** Leverage the mirrored data for data science projects, including machine learning and AI. |
49+
50+
### Continuous Replication
51+
52+
- **Replication Mechanism:** Microsoft Fabric uses a continuous data replication mechanism to mirror data from your ERP system database into Fabric's OneLake. This process involves capturing changes in the source database and applying them to the mirrored database in near real-time.
53+
- **Change Data Capture (CDC):** The replication process leverages Change Data Capture (CDC) technology to track changes (inserts, updates, deletes) in the source ERP database. These changes are then propagated to the mirrored database in Fabric.
54+
- **Data Consistency:** The replication ensures data consistency between the source ERP database and the mirrored database in Fabric. Any changes made in the source database are reflected in the mirrored database almost immediately.
55+
- **Scalability:** The replication process is designed to handle large volumes of data, making it suitable for enterprise-scale ERP systems.
56+
57+
```mermaid
58+
graph LR
59+
A[ERP System Database] -->|Captures Changes| B[Change Data Capture]
60+
B -->|Propagates Changes| C[Replication Engine]
61+
C -->|Applies Changes| D[Fabric OneLake]
62+
D -->|Stores Data| E[Read-Only Mirrored Data]
63+
64+
subgraph Continuous Replication Process
65+
direction LR
66+
A --> B
67+
B --> C
68+
C --> D
69+
D --> E
70+
end
71+
```
72+
73+
### Low Latency
74+
75+
- **Near Real-Time Replication:** The replication process is optimized for low latency, ensuring that data changes in the source ERP database are mirrored in Fabric's OneLake with minimal delay.
76+
- **Efficient Data Transfer:** The replication engine uses efficient data transfer protocols to minimize latency. This includes batching changes and using optimized network paths to transfer data quickly.
77+
- **Latency Metrics:** Microsoft Fabric provides tools to monitor replication latency, allowing administrators to ensure that the replication process meets the required performance standards.
78+
- **Real-Time Analytics:** The low-latency replication enables real-time analytics on the mirrored data, allowing businesses to make timely decisions based on the most current data.
79+
80+
```mermaid
81+
graph LR
82+
A[ERP System Database] -->|Near Real-Time Replication| B[Fabric OneLake]
83+
B -->|Efficient Data Transfer| C[Replication Engine]
84+
C -->|Monitors Latency| D[Latency Metrics]
85+
D -->|Enables| E[Real-Time Analytics]
86+
87+
subgraph Low Latency in MS Fabric
88+
direction LR
89+
A --> B
90+
B --> C
91+
C --> D
92+
D --> E
93+
end
94+
```
95+
96+
### Read-Only Access
97+
98+
- **Read-Only Format:** The mirrored data in Fabric's OneLake is stored in a read-only format, which means that it cannot be modified directly by users or applications.
99+
- **Protection of Source Database:** By providing read-only access to the mirrored data, the operational ERP database is protected from performance degradation caused by analytics queries.
100+
- **Access Controls:** Microsoft Fabric implements robust access controls to ensure that only authorized users can access the mirrored data. This includes role-based access control (RBAC) and item-level permissions.
101+
- **Data Security:** The read-only nature of the mirrored data also enhances security by preventing accidental or malicious modifications to the data.
102+
103+
```mermaid
104+
graph TD
105+
A[Read-Only Access] --> B[Read-Only Format]
106+
A --> C[Protection of Source DB]
107+
A --> D[Access Controls]
108+
A --> E[Data Security]
109+
110+
B -->|Stored in read-only format| F[Fabric OneLake]
111+
C -->|Prevents perf degradation| G[Operational ERP DB]
112+
D -->|Ensures authorized access| H[Role-Based Access Control ]
113+
D -->|Ensures authorized access| I[Item-Level Permissions]
114+
E -->|Prevents modifications| J[Enhanced Security]
115+
```
116+
117+
### Use Cases
118+
```mermaid
119+
mindmap
120+
root((Use Cases))
121+
Analytics and Reporting
122+
Benefits
123+
Uninterrupted Operations
124+
Timely Insights
125+
Scalability
126+
Examples
127+
Financial Reporting
128+
Sales Analysis
129+
Operational Metrics
130+
Data Engineering
131+
Benefits
132+
Data Transformation
133+
Data Integration
134+
Data Quality
135+
Examples
136+
ETL Processes
137+
Data Enrichment
138+
Data Aggregation
139+
Data Science
140+
Benefits
141+
Predictive Analytics
142+
Machine Learning
143+
AI Applications
144+
Examples
145+
Customer Segmentation
146+
Demand Forecasting
147+
Anomaly Detection
148+
```
149+
#### Analytics and Reporting
150+
151+
> The primary goal is to enable business users and analysts to perform data analysis and generate reports without affecting the performance of the operational ERP database.
152+
153+
**Benefits:**
154+
- **Uninterrupted Operations:** Since the mirrored data is read-only, analytics queries do not interfere with the day-to-day operations of the ERP system.
155+
- **Timely Insights:** Near real-time replication ensures that the data used for analytics is up-to-date, allowing for timely and accurate decision-making.
156+
- **Scalability:** The system can handle large volumes of data, making it suitable for enterprises with extensive reporting needs.
157+
158+
> Examples:
159+
160+
| **Example** | **Description** |
161+
|-------------|-----------------|
162+
| **Financial Reporting** | Generate financial statements, balance sheets, and profit & loss reports. |
163+
| **Sales Analysis** | Analyze sales trends, customer behavior, and product performance. |
164+
| **Operational Metrics** | Monitor key performance indicators (KPIs) such as inventory levels, order fulfillment rates, and production efficiency. |
165+
166+
#### Data Engineering
167+
168+
> To facilitate data transformation, integration, and preparation tasks that are essential for building data pipelines and workflows.
169+
170+
> Benefits:
171+
- **Data Transformation:** Convert raw data into a structured format suitable for analysis and reporting.
172+
- **Data Integration:** Combine data from multiple sources, including the ERP system, to create a unified data repository.
173+
- **Data Quality:** Implement data cleansing and validation processes to ensure the accuracy and reliability of the data.
174+
175+
> Examples:
176+
177+
| **Example** | **Description** |
178+
|-------------|-----------------|
179+
| **ETL Processes** | Extract, transform, and load data from the ERP system into a data warehouse or data lake. |
180+
| **Data Enrichment** | Enhance the ERP data with additional information from external sources, such as market data or customer feedback. |
181+
| **Data Aggregation** | Summarize and aggregate data to create high-level views and dashboards. |
182+
183+
184+
#### **Data Science:**
185+
- **Purpose:** To leverage the mirrored data for advanced analytics, machine learning, and artificial intelligence projects.
186+
> Benefits:
187+
- **Predictive Analytics:** Use historical data to build models that predict future trends and outcomes.
188+
- **Machine Learning:** Train machine learning models on the mirrored data to automate decision-making processes and improve business operations.
189+
- **AI Applications:** Develop AI-driven applications that can provide insights, recommendations, and automation.
190+
191+
> Examples:
192+
193+
| **Example** | **Description** |
194+
|-------------|-----------------|
195+
| **Customer Segmentation** | Use clustering algorithms to segment customers based on their behavior and preferences. |
196+
| **Demand Forecasting** | Predict future demand for products and services using time series analysis. |
197+
| **Anomaly Detection** | Identify unusual patterns or outliers in the data that may indicate fraud, errors, or opportunities. |
198+
199+
## Mirroring Process
200+
201+
```mermaid
202+
mindmap
203+
root((Mirroring Process))
204+
Querying Mirrored Data
205+
Data Storage
206+
```
207+
208+
### Data Storage
209+
210+
```mermaid
211+
graph TD
212+
A[SQL Database in Fabric] -->|Stores Data| B[.mdf Files]
213+
A -->|Mirrors Data| C[OneLake]
214+
C -->|Stores Data| D[Delta Parquet Files]
215+
D -->|Optimized for| E[Analytical Queries]
216+
217+
subgraph Data Storage
218+
A --> B
219+
A --> C
220+
C --> D
221+
D --> E
222+
end
223+
```
224+
225+
- **Mirrored Data Storage:** The mirrored data is stored as delta parquet files in OneLake. Parquet is a columnar storage file format optimized for analytical queries, providing efficient data compression and encoding schemes.
226+
- **Delta Format:** The delta format supports ACID transactions and allows for efficient data updates and deletions. This ensures that the mirrored data remains consistent and reliable.
227+
- **Storage Management:** OneLake provides a unified storage solution that integrates with various Fabric services, such as Spark, Power BI, and data engineering tools. This integration allows for seamless data management and analytics.
228+
229+
230+
### Querying Mirrored Data
231+
232+
```mermaid
233+
mindmap
234+
root((Querying Mirrored Data))
235+
SQL Analytics Endpoint
236+
Queries Data
237+
Delta Parquet Files in OneLake
238+
Provides Data
239+
Reporting and Analytics
240+
Read-Only Mode
241+
Protects Operational Data
242+
Creates Views
243+
Data Presentation
244+
Joins Tables
245+
Cross-Database Queries
246+
Access Tools
247+
SSMS
248+
VS Code
249+
Fabric Portal
250+
```
251+
252+
- **SQL Analytics Endpoint:** The SQL Analytics Endpoint provides a SQL-based experience for querying mirrored data stored as delta parquet files in OneLake. It allows you to run reporting and analytics queries without impacting the primary workload of the operational ERP database.
253+
- **Read-Only Mode:** The endpoint operates in read-only mode, ensuring that analytics queries do not modify the mirrored data. This protects the integrity of the operational data and prevents any performance degradation.
254+
- **Creating Views:** You can create views in your SQL analytics endpoint to shape the data presentation. Views allow you to define how the data should be displayed and can include complex joins, aggregations, and calculations.
255+
- **Joining Tables:** The SQL analytics endpoint supports joining mirrored tables with other tables in different warehouses or lakehouses in the workspace. This enables you to perform cross-database queries and integrate data from multiple sources for comprehensive analysis.
256+
- **Access and Querying Tools:** You can access the SQL Analytics Endpoint through various tools such as SQL Server Management Studio, Visual Studio Code, and the Fabric portal. It supports T-SQL language, allowing you to create views, save functions, and apply SQL security.
257+
258+
259+
260+
261+
262+
263+
264+
<div align="center">
265+
<h3 style="color: #4CAF50;">Total Visitors</h3>
266+
<img src="https://profile-counter.glitch.me/brown9804/count.svg" alt="Visitor Count" style="border: 2px solid #4CAF50; border-radius: 5px; padding: 5px;"/>
267+
</div>
268+
269+

0_Azure/2_AzureAnalytics/0_Fabric/demos/README.md

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ Costa Rica
55
[![GitHub](https://img.shields.io/badge/--181717?logo=github&logoColor=ffffff)](https://github.com/)
66
[brown9804](https://github.com/brown9804)
77

8-
Last updated: 2024-08-06
8+
Last updated: 2025-02-28
99

1010
------------------------------------------
1111

@@ -23,4 +23,7 @@ Before you begin, ensure you have met the following requirements:
2323
> Each demo includes detailed instructions on setup and requirements.
2424
2525

26-
26+
<div align="center">
27+
<h3 style="color: #4CAF50;">Total Visitors</h3>
28+
<img src="https://profile-counter.glitch.me/brown9804/count.svg" alt="Visitor Count" style="border: 2px solid #4CAF50; border-radius: 5px; padding: 5px;"/>
29+
</div>

0 commit comments

Comments
 (0)