|
| 1 | +# Mirroring ERP into Fabric: Performance Analysis |
| 2 | + |
| 3 | +Costa Rica |
| 4 | + |
| 5 | +[](https://github.com/) |
| 6 | +[brown9804](https://github.com/brown9804) |
| 7 | + |
| 8 | +Last updated: 2025-02-28 |
| 9 | + |
| 10 | +---------- |
| 11 | + |
| 12 | +> Quick technical overview of Mirroring an ERP System Database into Microsoft Fabric |
| 13 | +
|
| 14 | +> [!NOTE] |
| 15 | +> `Shortcuts` in Microsoft Fabric are objects that point to other storage locations, either within OneLake or external sources like Azure Data Lake, Amazon S3, and Dataverse. They provide `real-time access` to data without physically copying it, reducing storage overhead and ensuring the latest data is always available. <br/> |
| 16 | +> `Mirroring` in Microsoft Fabric involves creating a synchronized copy of an external database within OneLake. This process `physically stores` the data in Fabric, enabling faster query performance and ensuring data consistency through continuous updates. |
| 17 | +
|
| 18 | +<details> |
| 19 | +<summary><b>List of References</b> (Click to expand)</summary> |
| 20 | + |
| 21 | +- [What is Mirroring in Fabric?](https://learn.microsoft.com/en-us/fabric/database/mirrored-database/overview) |
| 22 | +- [Mirroring Fabric SQL database in Microsoft Fabric (preview)](https://learn.microsoft.com/en-us/fabric/database/sql/mirroring-overview) |
| 23 | +- [Open mirroring in Microsoft Fabric (Preview)](https://learn.microsoft.com/en-us/fabric/database/mirrored-database/open-mirroring) |
| 24 | +- [Latency in Activator](https://learn.microsoft.com/en-us/fabric/real-time-intelligence/data-activator/activator-latency) |
| 25 | +- [Permission model](https://learn.microsoft.com/en-us/fabric/security/permission-model) |
| 26 | +- [Secure data access in Microsoft Fabric](https://learn.microsoft.com/en-us/training/modules/secure-data-access-in-fabric/) |
| 27 | +- [Authorization in SQL database in Microsoft Fabric](https://learn.microsoft.com/en-us/fabric/database/sql/authorization) |
| 28 | +- [What is the SQL analytics endpoint for a lakehouse? - Microsoft Fabric](https://learn.microsoft.com/en-us/fabric/data-engineering/lakehouse-sql-analytics-endpoint) |
| 29 | +- [Query the SQL analytics endpoint of your SQL database in Fabric](https://learn.microsoft.com/en-us/fabric/database/sql/query-sql-analytics-endpoint) |
| 30 | +- [SQL database Overview (Preview) - Microsoft Fabric](https://learn.microsoft.com/en-us/fabric/database/sql/overview) |
| 31 | + |
| 32 | +</details> |
| 33 | + |
| 34 | + |
| 35 | +<details> |
| 36 | +<summary><b>Table of Content</b> (Click to expand)</summary> |
| 37 | + |
| 38 | + |
| 39 | +</details> |
| 40 | + |
| 41 | +## Overview |
| 42 | + |
| 43 | +| **Purpose and Benefits** | **Description** | |
| 44 | +|--------------------------|-----------------| |
| 45 | +| **Continuous Replication** | Microsoft Fabric continuously replicates data from your ERP system database into Fabric's OneLake. | |
| 46 | +| **Low Latency** | The replication process is designed to be low-latency, ensuring near real-time data availability for analytics. | |
| 47 | +| **Read-Only Access** | The mirrored data is stored in a read-only format, protecting the operational ERP database from performance degradation due to analytics queries. | |
| 48 | +| **Use Cases** | - **Analytics and Reporting:** Perform analytics and generate reports without impacting the operational ERP database. <br> - **Data Engineering:** Use the mirrored data for data engineering tasks, such as data transformation and integration. <br> - **Data Science:** Leverage the mirrored data for data science projects, including machine learning and AI. | |
| 49 | + |
| 50 | +### Continuous Replication |
| 51 | + |
| 52 | +- **Replication Mechanism:** Microsoft Fabric uses a continuous data replication mechanism to mirror data from your ERP system database into Fabric's OneLake. This process involves capturing changes in the source database and applying them to the mirrored database in near real-time. |
| 53 | +- **Change Data Capture (CDC):** The replication process leverages Change Data Capture (CDC) technology to track changes (inserts, updates, deletes) in the source ERP database. These changes are then propagated to the mirrored database in Fabric. |
| 54 | +- **Data Consistency:** The replication ensures data consistency between the source ERP database and the mirrored database in Fabric. Any changes made in the source database are reflected in the mirrored database almost immediately. |
| 55 | +- **Scalability:** The replication process is designed to handle large volumes of data, making it suitable for enterprise-scale ERP systems. |
| 56 | + |
| 57 | +```mermaid |
| 58 | +graph LR |
| 59 | + A[ERP System Database] -->|Captures Changes| B[Change Data Capture] |
| 60 | + B -->|Propagates Changes| C[Replication Engine] |
| 61 | + C -->|Applies Changes| D[Fabric OneLake] |
| 62 | + D -->|Stores Data| E[Read-Only Mirrored Data] |
| 63 | +
|
| 64 | + subgraph Continuous Replication Process |
| 65 | + direction LR |
| 66 | + A --> B |
| 67 | + B --> C |
| 68 | + C --> D |
| 69 | + D --> E |
| 70 | + end |
| 71 | +``` |
| 72 | + |
| 73 | +### Low Latency |
| 74 | + |
| 75 | +- **Near Real-Time Replication:** The replication process is optimized for low latency, ensuring that data changes in the source ERP database are mirrored in Fabric's OneLake with minimal delay. |
| 76 | +- **Efficient Data Transfer:** The replication engine uses efficient data transfer protocols to minimize latency. This includes batching changes and using optimized network paths to transfer data quickly. |
| 77 | +- **Latency Metrics:** Microsoft Fabric provides tools to monitor replication latency, allowing administrators to ensure that the replication process meets the required performance standards. |
| 78 | +- **Real-Time Analytics:** The low-latency replication enables real-time analytics on the mirrored data, allowing businesses to make timely decisions based on the most current data. |
| 79 | + |
| 80 | +```mermaid |
| 81 | +graph LR |
| 82 | + A[ERP System Database] -->|Near Real-Time Replication| B[Fabric OneLake] |
| 83 | + B -->|Efficient Data Transfer| C[Replication Engine] |
| 84 | + C -->|Monitors Latency| D[Latency Metrics] |
| 85 | + D -->|Enables| E[Real-Time Analytics] |
| 86 | +
|
| 87 | + subgraph Low Latency in MS Fabric |
| 88 | + direction LR |
| 89 | + A --> B |
| 90 | + B --> C |
| 91 | + C --> D |
| 92 | + D --> E |
| 93 | + end |
| 94 | +``` |
| 95 | + |
| 96 | +### Read-Only Access |
| 97 | + |
| 98 | +- **Read-Only Format:** The mirrored data in Fabric's OneLake is stored in a read-only format, which means that it cannot be modified directly by users or applications. |
| 99 | +- **Protection of Source Database:** By providing read-only access to the mirrored data, the operational ERP database is protected from performance degradation caused by analytics queries. |
| 100 | +- **Access Controls:** Microsoft Fabric implements robust access controls to ensure that only authorized users can access the mirrored data. This includes role-based access control (RBAC) and item-level permissions. |
| 101 | +- **Data Security:** The read-only nature of the mirrored data also enhances security by preventing accidental or malicious modifications to the data. |
| 102 | + |
| 103 | +```mermaid |
| 104 | +graph TD |
| 105 | + A[Read-Only Access] --> B[Read-Only Format] |
| 106 | + A --> C[Protection of Source DB] |
| 107 | + A --> D[Access Controls] |
| 108 | + A --> E[Data Security] |
| 109 | +
|
| 110 | + B -->|Stored in read-only format| F[Fabric OneLake] |
| 111 | + C -->|Prevents perf degradation| G[Operational ERP DB] |
| 112 | + D -->|Ensures authorized access| H[Role-Based Access Control ] |
| 113 | + D -->|Ensures authorized access| I[Item-Level Permissions] |
| 114 | + E -->|Prevents modifications| J[Enhanced Security] |
| 115 | +``` |
| 116 | + |
| 117 | +### Use Cases |
| 118 | +```mermaid |
| 119 | +mindmap |
| 120 | + root((Use Cases)) |
| 121 | + Analytics and Reporting |
| 122 | + Benefits |
| 123 | + Uninterrupted Operations |
| 124 | + Timely Insights |
| 125 | + Scalability |
| 126 | + Examples |
| 127 | + Financial Reporting |
| 128 | + Sales Analysis |
| 129 | + Operational Metrics |
| 130 | + Data Engineering |
| 131 | + Benefits |
| 132 | + Data Transformation |
| 133 | + Data Integration |
| 134 | + Data Quality |
| 135 | + Examples |
| 136 | + ETL Processes |
| 137 | + Data Enrichment |
| 138 | + Data Aggregation |
| 139 | + Data Science |
| 140 | + Benefits |
| 141 | + Predictive Analytics |
| 142 | + Machine Learning |
| 143 | + AI Applications |
| 144 | + Examples |
| 145 | + Customer Segmentation |
| 146 | + Demand Forecasting |
| 147 | + Anomaly Detection |
| 148 | +``` |
| 149 | +#### Analytics and Reporting |
| 150 | + |
| 151 | +> The primary goal is to enable business users and analysts to perform data analysis and generate reports without affecting the performance of the operational ERP database. |
| 152 | +
|
| 153 | +**Benefits:** |
| 154 | +- **Uninterrupted Operations:** Since the mirrored data is read-only, analytics queries do not interfere with the day-to-day operations of the ERP system. |
| 155 | +- **Timely Insights:** Near real-time replication ensures that the data used for analytics is up-to-date, allowing for timely and accurate decision-making. |
| 156 | +- **Scalability:** The system can handle large volumes of data, making it suitable for enterprises with extensive reporting needs. |
| 157 | + |
| 158 | +> Examples: |
| 159 | +
|
| 160 | +| **Example** | **Description** | |
| 161 | +|-------------|-----------------| |
| 162 | +| **Financial Reporting** | Generate financial statements, balance sheets, and profit & loss reports. | |
| 163 | +| **Sales Analysis** | Analyze sales trends, customer behavior, and product performance. | |
| 164 | +| **Operational Metrics** | Monitor key performance indicators (KPIs) such as inventory levels, order fulfillment rates, and production efficiency. | |
| 165 | + |
| 166 | +#### Data Engineering |
| 167 | + |
| 168 | +> To facilitate data transformation, integration, and preparation tasks that are essential for building data pipelines and workflows. |
| 169 | +
|
| 170 | +> Benefits: |
| 171 | + - **Data Transformation:** Convert raw data into a structured format suitable for analysis and reporting. |
| 172 | + - **Data Integration:** Combine data from multiple sources, including the ERP system, to create a unified data repository. |
| 173 | + - **Data Quality:** Implement data cleansing and validation processes to ensure the accuracy and reliability of the data. |
| 174 | + |
| 175 | +> Examples: |
| 176 | +
|
| 177 | +| **Example** | **Description** | |
| 178 | +|-------------|-----------------| |
| 179 | +| **ETL Processes** | Extract, transform, and load data from the ERP system into a data warehouse or data lake. | |
| 180 | +| **Data Enrichment** | Enhance the ERP data with additional information from external sources, such as market data or customer feedback. | |
| 181 | +| **Data Aggregation** | Summarize and aggregate data to create high-level views and dashboards. | |
| 182 | + |
| 183 | + |
| 184 | +#### **Data Science:** |
| 185 | +- **Purpose:** To leverage the mirrored data for advanced analytics, machine learning, and artificial intelligence projects. |
| 186 | +> Benefits: |
| 187 | + - **Predictive Analytics:** Use historical data to build models that predict future trends and outcomes. |
| 188 | + - **Machine Learning:** Train machine learning models on the mirrored data to automate decision-making processes and improve business operations. |
| 189 | + - **AI Applications:** Develop AI-driven applications that can provide insights, recommendations, and automation. |
| 190 | + |
| 191 | +> Examples: |
| 192 | +
|
| 193 | +| **Example** | **Description** | |
| 194 | +|-------------|-----------------| |
| 195 | +| **Customer Segmentation** | Use clustering algorithms to segment customers based on their behavior and preferences. | |
| 196 | +| **Demand Forecasting** | Predict future demand for products and services using time series analysis. | |
| 197 | +| **Anomaly Detection** | Identify unusual patterns or outliers in the data that may indicate fraud, errors, or opportunities. | |
| 198 | + |
| 199 | +## Mirroring Process |
| 200 | + |
| 201 | +```mermaid |
| 202 | +mindmap |
| 203 | + root((Mirroring Process)) |
| 204 | + Querying Mirrored Data |
| 205 | + Data Storage |
| 206 | +``` |
| 207 | + |
| 208 | +### Data Storage |
| 209 | + |
| 210 | +```mermaid |
| 211 | +graph TD |
| 212 | + A[SQL Database in Fabric] -->|Stores Data| B[.mdf Files] |
| 213 | + A -->|Mirrors Data| C[OneLake] |
| 214 | + C -->|Stores Data| D[Delta Parquet Files] |
| 215 | + D -->|Optimized for| E[Analytical Queries] |
| 216 | +
|
| 217 | + subgraph Data Storage |
| 218 | + A --> B |
| 219 | + A --> C |
| 220 | + C --> D |
| 221 | + D --> E |
| 222 | + end |
| 223 | +``` |
| 224 | + |
| 225 | +- **Mirrored Data Storage:** The mirrored data is stored as delta parquet files in OneLake. Parquet is a columnar storage file format optimized for analytical queries, providing efficient data compression and encoding schemes. |
| 226 | +- **Delta Format:** The delta format supports ACID transactions and allows for efficient data updates and deletions. This ensures that the mirrored data remains consistent and reliable. |
| 227 | +- **Storage Management:** OneLake provides a unified storage solution that integrates with various Fabric services, such as Spark, Power BI, and data engineering tools. This integration allows for seamless data management and analytics. |
| 228 | + |
| 229 | + |
| 230 | +### Querying Mirrored Data |
| 231 | + |
| 232 | +```mermaid |
| 233 | +mindmap |
| 234 | + root((Querying Mirrored Data)) |
| 235 | + SQL Analytics Endpoint |
| 236 | + Queries Data |
| 237 | + Delta Parquet Files in OneLake |
| 238 | + Provides Data |
| 239 | + Reporting and Analytics |
| 240 | + Read-Only Mode |
| 241 | + Protects Operational Data |
| 242 | + Creates Views |
| 243 | + Data Presentation |
| 244 | + Joins Tables |
| 245 | + Cross-Database Queries |
| 246 | + Access Tools |
| 247 | + SSMS |
| 248 | + VS Code |
| 249 | + Fabric Portal |
| 250 | +``` |
| 251 | + |
| 252 | +- **SQL Analytics Endpoint:** The SQL Analytics Endpoint provides a SQL-based experience for querying mirrored data stored as delta parquet files in OneLake. It allows you to run reporting and analytics queries without impacting the primary workload of the operational ERP database. |
| 253 | +- **Read-Only Mode:** The endpoint operates in read-only mode, ensuring that analytics queries do not modify the mirrored data. This protects the integrity of the operational data and prevents any performance degradation. |
| 254 | +- **Creating Views:** You can create views in your SQL analytics endpoint to shape the data presentation. Views allow you to define how the data should be displayed and can include complex joins, aggregations, and calculations. |
| 255 | +- **Joining Tables:** The SQL analytics endpoint supports joining mirrored tables with other tables in different warehouses or lakehouses in the workspace. This enables you to perform cross-database queries and integrate data from multiple sources for comprehensive analysis. |
| 256 | +- **Access and Querying Tools:** You can access the SQL Analytics Endpoint through various tools such as SQL Server Management Studio, Visual Studio Code, and the Fabric portal. It supports T-SQL language, allowing you to create views, save functions, and apply SQL security. |
| 257 | + |
| 258 | + |
| 259 | + |
| 260 | + |
| 261 | + |
| 262 | + |
| 263 | + |
| 264 | +<div align="center"> |
| 265 | + <h3 style="color: #4CAF50;">Total Visitors</h3> |
| 266 | + <img src="https://profile-counter.glitch.me/brown9804/count.svg" alt="Visitor Count" style="border: 2px solid #4CAF50; border-radius: 5px; padding: 5px;"/> |
| 267 | +</div> |
| 268 | + |
| 269 | + |
0 commit comments