You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
-[How to Configure and Use the SQL Analytics Endpoint](#how-to-configure-and-use-the-sql-analytics-endpoint)
52
+
-[Fabric AI Skill](#fabric-ai-skill)
52
53
53
54
</details>
54
55
@@ -160,7 +161,7 @@ graph TD
160
161
> -`ACID Transaction`s: Ensures data reliability and consistency, supporting complex data operations without data corruption.
161
162
> -`Schema Enforcement and Evolution`: Allows for schema changes over time, making it easier to manage evolving data structures.
162
163
> -`Time Travel:` Enables querying of historical data, providing the ability to access and revert to previous versions of data.
163
-
> -`Efficient Data Management`: Features like compaction, Z-Order, and V-Order optimize data storage and query performance
164
+
> -`Efficient Data Management`: Features like compaction, [Z-Order](#z-order-and-v-order), and [V-Order](#z-order-and-v-order) optimize data storage and query performance
|**Purpose**| Improves query performance by co-locating related information in the same set of files. | Enhances read performance by organizing data in a way that leverages Microsoft Verti-Scan technology. |
184
-
|**Key Features**| - Data Co-Location: Organizes data based on one or more columns, storing rows with similar values together. <br/> - Query Efficiency: Reduces the amount of data read during queries, improving performance. <br/> - Compatibility: Works with Delta Lake to enhance data-skipping algorithms. | - Special Sorting: Applies special sorting techniques to Parquet files. <br/> - Row Group Distribution: Optimizes row group distribution for better read performance. <br/> - Dictionary Encoding and Compression: Uses efficient dictionary encoding and compression. <br/> - Performance Boost: Provides fast reads under various compute engines. <br/> - Cost Efficiency: Reduces network, disk, and CPU resources during reads. |
185
-
|**Timing**| Applied during read time (or table optimization). | Applied during write time. |
186
-
|**Use Cases**| - When you need to improve query performance by reducing the amount of data read. <br/> - For queries that frequently filter on specific columns. | - When you need to enhance read performance and reduce storage costs. <br/> - For scenarios requiring efficient data access across various compute engines. |
187
-
| **Compatibility** | Requires specific tools like Delta Lake. | Universally compatible with all Parquet engines.
188
-
189
181
190
182
| Feature | Parquet | Delta | Available in Parquet? | Available in Delta? |
|**Data Versioning**| Not available, limiting the ability to track changes over time. | Provides data versioning, allowing for auditing and rollback scenarios. | ❌ | ✔️ |
|**Efficient Updates**| Does not support efficient updates, making it less suitable for frequently changing data. | Allows for efficient updates and deletes, ideal for dynamic datasets. | ❌ | ✔️ |
201
-
|**Query Optimization**| Basic query optimization, relying on columnar storage benefits. | Advanced query optimization with features like data skipping and Z-order indexing. | ✔️ | ✔️ |
193
+
|**Query Optimization**| Basic query optimization, relying on columnar storage benefits. | Advanced query optimization with features like data skipping and [Z-order](#z-order-and-v-order) indexing. | ✔️ | ✔️ |
202
194
|**Use Case**| Ideal for data warehousing, batch processing, and scenarios where data is primarily read and not frequently updated. | Best suited for data lakes, real-time analytics, and environments requiring strict data integrity and frequent updates. | ✔️ | ✔️ |
203
195
|**Additional Context**| Parquet is excellent for read-heavy workloads and large-scale data analytics. It's widely supported and highly efficient for scenarios where data doesn't change frequently. | Delta builds on Parquet by adding features like ACID transactions, data versioning, and efficient updates/deletes. It's designed for environments where data integrity, frequent updates, and complex data operations are crucial. | ✔️ | ✔️ |
|**Purpose**| Improves query performance by co-locating related information in the same set of files. | Enhances read performance by organizing data in a way that leverages Microsoft Verti-Scan technology. |
203
+
|**Key Features**| - Data Co-Location: Organizes data based on one or more columns, storing rows with similar values together. <br/> - Query Efficiency: Reduces the amount of data read during queries, improving performance. <br/> - Compatibility: Works with Delta Lake to enhance data-skipping algorithms. | - Special Sorting: Applies special sorting techniques to Parquet files. <br/> - Row Group Distribution: Optimizes row group distribution for better read performance. <br/> - Dictionary Encoding and Compression: Uses efficient dictionary encoding and compression. <br/> - Performance Boost: Provides fast reads under various compute engines. <br/> - Cost Efficiency: Reduces network, disk, and CPU resources during reads. |
204
+
|**Timing**| Applied during read time (or table optimization). | Applied during write time. |
205
+
|**Use Cases**| - When you need to improve query performance by reducing the amount of data read. <br/> - For queries that frequently filter on specific columns. | - When you need to enhance read performance and reduce storage costs. <br/> - For scenarios requiring efficient data access across various compute engines. |
206
+
| **Compatibility** | Requires specific tools like Delta Lake. | Universally compatible with all Parquet engines.
0 commit comments