Skip to content

Commit 0f93896

Browse files
authored
Merge 5f52a39 into 41a350d
2 parents 41a350d + 5f52a39 commit 0f93896

4 files changed

Lines changed: 141 additions & 30 deletions

File tree

0_Azure/1_AzureData/README.md

Lines changed: 2 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -5,25 +5,17 @@ Costa Rica
55
[![GitHub](https://img.shields.io/badge/--181717?logo=github&logoColor=ffffff)](https://github.com/)
66
[brown9804](https://github.com/brown9804)
77

8-
Last updated: 2024-11-19
8+
Last updated: 2025-02-13
99

1010
----------
1111

1212
## Content
1313

1414
- [Data Storage](./0_DataStorage/)
1515
- [Databases](./1_Databases/)
16-
- [Azure Databricks](./2_Databricks)
1716

1817
Azure Data Storage provides scalable, secure, and accessible cloud storage, ideal for big data and analytics, with various storage tiers. It supports a wide range of services and tools. Azure also offers relational and non-relational databases, with built-in management for high availability and performance, catering to different application needs.
1918

20-
21-
| Area | Category | Service | Overview |
22-
| ---- | ---- | ---- | ---- |
23-
| Big Data Analytics | Service | [Azure Databricks](https://azure.microsoft.com/en-us/products/databricks/) | Azure Databricks is a fast, easy, and collaborative Apache Spark-based analytics platform optimized for Azure. It provides an interactive workspace for data engineers, data scientists, and business analysts. <br/> <br/> For more information: <br/> [Azure Databricks Overview](https://azure.microsoft.com/en-us/products/databricks/) <br/> [What is Azure Databricks?](https://learn.microsoft.com/en-us/azure/databricks/scenarios/what-is-azure-databricks) <br/> [Azure Databricks Learning documents](https://learn.microsoft.com/en-us/azure/databricks/). |
24-
| Data Integration | Service | [Azure Data Factory](https://azure.microsoft.com/en-us/products/data-factory/) | Azure Data Factory is a cloud-based data integration service that allows you to create data-driven workflows for orchestrating data movement and transforming data at scale. <br/> <br/> For more information: <br/> [Azure Data Factory Overview](https://azure.microsoft.com/en-us/products/data-factory/) <br/> [What is Azure Data Factory?](https://learn.microsoft.com/en-us/azure/data-factory/introduction) <br/> [Azure Data Factory Learning documents](https://learn.microsoft.com/en-us/azure/data-factory/). |
25-
26-
2719
## Differences between Azure Data Storage and Databases
2820

2921
Azure Data Storage and Databases both persist data but are optimized for different purposes. Storage provides durable capacity while databases structure data for efficient access. Storage suits long-term file retention while databases enable interactive applications.
@@ -41,8 +33,6 @@ Image from [here](https://www.edureka.co/blog/azure-storage-tutorial/)
4133
| Durable long-term retention | Temporary persistence tier |
4234

4335

44-
45-
4636
## DataFrames types:
4737

4838
Comparative analysis of various types of DataFrames. Each type of DataFrame has its unique features and is suited for different use cases. The table below summarizes the key characteristics and common applications of each type:
@@ -62,4 +52,4 @@ Comparative analysis of various types of DataFrames. Each type of DataFrame has
6252
<div align="center">
6353
<h3 style="color: #4CAF50;">Total Visitors</h3>
6454
<img src="https://profile-counter.glitch.me/brown9804/count.svg" alt="Visitor Count" style="border: 2px solid #4CAF50; border-radius: 5px; padding: 5px;"/>
65-
</div>
55+
</div>
Lines changed: 111 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,111 @@
1+
# Azure Databricks
2+
3+
Costa Rica
4+
5+
[![GitHub](https://img.shields.io/badge/--181717?logo=github&logoColor=ffffff)](https://github.com/)
6+
[brown9804](https://github.com/brown9804)
7+
8+
Last updated: 2024-11-15
9+
10+
----------
11+
12+
13+
## Wiki
14+
- [What is Azure Databricks?](https://learn.microsoft.com/en-us/azure/databricks/introduction/)
15+
- [Tutorial: Implement Azure Databricks with an Azure Cosmos DB endpoint](https://learn.microsoft.com/en-us/azure/databricks/scenarios/service-endpoint-cosmosdb)
16+
- [Query databases using JDBC](https://learn.microsoft.com/en-us/azure/databricks/connect/external-systems/jdbc)
17+
- [Query SQL Server with Azure Databricks](https://learn.microsoft.com/en-us/azure/databricks/connect/external-systems/sql-server)
18+
- [How to connect from Azure Databricks to Azure SQL DB using service](https://stackoverflow.com/collectives/azure/articles/75189853/how-to-connect-from-azure-databricks-to-azure-sql-db-using-service-principal)
19+
- [Azure DataBricks To Connect SQL DataBase with Pyspark](https://stackoverflow.com/questions/76820391/azure-databricks-to-connect-sql-database-with-pyspark)
20+
- [Use a SQL connector, driver, or API](https://learn.microsoft.com/en-us/azure/databricks/dev-tools/index-driver)
21+
- [Connect to azure sql database from databricks](https://community.databricks.com/t5/data-engineering/connect-to-azure-sql-database-from-databricks-using-service/td-p/36174)
22+
- [Query data in Azure Synapse Analytics](https://learn.microsoft.com/en-us/azure/databricks/connect/external-systems/synapse-analytics)
23+
- [Connection from databricks to azure synapse](https://stackoverflow.com/questions/72873898/connection-from-databricks-to-azure-synapse)
24+
25+
## Introduction to Azure Databricks
26+
27+
> Azure Databricks is a comprehensive analytics platform for big data and AI, built on Apache Spark. It offers a collaborative workspace for data engineers, scientists, and analysts to engage in data processing, machine learning, and real-time analytics.
28+
29+
<figure>
30+
<img
31+
width="800"
32+
src="https://github.com/user-attachments/assets/c6c298f2-aae2-4ae8-b6cc-0407a22a32a2"
33+
alt="The beautiful MDN logo.">
34+
<figcaption> <br/> From https://www.databricks.com/product/azure </figcaption>
35+
</figure>
36+
37+
| Aspect | Details |
38+
| ----- | ---- |
39+
| Pricing tier | - **Standard**: Includes core Apache Spark features and Microsoft Entra integration. <br/> - **Premium**: Offers role-based access controls and advanced enterprise features. <br/> - **Trial**: Provides a 14-day free trial of our premium workspace. |
40+
41+
42+
### **Key Features**
43+
Here are some of the key features of Azure Databricks:
44+
45+
| **Feature** | **Description** |
46+
|------------------------|---------------------------------------------------------------------------------|
47+
| **Data Processing** | Efficiently process large volumes of data using Apache Spark. |
48+
| **Machine Learning** | Build, train, and deploy machine learning models at scale. |
49+
| **Real-Time Analytics**| Perform real-time data analysis and generate insights quickly. |
50+
| **Collaborative Workspace** | Provides a collaborative environment for different roles to work together. |
51+
| **Scalability** | Automatically scales resources to handle varying workloads. |
52+
| **Integration** | Seamlessly integrates with other Azure services and open-source tools. |
53+
54+
This diagram shows how Azure Databricks integrates with various data sources and storage solutions, processes data using Apache Spark, and supports machine learning and real-time analytics.
55+
56+
```mermaid
57+
graph TD
58+
A[Data Sources] -->|Azure Blob Storage, SQL DB| B[Azure Databricks Workspace]
59+
B --> C[Machine Learning - MLflow, etc.]
60+
61+
E[Data Storage] -->|Azure Data Lake, SQL DB| F[Data Processing - Apache Spark]
62+
F --> D[Real-Time Analytics - Dashboards, etc.]
63+
64+
```
65+
66+
## Architecture and Components
67+
68+
Azure Databricks architecture is divided into two main components:
69+
70+
| **Component** | **Description** |
71+
|--------------------------|---------------------------------------------------------------------------------|
72+
| **Control Plane** | Manages backend services, authentication, job scheduling, and cluster management. Hosts the web application and REST APIs. |
73+
| **Compute Plane** | Where data processing happens, consisting of clusters running Apache Spark jobs. |
74+
75+
There are two types of compute planes:
76+
77+
| **Type of Compute Plane** | **Description** |
78+
|--------------------------|---------------------------------------------------------------------------------|
79+
| **Serverless Compute** | Managed by Azure Databricks, with automatic scaling and resource management. Ideal for users preferring a hands-off approach. |
80+
| **Classic Compute** | Managed by the user, with full control over compute resources within their Azure subscription. Allows for more customization and manual scaling. |
81+
82+
This diagram below shows how the control plane interacts with both the serverless and classic compute planes.
83+
84+
```mermaid
85+
graph TD
86+
subgraph Control Plane
87+
A[Backend Services]
88+
B[Web Application]
89+
C[REST APIs]
90+
end
91+
subgraph Compute Plane
92+
D[Serverless Compute]
93+
E[Classic Compute]
94+
end
95+
A --> D
96+
A --> E
97+
B --> D
98+
B --> E
99+
C --> D
100+
C --> E
101+
```
102+
103+
## Recommended Trainings
104+
- [Explore Azure Databricks](https://learn.microsoft.com/en-us/training/modules/explore-azure-databricks/)
105+
- [Perform data analysis with Azure Databricks](https://learn.microsoft.com/en-us/training/modules/perform-data-analysis-azure-databricks/)
106+
- [Use Apache Spark in Azure Databricks](https://learn.microsoft.com/en-us/training/modules/use-apache-spark-azure-databricks/)
107+
108+
<div align="center">
109+
<h3 style="color: #4CAF50;">Total Visitors</h3>
110+
<img src="https://profile-counter.glitch.me/brown9804/count.svg" alt="Visitor Count" style="border: 2px solid #4CAF50; border-radius: 5px; padding: 5px;"/>
111+
</div>

0_Azure/2_AzureAnalytics/README.md

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ Costa Rica
66
[![GitHub](https://img.shields.io/badge/--181717?logo=github&logoColor=ffffff)](https://github.com/)
77
[brown9804](https://github.com/brown9804)
88

9-
Last updated: 2024-11-19
9+
Last updated: 2025-02-13
1010

1111
----------
1212

@@ -15,6 +15,7 @@ Last updated: 2024-11-19
1515
- [Microsoft Fabric](./0_Fabric/README.md)
1616
- [Azure Synapse Analytics](./1_SynapseAnalytics/README.md)
1717
- [Azure Event Hubs](./2_EventHubs)
18+
- [Azure Databricks](./3_Databricks)
1819

1920
## Wiki
2021

@@ -24,8 +25,10 @@ Last updated: 2024-11-19
2425
| Real-time Data Processing | Service | [Azure Stream Analytics](https://azure.microsoft.com/en-us/products/stream-analytics/) | Azure Stream Analytics is a real-time analytics and complex event-processing engine that is designed to analyze and process high volumes of fast streaming data from multiple sources simultaneously. <br/> <br/> For more information: <br/> [Azure Stream Analytics Overview](https://azure.microsoft.com/en-us/products/stream-analytics/) <br/> [What is Azure Stream Analytics?](https://learn.microsoft.com/en-us/azure/stream-analytics/stream-analytics-introduction) <br/> [Azure Stream Analytics Learning documents](https://learn.microsoft.com/en-us/azure/stream-analytics/). |
2526
| Unified Analytics Platform | Service | [Microsoft Fabric](https://azure.microsoft.com/en-us/blog/introducing-microsoft-fabric-data-analytics-for-the-era-of-ai/) | Microsoft Fabric is an end-to-end, unified analytics platform that integrates technologies like Azure Data Factory, Azure Synapse Analytics, and Power BI into a single product. It covers data movement, processing, ingestion, transformation, real-time event routing, and report building, simplifying analytics requirements for enterprises. <br/> <br/> For more information: <br/> [Microsoft Fabric Overview](https://azure.microsoft.com/en-us/blog/introducing-microsoft-fabric-data-analytics-for-the-era-of-ai/) <br/> [What is Microsoft Fabric?](https://learn.microsoft.com/en-us/fabric/get-started/microsoft-fabric-overview) <br/> [Microsoft Fabric Learning documents](https://learn.microsoft.com/en-us/fabric/). |
2627
| Real-time Data Streaming | Service | [Azure Event Hubs](./2_EventHubs/) | Azure Event Hubs is a fully managed, real-time data streaming platform that can ingest and process millions of events per second with low latency. It supports multiple protocols, including Apache Kafka, and integrates with various Azure services for building comprehensive data streaming solutions. <br/> <br/> For more information: <br/> [Azure Event Hubs Overview](https://learn.microsoft.com/en-us/azure/event-hubs/event-hubs-about) <br/> [What is Azure Event Hubs?](https://learn.microsoft.com/en-us/azure/event-hubs/event-hubs-about) <br/> Azure Event Hubs Learning documents. |
28+
| Big Data Analytics | Service | [Azure Databricks](https://azure.microsoft.com/en-us/products/databricks/) | Azure Databricks is a fast, easy, and collaborative Apache Spark-based analytics platform optimized for Azure. It provides an interactive workspace for data engineers, data scientists, and business analysts. <br/> <br/> For more information: <br/> [Azure Databricks Overview](https://azure.microsoft.com/en-us/products/databricks/) <br/> [What is Azure Databricks?](https://learn.microsoft.com/en-us/azure/databricks/scenarios/what-is-azure-databricks) <br/> [Azure Databricks Learning documents](https://learn.microsoft.com/en-us/azure/databricks/). |
29+
| Data Integration | Service | [Azure Data Factory](https://azure.microsoft.com/en-us/products/data-factory/) | Azure Data Factory is a cloud-based data integration service that allows you to create data-driven workflows for orchestrating data movement and transforming data at scale. <br/> <br/> For more information: <br/> [Azure Data Factory Overview](https://azure.microsoft.com/en-us/products/data-factory/) <br/> [What is Azure Data Factory?](https://learn.microsoft.com/en-us/azure/data-factory/introduction) <br/> [Azure Data Factory Learning documents](https://learn.microsoft.com/en-us/azure/data-factory/). |
2730

2831
<div align="center">
2932
<h3 style="color: #4CAF50;">Total Visitors</h3>
3033
<img src="https://profile-counter.glitch.me/brown9804/count.svg" alt="Visitor Count" style="border: 2px solid #4CAF50; border-radius: 5px; padding: 5px;"/>
31-
</div>
34+
</div>

0_Azure/3_AzureAI/9_AzureOpenAI/README.md

Lines changed: 23 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ Costa Rica
77
[![GitHub](https://img.shields.io/badge/--181717?logo=github&logoColor=ffffff)](https://github.com/)
88
[brown9804](https://github.com/brown9804)
99

10-
Last updated: 2024-11-19
10+
Last updated: 2025-02-13
1111

1212
----------
1313

@@ -17,21 +17,28 @@ Key Features:
1717
- **Models Available**: The models available through Azure OpenAI Service include GPT-4, GPT-3.5-Turbo, and Embeddings.
1818
- **Security and Compliance**: Azure OpenAI Service offers the security and enterprise promise of Azure.
1919

20-
## Content
21-
22-
<!-- TOC -->
23-
24-
- [Azure OpenAI](#azure-openai)
25-
- [Content](#content)
26-
- [Example of Use Cases](#example-of-use-cases)
27-
- [Text Generation - Azure OpenAI:](#text-generation---azure-openai)
28-
- [Sentiment Analysis - Azure OpenAI:](#sentiment-analysis---azure-openai)
29-
- [Language Translation - Azure OpenAI:](#language-translation---azure-openai)
30-
- [Content Moderation - Azure OpenAI:](#content-moderation---azure-openai)
31-
- [Speech Recognition - Azure OpenAI:](#speech-recognition---azure-openai)
32-
- [Demos & Tech Talks](./demos/)
33-
- [Optimizing Chatbot Efficiency](./demos/1_OptimizingChatbotEfficiency.md)
34-
<!-- /TOC -->
20+
21+
<details>
22+
<summary><b>List of References </b> (Click to expand)</summary>
23+
24+
- [Azure Open AI resource - Change log](https://github.com/Azure/azure-sdk-for-net/blob/main/sdk/openai/Azure.AI.OpenAI/CHANGELOG.md)
25+
26+
</details>
27+
28+
<details>
29+
<summary><b>Table of Content </b> (Click to expand)</summary>
30+
31+
- [Example of Use Cases](#example-of-use-cases)
32+
- [Text Generation - Azure OpenAI:](#text-generation---azure-openai)
33+
- [Sentiment Analysis - Azure OpenAI:](#sentiment-analysis---azure-openai)
34+
- [Language Translation - Azure OpenAI:](#language-translation---azure-openai)
35+
- [Content Moderation - Azure OpenAI:](#content-moderation---azure-openai)
36+
- [Speech Recognition - Azure OpenAI:](#speech-recognition---azure-openai)
37+
- [Demos & Tech Talks](./demos/)
38+
- [Optimizing Chatbot Efficiency](./demos/1_OptimizingChatbotEfficiency.md)
39+
40+
</details>
41+
3542

3643
## Example of Use Cases
3744

0 commit comments

Comments
 (0)