diff --git a/0_Azure/2_AzureAnalytics/0_Fabric/demos/30_dynamic_pipeline_nbkparameters.md b/0_Azure/2_AzureAnalytics/0_Fabric/demos/30_DynamicPipeline_nbkparametersADF.md similarity index 99% rename from 0_Azure/2_AzureAnalytics/0_Fabric/demos/30_dynamic_pipeline_nbkparameters.md rename to 0_Azure/2_AzureAnalytics/0_Fabric/demos/30_DynamicPipeline_nbkparametersADF.md index 979a977d7..46318cee2 100644 --- a/0_Azure/2_AzureAnalytics/0_Fabric/demos/30_dynamic_pipeline_nbkparameters.md +++ b/0_Azure/2_AzureAnalytics/0_Fabric/demos/30_DynamicPipeline_nbkparametersADF.md @@ -5,7 +5,7 @@ Costa Rica [![GitHub](https://img.shields.io/badge/--181717?logo=github&logoColor=ffffff)](https://github.com/) [brown9804](https://github.com/brown9804) -Last updated: 2025-03-03 +Last updated: 2025-03-05 ---------- diff --git a/0_Azure/2_AzureAnalytics/0_Fabric/demos/31_FabricActivatorRulePipeline/GeneratesRandomData.ipynb b/0_Azure/2_AzureAnalytics/0_Fabric/demos/31_FabricActivatorRulePipeline/GeneratesRandomData.ipynb new file mode 100644 index 000000000..ef4d12893 --- /dev/null +++ b/0_Azure/2_AzureAnalytics/0_Fabric/demos/31_FabricActivatorRulePipeline/GeneratesRandomData.ipynb @@ -0,0 +1 @@ +{"cells":[{"cell_type":"code","source":["# Generates Dummy json file in Files/\n","\n","# Import necessary libraries\n","from pyspark.sql import SparkSession\n","from pyspark.sql.types import *\n","import random\n","from datetime import datetime, timedelta\n","\n","# Initialize Spark session (if not already initialized)\n","spark = SparkSession.builder.appName(\"GenerateRandomData\").getOrCreate()\n","\n","# Function to generate random data\n","def generate_random_data(num_entries):\n"," data = []\n"," for i in range(1, num_entries + 1):\n"," name = f\"User{i}\"\n"," entry = {\n"," \"id\": i,\n"," \"name\": name,\n"," \"age\": random.randint(18, 65),\n"," \"email\": f\"{name.lower()}@example.com\",\n"," \"created_at\": (datetime.now() - timedelta(days=random.randint(0, 365))).strftime(\"%Y-%m-%d %H:%M:%S\")\n"," }\n"," data.append(entry)\n"," return data\n","\n","# Generate 10 random entries\n","random_data = generate_random_data(10)\n","\n","# Define schema for the DataFrame\n","schema = StructType([\n"," StructField(\"id\", IntegerType(), True),\n"," StructField(\"name\", StringType(), True),\n"," StructField(\"age\", IntegerType(), True),\n"," StructField(\"email\", StringType(), True),\n"," StructField(\"created_at\", StringType(), True)\n","])\n","\n","# Create a DataFrame from the random data\n","df_random_data = spark.createDataFrame(random_data, schema=schema)\n","\n","# Write the DataFrame to the Lakehouse in the specified path\n","output_path = \"abfss://{WORKSPACE-NAME}@onelake.dfs.fabric.microsoft.com/raw_Bronze.Lakehouse/Files/random_data\" # Replace {WORKSPACE-NAME}\n","df_random_data.write.format(\"delta\").mode(\"overwrite\").save(output_path)\n","\n","print(f\"Random data has been saved to the Lakehouse at '{output_path}'.\")"],"outputs":[],"execution_count":null,"metadata":{"microsoft":{"language":"python","language_group":"synapse_pyspark"}},"id":"8d820f25-3c2e-45b3-8a08-af78f0d45e1d"}],"metadata":{"kernel_info":{"name":"synapse_pyspark"},"kernelspec":{"name":"synapse_pyspark","language":"Python","display_name":"Synapse PySpark"},"language_info":{"name":"python"},"microsoft":{"language":"python","language_group":"synapse_pyspark","ms_spell_check":{"ms_spell_check_language":"en"}},"nteract":{"version":"nteract-front-end@1.0.0"},"spark_compute":{"compute_id":"/trident/default","session_options":{"conf":{"spark.synapse.nbs.session.timeout":"1200000"}}},"dependencies":{}},"nbformat":4,"nbformat_minor":5} \ No newline at end of file diff --git a/0_Azure/2_AzureAnalytics/0_Fabric/demos/31_FabricActivatorRulePipeline/README.md b/0_Azure/2_AzureAnalytics/0_Fabric/demos/31_FabricActivatorRulePipeline/README.md new file mode 100644 index 000000000..c3dc87546 --- /dev/null +++ b/0_Azure/2_AzureAnalytics/0_Fabric/demos/31_FabricActivatorRulePipeline/README.md @@ -0,0 +1,124 @@ +# Microsoft Fabric: Automating Pipeline Execution with Activator + +Costa Rica + +[![GitHub](https://img.shields.io/badge/--181717?logo=github&logoColor=ffffff)](https://github.com/) +[brown9804](https://github.com/brown9804) + +Last updated: 2025-03-05 + +---------- + +> This process shows how to set up Microsoft Fabric Activator to automate workflows by detecting file creation events in a storage system and triggering another pipeline to run. + +
+List of Content (Click to expand) + + - [Set Up the First Pipeline](#set-up-the-first-pipeline) + - [Configure Activator to Detect the Event](#configure-activator-to-detect-the-event) + - [Set Up the Second Pipeline](#set-up-the-second-pipeline) + - [Define the Rule in Activator](#define-the-rule-in-activator) + - [Test the Entire Workflow](#test-the-entire-workflow) + - [Troubleshooting If Needed](#troubleshooting-if-needed) + +
+ + +## Set Up the First Pipeline + +1. **Create the Pipeline**: + - In [Microsoft Fabric](https://app.fabric.microsoft.com/), create the first pipeline that performs the required tasks. + +> [!NOTE] +> This code generates random data with fields such as id, name, age, email, and created_at, organizes it into a PySpark DataFrame, and saves it to a specified Lakehouse path using the Delta format. Click here to see the [example script](./GeneratesRandomData.ipynb) + +https://github.com/user-attachments/assets/95206bf3-83a7-42c1-b501-4879df22ef7d + + - Add a `Copy Data` activity as the final step in the pipeline. + +2. **Generate the Trigger File**: + - Configure the `Copy Data` activity to create a trigger file in a specific location, such as `Azure Data Lake Storage (ADLS)` or `OneLake`. + - Ensure the file name and path are consistent and predictable (e.g., `trigger_file.json` in a specific folder). +3. **Publish and Test**: Publish the pipeline and test it to ensure the trigger file is created successfully. + + https://github.com/user-attachments/assets/798a3b12-c944-459d-9e77-0112b5d82831 + +## Configure Activator to Detect the Event + +> [!TIP] +> Event options: + +https://github.com/user-attachments/assets/282fae9b-e1c6-490d-bd23-9ed9bdf6105d + +1. **Set Up an Event**: + - Create a new event to monitor the location where the trigger file is created (e.g., ADLS or OneLake). Click on `Real-Time`: + + image + + - Choose the appropriate event type, such as `File Created`. + + image + + image + + - Add a source: + + image + + image + + https://github.com/user-attachments/assets/43a9654b-e8d0-44da-80b9-9f528483fa3b + +2. **Test Event Detection**: + - Save the event and test it by manually running the first pipeline to ensure Activator detects the file creation. + - Check the **Event Details** screen in Activator to confirm the event is logged. + + https://github.com/user-attachments/assets/6b21194c-54b4-49de-9294-1bf78b1e5acd + +## Set Up the Second Pipeline + +1. **Create the Pipeline**: + - In Microsoft Fabric, create the second pipeline that performs the next set of tasks. + - Ensure it is configured to accept external triggers. +2. **Publish the Pipeline**: Publish the second pipeline and ensure it is ready to be triggered. + + https://github.com/user-attachments/assets/5b630579-a0ec-4d5b-b973-d9b4fdd8254c + +## Define the Rule in Activator + +1. **Setup the Activator**: + + https://github.com/user-attachments/assets/7c88e080-d5aa-4920-acd6-94c2e4ae0568 + +2. **Create a New Rule**: + - In `Activator`, create a rule that responds to the event you just configured. + - Set the condition to match the event details (e.g., file name, path, or metadata). +3. **Set the Action**: + - Configure the rule to trigger the second pipeline. + - Specify the pipeline name and pass any required parameters. +3. **Save and Activate**: + - Save the rule and activate it. + - Ensure the rule is enabled and ready to respond to the event. + + https://github.com/user-attachments/assets/5f139eeb-bab0-4d43-9f22-bbe44503ed75 + +## Test the Entire Workflow + +1. **Run the First Pipeline**: Execute the first pipeline and verify that the trigger file is created. +2. **Monitor Activator**: Check the `Event Details` and `Rule Activation Details` in Activator to ensure the event is detected and the rule is activated. +3. **Verify the Second Pipeline**: Confirm that the second pipeline is triggered and runs successfully. + + https://github.com/user-attachments/assets/0a1dab70-2317-4636-b0be-aa0cb301b496 + + +## Troubleshooting (If Needed) +- If the second pipeline does not trigger: + 1. Double-check the rule configuration in Activator. + 2. Review the logs in Activator for any errors or warnings. + + +
+

Total Visitors

+ Visitor Count +
+