Skip to content
Tutorials>Azure Synapse Analytics>Part 8 (optional) - Create a Master Pipeline in Synapse

Part 8 (optional) - Create a Master Pipeline in Synapse

When working with multiple data ingestion or transformation notebooks (such as the Landing to Bronze layer), running each manually can be inefficient. A master pipeline solves this problem by:

  • Centralizing configuration and parameters.
  • Automatically distributing notebook execution.
  • Supporting parallelism for improved performance.
  • Scaling dynamically when new notebooks are added.

Create a master pipeline (PL_LDG_2_BRONZE) that orchestrates the execution of multiple Bronze notebooks in parallel.
This pipeline is dynamic and parameterized:

  • It receives global parameters (storage account, container names, base paths, data source, …).
  • It builds a list of Bronze notebook names (BRONZE_LIST) and iterates over them with a ForEach activity.
  • The ForEach runs the notebooks in parallel (controlled by batchCount) and passes pipeline parameters into each notebook invocation.
  • Designed for scalability, modular control, and reuse across environments.

1. Download sample pipelines

2. Create new Pipeline

  • In Azure Synapse, create a new pipeline named PL_LDG_2_BRONZE, click the {} (Code View) icon.

  • Copy the JSON content from the downloaded sample file (PL_LDG_2_BRONZE.json) and paste it into the code editor.

  • Click OK to apply.

3. Pipeline Structure

  • Global Parameters:
    The pipeline defines several global parameters that are passed to every Bronze notebook:
ParameterDescriptionExample
STORAGE_ACCOUNT_NAMEName of the Synapse storage accountadls2synapseintf
LANDING_STORAGE_ACCOUNT_CONTAINERContainer name for landing datatutorialcontainer
BRONZE_STORAGE_ACCOUNT_CONTAINERContainer name for bronze datatutorialcontainer
LANDING_BASE_PATHPath to the landing zonetest/landing
BRONZE_BASE_PATHPath to the bronze zonetest/bronze
DATA_SOURCEData source typeCSV

These parameters ensure that all notebooks use consistent paths and configurations.

  • Variables:
    A variable named BRONZE_LIST is created to store the names of all Bronze notebooks to execute:
json
"variables": {
    "BRONZE_LIST": {
        "type": "Array"
    }
}

This variable is initialized using a Set Variable activity:

json
"value": [
  "NB_LDG_2_BRONZE_AIRCRAFT_TYPES",
  "NB_LDG_2_BRONZE_AIRLINES",
  "NB_LDG_2_BRONZE_AIRPLANES",
  "NB_LDG_2_BRONZE_AIRPORTS",
  "NB_LDG_2_BRONZE_CITIES",
  "NB_LDG_2_BRONZE_COUNTRIES",
  "NB_LDG_2_BRONZE_ROUTES",
  "NB_LDG_2_BRONZE_TAXES"
]

Each element in the list represents the name of a Bronze notebook in Synapse.

  • ForEach Activity: Executing Notebooks in Parallel

    The ForEach_Bronze_Notebooks activity iterates through the list of Bronze notebooks and executes them in parallel.

json
"type": "ForEach",
"typeProperties": {
  "items": {
    "value": "@variables('BRONZE_LIST')",
    "type": "Expression"
  },
  "batchCount": 12
}
  • items: Refers to the BRONZE_LIST variable.
  • batchCount: Defines how many notebooks run in parallel (12 in this example).

Each iteration triggers a Synapse Notebook activity:

json
"type": "SynapseNotebook",
"typeProperties": {
    "notebook": {
        "referenceName": {
            "value": "@item()",
            "type": "Expression"
        },
        "type": "NotebookReference"
    },
    "parameters": {
        "STORAGE_ACCOUNT_NAME": {
            "value": {
                "value": "@pipeline().parameters.STORAGE_ACCOUNT_NAME",
                "type": "Expression"
            },
            "type": "string"
        },
        "LANDING_STORAGE_ACCOUNT_CONTAINER": {
            "value": {
                "value": "@pipeline().parameters.LANDING_STORAGE_ACCOUNT_CONTAINER",
                "type": "Expression"
            },
            "type": "string"
        },
        "BRONZE_STORAGE_ACCOUNT_CONTAINER": {
            "value": {
                "value": "@pipeline().parameters.BRONZE_STORAGE_ACCOUNT_CONTAINER",
                "type": "Expression"
            },
            "type": "string"
        },
        "LANDING_BASE_PATH": {
            "value": {
                "value": "@pipeline().parameters.LANDING_BASE_PATH",
                "type": "Expression"
            },
            "type": "string"
        },
        "BRONZE_BASE_PATH": {
            "value": {
                "value": "@pipeline().parameters.BRONZE_BASE_PATH",
                "type": "Expression"
            },
            "type": "string"
        }
    },
    "snapshot": true,
    "sparkPool": {
        "referenceName": "ApcheSparkPool",
        "type": "BigDataPoolReference"
    },
    "executorSize": "Small",
    "driverSize": "Small"
}

This ensures that each notebook receives the same global parameters for consistent execution.

4. Test the Master Pipeline

You can Debug or trigger your pipeline to test the pipeline. This screenshot shows the activity run results after you’ve debugged the pipeline PL_LDG_2_BRONZE.

All activities show ✅Succeeded. This means your entire pipeline logic worked without errors—from setting variables to running multiple notebooks in a loop.

Next steps:

  • Deploy the pipeline to your production environment.
  • Use “Add Trigger” to automate the pipeline (e.g., schedule it to run daily, or trigger it when new data arrives).

In short, a fully successful debug run like this confirms your pipeline is ready for automated, production-grade use.