Execute Any Azure Data Factory Pipeline with an Azure Function

Posted on February 18, 2020July 15, 2020 by mrpaulandrew

Following on from a previous blog post that I wrote a few months ago where I got an Azure Data Factory Pipeline run status with an Azure Function (link below). I recently found the need to create something very similar to execute any pipeline from an Azure Function.

https://mrpaulandrew.com/2019/11/21/get-any-azure-data-factory-pipeline-run-status-with-azure-functions/

Happily, this pipeline execution is basically the example provided by Microsoft in the documentation for the Data Factory .NET SDK (link also below). Given this I’m not taking any credit for the bulk of the function code. However, I did need to extend the body of the request for the function to accept any amount of pipeline parameters.

https://docs.microsoft.com/en-us/azure/data-factory/quickstart-create-data-factory-dot-net

The reason for needing such an Azure Function is because currently the Data Factory activity to execute another pipeline is not dynamic. The name of the downstream pipeline called can not be driven by metadata which upsets me greatly, everything should be dynamic 🙂

Replacing this activity with an Azure Function activity is less than ideal as this then presents the following challenges:

Making the Azure Function block and wait until the pipeline returns means potentially a long running durable function is required.
Calling an Azure Functions mean paying for the additional compute to a achieve the same behaviour which we are already paying for in Data Factory is used directly.
Authentication needs to be handled from Data Factory to the Azure Function App and then from the Azure Function back to the same Data Factory. This should be done via our application settings and handled in our release pipelines, rather than passed in the function body. No trolls please, I know.

With an understanding of these important caveats here’s an overview of the solution.

Note; I used .NET Core 3.0 for the below function.

Execute Pipeline

For the function itself, hopefully this is fairly intuitive once you’ve created your DataFactoryManagementClient and authenticated.

The only thing to be careful of is not using the CreateOrUpdateWithHttpMessagesAsync method by mistake. Make sure its Create Run. Sounds really obvious, but when you get code drunk names blur together and the very different method overloads will have you confused for hours!…. According to a friend 🙂

Body JSON without Pipeline Parameters

{ "tenantId": "1234-1234-1234-1234-1234", "applicationId": "1234-1234-1234-1234-1234", "authenticationKey": "Passw0rd123!", "subscriptionId": "1234-1234-1234-1234-1234", "resourceGroup": "CommunityDemos", "factoryName": "PaulsFunFactoryV2", "pipelineName": "WaitingPipeline" }

Body JSON with Pipeline Parameters

The pipeline parameters attributes can contain as many parameters as you want and basically just ingests them into the overloaded method; CreateRunWithHttpMessagesAsync as a Dictionary of string and object.

Data Factory doesn’t validate the parameter names so you can send anything. I just assumes the names passed are identical to the names of the actual pipeline parameters. If so, the values are simply mapped across.
{ "tenantId": "1234-1234-1234-1234-1234", "applicationId": "1234-1234-1234-1234-1234", "authenticationKey": "Passw0rd123!", "subscriptionId": "1234-1234-1234-1234-1234", "resourceGroup": "CommunityDemos", "factoryName": "PaulsFunFactoryV2", "pipelineName": "WaitingPipeline", "pipelineParameters": { "TestParam1": "Frank", "TestParam2": "Harry" } }

Output

{ "PipelineName": "WaitingPipeline", "RunIdUsed": "0d069026-bcbc-4356-8fe8-316ce5e07134", "Status": "Succeeded" }

Just Give Me The Code!

Ok! Here you go… ExecutePipeline.cs

The full solution is in the same Blob Support Content GitHub repository if you’d like to use the Visual Studio Solution.

I hope you found this post helpful.

Many thanks for reading.

55 thoughts on “Execute Any Azure Data Factory Pipeline with an Azure Function”

Pingback: Executing Azure Data Factory Pipelines with Azure Functions – Curated SQL
Pingback: Creating a Simple Staged Metadata Driven Processing Framework for Azure Data Factory Pipelines – Part 1 of 4 – Welcome to the Technical Community Blog of Paul Andrew
Pingback: Creating a Simple Staged Metadata Driven Processing Framework for Azure Data Factory Pipelines – Part 3 of 4 – Welcome to the Technical Community Blog of Paul Andrew
Manish Pradhan says:

April 2, 2020 at 5:28 am

Hi Paul.. We have a requirement of parallel execution of ADF pipelines . Our Azure functions run in parallel independently (multiple instances) . Hence we are thinking of using the function instances to trigger the pipeline instance in parallel. The other option is to use the tumbling window trigger . Which option do you suggest is more cost efficient.

LikeLike

Reply
1. mrpaulandrew says:
  
  April 2, 2020 at 8:39 am
  
  Hi Manish, I think the best answer I can offer is for you to check out my open source code project (Creating a Simple Metadata Driven Framework for Executing Azure Data Factory Pipelines). GitHub link: https://github.com/mrpaulandrew/ADF.procfwk
  
  This post about Functions executing pipelines was a prerequisite to me creating the framework wrapper. Within the framework pipelines executed within a processing stage will always run in parallel. Check it out and let me know if this solves your problem.
  Cheers
  Paul
  
  LikeLike
  
  Reply
Pradeep Tumuluri says:

April 16, 2020 at 1:18 am

Nice Post. I am getting Subscription Couldn’t be found when I execute. Can you please help me here.

LikeLike

Reply
1. mrpaulandrew says:
  
  April 16, 2020 at 5:02 am
  
  Are you using the correct subscription ID?
  
  LikeLike
  
  Reply
Pingback: Get Any Azure Data Factory Pipeline Activity Error Details with Azure Functions – Welcome to the Technical Community Blog of Paul Andrew
arunsankar2020 says:

May 15, 2020 at 12:43 pm

Hi Paul, Thanks for another useful post.

LikeLike

Reply
Rizal Ang says:

July 10, 2020 at 10:00 pm

Hi Paul, would you know if it’s possible to execute a pipeline run in Azure Functions using managed identity? Instead of Service Principal? Many thanks.

LikeLike

Reply
1. mrpaulandrew says:
  
  July 11, 2020 at 5:14 am
  
  Yes, definitely. Or store the SPN in the app settings.
  
  LikeLike
  
  Reply
  1. Rizal Ang says:
    
    July 11, 2020 at 6:21 am
    
    Thank you Paul! I wonder if using Azure Functions MI means:
    – We simply assign the MI “data factory contributor” role in ADF
    – No longer need to use SPN and keep SPN details in SQL DB
    – Changing the way ADF pipeline (child) works? no longer needed I suppose?
    – How do we create adf client and execute the pipeline run in Functions? i.e. authentication key?
    Much appreciated for the help!!!
    
    LikeLike
  2. mrpaulandrew says:
    
    July 11, 2020 at 6:26 am
    
    It’ll probably need the owner role.
    I don’t do this in my processing framework as this limits you to using a single Data Factory for Worker pipelines and it becomes an extra step to handle at deployment time.
    I haven’t looked into how. Let me know
    
    LikeLiked by 1 person
  3. Geert-Jan says:
    
    October 6, 2022 at 7:28 am
    
    Hi Paul, I know it’s been a while since you wrote this article but I stumbled on it while looking for how to trigger an ADF using a managed identity. Your code sample uses the id/secret of an SPN. How do I need to change the code to use the managed identity of the running function? That way I don’t have to store/transmit any secrets.
    
    LikeLike
Andrew Holowaty says:

July 12, 2020 at 9:28 pm

Hi Paul,
I am getting the following error when trying to invoke a ADF pipeline from an Azure powershell function: ERROR: Invoke-AzDataFactoryV2Pipeline : Object reference not set to an instance of an object. It looks like the library is missing Az.DataFactory. Any ideas on who to resolve the issue?
Best Regards, Andrew

LikeLike

Reply
1. mrpaulandrew says:
  
  July 13, 2020 at 4:48 am
  
  Are you calling a published pipeline?
  
  LikeLike
  
  Reply
2. Ken Stuber says:
  
  October 1, 2020 at 12:08 pm
  
  Andrew, did you ever get this sorted out? We’re having the same problem.
  
  LikeLike
  
  Reply
Swapna says:

July 24, 2020 at 1:01 pm

Hi,
im using the namespace Microsoft.Azure.Management.DataFactory in Azure Function. Function is failing here. How do i add reference to this dll in Azure function. Im using portal for creating the function.

LikeLike

Reply
1. mrpaulandrew says:
  
  July 24, 2020 at 1:05 pm
  
  I recommend using Visual Studio or VSCode to develop the function. It makes the adding of NuGet libraries much easier.
  
  LikeLike
  
  Reply
  1. Swapna says:
    
    July 24, 2020 at 2:06 pm
    
    Thank You for quick response. What is this Application Id and authentication Key?
    
    LikeLike
  2. mrpaulandrew says:
    
    July 24, 2020 at 4:22 pm
    
    This is for the SPN the function will use to authenticate against Data Factory
    
    LikeLike
  3. Swapna says:
    
    July 24, 2020 at 2:39 pm
    
    Hi Paul,
    Im getting error An unhandled exception of type ‘System.AggregateException’ occurred in mscorlib.dll
    at AcquireTokenAsync function call. Authentication Id and Key are correct.
    Please suggest .
    
    LikeLike
  4. mrpaulandrew says:
    
    July 24, 2020 at 4:26 pm
    
    Mmmm, not sure, don’t think I’ve had that one before. Is you function app configured for .Net core rather than .Net framework
    
    LikeLike
Swapna says:

July 25, 2020 at 12:07 pm

This issue was due to wrong Tenent Id. God, somehow figured out. Thank You Paul!.

LikeLike

Reply
Swapna says:

July 25, 2020 at 1:20 pm

Paul, One more question, .NET or Python which is better choice for writing Azure Function?

LikeLike

Reply
1. mrpaulandrew says:
  
  July 27, 2020 at 6:18 am
  
  I prefer .Net just because I have more experience with C#
  
  LikeLike
  
  Reply
Swapna says:

July 25, 2020 at 1:22 pm

Im going to write Error handling module for ADF pipelines. We are planning to use same module for all pipeline.

LikeLike

Reply
1. mrpaulandrew says:
  
  July 25, 2020 at 2:19 pm
  
  You should check out my post on getting the activity error details with an Azure function
  
  LikeLike
  
  Reply
Kavia Aravindd says:

August 12, 2020 at 8:38 am

Trying to execute the function to run an existing pipeline fails with error:
System.Private.CoreLib: Exception while executing function: ADFAutomationExecutor. System.Private.CoreLib: One or more errors occurred. (Client IP not authorized to access the API.). Microsoft.Azure.Management.DataFactory: Client IP not authorized to access the API.

Please help.

LikeLike

Reply
1. mrpaulandrew says:
  
  August 12, 2020 at 10:38 am
  
  Hi Kavia, this sounds like an issue with the firewall config on your SQLDB. Make sure you enable access to other Azure services, see link below. Or if running the functions locally via Visual Studio, make sure your own external IP is added to the firewall. Let me know. Cheers Paul
  
  https://docs.microsoft.com/en-us/azure/azure-sql/database/firewall-configure
  
  LikeLike
  
  Reply
Brent says:

September 17, 2020 at 1:57 am

“The reason for needing such an Azure Function is because currently the Data Factory activity to execute another pipeline is not dynamic. The name of the downstream pipeline called can not be driven by metadata which upsets me greatly, everything should be dynamic”

Any reason we can’t use the Web Activity or Web Hook in ADF to run another pipeline in another ADF? https://docs.microsoft.com/en-us/rest/api/datafactory/pipelines/createrun

I am trying it at the moment. I am creating the URL using dynamic content, using the subscription/resource group/data factory name/pipeline name, passing these in as they are stored in a table.

LikeLike

Reply
1. mrpaulandrew says:
  
  September 17, 2020 at 5:43 am
  
  No reason you can’t. Functions just gives you more control over pipeline parameters being passed and the ability to use Key Vault.
  Logic Apps is the third option.
  
  LikeLike
  
  Reply
  1. Brent says:
    
    September 17, 2020 at 9:23 am
    
    Ahhh OK, cool thanks Paul. I guess you just have to set the Body up correctly to pass in parameters.
    
    I had a wee test and an error that I was missing the Authentication header. I see we can add headers, so will set one up tomorrow when back at work. I take it the best idea with authentication though is to set up a “service account” (service principal), then use that as the authentication mechanism? If I do this, I can’t seem to find how I would set up the authentication header correctly. Do you know or have a sample you can give me? Information was a little sparse on this!
    
    If I can’t get that working, I will swap to a function app or logic app. Thanks for your help! Brent
    
    LikeLike
Balaram Gudivada says:

November 20, 2020 at 1:01 pm

How to run failed pipeline automatically? is there any such kind of fucntionality in ADF?

LikeLike

Reply
1. mrpaulandrew says:
  
  November 20, 2020 at 6:21 pm
  
  Check out my ADF framework, it includes restarting pipelines…. procfwk.com
  
  LikeLike
  
  Reply
Jeff Foster says:

December 29, 2020 at 9:16 pm

Hi Paul, thanks for the article.

What is the AuthenticationKey, is this using an AuthenticationKey from an Integration runtime, or is it something that’s created through AAD?

LikeLike

Reply
1. mrpaulandrew says:
  
  January 4, 2021 at 8:14 am
  
  Hi, this in the Service Principal secret created in AAD. Cheers
  
  LikeLike
  
  Reply
  1. Jeff Foster says:
    
    January 26, 2021 at 7:12 pm
    
    Thanks!!
    
    LikeLike
vigneshviki says:

May 13, 2021 at 1:58 pm

Hi Paul, Thanks, I am getting below exception while running my code.
Exception while executing function: ExecutePipeline. TriggerPipeline: Could not load file or assembly ‘Microsoft.IdentityModel.Clients.ActiveDirectory, Version=5.2.9.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35’. Could not find or load a specific file. (Exception from HRESULT: 0x80131621). System.Private.CoreLib: Could not load file or assembly ‘Microsoft.IdentityModel.Clients.ActiveDirectory, Version=5.2.9.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35’.

LikeLike

Reply
1. mrpaulandrew says:
  
  May 28, 2021 at 10:10 am
  
  Hey, based on the error I sounds like you just need the required NuGet packages installing.
  
  LikeLike
  
  Reply
Deepak Ramesh Kolapkar says:

May 18, 2021 at 7:25 pm

Thanks for this article; Very useful.
I was wondering if there is a way if the azure function can be started just before being called from ADF and then stopped immediately after the call to function is completed?

LikeLike

Reply
Rithi says:

June 21, 2021 at 7:24 pm

Your article is so usefull ! Thanks a lot !
It seems that my ADF pipelines launched by an Azure Function takes more time to be executed and finished than.
As Instance : One ADF pipeline containing only a Get MetaData, takes 16 second to be launched and executed instead of 3 sec directly if it’s directly launched normally in ADF.
Have you already noticed this performance issue ?

LikeLike

Reply
1. mrpaulandrew says:
  
  June 21, 2021 at 11:02 pm
  
  Lol, I think your missing the bigger picture and potential here.
  
  LikeLike
  
  Reply
khouloud IBN hadj says:

July 7, 2021 at 3:12 pm

Hi Paul thanks for this article;
I’m connecting to Power BI Service via a service principal in my azure function with powershell and I’m outpushing the logs in a blob storage.
I want the processing to be dynamic. But one of the constraints I have is to make the call to these azure function in azure data factory.
Is there a way to do that?

LikeLike

Reply
1. mrpaulandrew says:
  
  July 7, 2021 at 6:29 pm
  
  Yes, using the Function Activity.
  
  LikeLike
  
  Reply
khouloud IBN hadj says:

July 8, 2021 at 7:40 am

ok ,thank you !!!

LikeLike

Reply
Jagadish Subramonayan says:

August 12, 2021 at 4:47 am

Hi Paul

I am getting the below error when i execute the ‘parent’

Error code ActionFailed
Failure type User configuration issue
Details Activity failed because an inner activity failed
Source Pipeline 02-Parent
I am getting inner activity faile on all azure function calls . This ahppened after i was getting ‘Unauthorised’ error which got fixed by giving suitable permission.
I have already deployed twice but facing the same issue .Please help

LikeLike

Reply
1. mrpaulandrew says:
  
  August 12, 2021 at 7:53 am
  
  Hi, it sounds like you’ve got a permissions issue somewhere. Either between data factory and the functions app or between the functions app and your key vault. Depending on your setup. Also, try running the functions app locally and triggering it with postman using the same details, this will help with debugging.
  
  LikeLike
  
  Reply
Ashwini Mene says:

March 27, 2022 at 5:41 am

Hi Paul,
Thanks for this article.
I want to pass the output parameter to the Azure Datafactory Pipeline.
So, Can you please suggest the solution?
eg. I want to pass the parameter is as shown below
pipelineparameter:
{
“emp_id”:1,
“emp_name”: “Havells”,
“emp_dept”:”IT”
}

When I tried to read that parameter like the below manner
‘pipelineRun.Parameters’
then it gives me an error like this: System.Collections.Generic.Dictionary`2[System.String,System.String]

LikeLike

Reply
christian says:

June 22, 2022 at 11:08 am

Hi, How do we change the name or rather description of the azure pipiline I have one generic pipline called Master and I would like this name to change when I execute it from the function app it that posible by editing the header ?

LikeLike

Reply
christianvdheever says:

June 22, 2022 at 11:14 am

hi have one generic pipelne and I that I execute via this process works well what I would like to do is change the name of that pipelne during execution (While calling it) in order to see the run is the gui per name say I execute the generic pipe one of the name must become 1 and the other 2 though the pipeline ins the same one. It his posible maybe by passing headers?

LikeLike

Reply
Sachin says:

November 14, 2022 at 11:17 pm

Hello Paul,

Thank you for the detailed blog, however if I am blocking the entire public endpoint access where am not able access management.azure.com where adf need to be executed and get the status in that case can we still use this ?

LikeLike

Reply
Prateek says:

December 8, 2022 at 6:31 am

Hello Paul,
Just want to check one thing. For calling pipeline dynamically using azure function. Are the pipelines need to be published before running from azure function or it will work if we just save the pipelines & run it using azure function. I am getting an error in function entity not found

LikeLike

Reply
1. mrpaulandrew says:
  
  December 8, 2022 at 7:19 am
  
  Yes they need to be published.
  
  LikeLike
  
  Reply
Raj says:

May 19, 2023 at 12:23 pm

When I try to trigger the ADF pipeline from .Net6 Web App, I get following error – CloudException: The client ” with object id ” does not have authorization to perform action ‘Microsoft.DataFactory/factories/pipelines/createRun/action’ over scope ‘/subscriptions/{subId}/resourceGroups/{resGroupName}/providers/Microsoft.DataFactory/factories/{factoryName}/pipelines/{PipelineName}’ or the scope is invalid.
The client Id mentioned in error is not the App Id.

LikeLike

Reply