Following on from a previous blog post that I wrote a few months ago where I got an Azure Data Factory Pipeline run status with an Azure Function (link below). I recently found the need to create something very similar to execute any pipeline from an Azure Function.
Happily, this pipeline execution is basically the example provided by Microsoft in the documentation for the Data Factory .NET SDK (link also below). Given this I’m not taking any credit for the bulk of the function code. However, I did need to extend the body of the request for the function to accept any amount of pipeline parameters.
https://docs.microsoft.com/en-us/azure/data-factory/quickstart-create-data-factory-dot-net
The reason for needing such an Azure Function is because currently the Data Factory activity to execute another pipeline is not dynamic. The name of the downstream pipeline called can not be driven by metadata which upsets me greatly, everything should be dynamic 🙂
Replacing this activity with an Azure Function activity is less than ideal as this then presents the following challenges:
- Making the Azure Function block and wait until the pipeline returns means potentially a long running durable function is required.
- Calling an Azure Functions mean paying for the additional compute to a achieve the same behaviour which we are already paying for in Data Factory is used directly.
- Authentication needs to be handled from Data Factory to the Azure Function App and then from the Azure Function back to the same Data Factory. This should be done via our application settings and handled in our release pipelines, rather than passed in the function body. No trolls please, I know.
With an understanding of these important caveats here’s an overview of the solution.
Note; I used .NET Core 3.0 for the below function.
Execute Pipeline
For the function itself, hopefully this is fairly intuitive once you’ve created your DataFactoryManagementClient and authenticated.
The only thing to be careful of is not using the CreateOrUpdateWithHttpMessagesAsync method by mistake. Make sure its Create Run. Sounds really obvious, but when you get code drunk names blur together and the very different method overloads will have you confused for hours!…. According to a friend 🙂
Body JSON without Pipeline Parameters
{
"tenantId": "1234-1234-1234-1234-1234",
"applicationId": "1234-1234-1234-1234-1234",
"authenticationKey": "Passw0rd123!",
"subscriptionId": "1234-1234-1234-1234-1234",
"resourceGroup": "CommunityDemos",
"factoryName": "PaulsFunFactoryV2",
"pipelineName": "WaitingPipeline"
}
Body JSON with Pipeline Parameters
The pipeline parameters attributes can contain as many parameters as you want and basically just ingests them into the overloaded method; CreateRunWithHttpMessagesAsync as a Dictionary of string and object.
Data Factory doesn’t validate the parameter names so you can send anything. I just assumes the names passed are identical to the names of the actual pipeline parameters. If so, the values are simply mapped across.
{
"tenantId": "1234-1234-1234-1234-1234",
"applicationId": "1234-1234-1234-1234-1234",
"authenticationKey": "Passw0rd123!",
"subscriptionId": "1234-1234-1234-1234-1234",
"resourceGroup": "CommunityDemos",
"factoryName": "PaulsFunFactoryV2",
"pipelineName": "WaitingPipeline",
"pipelineParameters":
{
"TestParam1": "Frank",
"TestParam2": "Harry"
}
}
Output
{
"PipelineName": "WaitingPipeline",
"RunIdUsed": "0d069026-bcbc-4356-8fe8-316ce5e07134",
"Status": "Succeeded"
}
Just Give Me The Code!
Ok! Here you go… ExecutePipeline.cs
The full solution is in the same Blob Support Content GitHub repository if you’d like to use the Visual Studio Solution.
I hope you found this post helpful.
Many thanks for reading.
Hi Paul.. We have a requirement of parallel execution of ADF pipelines . Our Azure functions run in parallel independently (multiple instances) . Hence we are thinking of using the function instances to trigger the pipeline instance in parallel. The other option is to use the tumbling window trigger . Which option do you suggest is more cost efficient.
LikeLike
Hi Manish, I think the best answer I can offer is for you to check out my open source code project (Creating a Simple Metadata Driven Framework for Executing Azure Data Factory Pipelines). GitHub link: https://github.com/mrpaulandrew/ADF.procfwk
This post about Functions executing pipelines was a prerequisite to me creating the framework wrapper. Within the framework pipelines executed within a processing stage will always run in parallel. Check it out and let me know if this solves your problem.
Cheers
Paul
LikeLike
Nice Post. I am getting Subscription Couldn’t be found when I execute. Can you please help me here.
LikeLike
Are you using the correct subscription ID?
LikeLike
Hi Paul, Thanks for another useful post.
LikeLike
Hi Paul, would you know if it’s possible to execute a pipeline run in Azure Functions using managed identity? Instead of Service Principal? Many thanks.
LikeLike
Yes, definitely. Or store the SPN in the app settings.
LikeLike
Thank you Paul! I wonder if using Azure Functions MI means:
– We simply assign the MI “data factory contributor” role in ADF
– No longer need to use SPN and keep SPN details in SQL DB
– Changing the way ADF pipeline (child) works? no longer needed I suppose?
– How do we create adf client and execute the pipeline run in Functions? i.e. authentication key?
Much appreciated for the help!!!
LikeLike
It’ll probably need the owner role.
I don’t do this in my processing framework as this limits you to using a single Data Factory for Worker pipelines and it becomes an extra step to handle at deployment time.
I haven’t looked into how. Let me know
LikeLiked by 1 person
Hi Paul,
I am getting the following error when trying to invoke a ADF pipeline from an Azure powershell function: ERROR: Invoke-AzDataFactoryV2Pipeline : Object reference not set to an instance of an object. It looks like the library is missing Az.DataFactory. Any ideas on who to resolve the issue?
Best Regards, Andrew
LikeLike
Are you calling a published pipeline?
LikeLike
Andrew, did you ever get this sorted out? We’re having the same problem.
LikeLike
Hi,
im using the namespace Microsoft.Azure.Management.DataFactory in Azure Function. Function is failing here. How do i add reference to this dll in Azure function. Im using portal for creating the function.
LikeLike
I recommend using Visual Studio or VSCode to develop the function. It makes the adding of NuGet libraries much easier.
LikeLike
Thank You for quick response. What is this Application Id and authentication Key?
LikeLike
This is for the SPN the function will use to authenticate against Data Factory
LikeLike
Hi Paul,
Im getting error An unhandled exception of type ‘System.AggregateException’ occurred in mscorlib.dll
at AcquireTokenAsync function call. Authentication Id and Key are correct.
Please suggest .
LikeLike
Mmmm, not sure, don’t think I’ve had that one before. Is you function app configured for .Net core rather than .Net framework
LikeLike
This issue was due to wrong Tenent Id. God, somehow figured out. Thank You Paul!.
LikeLike
Paul, One more question, .NET or Python which is better choice for writing Azure Function?
LikeLike
I prefer .Net just because I have more experience with C#
LikeLike
Im going to write Error handling module for ADF pipelines. We are planning to use same module for all pipeline.
LikeLike
You should check out my post on getting the activity error details with an Azure function
LikeLike
Trying to execute the function to run an existing pipeline fails with error:
System.Private.CoreLib: Exception while executing function: ADFAutomationExecutor. System.Private.CoreLib: One or more errors occurred. (Client IP not authorized to access the API.). Microsoft.Azure.Management.DataFactory: Client IP not authorized to access the API.
Please help.
LikeLike
Hi Kavia, this sounds like an issue with the firewall config on your SQLDB. Make sure you enable access to other Azure services, see link below. Or if running the functions locally via Visual Studio, make sure your own external IP is added to the firewall. Let me know. Cheers Paul
https://docs.microsoft.com/en-us/azure/azure-sql/database/firewall-configure
LikeLike
“The reason for needing such an Azure Function is because currently the Data Factory activity to execute another pipeline is not dynamic. The name of the downstream pipeline called can not be driven by metadata which upsets me greatly, everything should be dynamic”
Any reason we can’t use the Web Activity or Web Hook in ADF to run another pipeline in another ADF? https://docs.microsoft.com/en-us/rest/api/datafactory/pipelines/createrun
I am trying it at the moment. I am creating the URL using dynamic content, using the subscription/resource group/data factory name/pipeline name, passing these in as they are stored in a table.
LikeLike
No reason you can’t. Functions just gives you more control over pipeline parameters being passed and the ability to use Key Vault.
Logic Apps is the third option.
LikeLike
Ahhh OK, cool thanks Paul. I guess you just have to set the Body up correctly to pass in parameters.
I had a wee test and an error that I was missing the Authentication header. I see we can add headers, so will set one up tomorrow when back at work. I take it the best idea with authentication though is to set up a “service account” (service principal), then use that as the authentication mechanism? If I do this, I can’t seem to find how I would set up the authentication header correctly. Do you know or have a sample you can give me? Information was a little sparse on this!
If I can’t get that working, I will swap to a function app or logic app. Thanks for your help! Brent
LikeLike
How to run failed pipeline automatically? is there any such kind of fucntionality in ADF?
LikeLike
Check out my ADF framework, it includes restarting pipelines…. procfwk.com
LikeLike
Hi Paul, thanks for the article.
What is the AuthenticationKey, is this using an AuthenticationKey from an Integration runtime, or is it something that’s created through AAD?
LikeLike
Hi, this in the Service Principal secret created in AAD. Cheers
LikeLike
Thanks!!
LikeLike