Code Project Overview
This open source code project delivers a simple metadata driven processing framework for Azure Data Factory (ADF). The framework is made possible by coupling ADF with an Azure SQL Database that houses execution stage and pipeline information that is later called using an Azure Functions App. The parent/child metadata structure firstly allows stages of dependencies to be executed in sequence. Then secondly, all pipelines within a stage to be executed in parallel offering scaled out control flows where no inter-dependencies exist.
The framework is designed to integrate with any existing Data Factory solution by making the lowest level executor a stand alone Worker pipeline that is wrapped in a higher level of controlled (sequential) dependencies. This level of abstraction means operationally nothing about the monitoring of orchestration processes is hidden in multiple levels of dynamic activity calls. Instead, everything from the processing pipeline doing the work (the Worker) can be inspected using out-of-the-box ADF features.
This framework can also be used in any Azure Tenant and allow the creation of complex control flows across multiple Data Factory resources by connecting Service Principal details through metadata to targeted Subscriptions > Resource Groups > Data Factory’s and Pipelines, this offers very granular administration over data processing components in a given environment.
Framework Key Features
- Granular metadata control.
- Metadata integrity checking.
- Global properties.
- Dependency handling.
- Execution restart-ability.
- Parallel execution.
- Full execution and error logs.
- Operational dashboards.
- Low cost orchestration.
- Disconnection between framework and Worker pipelines.
- Cross Data Factory control flows.
- Pipeline parameter support.
- Simple troubleshooting.
- Easy deployment.
- Email alerting.
Thank you for visiting, details on the latest framework release can be found below.
Version 1.7.1 of ADF.procfwk is ready!
Just a small release to address a bug found in v1.7 and action a feature request from the community.
The stored procedure
[procfwk].[CheckForEmailAlerts] was missing an ELSE condition in the IF statement logic meaning new pipelines without any alerting links would result in the procedure returning a NULL value. This logic has now been corrected.
Pipeline parameter values passed to Workers via the metadata previously had a limitation enforced by the database due to the attribute
[ParameterValue] having a data type of VARCHAR(128). In the Microsoft documentation of Function limitations the request can actually be up to 10MB. Therefore, setting a 128 character limit on the data type in the metadata doesn’t make sense. The data type has therefore been changed to NVARCHAR(MAX).
In addition, the stored procedure
[procfwk].[GetPipelineParameters] has been hardened to use the T-SQL function STRING_ESCAPE to handle any characters that may break the eventual JSON request string when its used in Data Factory.
Finally, to handle this size limit a new check has been added to the pre-execution metadata integrity stored procedure
[procfwk].[CheckMetadataIntegrity]. This in turn uses the following query to establish if the metadata parameters exceeds the Function request limit. Hopefully, unlikely, but better handled before an execution run starts.
Thank you James McLaughlin for the feature request.
That concludes the release notes for this version of ADF.procfwk.
Please reach out if you have any questions or want help updating your implementation from the previous release.