Azure Data Factory Resource Limitations

Hello friends, I’m creating this post hopefully to raise awareness for my followers of the service limitations for Azure Data Factory. Like most resources in the Microsoft Cloud Platform at various levels (Resource/Resource Group/Subscription/Tenant) there are limitations, these are enforced by Microsoft and most of the time we don’t hit them, especially when developing. That said, all to often I see these limitations bring down production processes because people aren’t aware of them, or aren’t calculating execution concurrency correctly. Sorry if that sounds fairly dramatic, but this is born out of my own frustrations.

As far as I can tell Microsoft do an excellent job at managing data centre capacity so I completely understand the reason for having limitations on resources in place. There is no such thing as a limitless cloud platform.

Note; in a lot of cases (as you’ll see in the below table for Data Factory) the MAX limitations are only soft restrictions that can easily be lifted via a support ticket. Please check before raising alerts and project risks.

Data Factory Limitations

I copied this table exactly as it appears for Data Factory on 22nd Jan 2019. References at the bottom.

Resource	Default limit	Maximum limit
Data factories in an Azure subscription	800 (updated)	800 (updated)
Total number of entities, such as pipelines, data sets, triggers, linked services, and integration runtimes, within a data factory	5,000	Contact support.
Total CPU cores for Azure-SSIS Integration Runtimes under one subscription	256	Contact support.
Concurrent pipeline runs per data factory that’s shared among all pipelines in the factory	10,000	Contact support.
Concurrent External activity runs per subscription per Azure Integration Runtime region External activities are managed on integration runtime but execute on linked services, including Databricks, stored procedure, HDInsights, Web, and others.	3000	Contact support.
Concurrent Pipeline activity runs per subscription per Azure Integration Runtime region Pipeline activities execute on integration runtime, including Lookup, GetMetadata, and Delete.	1000	Contact support.
Concurrent authoring operations per subscription per Azure Integration Runtime region Including test connection, browse folder list and table list, preview data.	200	Contact support.
Concurrent Data Integration Units¹ consumption per subscription per Azure Integration Runtime region	Region group 1²: 6000 Region group 2²: 3000 Region group 3²: 1500	Contact support.
Maximum activities per pipeline, which includes inner activities for containers	40	40
Maximum number of linked integration runtimes that can be created against a single self-hosted integration runtime	100	Contact support.
Maximum parameters per pipeline	50	50
ForEach items	100,000	100,000
ForEach parallelism	20	50
Maximum queued runs per pipeline	100	100
Characters per expression	8,192	8,192
Minimum tumbling window trigger interval	15 min	15 min
Maximum timeout for pipeline activity runs	7 days	7 days
Bytes per object for pipeline objects³	200 KB	200 KB
Bytes per object for dataset and linked service objects³	100 KB	2,000 KB
Data Integration Units¹ per copy activity run	256	Contact support.
Write API calls	1,200/h This limit is imposed by Azure Resource Manager, not Azure Data Factory.	Contact support.
Read API calls	12,500/h This limit is imposed by Azure Resource Manager, not Azure Data Factory.	Contact support.
Monitoring queries per minute	1,000	Contact support.
Entity CRUD operations per minute	50	Contact support.
Maximum time of data flow debug session	8 hrs	8 hrs
Concurrent number of data flows per factory	50	Contact support.
Concurrent number of data flow debug sessions per user per factory	3	3
Data Flow Azure IR TTL limit	4 hrs	Contact support.

You can find this table in the following Microsoft docs page. The page is huge and includes all Azure services, which is why I think people never manage to find it.

https://docs.microsoft.com/en-us/azure/azure-resource-manager/management/azure-subscription-service-limits

Also, the source for the page I believe is the following GitHub link.

https://github.com/MicrosoftDocs/azure-docs/blob/master/includes/azure-data-factory-limits.md

My blog is static so please refer to these links for the latest numbers.

Finally, it is not a competition to see who can hit all of these restrictions! Honest! 😉

Many thanks for reading.

17 thoughts on “Azure Data Factory Resource Limitations”

Thank you so much Paul for knowing these limitations of ADF.

Can you please share some thoughts on how to improve the performance of ADF.

Thanks hoping to see this blogs soon.

LikeLike

Hi Paul, what are the limitations that you encounter “normally”? The list itself is interesting, but the real-life experience is the more interesting.

LikeLike

Great Article. Good to know these limitations in ADF. Agree with Johannes Vink Question. It is really good to know the practical limitations which we encounter during our developement in ADF.

LikeLike

Pingback: Resource Limitations with Azure Data Factory – Curated SQL

Hi Paul ,
Thanks for Excellent analysis on Azure data factory.
I have send a request on linkedin . I have question , How do you see ADF (orchestration tool) from traditional ETL tool perspective (like Informatica, DataStage , ODI) , Is it right to compare any legacy ETL tool with Orechestration tool .
Regards,
Mangesh

LikeLike

Pingback: Creating a Simple Staged Metadata Driven Processing Framework for Azure Data Factory Pipelines – Part 4 of 4 – Welcome to the Technical Community Blog of Paul Andrew

Pingback: Best Practices for Implementing Azure Data Factory – Welcome to the Technical Community Blog of Paul Andrew

Pingback: Data Factory Activity Concurrency Limits – What Happens Next? – Welcome to the Technical Community Blog of Paul Andrew

The maximum 40 activities per pipeline to say the least is outrageous, what do we do if we need to have one trigger to execute 100 pipelines for Dimensions for instance

LikeLike

mrpaulandrew says:

December 22, 2020 at 8:55 am

Yes, I agree. There are other patterns you can consider like using a ForEach activity with nested calls to child pipelines.

LikeLike

Reply

Pingback: Pipelines – Understanding Internal vs External Activities – Welcome to the Technical Community Blog of Paul Andrew

The tumbling window trigger now is 5 minutes

LikeLike

Pingback: Best practices of Azure Data Factory – Guduru's Tech Blog

That’s an interesting way of doing it. While using ForEach is practical, it might not scale well. You should also be careful with nested calls to child pipelines as there is a high likelihood that you’ll encounter delays. This is because each pipeline has to wait for the processing to finish before it is the next one to be processed. You can reduce the delay by having each pipeline write to a service bus, then have another pipeline read from the service bus. It will have to wait for its turn, but it’ll have shorter delays.

LikeLike

Concurrent number of data flow debug sessions per user per factory:3…. Does anyone has faced with this limitation? How do you handle this limit when you had more than 3 ADF developers debugging at the same time?

LikeLike

Is the ADF capable of handling 200GB single csv file? I have assumed the ADF tool is for integration with external azure services and build ETL pipeline. But I am not sure the capacity to handle the data. I did not clear your below statement.
Bytes per object for pipeline objects3 200 KB 200 KB
Bytes per object for dataset and linked service objects3 100 KB 2,000 KB

LikeLike

Thanks Paul ! for giving a consolidate overview.
Cheers !

LikeLike

Leave a comment Cancel reply

About Me

mrpaulandrew

Paul (AKA @mrpaulandrew) is the Founder & CTO of Cloud Formations, a specialist data consultancy based in the UK. With nearly 20 years’ experience designing and delivering Microsoft data architectures, Paul leads a passionate team of engineers, supporting businesses small and large with scalable cloud platforms. Business value delivered through data insights. Over the years, Paul has covered the breadth and depth of design patterns and industry leading concepts, including Lambda, Kappa, Delta Lake, Data Mesh and Data Fabric. Paul is also a Microsoft Data Platform MVP, director for the Data Relay community conference, East Midlands user group leader, book author and mentor. In addition to the day job(s), Paul is a father of three, husband, foodie, runner, blood donor, geek, Lego, and Star Wars fan! Lastly, Paul confesses to enjoying a Ramstein playlist when given half a chance to do some coding for a customer project.

Vipinkumar Jha says:

January 30, 2020 at 1:27 pm

Thank you so much Paul for knowing these limitations of ADF.

Can you please share some thoughts on how to improve the performance of ADF.

Thanks hoping to see this blogs soon.

LikeLike

Johannes Vink says:

January 30, 2020 at 1:57 pm

Hi Paul, what are the limitations that you encounter “normally”? The list itself is interesting, but the real-life experience is the more interesting.

LikeLike

Balan says:

January 31, 2020 at 6:20 am

Great Article. Good to know these limitations in ADF. Agree with Johannes Vink Question. It is really good to know the practical limitations which we encounter during our developement in ADF.

LikeLike

Pingback: Resource Limitations with Azure Data Factory – Curated SQL
Mangesh says:

February 12, 2020 at 8:40 am

Hi Paul ,
Thanks for Excellent analysis on Azure data factory.
I have send a request on linkedin . I have question , How do you see ADF (orchestration tool) from traditional ETL tool perspective (like Informatica, DataStage , ODI) , Is it right to compare any legacy ETL tool with Orechestration tool .
Regards,
Mangesh

LikeLike

Pingback: Creating a Simple Staged Metadata Driven Processing Framework for Azure Data Factory Pipelines – Part 4 of 4 – Welcome to the Technical Community Blog of Paul Andrew
Pingback: Best Practices for Implementing Azure Data Factory – Welcome to the Technical Community Blog of Paul Andrew
Pingback: Data Factory Activity Concurrency Limits – What Happens Next? – Welcome to the Technical Community Blog of Paul Andrew
Kostas says:

December 17, 2020 at 2:10 pm

The maximum 40 activities per pipeline to say the least is outrageous, what do we do if we need to have one trigger to execute 100 pipelines for Dimensions for instance

LikeLike

1. mrpaulandrew says:
  
  December 22, 2020 at 8:55 am
  
  Yes, I agree. There are other patterns you can consider like using a ForEach activity with nested calls to child pipelines.
  
  LikeLike
  
Pingback: Pipelines – Understanding Internal vs External Activities – Welcome to the Technical Community Blog of Paul Andrew
Teja Rebb says:

May 20, 2021 at 2:47 pm

The tumbling window trigger now is 5 minutes

LikeLike

Pingback: Best practices of Azure Data Factory – Guduru's Tech Blog
madrisamson says:

March 29, 2022 at 12:41 pm

That’s an interesting way of doing it. While using ForEach is practical, it might not scale well. You should also be careful with nested calls to child pipelines as there is a high likelihood that you’ll encounter delays. This is because each pipeline has to wait for the processing to finish before it is the next one to be processed. You can reduce the delay by having each pipeline write to a service bus, then have another pipeline read from the service bus. It will have to wait for its turn, but it’ll have shorter delays.

LikeLike

Raken says:

June 7, 2022 at 10:00 pm

Concurrent number of data flow debug sessions per user per factory:3…. Does anyone has faced with this limitation? How do you handle this limit when you had more than 3 ADF developers debugging at the same time?

LikeLike

Sankar says:

October 16, 2022 at 11:43 am

Is the ADF capable of handling 200GB single csv file? I have assumed the ADF tool is for integration with external azure services and build ETL pipeline. But I am not sure the capacity to handle the data. I did not clear your below statement.
Bytes per object for pipeline objects3 200 KB 200 KB
Bytes per object for dataset and linked service objects3 100 KB 2,000 KB

LikeLike

Sunil says:

April 18, 2024 at 11:05 am

Thanks Paul ! for giving a consolidate overview.
Cheers !

LikeLike