Is Azure Synapse Analytics Dead and Does It Really Matter?

The product vs the capabilities

I think it’s fair to say that Azure Synapse Analytics has had a hard life. It was announced in public preview as a surprise to most of the community, including Microsoft cloud solution architects. Ultimately meaning that very little private preview testing and feedback on the product was done before showing it to the world. This resulted in a lot of frustration in the subsequent year before it could be classified as generally available and more frustration after that while we battled with the missing production features. Even now, the product is lacking in a lot of functionality. Anyway, this is all in the past. Microsoft Fabric is the new kid on the block, and we need to address the unpopular question about the future of Synapse. And considering I’ve been very unpopular with the product teams before; I’ll take this one for the team. Sorry, but it needs to be addressed.


Context

To answer this question fully we need to break down the technical capabilities of both unified platforms. Starting with a focus on:

  • Integration Pipelines
  • Storage
  • Real-time data
  • Delta lake data transformation

The first point we could make, is that regardless of Microsoft Fabric or Synapse Analytics we could deliver all these capabilities using existing mature cloud offerings. Namely:

  • Azure Data Factory
  • Azure Data Lake Gen2
  • Azure Data Explorer
  • Azure Databricks

Given this, we might want to think about the advantages of product maturity vs the advantages of a unified platform (to mean the reduction in platform plumbing). But that can be a blog for another time.

For now, if we are saying Synapse is dead. Do we care. Because we have this alternative set of offerings to deliver those same capabilities. Capabilities where the migration path of code is very simple.

  • JSON for integration pipeline definitions. Tweaked slightly for linked services.
  • Storage left unchanged and mounted/connected to alternative compute.
  • KQL scripts reused as is with logical database entities recreated.
  • Python/Scala/Spark SQL reused in Notebooks, with a little refactoring.

We still have the irritation of needing to do this work, but the impact is minimal if we setup our technology stack with the future in mind. Either way, reach out and check out the Cloud Formations migration offering if you need support.

Additionally, and more obviously, do we care. Because we have an upgraded set of offerings to deliver those same capabilities in Microsoft Fabric. In my current opinion Microsoft Fabric is not yet mature and not ready. But the roadmap looks promising.


The Elephant in the Room

An elephant in a crowded room of fragile object
“An elephant in a crowded room of fragile objects” 🙂

Where we have a gap in the capabilities above and perhaps the ‘elephant in the room’ is our warehouse dimensional models.

If in our solution we used what was once called an Azure SQL Data Warehouse, then renamed twice to become our Dedicated SQL Pool as part of Synapse Analytics we have a problem.

For this flavour of SQL engine, we have compute and storage (not a Data Lake) coupled together to execute distributed workloads. Or massively parallel processing, if you prefer the older terminology. For some data models and niche architecture patterns to load OLAP cubes, taking advantage of Polybase, this capability is/was good.

The problem comes in how the offering is delivered (coupled compute and storage) and what we need to do to our data to ensure optimal performance for our executions. That is, explicitly defining how rows are distributed across the fixed set of 60 storage nodes. As well as scaling out compute to support enough concurrent connections for downstream reporting.

So, as already stated, we have a problem, and we don’t have an easy migration path.

Therefore, a better question might be, is Azure Synapse Analytics Dedicated SQL Pools now a dead end technology in our data platform stack? Almost certainly, yes. But hence in this blog I wanted to breakdown the product vs the capabilities to offer a full answer.


Conclusion

So, to circle back on my blog title and answer the questions I actually asked directly, with some additional context, based on the considerations covered:

QuestionAnswer
Is Azure Synapse Analytics dead?Yes.

If designing a new data platform, it will not feature on the architecture diagram.
Yes.

In terms Microsoft product teams continuing to develop Synapse features. Based on what I know.
No.

In terms of short/medium term support from Microsoft for existing Synapse Analytics deployments.
Does it matter?Not really.

If using industry standard capabilities. A small code migration project is recommended. Cloud Formations can help here.
Yes.

If using Dedicated SQL Pools. A larger code and data migration project is going to be required. Cloud Formations can help here too.

Overall, maybe the simple answer then in, yes. Azure Synapse Analytics is dead.

Was Synapse just a circa 3-year test case for a unified data platform and ultimately a stepping stone to Microsoft Fabric? Comments welcome.


Many thanks for reading.

10 thoughts on “Is Azure Synapse Analytics Dead and Does It Really Matter?

  1. Hi Paull, In terms Microsoft product teams continuing to develop Synapse features. Does this represent an extra useful life for the product?

    Like

  2. Nice blog, Paul! I’m product group, and I promise this blog does not make you less popular with my team! I agree with many of your comments. The market moved and our competitors, as well as our customers, prefer a different world, where thinking in terms of 60 partitions is less than ideal. This is why most of our innovation is going to Fabric – we believe we can serve customers better, and really appreciate the help that your company (and others) are offering in migrating customers to the new vision. The only point where I disagree a bit with you is that about Synapse being dead from the perspective of new features . You are saying “Yes”, I would say “It depends”. Based on what I know 🙂 We prefer to focus innovation on Fabric, but may add features to Synapse if really needed by our customers. We prioritize our customers above our preference, particularly if multiple customers are impacted by an issue with no workaround. And absolutely always if we find a security risk.

    I added a link below to a blog post where I am explaining in more detail how we think of Synapse. Thanks a lot for the great blog!

    https://support.fabric.microsoft.com/en-us/blog/microsoft-fabric-explained-for-existing-synapse-users?ft=02-2024:date

    Like

  3. To answer the question that you posed at the end:

    “Was Synapse just a circa 3-year test case for a unified data platform and ultimately a stepping stone to Microsoft Fabric?” 

    I think that Synapse was a response to the competition, namely Snowflake. Now, Fabric is a response to Databricks which is a true unified data platform.

    Like

  4. Thanks very much for this piece of frank analysis. It’s really helpful. Fortunately for us we did not need the capabilities of Dedicated SQL Pools so don’t have that capability hole to be pondering over.

    However, the gap between Synapse and Fabric GA capabilities is frustrating, particularly as some in ‘the firm’ now think they are using Fabric due to the Power BI internals rebrand and the excellent marketing job that has been done.

    I’m looking forward to migrating to Fabric. The ‘curated’ presentation layer capabilities look really strong and fill a gap. Fortunately due to some excellent tips (thanks again Paul), and our subsequent choices, we are in a good position to segue over.

    But I could do without the speculative recruiter candidates who have delivered industrial grade Fabric engineering solutions, apparently on unsupported capabilities. Its like the invention of the data scientist job title on LinkedIn all over again.

    RIP Synapse Analytics: the Young Frankenstein of the modern data platform.

    Like

  5. What about the serverless offering? It provides such a great cost model and opportunity for many organizations who still work in the non-realtime realm. While it has limited visibility into the queries to be able to optimize effectively, the platform runs relatively efficiently and having the ability to virtualize access through openrowset and external tables paying only for the compute on scan seems like it’s being left on the wayside when Fabric comes into the equation. 

    Can you offer an opinion on this?

    Also – your suggestion of another blog post would be very interesting to serve as a comparison for my own thoughts on the topic: “Given this, we might want to think about the advantages of product maturity vs the advantages of a unified platform (to mean the reduction in platform plumbing). But that can be a blog for another time.”

    Like

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.