The SaaSpocalypse Has a Data Problem Nobody Is Talking About
On February 3, 2026, approximately $285 billion in market capitalization evaporated from global SaaS and IT services companies in a single trading session. The Retool 2026 Build vs. Buy Report confirmed the underlying trend: 35% of enterprises have already replaced at least one SaaS tool with a custom build, and 78% plan to build more. Gartner predicts that by 2030, 35% of point-product SaaS tools will be replaced by AI agents.
The trend is real. The economics of building custom software have shifted. As TechCrunch put it: "the barriers to entry for creating software are so low now thanks to coding agents, that the build versus buy decision is shifting toward build."
But the conversation is stuck on the wrong question. Most SaaSpocalypse coverage focuses on what to replace and how fast AI can build it. Nobody is asking where these replacements should live.
This matters because the answer determines whether enterprises escape the SaaS trap or fall into a worse one. Without a shared data foundation, the likely outcome is hundreds of vibe-coded micro-apps, each with its own data store, its own version of "customer," and its own governance gaps. That is not liberation from SaaS silos. That is SaaS silos rebuilt with worse documentation.
Taking the wrong approach carries a real cost. Gartner estimates poor data quality costs organizations an average of $12.9 million per year. When enterprises build without a unified data foundation or proper governance, that cost grows through rework, inconsistent data, and operational complexity.
In some ways, we have seen this before with the Modern Data Stack era. Starting with the honorable goal of “killing the monolithic data platform”, data teams started to use dozens or even hundreds of very specific one-problem-one-solution tools, leading to a new problem of data governance and management complexity. Databricks solved a lot of that by unifying many of these tools into a single platform. Now it is aiming to solve a similar problem with Software in general.
The SaaSpocalypse tells you why to build. This article is about where.
The Promise: One Data Layer to Rule Them All
The idea is straightforward: instead of buying enterprise software from multiple vendors — each with its own database, its own data model, and its own extraction pipeline — build custom applications on top of a common data layer that your organization already owns.
I once audited a company running 5 different BI tools. Each one had its own data pipeline, its own version of "revenue," and its own refresh schedule. The SaaS vendors were not the problem — the lack of a shared data layer was. Every tool created an island of data that had to be reconciled with every other island.
This is the ETL tax. Every SaaS vendor you buy creates an integration cost that never appears on the invoice: the pipeline to extract data, the schema mapping to normalize it, the reconciliation logic to resolve conflicts, and the freshness SLA to keep it current. In eight years of building data platforms, I have never seen an enterprise where this cost was accurately accounted for. It always exceeds the subscription. Worse, enterprises average more than 1,000 applications!
The signals that this tax may become optional are now concrete. SAP announced zero-copy data sharing with major data platforms, going GA in Q1 2026. Workday introduced Data Connect, enabling two-way zero-copy sharing via Apache Iceberg. The direction is clear: SaaS vendors themselves acknowledge that the data belongs on your platform, not locked inside their product. That is exactly Databricks positioning with its open platform mandate.
This convergence is only possible because the lakehouse is built on open formats and open protocols. Delta Lake, Apache Iceberg, and Delta Sharing are open-source, which means the data layer is not controlled by any single vendor. That openness is what makes zero-copy sharing architecturally viable. A proprietary data warehouse cannot serve as a neutral application foundation; an open lakehouse can.
If your data already lives in a lakehouse, and SaaS vendors are piping their data into it via zero-copy, the question becomes: why not build applications directly on that layer?
What a Data Platform Is Missing (and What Lakebase Changes)
The answer to "why not" is that an analytics platform is not an application platform. Having a lakehouse does not mean you can run enterprise software on it. There are five capabilities that a data platform needs before it can serve as an application foundation:
- Transactional writes with ACID guarantees. Analytics platforms are optimized for reads — append-only ingestion, batch transformations, analytical queries. Enterprise applications need to insert, update, and delete individual rows with full transactional consistency. Delta Lake and Iceberg handle analytical writes, but they were not designed for the kind of high-frequency, low-latency transactions that a CRM or ticketing system demands.
- Sub-second query latency for interactive UIs. A dashboard that refreshes in three seconds is acceptable. A form submission that takes three seconds is not. Application workloads require a serving layer with consistently low latency — something batch-oriented warehouses cannot guarantee.
- Row-level and column-level security tied to application roles. Analytics governance (who can query which tables) is not the same as application security (which user can see which records within a table, based on their role in the application). This requires fine-grained access control integrated with application authentication.
- Event-driven architecture for real-time state changes. Applications react to events — a ticket is assigned, a deal moves stages, an approval is granted. Batch-refresh data pipelines that run every 15 minutes or every hour cannot power this. You need to change data capture, streaming, or event buses.
- A developer experience that does not require data engineering skills for every change. If adding a field to a form requires a data engineer to modify a Delta table schema, run a migration, and update a transformation pipeline, the development velocity collapses.
I have watched teams build internal tools on top of their data warehouse only to discover that a batch-refresh analytics layer cannot power an interactive application. The queries were fast enough for dashboards but too slow for a form submission. The gap between "analytics-ready" and "application-ready" is wider than most teams expect.
This is where Databricks Lakebase changes the equation. Described by its creators as "what you would build if you had to redesign OLTP databases today," Lakebase is a third-generation database architecture that runs a fully managed, serverless Postgres compute layer on top of cloud object storage in open formats. It delivers ACID transactional writes on the lake, sub-second compute that scales to zero, git-like branching of petabyte-scale databases, and — critically — unified OLTP and OLAP on the same storage layer.
That last point is the one that matters most for this argument. If transactional and analytical workloads share the same storage, there is no ETL between them. The application writes data; the analytics layer reads it. No pipeline. No schema mapping. No reconciliation. No fresh SLA. Zero-copy by design. This means your finance team no longer has to wait 3 days for reconciled revenue numbers.
Lakebase closes the two hardest gaps: transactional writes and serving latency, which were previously architectural deal-breakers. But Lakebase is not the whole story. The broader Databricks Intelligence Platform addresses the remaining gaps through complementary capabilities: Unity Catalog provides row-level and column-level security with attribute-based access controls, extending governance from analytics to application workloads. Structured Streaming and Delta Live Tables enable event-driven architectures with native change data capture. Databricks Apps allows developers to build and deploy applications directly on the platform without requiring data engineering expertise for every change. No single product closes all five gaps overnight. But the platform, taken as a whole, has a credible path to each one, and that path is further along than most enterprises realize
The Maturity Model: Five Levels from Siloed SaaS to Application Platform
Not every enterprise should attempt this transition, and no enterprise should attempt it all at once. This is a graduated progression, and each level has prerequisites that must be met before advancing.
Level 0 - Siloed SaaS, ETL Everything. Each SaaS tool owns its data. You extract what you can via APIs or flat files, load it into a warehouse, and transform it for reporting. This is where most enterprises still operate. The ETL tax is highest here, but the operational risk is lowest — every application is managed by its vendor.
Level 1 - Zero-Copy Analytics. SaaS vendors share data with your data platform via zero-copy integrations (Iceberg, Delta Sharing). You stop building extraction pipelines for participating vendors. Analytics improves because the data is fresher and more consistent. You are not building applications yet — you are eliminating the ETL tax on the read side.
Level 2 - Custom Read-Heavy Applications. You build lightweight applications that read from the data platform — customer portals, operational dashboards, internal reporting tools, search interfaces. These are read-only or read-mostly. They do not require transactional writes to the lakehouse. This is where most "replaced a SaaS tool" stories from the Retool report actually live.
Level 3 - Custom Read-Write Applications. You build applications that both read from and write to the data platform, a simplified CRM, an internal ticketing system or a custom approval workflow. This is where you need transactional capabilities (Lakebase or equivalent), application-level security, and a real developer experience. This level replaces specific SaaS functions, not entire platforms.
Level 4 - Full Application Platform. The data platform becomes the foundation for all custom enterprise software. Operational databases, analytical workloads, and AI/ML models share the same storage layer. New applications are built on the platform by default. SaaS is reserved for domains where vendor expertise is irreplaceable (e.g., payroll compliance, tax calculation).
Most enterprises I work with think they are ready for Level 3 or 4. When we audit their data platform, they are solidly at Level 1, and that is not a criticism. Level 1 done well is more valuable than Level 4 done poorly.
The prerequisites at each level are not just technical. Level 2 requires a team that can build and maintain web applications, not just data pipelines. Level 3 requires application security expertise, on-call operations, and a CI/CD pipeline for application code. Level 4 requires an internal platform engineering team that treats the data layer as a product with SLAs, documentation, and developer support.
The AI Accelerant (and Its Limits)
AI-assisted development is the reason the build-vs-buy math has shifted. Gartner estimates that 90% of enterprise software engineers will use AI code assistants by 2028, up from less than 14% in 2024. AI coding tools can accelerate application development by 3-5x, which means a custom internal tool that once required six months and a dedicated team can now be prototyped in weeks.
This changes the economics. But it does not change the architecture.
AI coding agents solve the "can we build it fast enough?" problem. They do not solve the "can we operate it reliably?" problem. The risk is specific and predictable: a team uses an AI coding agent to build a CRM replacement in a weekend. It works. It queries the lakehouse, it renders a UI, and it even handles basic CRUD operations. Then, month three arrives. A manager asks why they cannot see a customer record that another team can see. Someone discovers there is no audit trail for who changed what. A compliance review reveals that deleted records are not actually deleted, they are just hidden in the UI.
These are not hypothetical scenarios. They are the exact failure modes that enterprise SaaS vendors have spent decades solving and that vibe-coded replacements skip entirely.
The right framing is that AI-assisted development is the economic enabler, not the architectural one. It makes building feasible. It does not make building well automatic. The maturity model still applies: AI just compresses the timeline at each level; it does not let you skip levels.
Who Should Attempt This (and Who Should Not)
Three conditions must be true before an enterprise should seriously consider building applications on its data platform:
You already have a mature data platform. If your lakehouse is still being built, if your governance layer is incomplete, if your data quality is inconsistent — fix those first. Building applications on a shaky data foundation amplifies every existing problem. When governance is solid, when Unity Catalog manages access, lineage is tracked, and data quality is monitored, the lakehouse becomes the most natural foundation for enterprise applications. The investment in getting the data layer right pays compound returns when applications consume it directly.
You have a specific, painful SaaS integration. The best candidates are SaaS tools where the ETL tax is highest — where you spend more time integrating and reconciling than you spend using the tool. Start with one. Not five. Not "all of them."
Your team can operate what it builds. Building an application is the easy part. Operating it — monitoring, patching, securing, scaling, supporting users, handling incidents at 2 AM — is the hard part. If your data team has never operated a user-facing application, start at Level 2 (read-heavy) and build operational maturity before attempting Level 3.
If all three conditions are met, start at Level 1 if you are not already there. Enable zero-copy sharing with your largest SaaS vendors. Measure the ETL tax reduction. Build one read-heavy application on the data platform. Learn what breaks. Then — and only then — evaluate Level 3.
The enterprises that will succeed in this transition are those that treat it as a progression of maturity, not a migration project. The ones that will fail are the ones that read the SaaSpocalypse headlines, hand an AI coding agent a prompt, and expect enterprise-grade software by Friday.
The data platform can become the application platform. But the real bottleneck is not the technology. It is the governance, the operational maturity, and the discipline to build incrementally. That has always been true — and no amount of AI changes it.
With Lakebase unifying transactional and analytical storage, Unity Catalog governing access across every workload, and Mosaic AI powering the intelligence layer, the architecture is no longer theoretical. It is being built, in production, on open formats that no single vendor controls. [The real bottleneck is not the technology. It is the governance, the operational maturity, and the discipline to build incrementally. But for the first time, the platform is ready before most organizations are. That gap is closing — and the enterprises that close it first will define the next era of enterprise software.
About Indicium AI
Indicium AI is trusted by the world's leading enterprises to deliver AI into production at scale. We are a global AI-native consultancy with proven experience across Financial Services, Energy & Utilities, Healthcare & Life Sciences, Retail & CPG, and Manufacturing. From strategy, to build, to business outcomes, we unlock value from AI with unmatched clarity, speed, and capability.
Powered by 600+ AI experts serving 50+ enterprise clients from 5 global locations, we work side-by-side with top partners - including Anthropic, Databricks, AWS, OpenAI, and Microsoft - to deliver modern AI with speed and measurable impact.

