When One Provider Owns Your Uptime: The Hidden Cost of Cloud Concentration

Why your business continuity plan probably isn't as solid as you think

Oct 21, 2025

Why your business continuity plan probably isn’t as solid as you think

Last December, a routine database error at an AWS facility in northern Virginia took down Lyft, Starbucks, ChatGPT, and thousands of other services for hours. The financial toll? Hard to quantify precisely, but analysts estimate the ripple effects cost affected businesses somewhere between $100 million and $150 million in lost revenue and productivity that day alone.

Here’s the uncomfortable truth most CMOs don’t want to hear: if you’re running on AWS, you’re in good company—and that’s exactly the problem. Amazon controls 31% of the global cloud infrastructure market. Microsoft Azure holds 20%. Google Cloud Platform has 11%. Together, these three companies account for more than 60% of the world’s cloud computing capacity. When one stumbles, entire sectors feel it.

The irony is thick. We’ve spent two decades moving away from single points of failure, distributing our infrastructure across regions and availability zones. Yet we’ve created a new vulnerability—one that’s arguably more insidious because it’s wrapped in the veneer of redundancy.

The consolidation paradox nobody’s talking about

While cloud providers have grown their market share, they’ve also become more intertwined. AWS generated $108 billion in revenue in 2024, up 19% year-over-year. Azure isn’t far behind, though Microsoft doesn’t break out exact figures. This growth isn’t just impressive—it’s concerning.

Consider the pharmacy chain that couldn’t process prescriptions during the December outage because their point-of-sale system, inventory management, and electronic health records all lived on AWS. Or the media company that lost ad revenue not just from their streaming service being down, but because their ad server, content delivery network, and analytics platform all failed simultaneously.

This isn’t theoretical risk management. It’s the reality of modern infrastructure dependence.

The multi-cloud approach that consultants love to recommend? It’s expensive, complex, and rarely executed well. According to recent data, only 8% of enterprises have fully integrated multi-cloud strategies despite 67% claiming they use multiple cloud providers. The difference between “using” and “being prepared to failover to” is where most business continuity plans fall apart.

What the alternatives actually look like

Let’s be practical. Moving entirely off hyperscale cloud providers isn’t realistic for most businesses. But understanding your exposure is.

Oracle Cloud Infrastructure has been quietly growing, targeting enterprises that want an alternative to the Big Three. It holds about 2% market share but saw 45% growth in 2024. Alibaba Cloud, despite geopolitical headwinds, maintains 4% of the global market. Both offer enterprise-grade services that can serve as legitimate backup options for critical workloads.

But here’s what few people consider: the real solution isn’t just technical—it’s architectural. Companies that weathered the AWS outage best weren’t necessarily those with elaborate multi-cloud setups. They were the ones who had architected their applications to degrade gracefully. Their payment processing might slow down, but it didn’t stop. Their customer service might switch to a backup mode, but it remained functional.

The financial services firm that kept 15% of their compute on Azure and GCP as a hot standby spent an extra $2.3 million annually on that redundancy. During the outage, they lost maybe 20 minutes of processing time while traffic rerouted. Their competitor, running 100% on AWS, lost eight hours. Do the math on that ROI.

The questions your vendor won’t answer

When evaluating cloud dependencies, most companies ask the wrong questions. They focus on uptime SLAs (which are usually 99.95% or higher across major providers) instead of asking about correlated failure risks.

What percentage of your provider’s customers are in your industry? If a sector-specific vulnerability emerges, you’re all going down together. What’s the blast radius of their availability zones? AWS’s us-east-1 region—the one that failed in December—serves as the default for countless services. It’s not just busy; it’s critical infrastructure for the entire internet.

How diversified is their own infrastructure? Ironically, many cloud providers rely on the same fiber optic networks, power grids, and even hardware suppliers. An Nvidia chip shortage impacts everyone. A cable cut in the Atlantic affects multiple providers simultaneously.

What this means for marketing and advertising tech

For marketing leaders, this isn’t just an IT problem. Your ad serving, your attribution modeling, your customer data platform—these probably all sit on the same cloud provider. When AWS went down, programmatic ad spending essentially paused for affected advertisers. Real-time bidding stopped. Attribution broke. Retargeting campaigns failed.

The impact on ad spend alone was estimated at $40 million for the hours of the outage. But the downstream effects—missed conversion windows, broken customer journeys, corrupted analytics data—persisted for days.

Marketing stacks average 3-6 tools according to 2024 research, and integration capability is now the top consideration when choosing platforms. But integration only matters if all your integrated tools don’t share the same single point of failure.

Building resilience that actually works

First, audit your actual dependencies—not what your architecture diagrams say, but what actually happens when you trace a customer transaction from click to confirmation. Most companies discover they’re more dependent on single providers than they realized.

Second, prioritize which workloads genuinely need redundancy. Not everything does. Your company blog can afford downtime. Your payment processing can’t. Your data warehouse can wait. Your fraud detection can’t.

Third, think in terms of degraded service, not just failover. Can you process orders manually for a few hours? Can you route customer service calls to a backup system? Can you pause non-essential analytics processing to keep critical paths running?

The companies that weather outages best aren’t necessarily those with the fanciest infrastructure. They’re the ones who’ve asked “what breaks first, and can we still serve customers when it does?”

The 2026 forecast

Cloud spending is projected to exceed $678 billion in 2024, growing 22% year-over-year. That growth isn’t slowing. As AI workloads proliferate, the strain on infrastructure will only increase. AWS, Azure, and GCP are all competing fiercely for AI compute customers, but they’re also acknowledging that the density of GPU clusters creates new failure modes.

Expect more outages, not fewer. Expect them to be more consequential as businesses become more digitally dependent. And expect the companies that survive them best to be the ones who planned for degradation rather than perfection.

The next major cloud outage will happen. Probably this quarter. Possibly this month. The question isn’t whether your provider will fail—they all do eventually. The question is whether your business has planned for it.

Data, Tech & Tools

Discussion about this post