All articles
AI's Compute Race Compounds Risk at Scale As Physical Constraints Mount
As AI infrastructure investment surges, technology executive and futurist Nabeel Mahmood explains why energy constraints, labor shortages, and catastrophic downtime risk are the variables most leaders are ignoring.

Key Points
The AI compute race is outpacing the physical constraints that will determine whether it succeeds, including energy supply, and a shortage of skilled tradespeople who build and operate these systems.
Nabeel Mahmood, CEO of Mahmood, says the industry's fixation on compute density is creating concentrated risk across operational, financial, and human capital dimensions.
He advises leaders to "deconstruct to de-risk" by separating workloads and using a modular approach to eliminate waste and build more resilient systems.
In these high-density environments, the people who keep the physical systems running are the ones who determine whether the infrastructure actually works.

The push toward megawatt-scale racks and extreme compute density has made AI infrastructure a story about engineering ambition. But that focus is missing the practical constraints that will actually determine if these systems can operate at scale. Instead, the next phase of the industry is being shaped by three overlooked factors: operational resilience, immense financial risk, and a growing shortage of skilled labor.
Nabeel Mahmood is a technology executive and futurist with over two decades of experience leading global technology organizations. The CEO of an eponymous consulting firm, specializing in global IT and data centers. Mahmood's career spans hands-on technology leadership, including CITO roles at Maxco Supply and Instor, through to board-level governance, including a directorship at United Security Bank. He also serves as Managing Director of the Nomad Futurist Foundation, a nonprofit focused on building the next generation of digital infrastructure talent. In his view, the industry's engineering-centric mindset is creating an environment of concentrated risk where the value of foundational operational roles is often overlooked.
"Your most important asset is no longer the bits and bytes of compute. It is the plumber. In these high-density environments, the people who keep the physical systems running are the ones who determine whether the infrastructure actually works," says Mahmood. The industry has treated infrastructure as an engineering problem for so long that the operational layer (the technicians, maintenance disciplines, and physical systems expertise) has become an afterthought. As compute density climbs and the stakes of failure rise, that oversight is becoming increasingly expensive.
For many leaders, this financial exposure makes resilient design a core business necessity. It is also prompting a new way of thinking about how service level agreements are tied to business outcomes. In the world of high-density AI, where a single rack can host billions of dollars worth of data, the conversation is expanding to include the cascading recovery costs of failure, not just performance metrics.
Do the math: "Calculate the true cost of downtime. If you're a $10 billion organization, your cost of downtime for one minute might be $100 million. Then you have the soft costs and the cost of recovery," says Mahmood. "And you are not recovering a high-density asset in just another minute. Assuming it takes twenty minutes with all hands on deck, do the math. We're already at $2 billion. For me, the math does not add up." It is a calculation Mahmood believes most CIOs have never formally run, and the one most likely to change how they approach infrastructure investment.
Mahmood's response to these risks is a fundamentally new architectural strategy that prioritizes resilience over hyper-concentration, a challenge to the industry's default of concentrating everything together for maximum speed. That rethink extends beyond design choices to the people who build and operate the infrastructure. As data centers and hyperscale campuses continue to be built at scale, the shift from traditional AC to high-voltage DC environments demands an entirely different technical skill set, with the shortage of qualified tradespeople compounding every other infrastructure challenge. Exacerbating it further is the strain on the US electrical grid, which is struggling to keep pace with the power demands of AI infrastructure build-out.
Deconstruct to de-risk: "We need to separate language models from your general compute. Move from large language models into narrow language models so you are actually doing the task the model is supposed to do. Create a segregation of where data resides: where inferencing happens, where processing happens. A little decentralization, with the right partitioning, can solve challenges that pure concentration cannot," notes Mahmood. The underlying principle is that not all AI workloads carry the same risk profile, and treating them as though they do concentrates both complexity and exposure unnecessarily.
Building with blocks: This modular approach also addresses one of infrastructure's oldest and most expensive problems: waste. Historically, the industry has "overengineered, over-provisioned, and underutilized," producing a global efficiency ratio Mahmood puts at "sub-47%," a figure that represents massive amounts of stranded capital. Leaders can build more efficient systems by embracing modularity and new data resilience models, which directly counters the overbuilding challenges often intensified by unpredictable AI workloads. "The spiky nature of AI compute demand is easily solvable if you build modular, Lego-style infrastructure. Add what you need when you need it instead of overbuilding expensive capacity that sits unused," adds Mahmood.
The scale of these financial risks is also prompting external parties to take a more active role. As the cost and operational difficulty of AI infrastructure begin to consolidate the market, Mahmood suggests the concentration of risk will likely attract greater attention from both insurance companies and government regulators.
The new watchdogs: "I believe when we encounter one major catastrophe within the next eighteen months to two years, there will be some significant shifts in how we deploy these applications. On the same note, I've started to see insurance companies taking an active role in trying to figure out how they are going to insure these environments on a go-forward basis," says Mahmood. It is less a prediction than an observation of a market already in motion, one where the financial stakes of failure are becoming too large for the industry to self-govern indefinitely.
Mahmood's guidance for CIOs is straightforward: look past the immediate engineering hype and reverse engineer your infrastructure strategy with a long-term view. Meticulously plan for every dependency, and take seriously what recent history has already demonstrated to be fragile. "What happened with the FAA? That was just a simple patch update that did not have the right sequence of operation," says Mahmood. The incident, a faulty CrowdStrike update that triggered a global IT outage and grounded flights across three major US airlines, was a reminder that in highly concentrated, interdependent systems, a single misstep cascades rapidly.
For infrastructure leaders, the lesson is not to move slower, but to think further ahead: stress-test every contingency and build systems that can absorb failure before it becomes a crisis. "Think like a Formula One car. You know that you've got the ability and the capability to drive at 260 miles an hour. But when that hairpin turn comes in at 35 degrees, you're not going to be pushing through at those speed limits. You've got to slow it down to go fast," concludes Mahmood.




