Breaking Nvidia’s Grip: Callosum’s Play for Control of AI Data Centre Workloads

The conversation around AI infrastructure often begins with model size and ends with graphics processing units. In between sits a quieter layer that determines cost, speed and resilience. That layer is AI data centre workloads.

Callosum, a UK-based startup founded by Danyal Akarca and Jascha Achterberg, enters this conversation with a specific challenge. Why should most AI data centre workloads continue to run on uniform clusters of Nvidia GPUs when the tasks themselves vary widely in complexity and urgency?

The company recently raised $10.25 million in venture funding. The round was led by Plural, the European early-stage fund co-founded by Taavet Hinrikus and Ian Hogarth. Angel investors include Charlie Songhurst, Stan Boland of FiveAI and John Lazar of the Royal Academy of Engineering. The UK government-backed Advanced Research and Invention Agency is supporting research through grant funding.

The funding size is modest when compared with multi-billion-dollar AI rounds. Yet the focus is not on the scale of capital. It is in control of AI data centre workloads.

The Global Concentration of AI Data Centre Workloads

Across North America, Europe, Asia Pacific and the Middle East, the structure of AI data centre workloads follows a similar path. Enterprises train large models on Nvidia GPUs. Cloud providers prioritise Nvidia-backed instances. Developers build around CUDA software.

This pattern created a concentration of AI data centre workloads within a single hardware ecosystem.

For years, this alignment made commercial sense. Training large language models required dense GPU clusters. NVIDIA delivered the performance needed at that stage of AI development.

Yet the market is shifting.

Deloitte estimates that inference will account for roughly two-thirds of AI compute demand by 2026, up from about one-third in 2023. Inference refers to live AI data centre workloads. Customer queries, fraud checks, logistics routing and content recommendations all fall into this category.

The inference hardware market is projected to exceed $50 billion in 2026. This figure reflects global demand, not a single region.

When inference becomes the dominant share of AI data centre workloads, the economics change. The focus moves from peak training power to cost per interaction and latency per request.

Callosum positions itself inside that transition.

From Neuroscience to AI Data Centre Workloads

Akarca and Achterberg met during their PhDs in neuroscience at Cambridge. Their academic work examined how the human brain coordinates specialised regions to produce coherent output. The brain does not rely on identical units performing identical tasks.

Callosum applies that observation to AI data centre workloads.

Instead of routing all tasks to one category of hardware, its orchestration software distributes AI data centre workloads across multiple chip types. NVIDIA GPUs remain part of the stack. So do AMD processors, AWS Trainium, AWS Inferentia, Cerebras systems and SambaNova hardware.

The premise is straightforward. Different chips may suit different AI data centre workloads depending on scale, latency and computational intensity.

The company reports that compared with identical hardware setups, its system can deliver two times better accuracy, seven times faster performance and four times lower cost on complex tasks. These are company-stated figures and have not yet been supported by independent 2026 benchmark studies in public reporting.

Even without third-party benchmarks, the framing matters. It suggests that AI data centre workloads should be matched to hardware characteristics rather than assigned by default.

Inference and the Economics of AI Data Centre Workloads

The rise of inference-heavy AI data centre workloads is visible across industries.

A bank processing real-time fraud detection handles millions of micro decisions each day.

A global retailer serving personalised recommendations across continents processes continuous inference requests.

A telecommunications provider managing network optimisation relies on live AI models.

These AI data centre workloads differ from training runs that occur periodically. They are persistent. They operate across time zones. They generate recurring costs.

When Deloitte projects that inference will represent roughly two-thirds of AI compute demand by 2026, it signals a structural shift. The projected $50 billion inference hardware market reinforces the scale of this opportunity.

For enterprises managing global AI data centre workloads, even marginal improvements in cost per inference request can influence annual budgets.

Callosum’s strategy targets this layer of decision-making.

NVIDIA Dominance Under Pressure

NVIDIA’s influence on AI data centre workloads is built on more than hardware performance. Its CUDA software ecosystem shaped developer training and enterprise deployment. Many cloud services default to Nvidia instances because customer demand followed that standard.

Callosum does not manufacture competing chips. It builds orchestration software that abstracts hardware selection. In practical terms, this means AI data centre workloads can be distributed across different processors without rewriting entire applications.

If this abstraction works at scale, it introduces optionality into global AI infrastructure. Optionality does not eliminate Nvidia’s dominance. It alters bargaining dynamics.

When enterprises can demonstrate that portions of their AI data centre workloads run effectively on alternative silicon, supplier negotiations may shift.

The effect would be gradual rather than immediate.

The Multi-Agent Expansion of AI Data Centre Workloads

Many enterprises are moving beyond single-model deployments toward multi-agent systems. One agent may manage customer service. Another may handle compliance checks. A third may support internal decision-making.

Each agent produces distinct AI data centre workloads with different latency and compute needs.

Uniform GPU clusters treat these workloads as similar. Heterogeneous computing treats them as specialised.

Callosum’s platform is designed for this environment. The company focuses on enterprises deploying complex, multi-step AI systems where AI data centre workloads vary in profile and priority.

In global operations, these variations multiply. A payment system in Southeast Asia may face different transaction volumes and regulatory conditions compared with a European branch. AI data centre workloads must adapt to these realities.

Practical Considerations for Global Enterprises

A global organisation can begin by mapping its AI data centre workloads across regions. Training, fine-tuning and inference should be measured separately. Hardware dependency should be tracked by geography.

Cost per workload should be analysed in each market. Latency metrics should be compared across continents. Energy consumption data should be reviewed alongside financial reporting.

Enterprises can request from cloud providers a breakdown of how much of their AI data centre workloads rely on Nvidia GPUs versus alternative chips. Controlled pilots can test whether specific inference workloads perform comparably on non-NVIDIA hardware.

These steps do not require immediate structural change. They require visibility.

Visibility supports informed negotiation and long-term planning.

Data Centre Architecture and Strategic Control

The current research shows that Callosum is investigating new hardware solutions, which include photonic interconnect technologies, to support its United States market expansion efforts. Public disclosures do not detail specific hardware products or commercial timelines. The ambition shows that AI data centre workloads operate within a larger architectural framework.

Worldwide government agencies are funding semiconductor manufacturing development together with data centre capacity expansion projects. Enterprises are responding by diversifying supply chains and reviewing infrastructure exposure.

AI data centre workloads intersect with these developments. They shape cloud contracts, energy demand and procurement strategies.

For multinational brands, this is no longer a background engineering issue. It influences service reliability, compliance and cost predictability.

A Scenario Across Continents

Consider a multinational e-commerce platform operating in North America, Europe and the Asia Pacific. Its recommendation engine runs continuously. Its fraud detection model analyses each transaction in real time. Its AI agents support customer queries in multiple languages.

These represent varied AI data centre workloads.

If all are routed to identical GPU instances, costs may remain predictable but inflexible. If selected inference workloads are shifted to alternative chips where appropriate, cost structures may change. Latency may vary by region. Performance must be measured.

A contained regional pilot could evaluate these variables. Cost per query can be tracked. Latency can be benchmarked. Reliability metrics can be monitored.

Measured results should guide any broader reallocation of AI data centre workloads.

Capital, Discipline and Execution

Callosum’s $10.25 million raise positions it as a focused infrastructure player rather than a headline-driven unicorn. The next stage will depend on enterprise adoption and technical validation.

Performance claims of two times better accuracy, seven times faster performance and four times lower cost will require scrutiny from potential clients. Large organisations managing critical AI data centre workloads will demand proof.

If validation follows, the orchestration layer could influence how global AI infrastructure evolves.

If not, Nvidia’s integrated ecosystem will remain the default.

The Strategic Question Ahead

AI data centre workloads are expanding as inference grows. Deloitte’s projection that inference will represent roughly two-thirds of AI compute demand by 2026 underscores that shift. The projected $50 billion inference hardware market reinforces the commercial stakes.

Callosum’s proposition rests on a simple challenge to the status quo. Should all AI data centre workloads be treated as identical when their functional requirements differ?

For global enterprises, the answer will not come from theory. It will come from audits, pilots and negotiated contracts.

AI data centre workloads now sit at the intersection of technology architecture, finance and competitive positioning. Decisions made at this layer will shape margins and resilience over the coming decade.

Callosum has entered that debate with $10.25 million and a multi-chip thesis. The global market will determine whether orchestration becomes a new standard for AI data centre workloads or remains an alternative path within an Nvidia-led ecosystem.

Global Brands & Business