28th May 2026 AI Startup Update: Inference Provider Baseten in Talks for $1B at $11B Valuation

The AI gold rush is shifting from dazzling tools to industrial infrastructure. As inference becomes the engine room of finance, cyber security and enterprise decision-making, the real winners will be those that make intelligence fast, reliable, scalable and affordable, not merely impressive

28th May 2026 AI Startup Update: Inference Provider Baseten in Talks for $1B at $11B Valuation
Photo by Steve A Johnson

AI Is Moving From Market Tool To Market Infrastructure

The AI gold rush is no longer just about training giant models. The real contest is shifting to inference: the expensive, high-volume business of running those models continuously for real users, real companies and real-time decisions.

Founded in 2019, Baseten focuses exclusively on AI inference infrastructure—the plumbing required to deploy and run AI models in production without the latency, reliability, and cost issues that plague large-scale deployments. The platform supports open-source, custom, and fine-tuned models across cloud, hybrid, and self-hosted environments.

That is why Baseten's reported talks to raise US$1 billion at an US$11 billion valuation matter. If completed, the deal would more than double the San Francisco startup’s valuation in less than 90 days, following its US$300 million Series E round at a US$5 billion valuation, backed by investors including Nvidia, IVP and CapitalG.

Baseten is not selling another chatbot. It is building the infrastructure layer that allows companies to deploy and operate AI models in production with lower latency, stronger reliability and better cost control. In simple terms, it wants to become the “AWS for inference”.

That ambition is important because inference is quickly becoming the true operating cost of artificial intelligence. Every AI search, coding request, customer service response, market summary, portfolio query, compliance check, cyber alert or legal review is an inference event. At scale, those events become infrastructure. They become margin. They become power.

The urgency surrounding inference infrastructure is driven by sheer volume. Industry analysts forecast that inference will represent two-thirds of all AI compute demand by the end of 2026, up from just one-third in 2023. This shift was recently accelerated by the release of advanced reasoning models, which demand significantly higher performance efficiency to keep serving costs viable.

"Baseten's valuation trajectory from $5B to $11B in three months signals that inference infrastructure is now treated as platform-class, not a commodity layer beneath model APIs," noted recent industry analysis of the impending deal. "For AI founders and technical leaders, this reframes the build-versus-buy calculus on inference: specialized providers are commanding premium multiples, which changes the risk profile of rolling custom solutions."

This is where the Reflexivity story connects. The New York-based AI investment platform recently raised US$30 million in Series B funding led by Greycroft and Interactive Brokers. Reflexivity represents the front-end intelligence layer: AI tools that help investors analyse portfolios, test scenarios, read market signals and compress research workflows.

Baseten represents the engine room underneath that future. Reflexivity shows how AI is entering financial decision-making. Baseten shows what must exist behind the curtain to make that intelligence fast, reliable and commercially viable.

That is the bigger story. AI is moving from feature to foundation.

For financial services, the implications are serious. Investors, brokers, wealth platforms and analysts are drowning in data: earnings calls, corporate filings, broker notes, economic releases, geopolitical shocks, social sentiment and live market moves. AI can turn that noise into usable intelligence, but only if the infrastructure can support millions of high-quality responses at speed.

This is why inference has become a platform business. The companies that control inference will influence the economics of AI adoption across finance, cyber security, law, healthcare, media and enterprise software. They will decide how cheaply intelligence can be delivered, how quickly decisions can be supported and how deeply AI can be embedded into everyday work.

Nvidia’s role is also strategic. By backing infrastructure players such as Baseten, Nvidia is not simply selling chips into the AI boom. It is extending its influence across the full stack, from model training to real-world deployment. That strengthens its position at a time when every serious AI company is fighting for compute, speed and efficiency.

The next pressure point will be the hyperscalers. AWS, Microsoft Azure and Google Cloud will not stand aside if inference becomes one of the most valuable profit pools in AI. They have the capital to bundle, discount and defend their platforms. Specialist players like Baseten will need to prove that better performance, developer loyalty and purpose-built infrastructure can withstand a price war.

Why does it matter?

Because the AI boom is growing up.

The first phase was defined by wonder: chatbots, copilots, image generators and the race to show what large models could do. The next phase is harder, more expensive and far more consequential. It is about cost, scale, reliability and control.

That is where the hyperscalers enter the frame. AWS, Microsoft Azure and Google Cloud will not quietly watch inference become one of the richest profit pools in artificial intelligence. They have the capital to bundle it, discount it and pull customers deeper into their cloud ecosystems. Specialist players such as Baseten now face the defining test: whether purpose-built infrastructure, speed and developer loyalty can withstand the gravitational force of the cloud giants.

This is the industrialisation of AI. Intelligence is no longer just a feature on a screen. It is becoming the machinery behind finance, cyber security, law, media, software development and enterprise decision-making. The winners will be those that make intelligence reliable, repeatable, fast and affordable. That is the shift Cyber News Centre is watching closely: AI is moving from tools people use to infrastructure economies depend on.


Get the stories that matter to you.
Subscribe to Cyber News Centre and update your preferences to follow our Daily 4min Cyber Update, Innovative AI Startups, The AI Diplomat series, or the main Cyber News Centre newsletter — featuring in-depth analysis on major cyber incidents, tech breakthroughs, global policy, and AI developments.

Great! You’ve successfully signed up.

Welcome back! You've successfully signed in.

You've successfully subscribed to Cyber News Centre.

Success! Check your email for magic link to sign-in.

Success! Your billing info has been updated.

Your billing was not updated.