
Hey data people— are you seeing what I’m seeing?
Building with LLMs daily, I’ve noticed a growing and increasingly pronounced gap between how AI infrastructure actually works and how the broader world thinks it works. The fact that the infrastructure continues to transform and fracture (literally daily) doesn’t help much. The unacknowledged truth is we are in uncharted territory, and the more we see commentators trying to crystallize an understanding of AI's deepest underpinnings, the clearer it becomes that the surface layer applications are proliferating without stopping to consider the infrastructure.

Our Unity data platform tracks the migration of talent and capital in near real-time, allowing us to instantly generate market maps like the one above (which I’ll walk through below). So while AI market maps are obsolete as soon as they're created, this one should be right for today.
Along with building AI-driven tools for both our portfolio companies and the internal Ensemble team, I also get to study where capital is flowing across the tech stack, as investors race to figure out which layer will drive the next breakout. Here are some disconnects I've noted between the story in the press and the reality in the code:
- The AI stack doesn’t sit still: AI infrastructure is evolving daily — any “snapshot” is outdated the moment you publish it.
- Each element is its own brand new problem to solve: A breakdown of the key infrastructure layers powering LLMs today, from data to deployment.
- Is the money following the toughest problems?: Where capital is flowing across the stack — and what it says about where we’re headed next.
Everyone is racing to build on top of AI, but very few understand what they’re standing on.
As the models get better, the public (and often investors and founders) assume the underlying systems are getting more stable — that the hard part is behind us. In my mind, that’s a tricky illusion. The infrastructure powering LLMs is not settling. It’s fragmenting, accelerating, and rewriting itself in real time. It’s not “figured out” — it’s actively evolving, often by the models themselves.
Unlocking value in this landscape doesn’t mean riding an obvious wave — it means navigating a fragmented, fast-moving system that no one fully understands, and piecing together a working solution before the standards settle. The companies and builders who figure it out first aren’t just lucky — they’re creating the foundation that others will eventually call obvious. The diversity of approaches and the sheer speed of evolution are themes echoed by founders, investors, and technologists in Google Cloud's recent 'Future of AI: Perspectives for Startups 2025' report, underscoring the strategic complexity involved.
AI adoption continues to accelerate across industries, with the Stanford AI Index Report 2025 noting that private investment in generative AI grew by 18.7% last year and business usage jumped 55% to reach 78% adoption. Reflecting this momentum, Morgan Stanley’s 1Q25 CIO survey found that AI/ML remains the top priority for CIOs and is now seen as the second most defensible area of IT spend, increasingly insulated from broader macro pressures.
With the eyes and dollars of the world squarely focused on building and implementing AI at the everyday user level, will we assume AI to be safe simply because there’s no way to meaningfully audit or explain what’s under the hood? If the infra keeps moving this fast — and the abstraction layers keep getting thicker — how long before we lose the plot entirely?
If you’re working on LLMs today, you’re building on a shape-shifting system.
Each element is its own brand new problem to solve
Since the release of ChatGPT, my work as a Data Scientist has been a front-row seat to the chaos unfolding beneath the surface of AI. I’ve lived through the transition from established MLOps workflows to the fast-moving, often unstable world of LLM infrastructure. Building real applications with these tools means confronting a simple truth: every major challenge — from scaling to governance to hallucination control — now demands custom solutions. There is no unified standard. Every company is solving these problems differently, and the assumptions that once guided “best practices” in machine learning are being rewritten on the fly.
- Cost & Compute: Accessing/affording the massive processing power required for these systems
- Complexity & Scalability: Moving models from research to reliable, scalable systems
- Accuracy, Reliability & Hallucinations: Ensuring trustworthy, accurate outputs
- Latency: Delivering responses quickly enough for real-time applications
- Data Dilemmas: Managing data quality, storage, preparation, and governance on a large scale
- Explainability: Understanding and observing the “black box” nature of these systems
- Security & Privacy: Protecting systems from novel attacks, safeguarding sensitive data
- Rapid Evolution: Keeping pace with constant advancements in models, techniques, and tools
This fragmentation makes AI infrastructure both a massive opportunity and a massive risk. Each layer — compute, data, model deployment, security — is at this point its own frontier, with no agreed-upon blueprint for how to build, audit, or scale reliably. Companies aren’t competing within a shared framework; they’re building their own playbooks as they go on each of these fronts. Unlocking real value in this space means understanding that “infrastructure” today is not a stable foundation — it’s a moving target. Every market map you see is a temporary snapshot. Every system built today is a bet against tomorrow’s standards.
Companies building AI infrastructure tools attempt to tackle these challenges head-on. Some focus on point solutions targeting developers, while others, particularly those serving enterprises, aim for comprehensive, full-stack platforms. Segmenting this market is inherently difficult due to high feature overlap and the space's constant flux; companies often expand beyond their initial niche across the entire AI development lifecycle. Many of the early market maps in this space focused directly on LLMOps. There has been a recent shift towards all-encompassing AI Infra Maps, and now we see separate maps for Agents as we see companies shift towards Agentic solutions. Consequently, every market map will differ, and there is no single agreed-upon stack. (Note: This analysis focuses primarily on LLM-related infrastructure and excludes specialized solutions for computer vision, voice generation, and robotics).
Mapping a moving target
For this map, we'll use the following broad segments:
- Data Layer: Infrastructure for managing, processing, storing and accessing data used by AI systems (including vector database and synthetic data)
- Model Layer: Tools and platforms for training, developing, evaluating, deploying and monitoring AI systems
- Compute & Hardware: Access to and management of the underlying computing resources (GPUs, specialized hardware)
- Security & Privacy: Solutions addressing the unique security and privacy risks of AI
- Agents: Frameworks and infrastructure for building autonomous AI systems capable of planning and tool use
- Full Stack AI Development Platforms: Integrated environments aiming to cover end-to-end of the AI development lifecycle
Funding Rounds by Infrastructure Segment
Q1 2015 - Q1 2025, Ensemble Proprietary Data
Investment trends highlight the Model Layer as the current funding leader driven by the large foundational models Open AI, xAI, Anthropic, trailed by Data and AI Dev Platforms. If we look at funding over time, we see the Data Layer has been foundational early on for traditional ML and the public release of ChatGPT in November 2022 ignited a broad funding peak in Q2 2023, significantly elevating the Model Layer. Over time, Security & Privacy garnered more attention with enterprise adoption, and recently, Agentic solutions have seen a funding uptick.
Conclusion: Addressing Complexity in a Crowded, Evolving Market
The complexity of the AI lifecycle fuels strong demand for unified development platforms that streamline workflows. This dynamic space attracts both adapting MLOps leaders (Databricks, Weights & Biases) and newer LLM/Agent-native platforms (Vellum, Thread AI). The core challenge is the constantly evolving definition of "full-stack" AI, requiring vendors to rapidly integrate new paradigms like agentic workflows while maintaining platform stability and abstracting infrastructure complexity.
As AI moves firmly into production across businesses everywhere –– with Morgan Stanley's 1Q25 CIO survey showing 60% of CIOs expect to have a GenAI-based workload in production by the end of 2025 –– the challenges of cost, complexity, reliability, security, and managing the sheer pace of innovation are immense. Successfully bridging the gap between AI potential and tangible business value hinges critically on selecting and implementing the right infrastructure components from this fragmented landscape.
This overview represents the first part of my exploration into the AI Infrastructure space. In future articles, I plan to delve deeper into specific segments and share my perspective on navigating this dynamic market. With AI technology advancing at breakneck speed, only those building upon dynamic and resilient infrastructure will maintain momentum and outperform the competition.
____
In the upcoming installments of this recurring series, we'll dissect specific battlegrounds, and offer tactical insights for navigating this whirlwind. This market doesn't wait, and neither can our analysis. Staying ahead requires understanding the infrastructure powering the revolution, and only those building on the most dynamic and resilient foundations will lead the pack. Stay tuned – the deep dives are coming.
If you or someone you know is building this future, please shoot us a note.
Lilly Vernor (lilly@ensemble.vc)
Gopi Sundaramurthy (gopi@ensemble.vc)
Ian Heinrich (ian@ensemble.vc)
Why Ensemble is backing Potato on their mission to deploy AI scientists
Potato collaborates with leading academic institutions and biotech companies to lower the cost of discovery and expand access to underexplored scientific questions, from rare diseases to materials chemistry. Beyond specific use cases, however, Potato plans to amplify scientific headcount and push a new paradigm of discovery by bringing agentic scientists into the engine room of innovation.
Loti AI expands beyond celebrities, launches free likeness protection for all
Originally built to protect celebrities' likenesses from harmful or contractually prohibited use, Loti is now making headlines with a bold move: launching free likeness protection for everyone
Portfolio Headlines: ICON + Lennar double down, Saronic earns praise
ICON and Lennar announced expanding partnership, and Saronic continues to garner recognition.
Optimus to Optimist: The Next Generation of Primes Pt. 2
In the second part of a two-part series, Ensemble VC explores how emerging defense tech startups are reshaping the military-industrial landscape. With AI, structural reforms, and a changing geopolitical environment, new defense primes are rising to meet the challenges of modern warfare.
Optimus to Optimist: The Next Generation of Primes Pt. 1
In the first part of a two-part series on innovation in the defense industry, Ensemble VC examines the historical context behind the "military-industrial complex," and the failure of a consolidated set of prime contractors to innovate in line with adversaries despite no shortage of innovation in the larger economy.
The Future of Healthcare Tech Adoption: From Top-Down to Practitioner-First
Danny Freed and Blueprint are redefining healthcare tech adoption with practitioner-first AI tools that streamline workflows and prioritize patient care. By empowering providers instead of replacing them, they’re addressing the inefficiencies of top-down mandates to drive meaningful change in healthcare.