All AI infrastructure market maps are wrong. But some are better than others.

This is the first installment in "Pandora's Black Box," a series by Data Scientist Lilly Vernor dedicated to unpacking the complexities of AI infrastructure. Lilly leads the creation and refinement of Ensemble VC’s internal products, including Unity, our proprietary platform designed to provide holistic company profiles through growth signals, network analysis, and visualized insights. She collaborates with both existing and prospective portfolio companies to develop value-add products, leveraging data to optimize their operations and decision-making processes.

Hey data people— are you seeing what I’m seeing?

Building with LLMs daily, I’ve noticed a growing and increasingly pronounced gap between how AI infrastructure actually works and how the broader world thinks it works. The fact that the infrastructure continues to transform and fracture (literally daily) doesn’t help much. The unacknowledged truth is we are in uncharted territory, and the more we see commentators trying to crystallize an understanding of AI's deepest underpinnings, the clearer it becomes that the surface layer applications are proliferating without stopping to consider the infrastructure.

Our Unity data platform tracks the migration of talent and capital in near real-time, allowing us to instantly generate market maps like the one above (which I’ll walk through below). So while AI market maps are obsolete as soon as they're created, this one should be right for today.

Along with building AI-driven tools for both our portfolio companies and the internal Ensemble team, I also get to study where capital is flowing across the tech stack, as investors race to figure out which layer will drive the next breakout. Here are some disconnects I've noted between the story in the press and the reality in the code:

‍The AI stack doesn’t sit still: AI infrastructure is evolving daily — any “snapshot” is outdated the moment you publish it.‍
Each element is its own brand new problem to solve: A breakdown of the key infrastructure layers powering LLMs today, from data to deployment.‍
Is the money following the toughest problems?: Where capital is flowing across the stack — and what it says about where we’re headed next.

Everyone is racing to build on top of AI, but very few understand what they’re standing on.

As the models get better, the public (and often investors and founders) assume the underlying systems are getting more stable — that the hard part is behind us. In my mind, that’s a tricky illusion. The infrastructure powering LLMs is not settling. It’s fragmenting, accelerating, and rewriting itself in real time. It’s not “figured out” — it’s actively evolving, often by the models themselves.

Unlocking value in this landscape doesn’t mean riding an obvious wave — it means navigating a fragmented, fast-moving system that no one fully understands, and piecing together a working solution before the standards settle. The companies and builders who figure it out first aren’t just lucky — they’re creating the foundation that others will eventually call obvious. The diversity of approaches and the sheer speed of evolution are themes echoed by founders, investors, and technologists in Google Cloud's recent 'Future of AI: Perspectives for Startups 2025' report, underscoring the strategic complexity involved. ‍

AI adoption continues to accelerate across industries, with the Stanford AI Index Report 2025 noting that private investment in generative AI grew by 18.7% last year and business usage jumped 55% to reach 78% adoption. Reflecting this momentum, Morgan Stanley’s 1Q25 CIO survey found that AI/ML remains the top priority for CIOs and is now seen as the second most defensible area of IT spend, increasingly insulated from broader macro pressures.

With the eyes and dollars of the world squarely focused on building and implementing AI at the everyday user level, will we assume AI to be safe simply because there’s no way to meaningfully audit or explain what’s under the hood? If the infra keeps moving this fast — and the abstraction layers keep getting thicker — how long before we lose the plot entirely?

If you’re working on LLMs today, you’re building on a shape-shifting system.

Each element is its own brand new problem to solve

Since the release of ChatGPT, my work as a Data Scientist has been a front-row seat to the chaos unfolding beneath the surface of AI. I’ve lived through the transition from established MLOps workflows to the fast-moving, often unstable world of LLM infrastructure. Building real applications with these tools means confronting a simple truth: every major challenge — from scaling to governance to hallucination control — now demands custom solutions. There is no unified standard. Every company is solving these problems differently, and the assumptions that once guided “best practices” in machine learning are being rewritten on the fly.

Cost & Compute: Accessing/affording the massive processing power required for these systems
Complexity & Scalability: Moving models from research to reliable, scalable systems
Accuracy, Reliability & Hallucinations: Ensuring trustworthy, accurate outputs
Latency: Delivering responses quickly enough for real-time applications
Data Dilemmas: Managing data quality, storage, preparation, and governance on a large scale
Explainability: Understanding and observing the “black box” nature of these systems
Security & Privacy: Protecting systems from novel attacks, safeguarding sensitive data
Rapid Evolution: Keeping pace with constant advancements in models, techniques, and tools

This fragmentation makes AI infrastructure both a massive opportunity and a massive risk. Each layer — compute, data, model deployment, security — is at this point its own frontier, with no agreed-upon blueprint for how to build, audit, or scale reliably. Companies aren’t competing within a shared framework; they’re building their own playbooks as they go on each of these fronts. Unlocking real value in this space means understanding that “infrastructure” today is not a stable foundation — it’s a moving target. Every market map you see is a temporary snapshot. Every system built today is a bet against tomorrow’s standards.

‍‍Companies building AI infrastructure tools attempt to tackle these challenges head-on. Some focus on point solutions targeting developers, while others, particularly those serving enterprises, aim for comprehensive, full-stack platforms. Segmenting this market is inherently difficult due to high feature overlap and the space's constant flux; companies often expand beyond their initial niche across the entire AI development lifecycle. Many of the early market maps in this space focused directly on LLMOps. There has been a recent shift towards all-encompassing AI Infra Maps, and now we see separate maps for Agents as we see companies shift towards Agentic solutions. Consequently, every market map will differ, and there is no single agreed-upon stack. (Note: This analysis focuses primarily on LLM-related infrastructure and excludes specialized solutions for computer vision, voice generation, and robotics).

Mapping a moving target

For this map, we'll use the following broad segments:

Data Layer: Infrastructure for managing, processing, storing and accessing data used by AI systems (including vector database and synthetic data)
Model Layer: Tools and platforms for training, developing, evaluating, deploying and monitoring AI systems
Compute & Hardware: Access to and management of the underlying computing resources (GPUs, specialized hardware)
Security & Privacy: Solutions addressing the unique security and privacy risks of AI
Agents: Frameworks and infrastructure for building autonomous AI systems capable of planning and tool use ‍
Full Stack AI Development Platforms: Integrated environments aiming to cover end-to-end of the AI development lifecycle

Funding Rounds by Infrastructure Segment

Q1 2015 - Q1 2025, Ensemble Proprietary Data

Investment trends highlight the Model Layer as the current funding leader driven by the large foundational models Open AI, xAI, Anthropic, trailed by Data and AI Dev Platforms. If we look at funding over time, we see the Data Layer has been foundational early on for traditional ML and the public release of ChatGPT in November 2022 ignited a broad funding peak in Q2 2023, significantly elevating the Model Layer. Over time, Security & Privacy garnered more attention with enterprise adoption, and recently, Agentic solutions have seen a funding uptick.

Conclusion: Addressing Complexity in a Crowded, Evolving Market

The complexity of the AI lifecycle fuels strong demand for unified development platforms that streamline workflows. This dynamic space attracts both adapting MLOps leaders (Databricks, Weights & Biases) and newer LLM/Agent-native platforms (Vellum, Thread AI). The core challenge is the constantly evolving definition of "full-stack" AI, requiring vendors to rapidly integrate new paradigms like agentic workflows while maintaining platform stability and abstracting infrastructure complexity.

As AI moves firmly into production across businesses everywhere –– with Morgan Stanley's 1Q25 CIO survey showing 60% of CIOs expect to have a GenAI-based workload in production by the end of 2025 –– the challenges of cost, complexity, reliability, security, and managing the sheer pace of innovation are immense. Successfully bridging the gap between AI potential and tangible business value hinges critically on selecting and implementing the right infrastructure components from this fragmented landscape.

This overview represents the first part of my exploration into the AI Infrastructure space. In future articles, I plan to delve deeper into specific segments and share my perspective on navigating this dynamic market. With AI technology advancing at breakneck speed, only those building upon dynamic and resilient infrastructure will maintain momentum and outperform the competition.

____

In the upcoming installments of this recurring series, we'll dissect specific battlegrounds, and offer tactical insights for navigating this whirlwind. This market doesn't wait, and neither can our analysis. Staying ahead requires understanding the infrastructure powering the revolution, and only those building on the most dynamic and resilient foundations will lead the pack. Stay tuned – the deep dives are coming.

If you or someone you know is building this future, please shoot us a note.

Lilly Vernor (lilly@ensemble.vc)

Gopi Sundaramurthy (gopi@ensemble.vc)

Ian Heinrich (ian@ensemble.vc)

Publish date

April 22, 2025

Why Ensemble is backing Potato on their mission to deploy AI scientists

New Investments

April 15, 2025

‍

Loti AI expands beyond celebrities, launches free likeness protection for all

New Investments

March 29, 2025

‍

Portfolio Headlines: ICON + Lennar double down, Saronic earns praise

Firm announcements

March 21, 2025