Building The Infrastructure For Autonomous AI With Charles Drappier

AI agents are moving beyond human assistance to operate independently, executing complex workflows without supervision. This shift has exposed a major infrastructure problem: cloud platforms built for web applications cannot handle the demands of autonomous software. AI agents generate thousands of micro-interactions with unpredictable scaling needs and require sub-second response times that traditional platforms struggle to deliver.

Charles Drappier, Co-founder and lead developer at Blaxel, has spent 15 years building cloud infrastructure, including a successful exit when ForePaaS was acquired by OVHcloud, Europe’s largest cloud provider. Now he’s applying that experience to create specialized infrastructure for AI agents. Blaxel recently raised $7.3 million in seed funding led by First Round Capital after graduating from Y Combinator’s Spring 2025 batch. The platform processes millions of agent requests daily with sub-25 millisecond boot times.

1. What fundamental differences between human-centric applications and AI agents make existing infrastructure inadequate?

Human-centric apps are common today, but usage is already growing fast, and agents will accelerate that. If we keep the current model, cost (and energy consumption) will keep rising. The obvious antidote is serverless: consume resources on demand and densify workloads so more runs happen on less hardware. That’s been our philosophy from day one.

We still hit two gaps with today’s providers.

First, cold starts (~200–300 ms): in micro-service agent architectures, every agent action or tool call can trigger a new start, so that latency compounds quickly.

Second, global efficiency: to cut network time (not just cold starts) and handle bursts, you want the workload replicated close to where it will be called, ready without paying while idle, so you can route to the nearest region or fail over when one region is hot.
Traditional clouds let you do this, but they don’t make it global by default: you have to deploy to multiple regions yourself and build the front-door routing/failover logic.
Most agent teams would rather focus on agents than multi-region plumbing.

Where Blaxel is different: we auto-pause workloads and resume them almost instantly, so agents don’t pay while idle and don’t suffer real cold starts when they wake. The experience is seamless: no keep-warm hacks, no provisioned concurrency knobs, and no special routing logic, just smooth execution across thousands of micro-interactions, even under unpredictable spikes.

2. Blaxel just raised $7.3 million and graduated from Y Combinator. What traction convinced investors that AI agents need specialized infrastructure, and what specific performance improvements have you demonstrated?

What convinced investors was how we were solving concrete developer needs. While our vision is to provide the full experience of agent infrastructure, three use cases kept coming back again and again: secure sandboxes to execute code safely, infrastructure to run agents directly and expose them as services, and jobs for asynchronous or intensive background workloads.

Beyond the technical side, we also grew quickly in terms of adoption. New customers came fast, especially now that we’re shifting to on-demand pricing. Customers who initially came in for a specific use case usually expand their usage across all products as they scale their agents.

Two axes of traction were especially convincing. Jobs have shown that our platform can handle very large, compute-intensive workloads without breaking efficiency. At the same time, sandboxes demonstrated that we can run and scale thousands of isolated environments concurrently, giving developers the confidence to execute untrusted code safely at massive scale. Together, these validated that our architecture isn’t just fast in theory, it scales seamlessly across very different agent workloads.

3. Drawing from your experience building ForePaaS and selling to OVH, how does developing infrastructure for AI agents differ from building traditional cloud platforms? What lessons from your previous exit influenced Blaxel’s architecture?

At ForePaaS we built a data platform where the focus was almost entirely on user experience and building blocks. What we didn’t control was the infrastructure layer underneath, and that taught us a clear lesson: a layer on top of another layer will always be slower and less efficient than owning the foundation. With Blaxel we wanted to avoid that dependency, to go deeper, so we could deliver both speed and efficiency.

From ForePaaS we also carried forward several key principles. A strong focus on developer experience, a clear separation of concerns between running workloads and administering them, true multi-tenancy, and the importance of building security into the architecture from day one. Those lessons shaped how we designed Blaxel’s platform.

4. Your platform processes millions of agent requests daily with sub-25 millisecond boot times. What are the key technical innovations that enable AI agents to operate autonomously at this scale?

For developers and businesses, the impact is simple: agents can respond instantly, scale globally, and operate reliably without developers worrying about infrastructure bottlenecks. That’s what makes autonomous workflows actually usable in production, latency doesn’t pile up, workloads don’t stall under load, and performance is predictable even at very large scale.

Under the hood, the foundation of our performance is micro-VM technology. From the start we focused on Firecracker, and we’ve optimized every aspect of it—from network binding to snapshotting—to make workloads boot nearly instantly. That’s not something you get out of the box; it’s the result of continuously tuning the runtime.

But that’s only half the job, because this had to work not just in a lab but across regions worldwide. We designed the system so it can replicate globally and scale to hundreds of thousands of concurrent instances with the same predictability, making it possible to process millions of short-lived workloads every day without sacrificing latency.

The foundation of our performance is micro-VM technology. From the start we focused on Firecracker, and we’ve optimized every aspect of it, from network binding to snapshotting, to make agent workloads boot in just a few milliseconds. That’s not something you get out of the box; it’s the result of continuously tuning the runtime.

But that’s only half the job, as we knew this couldn’t just work in a lab. It had to replicate globally and scale to hundreds of thousands of concurrent instances without hitting bottlenecks. So we designed the system to operate across regions with the same predictability, making it possible to process millions of short-lived workloads every day without sacrificing latency.

5. AI agents often require unpredictable resource scaling – from minimal computing while waiting to intensive processing during active tasks. How does Blaxel’s infrastructure handle these variable demands differently than traditional serverless platforms?

Traditional serverless like AWS Lambda charges per request and tears the runtime down, which works for stateless, human-triggered functions but isn’t ideal for agent workflows that chain many short calls. In our model, you pay for the runtime of a container or micro-VM, which makes it more efficient for continuous or modular agent workloads.

The key difference is that with Blaxel we can snapshot and pause any workload when it’s idle, then resume it in under ~25 ms. That means you don’t pay while nothing is happening, but the agent can continue exactly where it left off without an annoying cold start. On top of that, we designed the backend to bootstrap new resources easily as demand grows, so bursts of activity don’t create bottlenecks.

6. Security and compliance are critical for enterprise adoption. How do you balance the autonomous nature of AI agents with enterprise requirements like SOC2, HIPAA compliance, and data residency controls?

As an infrastructure provider we are compliant with SOC 2 and HIPAA. That said, compliance is always a shared responsibility: we ensure the infrastructure is secure and certified, but users still need to apply good practices in how they handle data and design their agents.

Our job is to make sure the infrastructure is a no-brainer when it comes to securing the workload. For example, this is why we based our core compute engine on micro-VMs and not containers: VMs provide a stronger level of isolation while correctly prompted LLMs are able to easily escape from containers.

We also provide residency controls. For stateless workloads, customers can govern where they want them to run via workspace deployment policies, for example restricted to a country or a continent. For stateful workloads, deployments are pinned to a single region chosen by the user by default. This way, developers can align with enterprise or regulatory requirements without being blocked by infrastructure limits.

7. Your team consists of six co-founders who previously built together. How does this collaborative foundation influence your approach to building developer tools and infrastructure for AI agent builders?

We’ve been working together for years, I personally started working with Paul 13 years ago. Each of us has a clear domain of expertise, so when we started Blaxel it was natural to set up an efficient organization without friction. That’s why we’ve been able to move so quickly: we know each other’s strengths, and we can trust those decisions.

By default we divide responsibilities by role and expertise, but as a startup we also share a lot and support each other across the product. We all want Blaxel to be the best thing we’ve built, and that shared ambition keeps collaboration smooth.

The influence on the product is direct: because we aren’t slowed down by internal alignment, we can spend more time obsessing over developer experience. That means asking at every step: how do we make this easier, faster, and safer for agent developers?

8. With one customer reportedly running over 1 billion seconds of agent runtime monthly, what performance optimizations and cost efficiencies have proven most valuable for AI-first companies?

This kind of customer started with something which could consume even more than a billion seconds of runtime monthly, and it was a real pain point for them to fit it on another cloud.
We worked closely with them in the early days, the code ran on our platform easily, but they also needed advice on how to make a workload of that size efficient. The real benefit was being able to run everything in one place at scale. Before, they were splitting execution between local setups and remote options, and it wasn’t sustainable in terms of cost or operations.

On the cost side, the key was optimization rather than gimmicks. Too many overlay layers reduce performance, and for CPU and memory-intensive workloads that overhead adds up quickly. By running inside optimized micro-VMs, they got the scale and efficiency they couldn’t reach before.

Another important benefit was visibility. On Blaxel, they gained a clear view of execution, errors, and logs, which made it much easier to operate workloads of this size and complexity.

9. As AI agents become more prevalent across industries, how do you see the infrastructure requirements evolving? What capabilities are you building now to prepare for the next wave of agentic AI?

Right now many agent projects are still monolithic, but they’re starting to grow. We see the clear move toward structured, microservice-oriented systems with efficient deployment and global distribution. That’s already happening, and it’s where Blaxel is focused today.

The next step will be to give agents all the tools they need inside a structured environment. Today most developers rely on large external LLM providers like OpenAI or Anthropic. It works to get started fast but it adds latency overhead and often provides more capability than a specific use case requires. As enterprise-grade agentics gets rolled out at scale, we believe the next wave will bring fine-tuned smaller models running close to the agent itself. That will make agents both faster and more cost-efficient.

Of course, none of this can work at scale without clear observability. Having deep visibility into how agents and their models run in production is critical for reliability, optimization, and enterprise adoption.

10. For developers currently building AI agents on traditional cloud platforms, what are the most critical migration considerations when moving to infrastructure designed specifically for autonomous AI systems?

For most developers the migration is straightforward. We provide a Docker interface, so if your workloads already run in containers the transition is close to effortless. At the same time we keep expanding compatibility so developers don’t have to rethink everything to get started.

Our focus on user experience is to make it effortless while still giving you the strong security guarantees of micro-VM isolation. And for those not working directly with Docker, we also provide SDKs to deploy standard JavaScript, TypeScript, and Python agents easily. The idea is to remove friction so developers aren’t lost when they try it out compared to how they usually build.

And then, there is everything that will be different for agentic systems which people need to adapt to. People must be careful to avoid transferring architectures and biases from the SaaS/web era into agentic development. One simple example is the communication protocols. As AIs increasingly operate systems, protocols like MCP should be the standard rather than a bonus feature. This is why Blaxel can be operated via tool calls through an MCP server.

11. Looking ahead, you’ve mentioned that “hundreds of billions of agents are coming.” How is Blaxel positioning itself for this massive scale, and what role will infrastructure play in enabling truly autonomous AI operations?

We’re building an infrastructure that’s ready to scale to hundreds of billions of agents. On the technical side, we’ve designed the platform so that scaling itself isn’t the bottleneck. The bigger challenge is to keep up with developer needs, making sure that as agent workloads grow, we provide the right services directly in the platform.

With traditional clouds you get a wide catalog of services, but many new providers don’t go that far. Developers then end up stitching multiple platforms together, and that quickly creates pain around security, operations, and efficiency. Our focus at Blaxel is to avoid that trap. We want to give developers an efficient experience that hides a very powerful infrastructure, so they can build and scale agents without fragmentation or missing pieces.

Building the Infrastructure for Autonomous AI with Charles Drappier