By Zac Siegel in ai — Jun 4, 2025

The Questions AI Coding Agents Are Forcing Me to Ask

As I help my teams adopt AI coding agents, I'm grappling with fundamental questions about productivity, team dynamics, architecture, and infrastructure that don't have easy answers yet.

It All Starts with People and Teams

The Productivity Divide and Team Dynamics

AI coding agents are delivering remarkable productivity gains to engineering teams, but how do we ensure these benefits strengthen rather than fracture our teams? As some engineers quickly master AI tools and achieve unprecedented output levels, we're seeing the emergence of new productivity dynamics that challenge traditional team structures. While this newfound individual speed is exciting—who doesn't love engineers who can implement features at speeds previously unthinkable—it raises important questions about team collaboration. How do we maintain effective code reviews and planning sessions when team members are operating at fundamentally different velocities? What happens to our established mentorship patterns if AI expertise becomes more valuable than traditional seniority? How do we prevent the natural human reactions of resentment or imposter syndrome from undermining team cohesion? The opportunity is immense: teams that successfully navigate this transition could see productivity gains that will improve the odds of survival. But the challenge is equally significant—ensuring that as individual engineers become more powerful, the team as a whole becomes stronger rather than more divided.

The Lean Team Paradox: Higher Productivity, Higher Risk

AI coding agents present us with an enticing proposition: maintain the same output with fewer engineers. But I've been wondering if this apparent efficiency is actually setting our teams up for failure. While the immediate financial appeal of reducing headcount is obvious, I keep asking myself deeper questions about what makes teams truly productive over the long term. When we reduce team size while maintaining velocity, each remaining engineer becomes exponentially more critical to our system's function—what happens when our most AI-skilled developer takes a vacation or, worse, joins a competitor? I've spent my entire career working on small teams, trying to produce outsized impact with minimal headcount. I genuinely believe in the power of small, focused teams—there's something magical about the velocity and innovation that emerges when you have the right people working closely together. But here's what I've learned through years of building lean teams: the mathematics are unforgiving. Even in traditional development, a single sick day can crater velocity, and an unexpected departure transforms routine recruitment into crisis management with no time for proper onboarding or knowledge transfer. Now, with AI agents amplifying individual productivity, these dynamics become even more extreme.

What I'm finding most concerning is that AI agents amplify rather than eliminate the need for human guidance and institutional memory. These tools require context, direction, and oversight that becomes more fragile as teams shrink. The engineers who remain often find themselves under constant pressure, feeling that their absence significantly impacts the entire organization's output, and likely accelerating burnout. The key question I'm wrestling with isn't whether AI can help us build more with fewer people—it clearly can. The question is how big or small should my existing team become? I'm increasingly convinced that AI's productivity gains should be grounded in making more resilient, innovative teams rather than simply cutting costs. What if we used AI to give our teams the breathing room to tackle bigger challenges, invest in quality, and maintain the necessary margin that healthy systems require?

I want to find the right balance. AI should make small teams more capable and sustainable, not more fragile.

From Coder to Conductor

The Great Reversal: Becoming an AI Conductor

The fundamental nature of what I do every day is changing. For my entire career, software engineering has been about writing code—thinking through problems, crafting solutions, and iterating on implementations. Now I find myself spending more time reviewing, guiding, and orchestrating AI-generated code than writing it myself.

My new daily routine now involves being a part-time AI conductor, directing and coordinating multiple coding agents rather than doing the hands-on implementation myself. I made a similar transition as I moved from being an individual contributor to a manager but this feels different. It’s forcing me to rethink everything about how I work and manage: my quality gates need to evolve beyond traditional code review and personal mentorship to include validation of AI reasoning similar to the evaluation of human decision-making. I'm constantly asking myself what decisions I should make versus what I can delegate to AI or my team—and that boundary keeps shifting as the models get stronger.

The Context Switching Crisis

I am also seeing and feeling an increase in the disruptiveness of context switching. As we use more agents in an attempt to parallelize work, managing all of that context becomes even more challenging. Asking my brain to track multiple simultaneous implementations, each with their own state and progress, while still handling all the other responsibilities I have always had—meetings, architectural decisions, team coordination. The problem is that we humans are fundamentally not great multitaskers. When I am working with an agent that is mid-task on a complex implementation and I need to shift focus to a meeting or another agent's work, the cost of interruption isn't just losing my train of thought—it's potentially losing the agent's context and momentum as well. I find myself trying to maintain mental models of multiple parallel workstreams in a way that feels unsustainable.

Architecture in the Age of AI

The Dependency Paradox

One of the most intriguing shifts I'm observing is how AI coding agents are changing my relationship with external dependencies. For years, the conventional wisdom has been to leverage existing libraries and frameworks rather than reinventing the wheel. But AI agents excel at generating small, tailored pieces of code that do exactly what you need—no more, no less. This raises a fascinating question: if an AI can quickly write a custom 50-line integration that replaces a heavyweight library, should we still default to external dependencies?

I find myself increasingly torn between the traditional benefits of established libraries—community support, battle-tested code, ongoing maintenance—and the appeal of lean, purpose-built solutions that AI can generate on demand. There's something compelling about having code that does precisely what your application requires without the bloat of features you'll never use. But this approach also introduces new maintenance burdens and the risk of fragmenting codebases with countless small, custom implementations.

The trade-offs aren't straightforward. While custom code can reduce external dependencies and potential security vulnerabilities, it also means taking on the responsibility for ongoing maintenance and updates that the open-source community would otherwise handle. The question becomes: are we optimizing for the short-term development speed that AI provides, or the long-term sustainability that established dependencies offer?

Repository Strategy in an AI World

The debate between monolithic repositories and distributed codebases takes on new dimensions when AI agents enter the picture. I've been grappling with whether AI tools work better with focused, smaller codebases where they can fully understand the scope and constraints, or comprehensive, larger repositories where they have access to the full context of interconnected systems.

What I'm finding particularly interesting is how this affects team coordination. If we're using multiple AI agents across different repositories, how do we ensure they're making consistent architectural decisions? Conversely, in a monorepo environment, AI agents might be better positioned to spot optimization opportunities and maintain consistency across the entire system. The context window limitations of current AI models add another layer of complexity—there's likely a sweet spot where you have enough context to be useful without overwhelming the system with irrelevant information. But as models get bigger and better, this dynamic shifts.

Code Quality and Consistency

Managing code quality when AI agents are generating significant portions of your codebase presents unique challenges. Traditional code reviews assumed human authorship, but now I'm reviewing code that I didn't write and may not immediately understand the reasoning behind. This requires developing new review strategies that go beyond syntax and style to include validation of the AI's approach and decision-making process.

Maintaining consistency across AI-generated code is particularly tricky and I am still not sure what to make of it. While AI agents can follow style guides, they may interpret requirements differently or make subtly different architectural choices for similar problems. This is actually remarkably similar to the human side of engineering as well but again I am asking myself how much change is happening and do we need to react.

This is leading me to think we need to commit more than just code to our repositories. I'm increasingly convinced that we should be committing the prompts and key summary points that developers used during AI-assisted development. When a future developer (or AI agent) needs to modify or extend a piece of functionality, understanding the original intent and approach becomes critical. Without the prompts that guided the initial implementation, we lose valuable context about the problem-solving process and the constraints that shaped the solution.

Infrastructure for AI-Native Development

Current Platform Limitations

The platforms we've relied on for years—GitHub, GitLab, and similar tools—were designed with human development rhythms in mind. They assume humans are the bottleneck in code generation, not code review and decision-making. But when AI agents can generate code at machine speed, these assumptions break down quickly. Could it be that existing review workflows will struggle to handle the volume and pace of AI-generated changes? Pull request queues that once moved at a steady human pace will now crumble under AI output, creating review bottlenecks that the platforms weren't designed to handle. The bottlenecks don't stop at code review—they extend deep into our CI/CD pipelines. When the rate of code changes increases dramatically, the speed of your continuous integration becomes a massive constraint. Test suites that seemed reasonably fast for human development velocity suddenly become painfully slow when AI agents are submitting an order of magnitude larger number of changes. Build pipelines that were optimized for a few deployments per day will struggle under the load of constant AI-generated updates, creating queues and delays that may negate much of the productivity gains AI promised to deliver.

The fundamental issue I supposed is that current platforms treat all changes equally, but AI-generated changes might need entirely different urgency levels and review processes. When an AI agent submits a pull request for a straightforward utility function, does it really need the same human oversight as a complex architectural change? And what about compliance requirements—if we must adhere to SOC 2 for compliance, does every single code change made by AI need human review? We're operating with tools built for a different era of software development, and the friction is becoming increasingly apparent.

The Economics of AI-Driven Development

The cost structure of feature development and bug fixing is fundamentally shifting. Software development used to be about salaries, fully loaded costs, and annual licensing costs for tools and IDEs, but now we're in a "pay as you go" world where every AI interaction costs a few pennies and you can put a dollar amount to the output of each line of code. The economics become particularly challenging when you're using frontier models day in and day out—those costs add up quickly, especially for smaller teams and startups who might benefit most from AI augmentation. Unfortunately they might be the ones who can least afford the expense and will be forced to use a sub-par model.

But it's not just the AI model costs. When the rate of code changes increases dramatically, we also need significantly more compute power for CI/CD pipelines to keep up. My teams are already running hundreds of pipelines a day, and I can see this exploding to thousands or tens of thousands as AI agents become more capable. Growing codebases with AI-generated code require more compute to build, test, and deploy. The infrastructure costs scale alongside the productivity gains.

Emerging Platform Needs

What's becoming clear is that we need infrastructure designed specifically for AI-native development. This means platforms that can orchestrate multiple AI agents working simultaneously, manage context and state across long-running AI tasks, and provide intelligent routing of human attention to the decisions that actually require it. I'm also feeling that we need intelligent routing to different AI models based on task complexity and cost considerations—why use an expensive, powerful model for a simple code review when a lighter, cheaper model can handle it just as well? Current platforms simply don't have the abstractions needed for agent coordination, workflow management, or this kind of cost-aware task distribution.

The question isn't whether existing platforms will adapt or new ones will emerge—it's how quickly this transformation will happen and whether development teams can bridge the gap in the meantime.

Conclusion

These are all questions I'm wrestling with, and I'll be honest—I don't have any concrete answers, at least not yet. What I do know is that despite all the uncertainty and challenges, I continue to love software engineering as a discipline and practice. There's something exciting about being in the middle of such a fundamental shift, even when it's uncomfortable. I'm genuinely excited to see where all of this leads, and I suspect the answers to many of these questions will emerge through experimentation and shared experiences from teams around the world. For now, I'm focused on helping my peers navigate this transition thoughtfully, one token at a time.