Training Myself to Think in Parallel: From 10,000 Hours to 10 Concurrent Contexts

AI agents require a completely different mindset: parallel thinking, externalized context, and orchestration over implementation.

Training Myself to Think in Parallel: From 10,000 Hours to 10 Concurrent Contexts

I started my engineering career with a simple belief: if I could just learn faster than everyone else, I'd always have an edge. It wasn't about being naturally smarter or having some innate gift for code – I had neither of these attributes. It was pure arithmetic – more hours in meant more knowledge out.

Consuming Everything I Could Find

In the early days of my career, I was relentless. Every waking moment that wasn't spent coding was spent learning about coding. I'd read programming books cover to cover on weekends. When content moved online, my consumption evolved with it – conference talks, tutorials, documentation, blog posts, anything that could make me a better engineer.

Pairing this individual knowledge quest with smart peers and friends became the real accelerant. I'd pepper fellow engineers with questions during the day, absorbing their war stories and hard-won insights. Then I'd go home and immediately apply what I learned – sometimes improving work projects, sometimes cooking up new personal experiments. Before I had dogs, a wife, and a kid, those 2 AM coding sessions were easy. The apartment was quiet, the distractions were minimal, and I could hold entire architectures in my head for hours.

The process was beautifully simple and brutally effective. Learn constantly, apply immediately, repeat endlessly. While others might have had more balanced lives, I had an ever-growing understanding of software systems. It wasn't sustainable long-term, but in those early years, it was my competitive advantage.

The Compound Interest of Serial Focus

What I didn't fully appreciate at the time was how this serial approach created compound returns. Each hour spent understanding a given software layer meant the next bug took less time to fix. Every evening studying architecture patterns meant I could improve our codebase the next morning.

The knowledge stacked linearly but grew exponentially.

This wasn't unique or special – it was just math. While a peer might have seen 100 different error patterns, I'd seen 300. While they'd debugged 50 production issues, I'd debugged 150. I'd made more mistakes than others. Each late-night coding session was another opportunity to break something new, and each bug was a lesson learned.

The 10,000 hour rule wasn't about the hours themselves; it was about the compound interest of serial, focused repetition. It was a simple game: make more mistakes faster, learn from them, and get better.

Amplification Without Absorption

When I moved into team leadership, the serial advantage evolved but didn't disappear. Now, instead of diving deep into every problem myself, I could guide others toward solutions based on my accumulated experience. The key insight was that I didn't need to hold my team members' context – they held it themselves.

I didn't need to know the specifics of their implementation. I just needed to pattern-match from my experience and point them in the right direction. They owned their context; I provided some wisdom throughout the process.

This felt like the natural evolution of the 10,000 hour advantage. I'd moved from being the person who knew the most details to being the person who had seen the most patterns. The serial accumulation of experience had transformed into parallel amplification with a team.

The Cracks in the Formula

Looking back, there were warning signs that this model had limits. As systems grew more complex, even my serial deep-dives couldn't cover everything. As technologies evolved faster, my accumulated knowledge had shorter half-lives.

But these felt like scaling issues, not fundamental issues in the model. I figured I just needed to read more, learn faster, develop better systems for knowledge management. The formula still held: more input, more output.

What I didn't anticipate was a technology that would completely invert the equation. Something that would enable parallel execution by an individual. Something that would require me to hold not just my own context, but the context of multiple artificial agents simultaneously.

The serial success formula had taken me far. It had built my career, shaped my thinking, and defined my value as an engineer. But as I stared at three Claude Code terminals, each working on different features, each requiring me to hold their entire context in my head, I realized:

I needed once again to change my operating model.

When Parallel Broke My Brain

At some point, I noticed something troubling – I was moving slower with AI agents than I had been without them. This made no sense. I had multiple Claude Code terminals open, each working on different features. The promise was multiplicative productivity. The reality? I was drowning.

With three agents working on features, I quickly lost track of what I'd told each one. I was the only thread connecting multiple amnesiacs writing code. The cognitive load appeared to increase linearly with each terminal session but felt exponential.

The Fundamental Break

What made this different from managing a human team was devastatingly simple: humans hold their own context, and teams develop shared context and strong memory over time. I could tell a senior engineer I'd worked with for five years to "add user preferences" and they'd know our patterns, our style, our constraints. They'd fill in the gaps from experience.

With AI agents, the shared context and strong memory don't exist. They work incredibly fast but need exhaustive upfront detail. The planning and context-setting I needed to do for each agent dwarfed what I'd need to tell a trusted colleague. There's no accumulated understanding, no learned patterns, no "you know what I mean" moments.

AI agents are brilliant, but they're amnesiacs. Every interaction starts fresh. Every context switch requires a full reload. And most critically – they don't know what they don't know. A human developer will say "wait, how does this affect the payment system?" An AI agent will happily implement exactly what you asked, creating perfect code that's perfectly incompatible with everything else.

The Serial Brain Meets Parallel Execution

My 10,000 hours had trained me for deep, focused work. I could hold an entire system in my head – but only one system at a time. My brain was optimized for serial processing: load context, work deeply, save progress, move to next task. It's how I'd always worked, how I'd gotten good, how I'd built my career.

But here I was, trying to be the working memory for multiple parallel processes. The issue wasn't just coordination – it was speed. AI agents compress days or weeks of work into minutes. They need constant feedback, clarification, and course correction at a superhuman pace.

What would normally be a daily check-in with a human developer becomes a rapid-fire stream of decisions. I was context switching at a rate that would be impossible with human collaboration – not every few hours or days, but every minute.

The New Game

The serial success formula had been simple: invest more hours, gain more experience, become more valuable. But this new world didn't care about my accumulated hours. It cared about my ability to orchestrate parallel execution, to maintain multiple contexts simultaneously, to be the coherent thread connecting independent agents.

Everything I'd learned in 10,000 hours was still valuable – but it was no longer sufficient. The game had changed from "how deep can you go" to "how many contexts can you juggle." From "how much can you know" to "how well can you orchestrate."

The competitor in me already knew the answer. If parallel execution was the new advantage, then I'd learn it the same way I'd learned everything else – by making more mistakes than anyone else, just faster and in parallel this time.

But first, I needed a completely different approach to thinking about software development. The serial success formula wasn't just obsolete – it was actively holding me back.

Learning to Externalize Everything

The Context Problem Nobody Warned Me About

The first thing I had to accept was that context management with AI agents is fundamentally different from any other engineering challenge I'd faced. It's not about technical complexity or system design. It's about being the persistent memory layer for multiple amnesiacs who work at superhuman speed.

Every AI agent starts each conversation blank. They have no memory of previous sessions, no accumulated understanding of your codebase, no sense of what their "colleague" agents are building. You are their entire context. And when you're running multiple agents in parallel, you're not just their context – you're the only thing preventing chaos.

The mental model I'd developed over 10,000 hours assumed context persistence. Write code today, review it tomorrow, it's still there. Tell a teammate about a decision, they remember it next week. Document an architecture choice, it becomes part of the team's shared understanding. None of this exists with AI agents.

My First Attempts at Parallel Development

My early attempts were disasters of context loss. I'd start an agent on a backend service, get it humming along, then switch to another agent for frontend work. By the time the first agent had questions, I'd forgotten half the context. By the time I returned to the second, I was giving it contradictory information.

The speed made everything worse. These agents could implement a complete feature in the time it took me to context switch. I'd return to find hundreds of lines of code based on assumptions I couldn't remember making. Were those the right assumptions? Had I really said to use that pattern? The code was often excellent – and excellently wrong.

I tried to solve this with documentation. Following the model makers' guidance, I created context files with coding patterns and architectural decisions. I'd start every session reminding the agent to follow these guidelines. But after just a few minutes of work, as the context window filled with our conversation and code, these crucial instructions would get pushed out. The agent would forget our patterns, ignore our conventions, and drift back to its defaults.

The Breakthrough: Externalizing My Brain

The solution came when I stopped trying to hold everything in my head. If I was going to be the context layer for multiple agents, I needed to externalize that context into persistent, searchable, shareable artifacts.

Every piece of context I held in my head was a point of failure. Every assumption I made but didn't document was a future bug. Every decision I communicated verbally to an agent was lost the moment I closed that terminal.

I began developing what I now call "spec documents" – living specifications that capture not just what to build, but all the surrounding context that a human teammate would naturally absorb. These weren't traditional technical specs. They were more like shared memory dumps, containing:

  • Technical research and findings
  • Architecture decisions and patterns
  • Key domain models and their relationships
  • Detailed implementation plans
  • Testing strategies and acceptance criteria

Creating these comprehensive specs became the foundation of parallel development. But the process of creating them efficiently would require its own breakthrough.

Voice Dumping: The Unexpected Solution

The solution emerged from pure frustration. After yet another context switch left me confused about what I'd told which agent, I grabbed my phone and took my dogs for a walk. While walking, I just talked out loud. Not structured thoughts, not careful specifications – just a raw stream of consciousness about what I was trying to build.

This rambling contained more useful context than any structured document I'd written. It captured not just the decisions but the thinking process. The uncertainties. The trade-offs. The stuff you'd naturally think through with a human colleague but never write down.

Voice dumping became my secret weapon. Before starting any parallel work, I'd spend 2-3 minutes just talking through the entire system. Stream of consciousness. No editing. No structure. Just getting everything out of my head and into a form I could return to.

The process evolved quickly. I discovered I could feed these voice dumps to Claude with Opus 4 and ChatGPT o3, bouncing between them to transform my rambling into comprehensive spec documents. Each model brought different strengths – one might excel at architecture, another at implementation details. Voice input let me transfer my thoughts rapidly, and the models helped structure and expand them into actionable specs. They were also able to research topics deeply, discovering API limitations or capabilities I didn't know existed.

The Rhythm Emerges

Gradually, a sustainable rhythm emerged. Instead of trying to be a real-time context manager for multiple agents, I became a context architect. I focused on building comprehensive specs upfront that would let agents work independently rather than constantly needing my input.

The workflow that emerged looked nothing like my serial process:

  1. Voice dump the entire vision – Get everything out of my head before starting
  2. Transform voice into structured context – But keep the thinking visible
  3. Review and update context documents – Not editing code but crafting context
  4. Let the agent work – With rich context, it could run much longer and produce better results

This wasn't just a new way of working with AI. It was a fundamental shift in how I thought about software development. Instead of being the expert who knew everything, I became the documentarian who captured everything. Instead of holding context, I built context systems.

The irony wasn't lost on me. After 10,000 hours of learning to hold more in my head, the key to parallel AI development was learning to hold nothing in my head. Everything had to be externalized, documented, shareable. My brain's job was no longer storage – it was orchestration.

The New Workflow Taking Shape

My Current Stack for Context Management

After months of iteration, my toolkit has stabilized around a few core pieces:

  • Claude Code for the actual implementation work - it excels at following detailed specs
  • Claude Opus 4 and ChatGPT o3 for spec creation - bouncing between them refines ideas
  • Voice transcription as the primary input method - 2-3 minutes of talking beats 30 minutes of typing
  • Conductor for managing local git worktrees - each agent gets its own isolated workspace
  • Remote mini PC with Vibetunnel - persistent Claude sessions I can access from anywhere via Tailscale
  • Structured spec documents as the source of truth - everything lives here, not in my head

But the tools are less important than the mindset shift. I'm no longer relying solely on my 10,000+ hours of accumulated experience to solve every problem. Instead, I'm focused on translating that experience into clear, actionable context that AI agents can execute.

The Three Modes of AI Development

Through trial and error, I've found three distinct modes of working with AI that map to different cognitive states:

1. Multi-Model Research Mode

When I'm exploring architecture or solving complex problems, I'll have Claude and ChatGPT open simultaneously, using each as a sounding board. This isn't parallel execution - it's parallel thinking. Each model offers different perspectives, and the friction between them often reveals the best path forward.

2. Parallel Agent Assault

When I have well-defined, independent features to build, I'll run multiple Claude Code agents in parallel. This only works when I've invested heavily in upfront spec creation. Each agent needs a complete context to work independently. I'll often bundle relevant source code and documentation locally on the machine so agents can quickly find exactly what they need without searching. The payoff is dramatic - what would take a week serially completes in a day.

3. Single Agent Precision

When I have a clear plan and a first pass from an agent, I work with a single agent in tight iteration. This feels most like traditional development - except I'm refining and developing code with precise ideas into existence. The context is simple, the feedback loop is tight.

The Unexpected Benefits

The shift from serial to parallel thinking has changed more than just my output. It's fundamentally altered how I approach software development:

Clarity becomes crucial. When you can't rely on implicit understanding or accumulated context, every decision must be explicit. This has made me a better architect - not because I'm smarter, but because I'm forced to think clearly.

Documentation is development. The spec documents I create for AI agents are better than any documentation I've written throughout my career. They have to be - they're not supplementary, they're foundational.

Mistakes happen faster. With parallel execution, I can explore multiple approaches simultaneously. Bad ideas fail quickly. Good ideas emerge from the contrast. The same principle that made me better through serial mistake-making now operates in parallel.

What This Means for Engineering

The 10,000 hour model isn't dead - it's evolving. Deep expertise still matters, but it manifests differently. Instead of being the person who can hold the most complexity in their head, you become the person who can express complexity most clearly.

Junior developers might actually have an advantage here. They haven't spent decades optimizing for serial depth. Their brains might adapt to parallel orchestration more naturally. The hierarchy of experience could flatten in unexpected ways.

For senior engineers like me, the challenge is letting go. Letting go of being the bottleneck. Letting go of holding all the context. Letting go of serial thinking patterns that have defined our careers.

The New Questions I'm Living With Now

As I continue to refine this workflow, new questions emerge:

How many parallel contexts can one person effectively manage? I've found my limit is around 3-4 active agents. Beyond that, the context switching overhead destroys the benefits. But will better tools raise this ceiling?

How do we train the next generation? My guess is that developers learning with AI assistance will become proficient in a few hundred hours rather than following the traditional 10,000 hour rule. They'll develop different skills - perhaps less about syntax and implementation details, more about system design and context creation. The question is whether this accelerated proficiency comes with hidden gaps we haven't yet identified.

Moving Forward

The game has changed, but the core challenge remains: how do we build better software, faster, with fewer bugs? AI agents offer a new answer, but only if we're willing to change how we think.

For me, that meant abandoning the serial success formula that built my career. It meant learning to think in parallel, to externalize context, to become an orchestrator rather than an implementer. It meant making a whole new set of mistakes - just faster and in parallel.

The irony isn't lost on me. After 10,000 hours learning to be a better programmer, I'm spending my next 10,000 hours learning to program less and communicate more. But that's the beauty of this field - just when you think you've figured it out, everything changes.


If you're exploring this transition yourself, I'd love to hear what's working for you. What patterns have you discovered? What mistakes have you made? How are you adapting your workflow for parallel AI development?

The one thing I know for sure: we're all figuring this out together, one parallel mistake at a time.

Subscribe to zsiegel.com

Don’t miss out on the latest articles. Sign up now to get exclusive members only content and early access to articles.
jamie@example.com
Subscribe