Claude Code: Running Agents in Parallel

· 5 min read

Overview

Sequential agent execution is the default, but many agent tasks are naturally parallelizable. When three analysts each examine a different aspect of the same dataset, there is no reason for the second to wait for the first. Running them concurrently can cut total execution time by 60-70% for teams of three or more agents.

Parallel execution in the Claude Agent SDK works through standard JavaScript concurrency primitives — Promise.all, Promise.allSettled, and async iteration. The SDK does not impose single-threaded constraints on agent runs, so multiple agents can execute simultaneously, each maintaining their own conversation state and tool call history.

The challenge is not running agents in parallel — that part is straightforward. The challenge is handling partial failures, managing rate limits across concurrent agents, collecting and merging results, and knowing when parallel execution helps versus when it adds complexity without meaningful speedup.

When to use it

Parallel execution pays off when:

Parallel execution does not help when:

Getting started

Basic parallel execution with Promise.allSettled

Always use Promise.allSettled over Promise.all for parallel agent runs. One failing agent should not cancel the others.

import { Agent } from "@anthropic-ai/agent-sdk";

interface AgentResult {
  agentName: string;
  status: "success" | "failed";
  output?: string;
  error?: string;
  durationMs: number;
}

async function runAgentsInParallel(
  agents: Agent[],
  input: string,
  maxTurns: number = 10
): Promise<AgentResult[]> {
  const startTime = Date.now();

  const results = await Promise.allSettled(
    agents.map(async (agent) => {
      const agentStart = Date.now();
      const result = await agent.run(input, { maxTurns });
      return {
        agentName: agent.name,
        status: "success" as const,
        output: result.output,
        durationMs: Date.now() - agentStart,
      };
    })
  );

  return results.map((result, index) => {
    if (result.status === "fulfilled") {
      return result.value;
    }
    return {
      agentName: agents[index].name,
      status: "failed" as const,
      error: result.reason?.message ?? "Unknown error",
      durationMs: Date.now() - startTime,
    };
  });
}

// Usage
const analysts = [marketAnalyst, technicalAnalyst, financialAnalyst];
const results = await runAgentsInParallel(analysts, "Analyze Q3 performance for ACME Corp");

const succeeded = results.filter((r) => r.status === "success");
const failed = results.filter((r) => r.status === "failed");

console.log(`${succeeded.length}/${results.length} agents completed successfully`);

Rate-limited parallel execution

When you need to limit concurrency to avoid hitting API rate limits, use a semaphore pattern.

class Semaphore {
  private queue: Array<() => void> = [];
  private active = 0;

  constructor(private readonly maxConcurrent: number) {}

  async acquire(): Promise<void> {
    if (this.active < this.maxConcurrent) {
      this.active++;
      return;
    }
    return new Promise<void>((resolve) => {
      this.queue.push(() => {
        this.active++;
        resolve();
      });
    });
  }

  release(): void {
    this.active--;
    const next = this.queue.shift();
    if (next) next();
  }
}

async function runWithConcurrencyLimit(
  agents: Agent[],
  input: string,
  maxConcurrent: number = 3
): Promise<AgentResult[]> {
  const semaphore = new Semaphore(maxConcurrent);

  const results = await Promise.allSettled(
    agents.map(async (agent) => {
      await semaphore.acquire();
      try {
        const start = Date.now();
        const result = await agent.run(input, { maxTurns: 10 });
        return {
          agentName: agent.name,
          status: "success" as const,
          output: result.output,
          durationMs: Date.now() - start,
        };
      } finally {
        semaphore.release();
      }
    })
  );

  return results.map((r, i) =>
    r.status === "fulfilled"
      ? r.value
      : {
          agentName: agents[i].name,
          status: "failed" as const,
          error: r.reason?.message,
          durationMs: 0,
        }
  );
}

Parallel execution with result merging

After parallel agents complete, a synthesis agent can merge their findings into a unified output.

async function parallelResearchWithSynthesis(topic: string) {
  // Phase 1: Parallel research
  const researchResults = await runAgentsInParallel(
    [academicAgent, industryAgent, trendAgent],
    topic
  );

  const successfulResults = researchResults
    .filter((r) => r.status === "success")
    .map((r) => `## ${r.agentName}\n${r.output}`)
    .join("\n\n");

  if (successfulResults.length === 0) {
    throw new Error("All research agents failed");
  }

  // Phase 2: Sequential synthesis
  const synthesisInput = `Synthesize these research findings into a
    unified analysis:\n\n${successfulResults}`;

  const synthesisAgent = new Agent({
    name: "synthesizer",
    model: "claude-sonnet-4-20250514",
    instructions: `You receive research from multiple analysts. Combine
      their findings into a coherent report. Resolve contradictions by
      noting different perspectives. Highlight consensus findings.`,
  });

  return synthesisAgent.run(synthesisInput, { maxTurns: 5 });
}

Integration with agent teams

Parallel execution maps directly to two coordination patterns:

Parallel Workers is the simplest case — multiple agents perform the same type of work on the same input. The value comes from diverse perspectives. Use this for research, brainstorming, and analysis tasks where coverage matters more than efficiency.

Fork-Join combines parallel execution with synthesis. Agents work independently in the fork phase, then a dedicated synthesis agent joins their results. This pattern requires careful handling of the join step. The synthesis agent must handle cases where some parallel agents failed, produced empty results, or generated contradictory findings.

// Fork-Join with graceful degradation
async function forkJoinAnalysis(data: string) {
  const forkResults = await runAgentsInParallel(
    [quantitativeAgent, qualitativeAgent, comparativeAgent],
    data
  );

  const succeeded = forkResults.filter((r) => r.status === "success");

  // Proceed with synthesis even if some agents failed
  if (succeeded.length === 0) {
    return { error: "All analysis agents failed", details: forkResults };
  }

  const joinPrompt = succeeded.length < forkResults.length
    ? `Some analyses failed. Synthesize available results and note gaps:\n\n${
        succeeded.map((r) => `### ${r.agentName}\n${r.output}`).join("\n\n")
      }`
    : `All analyses complete. Synthesize into a unified report:\n\n${
        succeeded.map((r) => `### ${r.agentName}\n${r.output}`).join("\n\n")
      }`;

  return synthesisAgent.run(joinPrompt, { maxTurns: 5 });
}

Best practices and common pitfalls

  1. Use Promise.allSettled, not Promise.all. With Promise.all, one agent failure rejects the entire batch. Promise.allSettled lets you collect results from agents that succeeded while handling failures individually.

  2. Set per-agent timeouts. A single slow agent can bottleneck the entire parallel batch. Use AbortSignal.timeout() or a wrapper timeout to cap individual agent execution time.

  3. Monitor aggregate token usage. Three agents running in parallel consume three times the tokens per second. If you are near rate limits, use the semaphore pattern to cap concurrency rather than letting all agents start simultaneously.

  4. Define minimum viable results. Before running parallel agents, decide how many must succeed for the overall task to be considered complete. If 2 out of 3 is acceptable, your synthesis logic should handle that gracefully rather than treating it as a failure.

  5. Profile before parallelizing. Measure sequential execution time first. If total latency is already acceptable, parallel execution adds complexity without meaningful benefit. Parallelize only the actual bottleneck.

Skip the setup — generate agent teams instantly →