Claude Agent SDK: Streaming and Real-Time Agents

· 5 min read

Overview

Batch-mode agents that process a request and return a complete response after several seconds (or minutes) create a poor user experience. Users stare at a loading spinner with no visibility into whether the agent is working, stuck, or about to finish. Streaming changes this by delivering output incrementally — token by token, tool call by tool call — so users see progress in real time.

The Claude Agent SDK supports streaming at multiple levels: raw token streaming from the model, structured event streaming for tool calls and handoffs, and progress callbacks that let you build responsive UIs on top of agent workflows. Streaming is not just about perceived performance — it also enables new interaction patterns like early user intervention, progressive rendering, and live monitoring of multi-agent coordination.

Building streaming into agent systems requires different architectural patterns than batch processing. You need to handle partial outputs, manage stream lifecycle, and decide how to expose agent-internal events (like tool calls) to end users without overwhelming them.

When to use it

Use streaming when:

Batch mode is preferable when:

Getting started

Basic token streaming

Stream agent output token by token as the model generates it.

import { Agent } from "@anthropic-ai/agent-sdk";

const agent = new Agent({
  name: "writing-assistant",
  model: "claude-sonnet-4-20250514",
  instructions: "You are a technical writing assistant. Produce clear, well-structured content.",
});

async function streamAgentResponse(query: string) {
  const stream = await agent.stream(query, { maxTurns: 10 });

  for await (const event of stream) {
    switch (event.type) {
      case "text_delta":
        process.stdout.write(event.text);
        break;
      case "tool_call_start":
        console.log(`\n[Calling tool: ${event.toolName}]`);
        break;
      case "tool_call_complete":
        console.log(`[Tool result received]`);
        break;
      case "turn_complete":
        console.log(`\n[Turn ${event.turnNumber} complete]`);
        break;
      case "run_complete":
        console.log("\n[Agent run complete]");
        break;
    }
  }
}

Server-sent events for web applications

Connect agent streaming to a web frontend using server-sent events (SSE).

import express from "express";
import { Agent } from "@anthropic-ai/agent-sdk";

const app = express();

app.get("/api/agent/stream", async (req, res) => {
  const query = req.query.q as string;
  if (!query) {
    res.status(400).json({ error: "Missing query parameter" });
    return;
  }

  res.writeHead(200, {
    "Content-Type": "text/event-stream",
    "Cache-Control": "no-cache",
    Connection: "keep-alive",
  });

  const agent = new Agent({
    name: "assistant",
    model: "claude-sonnet-4-20250514",
    instructions: "You are a helpful assistant.",
    tools: [searchTool, calculatorTool],
  });

  try {
    const stream = await agent.stream(query, { maxTurns: 10 });

    for await (const event of stream) {
      const sseData = JSON.stringify({
        type: event.type,
        ...(event.type === "text_delta" && { text: event.text }),
        ...(event.type === "tool_call_start" && { toolName: event.toolName }),
        ...(event.type === "run_complete" && { finalOutput: event.output }),
      });

      res.write(`data: ${sseData}\n\n`);
    }

    res.write("data: [DONE]\n\n");
  } catch (err) {
    res.write(`data: ${JSON.stringify({ type: "error", message: err.message })}\n\n`);
  } finally {
    res.end();
  }
});

Client-side stream consumption

Parse the SSE stream in a React frontend to render progressive output.

async function consumeAgentStream(
  query: string,
  onToken: (text: string) => void,
  onToolCall: (toolName: string) => void,
  onComplete: (output: string) => void,
  onError: (error: string) => void
) {
  const response = await fetch(
    `/api/agent/stream?q=${encodeURIComponent(query)}`
  );

  if (!response.ok || !response.body) {
    onError("Failed to start stream");
    return;
  }

  const reader = response.body.getReader();
  const decoder = new TextDecoder();
  let buffer = "";

  while (true) {
    const { done, value } = await reader.read();
    if (done) break;

    buffer += decoder.decode(value, { stream: true });

    const lines = buffer.split("\n");
    buffer = lines.pop() ?? "";

    for (const line of lines) {
      if (!line.startsWith("data: ")) continue;
      const data = line.slice(6);
      if (data === "[DONE]") return;

      try {
        const event = JSON.parse(data);
        switch (event.type) {
          case "text_delta":
            onToken(event.text);
            break;
          case "tool_call_start":
            onToolCall(event.toolName);
            break;
          case "run_complete":
            onComplete(event.finalOutput);
            break;
          case "error":
            onError(event.message);
            break;
        }
      } catch {
        // Skip malformed events
      }
    }
  }
}

Progress tracking for long-running agents

For agents that make multiple tool calls, provide progress updates to users.

interface AgentProgress {
  currentStep: string;
  stepsCompleted: number;
  estimatedTotalSteps: number;
  toolCallHistory: Array<{ tool: string; durationMs: number }>;
  elapsedMs: number;
}

async function streamWithProgress(
  agent: Agent,
  query: string,
  onProgress: (progress: AgentProgress) => void
) {
  const startTime = Date.now();
  const toolHistory: Array<{ tool: string; durationMs: number }> = [];
  let toolStartTime = 0;

  const stream = await agent.stream(query, { maxTurns: 15 });

  for await (const event of stream) {
    if (event.type === "tool_call_start") {
      toolStartTime = Date.now();
      onProgress({
        currentStep: `Calling ${event.toolName}...`,
        stepsCompleted: toolHistory.length,
        estimatedTotalSteps: Math.max(toolHistory.length + 2, 5),
        toolCallHistory: toolHistory,
        elapsedMs: Date.now() - startTime,
      });
    }

    if (event.type === "tool_call_complete") {
      toolHistory.push({
        tool: event.toolName,
        durationMs: Date.now() - toolStartTime,
      });
      onProgress({
        currentStep: "Processing results...",
        stepsCompleted: toolHistory.length,
        estimatedTotalSteps: Math.max(toolHistory.length + 1, 5),
        toolCallHistory: toolHistory,
        elapsedMs: Date.now() - startTime,
      });
    }
  }
}

Integration with agent teams

Streaming in multi-agent teams requires deciding which events to surface to users. In a Sequential Pipeline, you might stream output only from the final report agent while showing progress indicators for earlier stages.

async function streamPipelineToUser(
  query: string,
  onProgress: (step: string) => void,
  onToken: (text: string) => void
) {
  // Stage 1: Research (progress only)
  onProgress("Researching...");
  const research = await researchAgent.run(query, { maxTurns: 10 });

  // Stage 2: Analysis (progress only)
  onProgress("Analyzing findings...");
  const analysis = await analysisAgent.run(research.output, { maxTurns: 10 });

  // Stage 3: Report (streamed to user)
  onProgress("Writing report...");
  const stream = await reportAgent.stream(analysis.output, { maxTurns: 5 });

  for await (const event of stream) {
    if (event.type === "text_delta") {
      onToken(event.text);
    }
  }
}

For Parallel Workers, you can stream progress from all agents simultaneously, showing users which agents have completed and which are still working. This builds trust by demonstrating that the system is doing substantive work, not just spinning.

Best practices and common pitfalls

  1. Stream the final agent's output, show progress for intermediaries. Users care about the final result. Streaming raw output from intermediate agents is confusing. Show progress indicators for early stages and stream tokens only from the agent producing user-facing content.

  2. Handle stream interruptions gracefully. Network disconnections during streaming are common. Build reconnection logic on the client and ensure the server cleans up resources when a client disconnects mid-stream.

  3. Buffer tool call events. Sending a "calling search tool" event for every internal tool call clutters the UI. Batch tool call events or only surface them for long-running tools (over 2 seconds).

  4. Set stream timeouts. A stream that stops producing events without completing looks like a hang to users. Set a maximum inactivity timeout and surface it as an error rather than leaving the stream open indefinitely.

  5. Test with slow connections. Streaming behavior changes dramatically on slow networks. Test with throttled connections to ensure your buffering, parsing, and error handling work under realistic conditions.

Skip the setup — generate agent teams instantly →