Contents

Claude Agent SDK Complete Tutorial: Tool System, Agent Loop, and Controllable Execution in Action

The goal of the Claude Agent SDK is to upgrade “model inference” to “controllable execution”. It not only generates text but can also complete tasks within an auditable, constrainable tool system, allowing Agents to truly participate in engineering processes. This article will guide you from environment setup to production practice, mastering the Claude Agent SDK comprehensively.

What is Claude Agent SDK?

Claude Agent SDK is an enterprise-grade Agent development framework launched by Anthropic. It deeply integrates Claude’s language capabilities with a tool execution system, enabling AI to:

  • Autonomously invoke external tools like file systems, databases, APIs
  • Maintain context and plan complex tasks across multi-turn conversations
  • Implement breakpoint resumption for long-process tasks via checkpoint mechanisms
  • Execute operations within strict permission boundaries to ensure safety and control

Differences from calling the Claude API directly:

FeatureClaude APIClaude Agent SDK
Execution ModeSingle-turn Q&AMulti-turn Agent Loop
Tool InvocationManual parsing of function callingAutomated tool orchestration
Context ManagementManual concatenation requiredBuilt-in context manager
Permission ControlNoneTool policies + Path whitelists
AuditabilityDependent on application layerBuilt-in tool invocation logs
Applicable ScenariosChat, Content GenerationCode Refactoring, Data Processing, Automated Ops

Core Concepts at a Glance

Agent Loop

The built-in multi-turn execution engine responsible for the closed loop of “Understand Goal → Plan → Call Tool → Verify Result → Continue Iteration”.

User Input
  ↓
Claude analyzes task and plans steps
  ↓
Calls Tool (Read File/Execute Command/Search)
  ↓
Gets Tool Output and updates context
  ↓
Check: Task Complete?
  └─ No → Continue next loop
  └─ Yes → Return final result

Tool

Capabilities callable by the Agent, usually including file system, command execution, retrieval, network requests, etc. Each tool requires a clear input schema and permission boundaries.

Standard Tool Example:

{
  name: 'read_file',
  description: 'Reads text file content',
  inputSchema: {
    type: 'object',
    properties: {
      path: { type: 'string', description: 'File path' }
    },
    required: ['path']
  },
  execute: async ({ path }) => {
    return { content: fs.readFileSync(path, 'utf-8') };
  }
}

Tool Policy

Restrictions on which tools the Agent can use, which paths it can access, and which commands it can execute. It is the first gate for risk control.

Sub-Agent

Parallel/isolation mechanism for complex tasks. The main agent can delegate tasks to sub-agents to reduce context interference and improve throughput.

Checkpoint

Breakpoint and rollback mechanism for long-process tasks, facilitating recovery and replay.

Environment Preparation and Quick Start

Prerequisites

  • Node.js 18+ or Bun Runtime
  • Git (For workspace management and traceable changes)
  • Anthropic API Key (Get it here)

Install SDK

# Using npm
npm install @anthropic-ai/sdk

# Or using bun (faster)
bun add @anthropic-ai/sdk

# Verify installation
node -e "console.log(require('@anthropic-ai/sdk').VERSION)"

Configure API Key

# Create .env file
echo "ANTHROPIC_API_KEY=your_api_key_here" > .env

# Or set environment variable
export ANTHROPIC_API_KEY="your_api_key_here"

Complete Hands-on Case 1: File Summary Agent

Let’s start with a minimal runnable example to understand how the Agent works.

Scenario Requirement

Create an Agent capable of reading multiple Markdown files in a project and generating a content summary.

Step 1: Define Tools

First, define the tools needed by the Agent:

// tools.ts
import fs from 'fs';
import path from 'path';

export const readFileTool = {
  name: 'read_file',
  description: 'Reads text file content. Only supports .md, .txt, .json formats.',
  inputSchema: {
    type: 'object',
    properties: {
      path: { 
        type: 'string', 
        description: 'File path relative to project root' 
      }
    },
    required: ['path']
  },
  execute: async ({ path: filePath }: { path: string }) => {
    const allowedExtensions = ['.md', '.txt', '.json'];
    const ext = path.extname(filePath);
    
    if (!allowedExtensions.includes(ext)) {
      throw new Error(`Unsupported file type: ${ext}`);
    }
    
    const fullPath = path.resolve(process.cwd(), filePath);
    const content = fs.readFileSync(fullPath, 'utf-8');
    
    return { 
      content, 
      size: content.length,
      lines: content.split('\n').length
    };
  }
};

export const listFilesTool = {
  name: 'list_files',
  description: 'Lists all files in a specified directory',
  inputSchema: {
    type: 'object',
    properties: {
      directory: { 
        type: 'string', 
        description: 'Directory path' 
      },
      extension: {
        type: 'string',
        description: 'Filter by file extension (e.g., .md)'
      }
    },
    required: ['directory']
  },
  execute: async ({ directory, extension }: { directory: string; extension?: string }) => {
    const dirPath = path.resolve(process.cwd(), directory);
    const files = fs.readdirSync(dirPath);
    
    const filtered = extension
      ? files.filter(f => f.endsWith(extension))
      : files;
    
    return { files: filtered, count: filtered.length };
  }
};

Step 2: Create Agent

// agent.ts
import { Anthropic } from '@anthropic-ai/sdk';
import { readFileTool, listFilesTool } from './tools';

const client = new Anthropic({
  apiKey: process.env.ANTHROPIC_API_KEY
});

async function summarizeMarkdownFiles(directory: string) {
  const messages = [
    {
      role: 'user' as const,
      content: `Please analyze all Markdown files in the ${directory} directory and generate a summary report. The report should include:
      1. File list
      2. Core content of each file (1-2 sentences)
      3. Overall theme analysis`
    }
  ];

  let continueLoop = true;
  const tools = [readFileTool, listFilesTool];
  
  while (continueLoop) {
    const response = await client.messages.create({
      model: 'claude-3-5-sonnet-20241022',
      max_tokens: 4096,
      tools,
      messages
    });

    console.log('Claude Response:', response.content);

    // Check for tool calls
    const toolUseBlock = response.content.find(
      block => block.type === 'tool_use'
    );

    if (toolUseBlock && toolUseBlock.type === 'tool_use') {
      // Execute tool
      const tool = tools.find(t => t.name === toolUseBlock.name);
      if (!tool) {
        throw new Error(`Tool not found: ${toolUseBlock.name}`);
      }

      console.log(`\nExecuting Tool: ${tool.name}`, toolUseBlock.input);
      
      const result = await tool.execute(toolUseBlock.input as any);
      
      console.log('Tool Output:', JSON.stringify(result, null, 2));

      // Add tool result to conversation
      messages.push({
        role: 'assistant',
        content: response.content
      });
      
      messages.push({
        role: 'user' as const,
        content: [
          {
            type: 'tool_result',
            tool_use_id: toolUseBlock.id,
            content: JSON.stringify(result)
          }
        ]
      });
    } else {
      // No tool call, task complete
      continueLoop = false;
      
      const textBlock = response.content.find(block => block.type === 'text');
      if (textBlock && textBlock.type === 'text') {
        console.log('\n\n=== Final Summary ===\n');
        console.log(textBlock.text);
      }
    }
  }
}

// Run
summarizeMarkdownFiles('./docs')
  .then(() => console.log('\nTask Completed'))
  .catch(err => console.error('Error:', err));

Step 3: Run and Test

# Create test documents
mkdir docs
echo "# Document 1\nThis is test content." > docs/file1.md
echo "# Document 2\nAnother test." > docs/file2.md

# Run Agent
npx ts-node agent.ts

Expected Output:

Executing Tool: list_files { directory: './docs', extension: '.md' }
Tool Output: { files: ['file1.md', 'file2.md'], count: 2 }

Executing Tool: read_file { path: 'docs/file1.md' }
Tool Output: { content: '# Document 1\nThis is test content.', size: 20, lines: 2 }

Executing Tool: read_file { path: 'docs/file2.md' }
Tool Output: { content: '# Document 2\nAnother test.', size: 18, lines: 2 }

=== Final Summary ===

## File Summary Report

### File List
1. file1.md (20 chars, 2 lines)
2. file2.md (18 chars, 2 lines)

### Content Overview
- **file1.md**: Title document containing basic test content
- **file2.md**: Short test document

### Overall Analysis
These documents appear to be simple example files used for testing purposes.

Task Completed

Complete Hands-on Case 2: Code Review Agent

Scenario Requirement

Build an automated code review tool that checks Python code for:

  • Code style issues
  • Potential bugs
  • Performance optimization suggestions
  • Security vulnerabilities

Implementation Code

// code-review-agent.ts
import { Anthropic } from '@anthropic-ai/sdk';
import fs from 'fs';
import { exec } from 'child_process';
import { promisify } from 'util';

const execAsync = promisify(exec);

const tools = [
  {
    name: 'read_python_file',
    description: 'Reads Python file content for analysis',
    inputSchema: {
      type: 'object',
      properties: {
        path: { type: 'string' }
      },
      required: ['path']
    },
    execute: async ({ path }: { path: string }) => {
      const content = fs.readFileSync(path, 'utf-8');
      const lines = content.split('\n').length;
      return { content, lines };
    }
  },
  {
    name: 'run_pylint',
    description: 'Runs pylint static code analysis tool',
    inputSchema: {
      type: 'object',
      properties: {
        path: { type: 'string' }
      },
      required: ['path']
    },
    execute: async ({ path }: { path: string }) => {
      try {
        const { stdout } = await execAsync(`pylint ${path} --output-format=json`);
        return { result: JSON.parse(stdout) };
      } catch (error: any) {
        // Pylint returns non-zero exit code on issues
        return { result: JSON.parse(error.stdout || '[]') };
      }
    }
  }
];

async function reviewPythonCode(filePath: string) {
  const client = new Anthropic({
    apiKey: process.env.ANTHROPIC_API_KEY!
  });

  const messages = [
    {
      role: 'user' as const,
      content: `Please review the Python file ${filePath}, focusing on:
      1. Code style and readability
      2. Potential bugs and logical errors
      3. Performance optimization suggestions
      4. Security vulnerabilities (SQL injection, XSS, etc.)
      
      Please run the pylint tool first, then combine the tool output with code content for manual review.
      Finally, generate a structured review report.`
    }
  ];

  let continueLoop = true;

  while (continueLoop) {
    const response = await client.messages.create({
      model: 'claude-3-5-sonnet-20241022',
      max_tokens: 8192,
      tools,
      messages
    });

    const toolUseBlocks = response.content.filter(block => block.type === 'tool_use');

    if (toolUseBlocks.length > 0) {
      messages.push({ role: 'assistant', content: response.content });

      const toolResults = [];
      for (const toolUse of toolUseBlocks) {
        if (toolUse.type !== 'tool_use') continue;

        const tool = tools.find(t => t.name === toolUse.name);
        const result = await tool!.execute(toolUse.input as any);
        
        toolResults.push({
          type: 'tool_result' as const,
          tool_use_id: toolUse.id,
          content: JSON.stringify(result)
        });
      }

      messages.push({ role: 'user' as const, content: toolResults });
    } else {
      continueLoop = false;
      const textBlock = response.content.find(block => block.type === 'text');
      if (textBlock && textBlock.type === 'text') {
        console.log(textBlock.text);
      }
    }
  }
}

// Usage Example
reviewPythonCode('./example.py');

Tool Policy and Security Configuration

Define Tool Policy

interface ToolPolicy {
  allowedTools: string[];        // Whitelist of allowed tool names
  allowedPaths: string[];        // Whitelist of allowed paths
  allowedCommands: string[];     // Whitelist of allowed commands
  confirmRequired: string[];     // Tools requiring user confirmation
  maxFileSize: number;           // File size limit (bytes)
  timeout: number;               // Tool execution timeout (ms)
}

const productionPolicy: ToolPolicy = {
  allowedTools: ['read_file', 'list_files', 'grep'],
  allowedPaths: ['./src', './docs', './tests'],
  allowedCommands: ['git', 'npm', 'node'],
  confirmRequired: ['write_file', 'delete_file', 'exec'],
  maxFileSize: 1024 * 1024 * 5,  // 5MB
  timeout: 30000  // 30 seconds
};

Apply Policy

function applyToolPolicy(tool: any, policy: ToolPolicy) {
  return {
    ...tool,
    execute: async (input: any) => {
      // 1. Check if tool is in whitelist
      if (!policy.allowedTools.includes(tool.name)) {
        throw new Error(`Tool ${tool.name} is not whitelisted`);
      }

      // 2. Path check
      if (input.path) {
        const isAllowed = policy.allowedPaths.some(
          allowed => input.path.startsWith(allowed)
        );
        if (!isAllowed) {
          throw new Error(`Path ${input.path} is not in allowed range`);
        }
      }

      // 3. File size check
      if (input.path && fs.existsSync(input.path)) {
        const stats = fs.statSync(input.path);
        if (stats.size > policy.maxFileSize) {
          throw new Error(`File too large: ${stats.size} bytes`);
        }
      }

      // 4. Timeout control
      const timeoutPromise = new Promise((_, reject) =>
        setTimeout(() => reject(new Error('Tool execution timeout')), policy.timeout)
      );

      return Promise.race([
        tool.execute(input),
        timeoutPromise
      ]);
    }
  };
}

// Usage
const safeTools = tools.map(t => applyToolPolicy(t, productionPolicy));

Troubleshooting and FAQ

Issue 1: Agent doesn’t call tools

Symptom: Agent returns text directly instead of calling tools.

Possible Causes:

  1. Tool description is unclear
  2. inputSchema is too complex
  3. Task description doesn’t match tool capabilities

Solution:

// ❌ Poor tool description
{
  name: 'file_tool',
  description: 'File operations'  // Too vague
}

// ✅ Excellent tool description
{
  name: 'read_file',
  description: 'Reads the complete content of a text file. Supports .txt, .md, .json, .yaml formats. Returns file content string and metadata (lines, size).'
}

Debugging Tip:

// Explicitly request tool usage in prompt
const messages = [{
  role: 'user',
  content: `Please use the read_file tool to read ${filePath}, do not guess the content.`
}];

Issue 2: Long task fails midway and cannot recover

Symptom: When processing many files, an error in one step causes total failure.

Solution: Implement checkpoint mechanism

interface Checkpoint {
  step: number;
  processedFiles: string[];
  results: any[];
  timestamp: number;
}

async function processWithCheckpoint(files: string[]) {
  const checkpointFile = './checkpoint.json';
  
  // Restore checkpoint
  let checkpoint: Checkpoint = fs.existsSync(checkpointFile)
    ? JSON.parse(fs.readFileSync(checkpointFile, 'utf-8'))
    : { step: 0, processedFiles: [], results: [], timestamp: Date.now() };

  for (let i = checkpoint.step; i < files.length; i++) {
    try {
      const result = await processFile(files[i]);
      checkpoint.processedFiles.push(files[i]);
      checkpoint.results.push(result);
      checkpoint.step = i + 1;
      
      // Save every 5 files
      if (i % 5 === 0) {
        fs.writeFileSync(checkpointFile, JSON.stringify(checkpoint));
      }
    } catch (error) {
      console.error(`Failed to process ${files[i]}:`, error);
      fs.writeFileSync(checkpointFile, JSON.stringify(checkpoint));
      throw error;
    }
  }

  // Delete checkpoint upon completion
  fs.unlinkSync(checkpointFile);
  return checkpoint.results;
}

Issue 3: Unstable Agent output quality

Symptom: Same task, widely different results across runs.

Solution: Add verification steps

async function runWithVerification(task: string) {
  const messages = [
    { role: 'user' as const, content: task }
  ];

  let result;
  let verified = false;
  let attempts = 0;

  while (!verified && attempts < 3) {
    result = await runAgent(messages);
    
    // Verify output
    if (result.includes('TODO') || result.length < 100) {
      messages.push({
        role: 'user' as const,
        content: 'Output incomplete, please provide a more detailed analysis.'
      });
      attempts++;
    } else {
      verified = true;
    }
  }

  return result;
}

Issue 4: High API costs

Symptom: Token consumption far exceeds expectations.

Optimization Tips:

// 1. Limit context length
function trimContext(messages: any[], maxTokens = 8000) {
  // Keep last N turns
  return messages.slice(-10);
}

// 2. Compress tool output
function compressToolResult(result: any) {
  if (typeof result === 'string' && result.length > 2000) {
    return result.slice(0, 1000) + '\n...[omitted ' + (result.length - 1000) + ' chars]...';
  }
  return result;
}

// 3. Use smaller models for simple tasks
const model = task.complexity === 'high' 
  ? 'claude-3-5-sonnet-20241022'
  : 'claude-3-haiku-20240307';

FAQ

Q1: Difference between Agent SDK and LangChain/AutoGPT?

A: Main differences lie in design philosophy and applicable scenarios:

FeatureClaude Agent SDKLangChainAutoGPT
GoalControllable, audit-ready enterprise executionGeneral LLM app development frameworkFully autonomous AGI exploration
ControlFine-grained tool policiesMediumCoarse
ScenariosCode refactoring, data processing, opsChatbots, RAGOpen-ended task exploration
Learning CurveMediumSteepSimple

Q2: How to handle scenarios requiring user interaction?

A: Implement bi-directional communication mechanisms:

async function interactiveAgent() {
  const readline = require('readline').createInterface({
    input: process.stdin,
    output: process.stdout
  });

  const askUser = (question: string): Promise<string> => {
    return new Promise(resolve => {
      readline.question(question, resolve);
    });
  };

  // Called within tool
  const confirmTool = {
    name: 'confirm_action',
    description: 'Requests user confirmation for sensitive operations',
    inputSchema: {
      type: 'object',
      properties: {
        action: { type: 'string' },
        details: { type: 'string' }
      }
    },
    execute: async ({ action, details }: any) => {
      const answer = await askUser(
        `\nConfirm execution: ${action}?\nDetails: ${details}\n(y/n): `
      );
      return { confirmed: answer.toLowerCase() === 'y' };
    }
  };
}

Q3: How to monitor Agent execution?

A: Implement structured logging:

class AgentLogger {
  private logs: any[] = [];

  logToolCall(toolName: string, input: any, output: any, duration: number) {
    this.logs.push({
      type: 'tool_call',
      toolName,
      input,
      output,
      duration,
      timestamp: Date.now()
    });
  }

  logError(error: Error, context: any) {
    this.logs.push({
      type: 'error',
      message: error.message,
      stack: error.stack,
      context,
      timestamp: Date.now()
    });
  }

  exportMetrics() {
    return {
      totalToolCalls: this.logs.filter(l => l.type === 'tool_call').length,
      averageDuration: this.logs
        .filter(l => l.type === 'tool_call')
        .reduce((sum, l) => sum + l.duration, 0) / this.logs.length,
      errors: this.logs.filter(l => l.type === 'error').length
    };
  }
}

Q4: Which programming languages are supported?

A: The SDK itself is TypeScript/JavaScript, but the Agent can handle code in any language:

  • Direct Support: TypeScript, JavaScript, Python, Go, Rust
  • Via Tool Support: Any language with CLI tools (Java/javac, C++/gcc)

Q5: How to implement concurrent task processing?

A: Use the sub-agent pattern:

async function parallelProcessing(tasks: string[]) {
  const subAgents = tasks.map(task => 
    createSubAgent({ 
      task, 
      maxTokens: 2000,
      isolatedContext: true 
    })
  );

  const results = await Promise.all(
    subAgents.map(agent => agent.run())
  );

  return aggregateResults(results);
}

Claude Ecosystem

Tools and Integration

Official Resources

Summary

The value of the Claude Agent SDK lies not in “writing more like a human”, but in “executing more like engineering”. Starting from a minimal runnable version, gradually adding tools, constraints, and verification mechanisms, you can build an auditable, controllable, and scalable intelligent agent system.

Key Takeaways:

  • ✅ Tool descriptions must be clear and specific, including input/output examples
  • ✅ Use tool policies to control permission boundaries (paths, commands, size)
  • ✅ Implement checkpoint mechanisms to handle long-process tasks
  • ✅ Add verification steps to ensure output quality
  • ✅ Structured logs facilitate monitoring and debugging
  • ✅ Verify with read-only tools first, then gradually open up write tools

Remember: An Agent is not a silver bullet, it is an amplifier. Good tool design + clear task boundaries + appropriate human supervision = reliable automation systems.

Start building your first Agent now!