What is agent-code?

agent-code is an open-source, AI-powered coding agent for the terminal. You describe what you want in natural language, and the agent reads your codebase, runs commands, edits files, and iterates until the task is done.

It's built in pure Rust for speed, safety, and a single static binary with zero runtime dependencies.

What can it do?

Searches your codebase, reads files, traces dependencies, and explains how things work. Creates files, makes targeted edits, refactors code, adds features, and fixes bugs. Executes shell commands, runs tests, manages git operations, and handles build tools. Chains together reading, writing, and running to complete complex engineering tasks autonomously.

How it works

The core loop is simple:

  1. You type a request
  2. The agent sends your request + conversation history to an LLM
  3. The LLM responds with text and tool calls
  4. The agent executes the tools (file reads, edits, shell commands, etc.)
  5. Tool results are fed back to the LLM
  6. Repeat until the task is done

Every tool call passes through a permission system before executing. You stay in control.

You: "add input validation to the signup endpoint"

Agent:
  → FileRead src/routes/signup.rs
  → Grep "validate" src/
  → FileEdit src/routes/signup.rs (add validation logic)
  → Bash "cargo test"
  → FileEdit src/routes/signup.rs (fix test failure)
  → Bash "cargo test"
  ✓ All tests pass. Validation added.

Key features

  • 32 built-in tools for file operations, shell commands, code search, web access, language servers, and more
  • 42 slash commands for git, session management, diagnostics, sharing, and agent control
  • 12 bundled skills including security review, architecture advisor, bug hunter, and implementation planner
  • 12 LLM providers — Anthropic, OpenAI, xAI, Google, DeepSeek, Groq, Mistral, Together, Zhipu, Ollama, AWS Bedrock, Google Vertex
  • Permission system with configurable rules per tool, pattern matching, and protected directories
  • MCP support for connecting external tool servers
  • Memory system for persistent context across sessions
  • Plugin system for bundling skills, hooks, and configuration
  • Session persistence with save, resume, share, fork, and rewind
  • Plan mode for safe read-only exploration
  • Extended thinking for complex reasoning tasks
  • Cross-platform — Linux, macOS, and Windows

Architecture

                        ┌─────────────┐
                        │   CLI / REPL │
                        └──────┬───────┘
                               │
              ┌────────────────▼────────────────┐
              │          Query Engine            │
              │  stream → tools → loop → compact │
              └──┬──────────┬──────────┬────────┘
                 │          │          │
        ┌────────▼──┐ ┌────▼─────┐ ┌──▼───────┐
        │   Tools   │ │ Perms    │ │  Hooks   │
        │  31 built │ │ allow    │ │ pre/post │
        │  + MCP    │ │ deny/ask │ │ shell    │
        └───────────┘ └──────────┘ └──────────┘

Next steps

Get up and running in 2 minutes. All the ways to install agent-code.

Install

```bash cargo (recommended) cargo install agent-code ```
brew install avala-ai/tap/agent-code
git clone https://github.com/avala-ai/agent-code.git
cd agent-code
cargo build --release
# Binary: target/release/agent

Set your API key

agent-code works with any LLM provider. Set the key for the one you use:

```bash Anthropic (Claude) export ANTHROPIC_API_KEY="sk-ant-..." ```
export OPENAI_API_KEY="sk-..."
export AGENT_CODE_API_KEY="your-key"
export AGENT_CODE_API_BASE_URL="https://api.your-provider.com/v1"

Start the agent

agent

You'll see:

 agent  session a1b2c3d
Type your message, or /help for commands. Ctrl+C to cancel, Ctrl+D to exit.

>

Try it out

Type a natural language request:

> what files are in this project?

The agent will use the Glob and FileRead tools to explore and answer.

Try something more complex:

> add a health check endpoint to the API server that returns the git commit hash

The agent will:

  1. Read the existing code to understand the project structure
  2. Find how other endpoints are defined
  3. Write the new endpoint
  4. Run tests if they exist

Slash commands

Type /help to see all available commands:

> /help

Available commands:

  /help           Show this help message
  /clear          Clear conversation history
  /cost           Show session cost and token usage
  /model          Show or change the current model
  /commit         Commit current changes
  /review         Review current diff for issues
  /plan           Toggle plan mode (read-only)
  /doctor         Check environment health
  ...

One-shot mode

For scripting and CI, use --prompt to run a single task and exit:

agent --prompt "fix the failing tests" --dangerously-skip-permissions

Next steps

Configure models, permissions, and behavior. See all 31 built-in tools. Create custom reusable workflows. Connect external tool servers.

Requirements

  • A supported LLM API key (Anthropic, OpenAI, or any compatible provider)
  • git and rg (ripgrep) for full functionality

Install methods

If you have Rust installed:

cargo install agent-code

This installs the agent binary to ~/.cargo/bin/.

Homebrew

On macOS or Linux:

brew install avala-ai/tap/agent-code

Prebuilt binaries

Download from GitHub Releases:

PlatformArchitectureDownload
Linuxx86_64agent-linux-x86_64.tar.gz
Linuxaarch64agent-linux-aarch64.tar.gz
macOSx86_64agent-macos-x86_64.tar.gz
macOSApple Siliconagent-macos-aarch64.tar.gz
Windowsx86_64agent-windows-x86_64.zip
# Example: macOS Apple Silicon
curl -L https://github.com/avala-ai/agent-code/releases/latest/download/agent-macos-aarch64.tar.gz | tar xz
sudo mv agent /usr/local/bin/

From source

git clone https://github.com/avala-ai/agent-code.git
cd agent-code
cargo build --release
sudo cp target/release/agent /usr/local/bin/

Verify installation

agent --version
# agent 0.1.1

Run the environment check:

agent --dump-system-prompt | head -5
# You are an AI coding agent...

Uninstall

# Cargo
cargo uninstall agent-code

# Homebrew
brew uninstall agent-code

# Manual
rm $(which agent)

Data locations

WhatPath
User config~/.config/agent-code/config.toml
Session data~/.config/agent-code/sessions/
Memory~/.config/agent-code/memory/
Skills~/.config/agent-code/skills/
Plugins~/.config/agent-code/plugins/
Keybindings~/.config/agent-code/keybindings.json
History~/.local/share/agent-code/history.txt
Tool output cache~/.cache/agent-code/tool-results/
Task output~/.cache/agent-code/tasks/

This tutorial walks through using agent-code on a real project for the first time.

Prerequisites

  • agent-code installed (agent --version works)
  • An API key configured (any provider)
  • A project directory with code in it

Step 1: Navigate to your project

cd /path/to/your/project

agent-code uses your current directory as context. It can read files, run commands, and make edits here.

Step 2: Start the agent

agent

You'll see the welcome banner with your session ID. The agent is ready.

Step 3: Explore the codebase

Ask the agent to understand your project:

> what is this project and how is it structured?

The agent will use Glob to find files, FileRead to read key files (README, package.json, Cargo.toml, etc.), and explain the structure.

Step 4: Make a change

Try something concrete:

> add a health check endpoint that returns {"status": "ok"}

The agent will:

  1. Read existing code to understand patterns
  2. Find where endpoints are defined
  3. Write the new endpoint
  4. Run tests if they exist

Watch the tool calls — you'll see FileRead, Grep, FileWrite, and Bash in action.

Step 5: Review what changed

> /diff

This shows the git diff of everything the agent modified.

Step 6: Commit if you're happy

> /commit

The agent reviews the diff and creates a commit with a descriptive message.

Step 7: Save project context

Create an AGENTS.md file so the agent remembers your project in future sessions:

> /init

Or ask the agent to create one:

> create an AGENTS.md with our project's tech stack, conventions, and test commands

What's next

  • Use /plan to explore code safely (read-only mode)
  • Use /review to review your changes before committing
  • Use /model to switch to a faster or more capable model
  • See Custom Skills to create reusable workflows

Skills turn multi-step workflows into single commands. This tutorial creates a skill from scratch.

What we'll build

A /deploy-check skill that verifies a project is ready for deployment: tests pass, no uncommitted changes, and the build succeeds.

Step 1: Create the skill file

mkdir -p .agent/skills

Create .agent/skills/deploy-check.md:

---
description: Verify the project is ready for deployment
userInvocable: true
---

Run a pre-deployment checklist:

1. Check for uncommitted changes with `git status`. If there are
   uncommitted changes, warn the user and stop.

2. Run the project's test suite. If any tests fail, report the
   failures and stop.

3. Run the build command. If it fails, report the error and stop.

4. If everything passes, report "Ready to deploy" with a summary
   of what was checked.

Do not proceed past a failing step.

Step 2: Verify it loaded

Start agent-code and check:

> /skills

You should see deploy-check [invocable] in the list.

Step 3: Run it

> /deploy-check

The agent follows the steps in order, stopping at the first failure.

Adding arguments

Skills support {{arg}} substitution. Create .agent/skills/review-file.md:

---
description: Deep review of a specific file
userInvocable: true
---

Review {{arg}} thoroughly:

1. Read the file and understand its purpose
2. Check for bugs, edge cases, and error handling gaps
3. Check for security issues (injection, XSS, auth bypass)
4. Suggest specific improvements with line references

Use it:

> /review-file src/auth.rs

Directory skills

For complex skills with supporting context, use a directory:

.agent/skills/
  deploy-check/
    SKILL.md          ← the skill definition
    checklist.md      ← referenced by the skill
    known-issues.md   ← context the agent can read

The skill file must be named SKILL.md in a directory skill.

Sharing skills

Skills are just markdown files. Share them by:

  • Committing .agent/skills/ to your repo (team-wide)
  • Copying to ~/.config/agent-code/skills/ (personal, all projects)
  • Publishing as a plugin (see Plugins)

Tips

  • Keep skill prompts specific — vague instructions produce vague results
  • Number the steps — the agent follows numbered lists reliably
  • Include stop conditions ("if X fails, stop and report")
  • Test with /skills to verify loading before running

MCP lets you extend agent-code with tools from external servers — databases, APIs, file systems, and more.

What we'll do

Connect the official GitHub MCP server so the agent can create issues, read PRs, and manage repositories.

Step 1: Create project config

agent   # start agent-code
> /init  # creates .agent/settings.toml

Or create it manually:

mkdir -p .agent
touch .agent/settings.toml

Step 2: Add the MCP server

Edit .agent/settings.toml:

[mcp_servers.github]
command = "npx"
args = ["-y", "@modelcontextprotocol/server-github"]
env = { GITHUB_TOKEN = "ghp_your_token_here" }

Step 3: Restart and verify

Restart agent-code and check the connection:

> /mcp
1 MCP server(s) configured:
  github (stdio)

Step 4: Use it

The agent now has access to GitHub tools:

> create a GitHub issue titled "Add input validation" with a description of what needs to be done

The agent calls the MCP server's create_issue tool, which creates the issue via the GitHub API.

SSE transport (remote servers)

For servers that expose an HTTP endpoint:

[mcp_servers.my-api]
url = "http://localhost:8080"

The agent connects via Server-Sent Events instead of stdio.

Multiple servers

Add as many as you need:

[mcp_servers.github]
command = "npx"
args = ["-y", "@modelcontextprotocol/server-github"]
env = { GITHUB_TOKEN = "ghp_..." }

[mcp_servers.postgres]
command = "npx"
args = ["-y", "@modelcontextprotocol/server-postgres", "postgresql://localhost/mydb"]

[mcp_servers.filesystem]
command = "npx"
args = ["-y", "@modelcontextprotocol/server-filesystem", "/path/to/docs"]

Security

MCP servers have access to whatever their underlying service provides. Restrict access with:

[security]
mcp_server_allowlist = ["github", "filesystem"]  # Only these servers can connect

Or block specific ones:

[security]
mcp_server_denylist = ["untrusted-server"]

Browsing MCP resources

Some MCP servers expose resources (like database schemas or file listings). The agent can browse them with the ListMcpResources and ReadMcpResource tools automatically.

Troubleshooting

  • Server stuck connecting: verify the command works manually — npx -y @modelcontextprotocol/server-github
  • Tools not appearing: restart agent-code after config changes
  • Permission errors: check the env vars (tokens, credentials) are correct
  • See the full MCP configuration guide for details

agent-code works with 12+ LLM providers. This tutorial shows how to switch between them and set up your preferred workflow.

Quick switch: one env var

Each provider is activated by setting its API key:

# Anthropic (Claude)
export ANTHROPIC_API_KEY="sk-ant-..."
agent

# OpenAI (GPT)
export OPENAI_API_KEY="sk-..."
agent

# Google (Gemini)
export GOOGLE_API_KEY="AIza..."
agent

The agent auto-detects the provider from which key is set and configures the correct API endpoint.

Switch mid-session

Use the /model command to open the interactive model picker:

> /model

This shows models available for your current provider. Or specify directly:

> /model gpt-4.1-mini

Use a specific provider via config

For permanent setup, edit your config file:

# ~/.config/agent-code/config.toml

[api]
model = "claude-sonnet-4-20250514"

Local models with Ollama

Run models locally with zero API cost:

# Install and start Ollama
ollama serve

# Pull a model
ollama pull llama3

# Run agent-code with it
agent --api-base-url http://localhost:11434/v1 --model llama3 --api-key unused

The --api-key unused is required (Ollama ignores it but the flag is needed).

Any OpenAI-compatible endpoint

agent-code works with any service that speaks the OpenAI Chat Completions API:

# OpenRouter (access any model via one key)
agent --api-base-url https://openrouter.ai/api/v1 --api-key sk-or-... --model anthropic/claude-sonnet-4

# Together AI
agent --api-base-url https://api.together.xyz/v1 --api-key ... --model meta-llama/Llama-3-70b-chat-hf

# Groq (fast inference)
agent --api-base-url https://api.groq.com/openai/v1 --api-key gsk_... --model llama-3.3-70b-versatile

# Your own endpoint
agent --api-base-url http://localhost:8080/v1 --api-key ... --model my-model

AWS Bedrock

Access Claude models through your AWS account:

export AGENT_CODE_USE_BEDROCK=1
export AWS_REGION=us-east-1
# Uses your default AWS credential chain (env vars, ~/.aws/credentials, IAM role)
agent

Google Vertex AI

Access Claude models through Google Cloud:

export AGENT_CODE_USE_VERTEX=1
export GOOGLE_CLOUD_PROJECT=my-project
export GOOGLE_CLOUD_LOCATION=us-central1
agent

Model recommendations by task

TaskRecommendedWhy
Quick fixes, small editsGPT-4.1-mini, Haiku, Gemini FlashFast, cheap
Feature implementationSonnet, GPT-4.1Good balance
Complex architectureOpus, GPT-5.4Maximum reasoning
Local/private codeOllama + Llama 3No data leaves your machine

Cost tracking

Check what you're spending:

> /cost

Set a session limit:

[api]
max_cost_usd = 5.0  # Stop after $5

The /cost command shows per-model breakdown when you've used multiple models in one session.

The agent loop is the core execution engine. It handles the cycle of calling the LLM, executing tools, and managing context.

Turn lifecycle

Each turn follows this sequence:

1. Budget check        → stop if cost/token limit exceeded
2. Message normalize   → pair orphaned tool results, merge consecutive user messages
3. Auto-compact        → if context nears window limit:
                           microcompact → LLM summary → context collapse → aggressive trim
4. Build request       → system prompt + history + tool schemas
5. Stream response     → display text in real-time, collect content blocks
6. Error recovery      → rate limit retry, prompt-too-long compact, max-output continue
7. Extract tool calls  → parse tool_use blocks from response
8. Permission check    → allow/deny/ask per tool and pattern
9. Execute tools       → concurrent batch (read-only) or serial (mutations)
10. Inject results     → add tool results to history
11. Loop               → back to step 1 until no tool calls

Compaction strategies

Long sessions exceed the context window. The agent uses three strategies, tried in order:

Clears the content of old tool results, replacing with `[Old tool result cleared]`. Keeps tool_use/tool_result pairing intact. Cheapest — no API call needed. Calls the LLM to generate a concise summary of older messages. Replaces them with a compact boundary marker + summary. Costs one API call but frees significant context. Removes message groups from the middle of the conversation, keeping the first (context/summary) and last (recent) groups. The full history stays in memory for session persistence — only the API-facing view is collapsed.

Error recovery

The agent handles these error conditions automatically:

ErrorRecovery
Rate limited (429)Wait retry_after ms, retry up to 5 times
Overloaded (529)5s backoff, retry up to 5 times, then fall back to smaller model
Prompt too long (413)Reactive microcompact, then context collapse
Max output tokensInject continuation message, retry up to 3 times
Stream interruptedExponential backoff with retry

Token budget

The agent tracks token usage and estimated cost:

  • Auto-compact fires at context_window - 20K reserved - 13K buffer
  • Budget enforcement stops execution when cost or token limits are reached
  • Diminishing progress detection stops after 3 turns with minimal output

Configure limits:

[api]
max_cost_usd = 5.0  # Stop after $5 spent this session

Extended thinking

When using models that support it, the agent sends a thinking budget with each request. The budget scales by model:

ModelThinking budget
Opus32,000 tokens
Sonnet16,000 tokens
Haiku8,000 tokens

Thinking content is displayed briefly in the terminal but not stored in conversation history.

Tools are the bridge between the LLM's intentions and your local environment. Each tool defines what it can do, what inputs it accepts, and whether it's safe to run in parallel.

How tools work

  1. The LLM decides which tool to call and with what arguments
  2. The agent validates the input against the tool's schema
  3. Permission checks run (allow/deny/ask based on your rules)
  4. The tool executes and returns a result
  5. The result is sent back to the LLM for the next step

Tool categories

File operations

ToolWhat it does
FileReadRead files with line numbers. Handles text, PDF (via pdftotext), Jupyter notebooks, and images (base64 for vision).
FileWriteCreate or overwrite files. Auto-creates parent directories.
FileEditSearch-and-replace within files. Requires unique match unless replace_all is set.
NotebookEditEdit Jupyter notebook cells (replace, insert, delete).
ToolWhat it does
GrepRegex content search powered by ripgrep. Supports context lines, glob filtering, case sensitivity.
GlobFind files by pattern, sorted by modification time.
ToolSearchDiscover available tools by keyword or direct name.

Execution

ToolWhat it does
BashRun shell commands with timeout, background execution, destructive command detection, and output truncation.
REPLExecute Python or Node.js code snippets in an interpreter.

Agent coordination

ToolWhat it does
AgentSpawn subagents for parallel tasks with optional git worktree isolation.
SendMessageSend messages between running agents.
SkillInvoke user-defined skills programmatically.

Planning and tracking

ToolWhat it does
EnterPlanMode / ExitPlanModeToggle read-only mode for safe exploration.
TaskCreate / TaskUpdate / TaskGet / TaskList / TaskStop / TaskOutputFull task lifecycle management.
TodoWriteStructured todo list management.

Web and external

ToolWhat it does
WebFetchHTTP GET with HTML-to-text conversion.
WebSearchWeb search with result extraction.
LSPLanguage server diagnostics with linter fallbacks.

MCP integration

ToolWhat it does
McpProxyCalls tools on connected MCP servers.
ListMcpResourcesBrowse MCP server resources.
ReadMcpResourceRead a specific MCP resource by URI.

Workspace

ToolWhat it does
EnterWorktree / ExitWorktreeCreate and manage isolated git worktrees.
AskUserQuestionPrompt the user with structured multi-choice questions.
SleepAsync pause with cancellation.

Concurrency

Tools declare whether they're safe to run in parallel:

  • Read-only tools (FileRead, Grep, Glob, etc.) run concurrently
  • Mutation tools (Bash, FileWrite, FileEdit) run serially

This maximizes throughput while preventing race conditions on file writes.

Result handling

Tool results larger than 64KB are automatically persisted to disk (~/.cache/agent-code/tool-results/). The conversation receives a truncated preview with a file path reference so the full output isn't lost.

Writing custom tools

See Custom Tools for how to implement the Tool trait and register new tools.

Every tool call passes through the permission system before executing. This gives you fine-grained control over what the agent is allowed to do.

Permission modes

Set the mode via CLI flag or config:

ModeBehavior
ask (default)Prompt before mutations, auto-allow reads
allowAuto-approve everything
denyBlock all mutations
planRead-only tools only (strictest)
accept_editsAuto-approve file edits, ask for shell commands
# CLI
agent --permission-mode plan

# Skip all checks (CI/scripting only)
agent --dangerously-skip-permissions

Permission rules

Configure per-tool rules in your settings file:

# .agent/settings.toml or ~/.config/agent-code/config.toml

[permissions]
default_mode = "ask"

# Auto-approve git commands
[[permissions.rules]]
tool = "Bash"
pattern = "git *"
action = "allow"

# Block destructive commands
[[permissions.rules]]
tool = "Bash"
pattern = "rm -rf *"
action = "deny"

# Allow writes only to /tmp
[[permissions.rules]]
tool = "FileWrite"
pattern = "/tmp/*"
action = "allow"

Rules are evaluated in order. The first matching rule wins. If no rule matches, the default mode applies.

Built-in safety

Protected directories

Write tools (FileWrite, FileEdit, MultiEdit, NotebookEdit) are blocked from writing to these directories regardless of your permission rules:

  • .git/ — prevent repository corruption
  • .husky/ — prevent hook tampering
  • node_modules/ — prevent dependency modification

Read access to these directories is unaffected.

Shell safety

The Bash tool includes additional safety checks beyond the permission system:

  • Destructive command detection: warns before rm -rf, git reset --hard, DROP TABLE, chmod -R 777, and other dangerous patterns
  • System path blocking: prevents writes to /etc, /usr, /bin, /sbin, /boot, /sys, /proc
  • Output truncation: large outputs are persisted to disk instead of flooding context

Plan mode

Plan mode restricts the agent to read-only operations. Use it when you want the agent to analyze and plan without making changes:

> /plan
Plan mode enabled. Only read-only tools available.

> analyze the architecture and suggest improvements
(agent reads files, searches code, but cannot edit or run commands)

> /plan
Plan mode disabled. All tools available.

Denial tracking

Permission denials are recorded with the tool name, reason, and input summary. View them with /permissions:

> /permissions
Permission mode: Ask
Rules:
  Bash git * -> Allow
  Bash rm * -> Deny

Memory gives the agent context that persists across sessions. There are two layers:

Project memory

Place a AGENTS.md file in your project root or .agent/AGENTS.md:

# Project Context

This is a Rust web API using Axum and SQLx.
The database is PostgreSQL, migrations are in db/migrations/.
Run tests with `cargo test`. The CI pipeline is in .github/workflows/ci.yml.
Always run `cargo fmt` before committing.

This is loaded automatically at the start of every session in that project directory. Use it for project-specific instructions, conventions, and context that every session needs.

User memory

User-level memory lives in ~/.config/agent-code/memory/:

  • MEMORY.md — the index file, loaded automatically
  • Individual memory files linked from the index
<!-- ~/.config/agent-code/memory/MEMORY.md -->
- [Preferences](preferences.md) — coding style and response preferences
- [Work context](work.md) — current projects and priorities
<!-- ~/.config/agent-code/memory/preferences.md -->
---
name: preferences
description: User coding style preferences
type: user
---

- I prefer explicit error handling over unwrap/expect
- Use descriptive variable names, not single letters
- Always include tests for new functions

How memory is used

Memory files are injected into the system prompt at session start:

  1. Project AGENTS.md → appears under "# Project Context"
  2. User MEMORY.md index → appears under "# User Memory"
  3. Individual memory files linked from the index → loaded and appended

The agent sees this context on every turn, so it can follow your conventions and understand your project without being told every time.

Size limits

LimitValue
Max file size25KB per memory file
Max index lines200 lines

Files exceeding these limits are truncated with a (truncated) marker.

Commands

> /memory
Project context: loaded
User memory: loaded (2 files)

Every interactive session is automatically saved when you exit. You can resume any previous session to continue where you left off.

Auto-save

Sessions save automatically on exit (Ctrl+D or /exit). The session file includes:

  • Full conversation history (messages, tool calls, results)
  • Turn count and model used
  • Working directory at session start
  • Session ID

Sessions are stored as JSON in ~/.config/agent-code/sessions/.

Resume a session

List recent sessions:

> /sessions
Recent sessions:

  a1b2c3d — /home/user/project (5 turns, 23 msgs, 2026-03-31T20:15:00Z)
  e4f5g6h — /home/user/other (12 turns, 67 msgs, 2026-03-31T18:30:00Z)

Use /resume <id> to restore a session.

Resume by ID:

> /resume a1b2c3d
Resumed session a1b2c3d (23 messages, 5 turns)

The agent now has the full conversation context from the previous session and can continue the work.

Session ID

Each session gets a short unique ID shown in the welcome banner:

 agent  session a1b2c3d

Use this ID to resume later.

Export

Export the current conversation as markdown:

> /export
Exported to conversation-export-20260331-201500.md

The export includes user messages and assistant responses as readable markdown.

Configuration loads from three layers (highest priority first):

  1. CLI flags and environment variables
  2. Project config.agent/settings.toml in your repo
  3. User config~/.config/agent-code/config.toml

Full config reference

# ~/.config/agent-code/config.toml

[api]
base_url = "https://api.anthropic.com/v1"
model = "claude-sonnet-4-20250514"
# api_key is resolved from env: AGENT_CODE_API_KEY, ANTHROPIC_API_KEY, OPENAI_API_KEY
max_output_tokens = 16384
thinking = "enabled"          # "enabled", "disabled", or omit for default
effort = "high"               # "low", "medium", "high"
max_cost_usd = 10.0           # Stop session after this spend
timeout_secs = 120
max_retries = 3

[permissions]
default_mode = "ask"          # "ask", "allow", "deny", "plan", "accept_edits"

[[permissions.rules]]
tool = "Bash"
pattern = "git *"
action = "allow"

[[permissions.rules]]
tool = "Bash"
pattern = "rm *"
action = "deny"

[ui]
markdown = true
syntax_highlight = true
theme = "dark"

# MCP servers (see MCP Servers page)
[mcp_servers.filesystem]
command = "npx"
args = ["-y", "@modelcontextprotocol/server-filesystem", "/path"]

# Lifecycle hooks (see Hooks page)
[[hooks]]
event = "post_tool_use"
tool_name = "FileWrite"
[hooks.action]
type = "shell"
command = "cargo fmt"

Project config

Create .agent/settings.toml in your repo root for project-specific settings. These override user config but are overridden by CLI flags.

Initialize with:

agent
> /init
Created .agent/settings.toml

Environment variables

VariablePurpose
AGENT_CODE_API_KEYAPI key (highest priority)
ANTHROPIC_API_KEYAnthropic API key
OPENAI_API_KEYOpenAI API key
AGENT_CODE_API_BASE_URLAPI endpoint override
AGENT_CODE_MODELModel override
EDITORDetermines vi/emacs REPL mode

agent-code works with any LLM that speaks the Anthropic Messages API or OpenAI Chat Completions API. The provider is auto-detected from your model name and base URL.

Anthropic (Claude)

export ANTHROPIC_API_KEY="sk-ant-..."
agent

Supported models: Claude Opus, Sonnet, Haiku (all versions).

Features enabled: prompt caching, extended thinking, cache_control breakpoints.

OpenAI (GPT)

export OPENAI_API_KEY="sk-..."
agent --model gpt-4o

Supported models: GPT-4o, GPT-4, o1, o3, and others.

Ollama (local)

agent --api-base-url http://localhost:11434/v1 --model llama3 --api-key unused

No API key needed (pass any string). Start Ollama first: ollama serve.

Groq

agent --api-base-url https://api.groq.com/openai/v1 --api-key gsk_... --model llama-3.3-70b-versatile

Together AI

agent --api-base-url https://api.together.xyz/v1 --api-key ... --model meta-llama/Llama-3-70b-chat-hf

DeepSeek

agent --api-base-url https://api.deepseek.com/v1 --api-key ... --model deepseek-chat

OpenRouter

agent --api-base-url https://openrouter.ai/api/v1 --api-key ... --model anthropic/claude-sonnet-4

OpenRouter lets you access any model through a single API key.

Explicit provider selection

If auto-detection doesn't work for your setup, force it:

agent --provider anthropic  # Use Anthropic wire format
agent --provider openai     # Use OpenAI wire format

Auto-detection logic

The provider is chosen by checking (in order):

  1. --provider flag (if set)
  2. Base URL contains anthropic.com → Anthropic
  3. Base URL contains openai.com → OpenAI
  4. Base URL is localhost → OpenAI-compatible
  5. Model name starts with claude/opus/sonnet/haiku → Anthropic
  6. Model name starts with gpt/o1/o3 → OpenAI
  7. Default → OpenAI-compatible (most common API shape)

MCP (Model Context Protocol) lets you extend agent-code with tools and resources from external servers. Any MCP-compatible server can be connected.

Configuration

Add servers to your config file:

# .agent/settings.toml or ~/.config/agent-code/config.toml

[mcp_servers.filesystem]
command = "npx"
args = ["-y", "@modelcontextprotocol/server-filesystem", "/home/user/docs"]

[mcp_servers.github]
command = "npx"
args = ["-y", "@modelcontextprotocol/server-github"]
env = { GITHUB_TOKEN = "ghp_..." }

[mcp_servers.postgres]
command = "npx"
args = ["-y", "@modelcontextprotocol/server-postgres", "postgresql://localhost/mydb"]

Transports

Stdio (default)

The server runs as a subprocess, communicating via stdin/stdout JSON-RPC:

[mcp_servers.myserver]
command = "path/to/server"
args = ["--flag", "value"]
env = { API_KEY = "..." }

SSE (HTTP)

For servers that expose an HTTP endpoint:

[mcp_servers.remote]
url = "http://localhost:8080"

How it works

  1. At startup, agent-code connects to each configured server
  2. The initialize handshake negotiates capabilities
  3. Tools are discovered via tools/list and registered as mcp__server__tool in the agent's tool pool
  4. When the LLM calls an MCP tool, the request is proxied to the server via tools/call
  5. Resources can be browsed with ListMcpResources and read with ReadMcpResource

Commands

> /mcp
2 MCP server(s) configured:
  filesystem (stdio)
  github (stdio)
ServerWhat it provides
@modelcontextprotocol/server-filesystemFile system access with path restrictions
@modelcontextprotocol/server-githubGitHub API (issues, PRs, repos)
@modelcontextprotocol/server-postgresPostgreSQL query execution
@modelcontextprotocol/server-sqliteSQLite database access
@modelcontextprotocol/server-slackSlack messaging

Find more at modelcontextprotocol.io.

Hooks let you run shell commands or HTTP requests at specific points in the agent's lifecycle. Use them for auto-formatting, linting, notifications, or custom validation.

Configuration

# .agent/settings.toml

# Auto-format Rust files after any write
[[hooks]]
event = "post_tool_use"
tool_name = "FileWrite"
[hooks.action]
type = "shell"
command = "cargo fmt"

# Lint after edits
[[hooks]]
event = "post_tool_use"
tool_name = "FileEdit"
[hooks.action]
type = "shell"
command = "cargo clippy --quiet"

# Notify on session start
[[hooks]]
event = "session_start"
[hooks.action]
type = "http"
url = "https://hooks.slack.com/services/T.../B.../..."
method = "POST"

Hook events

EventWhen it fires
session_startSession begins
session_stopSession ends
pre_tool_useBefore a tool executes
post_tool_useAfter a tool completes
user_prompt_submitUser submits input

Hook actions

Shell

Run a command in the project directory:

[hooks.action]
type = "shell"
command = "make lint"

HTTP

Send a request to a URL:

[hooks.action]
type = "http"
url = "https://example.com/webhook"
method = "POST"

Filtering by tool

Use tool_name to run hooks only for specific tools:

[[hooks]]
event = "pre_tool_use"
tool_name = "Bash"
[hooks.action]
type = "shell"
command = "echo 'Bash command about to run'"

Without tool_name, the hook fires for all tools.

Commands

> /hooks
Hook system active. Configure hooks in .agent/settings.toml:
  [[hooks]]
  event = "pre_tool_use"
  action = { type = "shell", command = "./check.sh" }

Skills are reusable prompt templates that define multi-step workflows. They're markdown files with YAML frontmatter, loaded from .agent/skills/ or ~/.config/agent-code/skills/.

Creating a skill

Create a file in .agent/skills/test-and-fix.md:

---
description: Run tests and fix failures
whenToUse: When the user asks to test or fix failing tests
userInvocable: true
---

Run the test suite with the project's test command. If any tests fail:

1. Read the failing test file
2. Read the source code being tested
3. Identify the root cause
4. Fix the issue
5. Re-run tests to verify

Repeat until all tests pass. Do not skip or delete failing tests.

Frontmatter options

FieldTypeDescription
descriptionstringWhat this skill does
whenToUsestringHints for the LLM about when to suggest this skill
userInvocablebooleanWhether users can invoke it via /skill-name
disableNonInteractivebooleanDisable in one-shot mode
pathsstring[]File patterns that trigger this skill suggestion

Invoking skills

As a slash command

If userInvocable: true, invoke with the filename (minus .md):

> /test-and-fix

With arguments

Use {{arg}} in the template for argument substitution:

---
description: Review a specific file
userInvocable: true
---

Review {{arg}} for bugs, security issues, and code quality problems.
Focus on edge cases and error handling.
> /review src/auth.rs

Programmatically via the Skill tool

The LLM can invoke skills when it determines one is appropriate:

{
  "name": "Skill",
  "input": {
    "skill": "test-and-fix",
    "args": null
  }
}

Directory skills

For complex skills with supporting files, use a directory:

.agent/skills/
  deploy/
    SKILL.md      ← the skill definition
    checklist.md  ← referenced by the skill

Skill locations

LocationScope
.agent/skills/Project-specific
~/.config/agent-code/skills/Available in all projects

Bundled skills

agent-code ships with 12 built-in skills. These are always available and can be overridden by placing a skill with the same name in your project or user skills directory.

SkillPurpose
/commitCreate well-crafted git commits
/reviewReview diff for bugs and security issues
/testRun tests and fix failures
/explainExplain how code works
/debugDebug errors with root cause analysis
/prCreate pull requests
/refactorRefactor code for quality
/initInitialize project configuration
/security-reviewOWASP-oriented vulnerability scan
/advisorArchitecture and dependency health analysis
/bughunterSystematic bug search
/planStructured implementation planning

Commands

> /skills
Loaded 15 skills:
  commit [invocable] — Create a well-crafted git commit
  review [invocable] — Review code changes for bugs and issues
  test-and-fix [invocable] — Run tests and fix failures
  deploy — Production deployment checklist

Plugins package skills, hooks, and configuration together as installable units. A plugin is a directory with a plugin.toml manifest.

Plugin structure

my-plugin/
  plugin.toml        ← manifest
  skills/
    deploy.md        ← bundled skills
    rollback.md

Manifest

# plugin.toml

name = "my-deploy-plugin"
version = "1.0.0"
description = "Deployment workflows for our stack"
author = "team@company.com"

skills = ["deploy", "rollback"]

[[hooks]]
event = "post_tool_use"
tool_name = "Bash"
command = "notify-deploy-status"

Installing plugins

Place plugin directories in:

LocationScope
.agent/plugins/Project-specific
~/.config/agent-code/plugins/Available in all projects

Commands

> /plugins
Loaded 1 plugins:
  my-deploy-plugin v1.0.0 — Deployment workflows for our stack

Skills from plugins are automatically registered and appear in /skills output.

Tools implement the Tool trait. Each tool defines its input schema, permission behavior, and execution logic.

The Tool trait

#![allow(unused)]
fn main() {
#[async_trait]
pub trait Tool: Send + Sync {
    /// Unique name used in API tool_use blocks.
    fn name(&self) -> &'static str;

    /// Description sent to the LLM.
    fn description(&self) -> &'static str;

    /// JSON Schema for input parameters.
    fn input_schema(&self) -> serde_json::Value;

    /// Execute the tool.
    async fn call(
        &self,
        input: serde_json::Value,
        ctx: &ToolContext,
    ) -> Result<ToolResult, ToolError>;

    /// Whether this tool only reads (no mutations).
    fn is_read_only(&self) -> bool { false }

    /// Whether it's safe to run in parallel with other tools.
    fn is_concurrency_safe(&self) -> bool { self.is_read_only() }
}
}

Example: a simple tool

#![allow(unused)]
fn main() {
pub struct TimeTool;

#[async_trait]
impl Tool for TimeTool {
    fn name(&self) -> &'static str { "Time" }

    fn description(&self) -> &'static str {
        "Returns the current date and time."
    }

    fn input_schema(&self) -> serde_json::Value {
        json!({
            "type": "object",
            "properties": {}
        })
    }

    fn is_read_only(&self) -> bool { true }

    async fn call(
        &self,
        _input: serde_json::Value,
        _ctx: &ToolContext,
    ) -> Result<ToolResult, ToolError> {
        let now = chrono::Utc::now().to_rfc3339();
        Ok(ToolResult::success(now))
    }
}
}

Registering the tool

In src/tools/registry.rs:

#![allow(unused)]
fn main() {
pub fn default_tools() -> Self {
    let mut registry = Self::new();
    // ... existing tools ...
    registry.register(Arc::new(TimeTool));
    registry
}
}

ToolContext

Every tool receives a ToolContext with:

FieldTypeDescription
cwdPathBufCurrent working directory
cancelCancellationTokenCheck for Ctrl+C
permission_checkerArc<PermissionChecker>Check permissions
verboseboolVerbose output mode
plan_modeboolRead-only mode active
file_cacheOption<Arc<Mutex<FileCache>>>Shared file cache
denial_trackerOption<Arc<Mutex<DenialTracker>>>Permission denial log

ToolResult

#![allow(unused)]
fn main() {
// Success
Ok(ToolResult::success("output text"))

// Error (sent back to LLM as an error result)
Ok(ToolResult::error("what went wrong"))

// Fatal error (stops the tool, not sent to LLM)
Err(ToolError::ExecutionFailed("crash details".into()))
}

The IDE bridge allows editor extensions (VS Code, JetBrains, etc.) to communicate with a running agent-code instance over HTTP.

How it works

When agent-code starts, it writes a lock file to ~/.cache/agent-code/bridge/. IDE extensions scan for lock files to discover running sessions.

The lock file contains:

{
  "port": 8432,
  "pid": 12345,
  "cwd": "/home/user/project",
  "started_at": "2026-03-31T20:00:00Z"
}

Protocol

EndpointMethodDescription
/statusGETAgent status (idle/active, model, session info)
/messagesGETConversation history
/messagePOSTSend a user message
/eventsGETSSE stream of real-time events

Discovery

Stale lock files (where the PID is no longer running) are automatically cleaned up when any client scans for bridges.

#![allow(unused)]
fn main() {
use rs_code::services::bridge::discover_bridges;

let bridges = discover_bridges();
for b in &bridges {
    println!("Found: pid={} port={} cwd={}", b.pid, b.port, b.cwd);
}
}

Building an extension

  1. Scan ~/.cache/agent-code/bridge/ for .lock files
  2. Parse the JSON to get the port
  3. Connect to http://localhost:{port}
  4. Use the HTTP endpoints to interact with the session

The bridge is read-only by default — the IDE can observe but not control the agent without explicit user action.

Long sessions exceed the LLM's context window. The compaction system keeps conversations running indefinitely by strategically reducing history size.

Context Window Layout

|<--- context window (e.g., 200K tokens) ------------------------------>|
|<--- effective window (context - 20K reserved for output) ------------>|
|<--- auto-compact threshold (effective - 13K buffer) ----------------->|
|                                                      ↑ compaction fires

The 20K reservation ensures the model always has room to respond. The 13K buffer prevents compaction from firing on every turn.

Three Strategies

Strategies are tried in order. Each is progressively more aggressive.

1. Microcompact

Cost: zero (no API call). Savings: moderate.

Clears the content field of old tool results, replacing them with [Old tool result cleared]. Keeps the tool_use/tool_result pairing intact so the conversation structure remains valid.

Before: ToolResult { content: "500 lines of file content..." }
After:  ToolResult { content: "[Old tool result cleared]" }

The keep_recent parameter (default: 2) preserves the most recent N turns untouched.

Source: services/compact.rsmicrocompact()

2. LLM Summary

Cost: one API call. Savings: large.

Sends older messages to the LLM with a prompt asking for a concise summary. Replaces those messages with a single compact boundary message containing the summary.

The summary preserves: key decisions, file paths discussed, errors encountered, and the current task state.

Source: services/compact.rsbuild_compact_summary_prompt()

3. Context Collapse

Cost: zero. Savings: maximum.

Removes message groups from the middle of the conversation, keeping only the first group (initial context/summary) and the last group (recent messages). The full history remains in memory for session persistence — only the API-facing view is collapsed.

Source: services/context_collapse.rs

When Compaction Fires

Auto-compact

Checked before every LLM call. Fires when estimated tokens exceed the threshold:

threshold = context_window - 20K (reserved) - 13K (buffer)

For a 200K context window, this fires at ~167K tokens.

Reactive compact

Triggered by API prompt_too_long (413) errors. Parses the gap from the error message and runs microcompact + context collapse aggressively.

Manual

Users can trigger compaction with /compact to proactively free context.

Token Estimation

Token counts are estimated using a character-based heuristic: 4 bytes per token. This is conservative for English text and intentionally overestimates to prevent context overflow.

Images use a fixed estimate of 2,000 tokens. Tool use blocks estimate from the serialized JSON input.

Source: services/tokens.rs

Execution Flow

When the LLM responds with tool calls, the executor handles them in this order:

LLM response with tool_use blocks
    │
    ▼
Parse pending tool calls
    │
    ▼
For each tool call:
    ├── Permission check (allow/deny/ask)
    ├── Input validation (JSON schema)
    ├── Plan mode check (block mutations)
    └── Protected directory check (block .git/, .husky/, node_modules/)
    │
    ▼
Partition into batches:
    ├── Read-only tools → parallel batch (tokio::join!)
    └── Mutation tools → serial execution
    │
    ▼
Execute tools, collect results
    │
    ▼
Fire post-tool-use hooks
    │
    ▼
Inject tool results into conversation
    │
    ▼
Back to LLM for next turn

Batching Strategy

Tools declare two properties:

PropertyMeaning
is_read_only()Tool only reads, never mutates
is_concurrency_safe()Safe to run alongside other tools (defaults to is_read_only())

The executor uses these to partition:

  • Parallel batch: all concurrency-safe tools run simultaneously via tokio::join!
  • Serial queue: mutation tools run one at a time, in order

This maximizes throughput for read-heavy turns (common when the agent explores code) while ensuring mutation ordering is preserved.

Permission Checks

Every tool call passes through PermissionChecker::check() before execution:

  1. Protected directories: write tools blocked from .git/, .husky/, node_modules/ (hardcoded, not overridable)
  2. Explicit rules: user-configured per-tool/pattern rules evaluated in order, first match wins
  3. Default mode: ask, allow, deny, plan, or accept_edits

Read-only tools use a relaxed check (check_read()) that only blocks explicit deny rules.

Streaming Executor

Tools begin execution as soon as their input is fully parsed from the SSE stream — they don't wait for the entire response to finish. This overlaps tool execution with LLM generation for faster turns.

The streaming executor watches for complete tool_use content blocks in the accumulating response and dispatches them immediately.

Error Handling

ErrorRecovery
Permission deniedTool result reports denial, LLM adjusts approach
Tool execution errorError message returned as tool result
TimeoutTool cancelled via CancellationToken, error result injected
Invalid inputValidation error returned before execution

The Tool Trait

Every tool implements:

#![allow(unused)]
fn main() {
#[async_trait]
pub trait Tool: Send + Sync {
    fn name(&self) -> &'static str;
    fn description(&self) -> &'static str;
    fn input_schema(&self) -> serde_json::Value;
    async fn call(&self, input: Value, ctx: &ToolContext) -> Result<ToolResult, ToolError>;
    fn is_read_only(&self) -> bool { false }
    fn is_concurrency_safe(&self) -> bool { self.is_read_only() }
}
}

Adding a new tool means implementing this trait and registering it in the ToolRegistry. No central enum to modify.

Source: tools/mod.rs (trait), tools/executor.rs (dispatch), tools/registry.rs (registration)

The Problem

Different LLM providers use different APIs:

  • Anthropic: Messages API with content blocks, tool_use/tool_result types, prompt caching
  • OpenAI: Chat Completions API with messages, tool_calls, function format

agent-code needs to work identically regardless of which provider is configured.

Architecture

User prompt
    │
    ▼
Query Engine (provider-agnostic)
    │
    ▼
Provider Detection (auto from model name + base URL)
    │
    ├── Anthropic wire format → Anthropic Messages API
    └── OpenAI wire format → OpenAI Chat Completions API
    │
    ▼
SSE Stream → Normalize → Unified ContentBlock types
    │
    ▼
Tool execution (same code path regardless of provider)

Provider Detection

detect_provider() in llm/provider.rs determines the provider from:

  1. Model name: claude-* → Anthropic, gpt-* → OpenAI, grok-* → xAI, etc.
  2. Base URL: api.anthropic.com → Anthropic, api.openai.com → OpenAI, etc.
  3. Environment: AGENT_CODE_USE_BEDROCK → Bedrock, AGENT_CODE_USE_VERTEX → Vertex

Each provider maps to a WireFormat:

Wire FormatProviders
AnthropicAnthropic, Bedrock, Vertex
OpenAiOpenAI, xAI, Google, DeepSeek, Groq, Mistral, Together, Zhipu, Ollama, any compatible

Wire Formats

Anthropic (llm/anthropic.rs)

  • Sends messages with content as array of typed blocks
  • Tool calls appear as tool_use content blocks in assistant messages
  • Tool results are tool_result content blocks in user messages
  • Supports cache_control breakpoints for prompt caching
  • Extended thinking via thinking content blocks

OpenAI (llm/openai.rs)

  • Sends messages with content as string or array
  • Tool calls appear in tool_calls array on assistant messages
  • Tool results are separate messages with role: "tool"
  • Supports streaming via SSE with [DONE] sentinel

Message Normalization

llm/normalize.rs ensures messages are valid before sending:

  • Tool pairing: every tool_use block must have a matching tool_result in the next user message
  • Alternation: user and assistant messages must alternate (APIs reject consecutive same-role messages)
  • Empty handling: empty content arrays are removed or filled with placeholder text

This runs after every turn, before the next API call.

Stream Parsing

llm/stream.rs handles SSE (Server-Sent Events) parsing:

  1. Read data: lines from the HTTP response stream
  2. Parse JSON deltas (content block starts, text deltas, tool input deltas)
  3. Accumulate into complete ContentBlock instances
  4. Emit blocks to the UI (real-time text display) and executor (tool dispatch)

The stream parser handles both Anthropic's content_block_delta events and OpenAI's choices[0].delta format through the wire format abstraction.

Error Recovery

ErrorRecovery
Rate limited (429)Wait retry_after ms, retry up to 5 times
Overloaded (529)5s exponential backoff, fall back to smaller model after 3 attempts
Prompt too long (413)Reactive compaction, then retry
Max output tokensInject continuation message, retry up to 3 times
Stream interruptedReconnect with exponential backoff

The retry state machine in llm/retry.rs tracks attempts per error type and supports model fallback (e.g., Opus → Sonnet on overload).

Source: llm/provider.rs (detection), llm/anthropic.rs (Anthropic format), llm/openai.rs (OpenAI format), llm/normalize.rs (validation), llm/stream.rs (SSE parsing), llm/retry.rs (error recovery)

Overview

MCP (Model Context Protocol) extends agent-code with tools and resources from external servers. agent-code acts as an MCP client, connecting to one or more MCP servers.

Connection Lifecycle

1. Startup: read mcp_servers from config
2. For each server:
   a. Spawn subprocess (stdio) or connect HTTP (SSE)
   b. Send initialize request (protocol version, capabilities)
   c. Receive initialize response (server capabilities)
   d. Send tools/list request
   e. Register discovered tools as mcp__<server>__<tool> in the tool pool
   f. Send resources/list request (if supported)
   g. Register discovered resources
3. Ready: MCP tools available to the LLM alongside built-in tools

Transports

Stdio

The default transport. agent-code spawns the server as a subprocess and communicates via JSON-RPC over stdin/stdout.

[mcp_servers.github]
command = "npx"
args = ["-y", "@modelcontextprotocol/server-github"]
env = { GITHUB_TOKEN = "ghp_..." }

The subprocess inherits the configured env variables. Stderr is captured for debugging.

SSE (Server-Sent Events)

For servers that expose an HTTP endpoint:

[mcp_servers.remote]
url = "http://localhost:8080"

Uses HTTP POST for requests and SSE for responses/notifications.

Tool Proxying

When the LLM calls an MCP tool, the request flows through McpProxy:

LLM: tool_use { name: "mcp__github__create_issue", input: {...} }
    │
    ▼
McpProxy.call()
    ├── Parse server name and tool name from the namespaced name
    ├── Find the MCP client for "github"
    ├── Send tools/call JSON-RPC request
    ├── Wait for response
    └── Return tool result to conversation

Tool names are namespaced as mcp__<server>__<tool> to prevent collisions between servers and with built-in tools.

Resource Access

MCP servers can expose resources (database schemas, file listings, documentation). Two tools handle this:

ToolPurpose
ListMcpResourcesBrowse available resources across all connected servers
ReadMcpResourceRead a specific resource by URI

The LLM uses these tools when it needs context from external systems.

Security

Allowlist / Denylist

[security]
mcp_server_allowlist = ["github", "filesystem"]   # Only these can connect
mcp_server_denylist = ["untrusted"]                # These are blocked

If allowlist is non-empty, only listed servers are connected. denylist is checked regardless.

Permission Integration

MCP tool calls go through the same permission system as built-in tools. The namespaced tool name (mcp__github__create_issue) can be matched by permission rules:

[[permissions.rules]]
tool = "mcp__github__*"
action = "allow"

Trust Boundary

MCP servers run with the user's permissions. They can access the filesystem, network, and any service the user can. The security boundary is the same as running a shell command — use allowlists to restrict which servers connect.

Error Handling

ErrorBehavior
Server fails to startWarning logged, server skipped, agent continues without it
Connection lostTool calls to that server return error results
Tool call failsError message returned as tool result, LLM can retry or adjust
TimeoutTransport-level timeout, error result returned

JSON-RPC Protocol

MCP uses JSON-RPC 2.0 over the chosen transport:

// Request
{"jsonrpc": "2.0", "id": 1, "method": "tools/call", "params": {"name": "create_issue", "arguments": {...}}}

// Response
{"jsonrpc": "2.0", "id": 1, "result": {"content": [{"type": "text", "text": "Issue created: #42"}]}}

Source: services/mcp/client.rs (client), services/mcp/transport.rs (stdio/SSE), services/mcp/types.rs (JSON-RPC types), tools/mcp_proxy.rs (tool proxy), tools/mcp_resources.rs (resource access)

Model Selection

The single biggest performance lever is model choice.

PriorityModel TypeExamplesTradeoff
SpeedSmall/fastGPT-4.1-mini, Haiku, Gemini FlashFaster, cheaper, less capable
BalanceMid-tierSonnet, GPT-4.1Good for most tasks
QualityLargeOpus, GPT-5.4Slower, expensive, best reasoning

Switch mid-session with /model — use a fast model for exploration, switch to a capable model for complex changes.

Context Management

Monitor usage

> /context
Context: ~45000 tokens (23% of 200000 window)
Auto-compact at: 167000 tokens
Messages: 24

Manual compaction

If context is getting large, compact proactively:

> /compact
Freed ~12000 estimated tokens.

Start fresh when stuck

If the agent seems confused or repetitive, clear context:

> /clear

Or start a new session — previous sessions are auto-saved and can be resumed with /resume.

Cost Control

Set a budget

[api]
max_cost_usd = 5.0

The agent stops when the budget is reached. Check spending with /cost.

Reduce cost per turn

  • Shorter prompts: be specific, avoid repeating instructions the agent already has
  • Use AGENTS.md: persistent context loads once instead of being re-explained each session
  • Smaller models for simple tasks: /model gpt-4.1-mini for quick edits, switch back for complex work
  • Plan mode: /plan for exploration uses only read tools (cheaper turns)

Token Budget

Enable budget tracking to get warnings before hitting limits:

[features]
token_budget = true

The agent shows a warning when approaching the auto-compact threshold.

Compaction Tuning

The default thresholds work well for most use cases. For very long sessions:

  • Microcompact fires first and is free — it clears old tool results
  • LLM summary costs one API call but frees significant context
  • Context collapse is a last resort — removes middle messages entirely

If sessions are compacting too aggressively, consider using a model with a larger context window (200K+).

Tool Execution Speed

Streaming execution

Tools execute as soon as their input is parsed from the LLM response stream — they don't wait for the full response. This is automatic and requires no configuration.

Parallel reads

Read-only tools (FileRead, Grep, Glob, WebFetch) run in parallel. The agent naturally batches reads when exploring code. No tuning needed.

Bash timeout

Long-running shell commands can be given explicit timeouts:

{"command": "npm test", "timeout": 60000}

The default timeout is 120 seconds. Background mode (run_in_background: true) has no timeout.

Session Persistence

Sessions auto-save on exit. For long-running work:

  • Use /fork to create a checkpoint before risky changes
  • Use /resume <id> to return to a previous state
  • Use /export or /share to save a readable copy

Benchmarking

Run the built-in benchmarks to measure performance on your machine:

cargo bench                          # All benchmarks
cargo bench -- microcompact          # Compaction only
cargo bench -- estimate_tokens       # Token estimation only

Results with HTML reports are generated in target/criterion/.

File Operations

ToolDescriptionRead-onlyConcurrent
FileReadRead files with line numbers. Handles text, PDF, notebooks, images.YesYes
FileWriteCreate or overwrite files. Auto-creates parent dirs.NoNo
FileEditSearch-and-replace. Requires unique match or replace_all.NoNo
MultiEditBatch multiple edits in a single atomic operation.NoNo
NotebookEditEdit Jupyter cells (replace, insert, delete).NoNo

Search

ToolDescriptionRead-onlyConcurrent
GrepRegex search via ripgrep. Context lines, glob filter, case control.YesYes
GlobFind files by pattern, sorted by modification time.YesYes
ToolSearchDiscover tools by keyword or select:Name.YesYes

Execution

ToolDescriptionRead-onlyConcurrent
BashShell commands. Background mode, destructive detection, sandbox, timeout.NoNo
PowerShellPowerShell commands (Windows).NoNo
REPLPython or Node.js code execution.NoNo

Agent Coordination

ToolDescriptionRead-onlyConcurrent
AgentSpawn subagents. Optional worktree isolation.NoNo
SendMessageInter-agent communication.NoYes
SkillInvoke skills programmatically.YesNo

Planning and Tracking

ToolDescriptionRead-onlyConcurrent
EnterPlanModeSwitch to read-only mode.YesYes
ExitPlanModeRe-enable all tools.YesYes
TaskCreateCreate a progress tracking task.YesYes
TaskUpdateUpdate task status.YesYes
TaskGetGet task details by ID.YesYes
TaskListList all session tasks.YesYes
TaskStopStop a running background task.NoYes
TaskOutputRead output of a completed task.YesYes
TodoWriteStructured todo list management.YesYes

Web and External

ToolDescriptionRead-onlyConcurrent
WebFetchHTTP GET with HTML-to-text conversion.YesYes
WebSearchWeb search with result extraction.YesYes
LSPLanguage server diagnostics with linter fallbacks.YesYes

MCP Integration

ToolDescriptionRead-onlyConcurrent
McpProxyCall tools on connected MCP servers.NoNo
ListMcpResourcesBrowse MCP server resources.YesYes
ReadMcpResourceRead an MCP resource by URI.YesYes

Workspace

ToolDescriptionRead-onlyConcurrent
EnterWorktreeCreate an isolated git worktree.NoNo
ExitWorktreeClean up a worktree.NoNo
AskUserQuestionInteractive multi-choice prompts.YesNo
SleepAsync pause (max 5 min).YesYes

Type / followed by a command name in the REPL. Unknown commands are passed to the agent as prompts. Skill names work as commands too.

Session

CommandDescription
/helpShow all commands and loaded skills
/exitExit the REPL
/clearReset conversation history
/resume <id>Restore a previous session by ID
/sessionsList recent saved sessions
/exportExport conversation to markdown file
/shareExport session as shareable markdown with metadata
/summaryAsk the agent to summarize the session
/versionShow agent-code version

Context

CommandDescription
/costToken usage and estimated session cost
/contextContext window usage and auto-compact threshold
/compactFree context by clearing stale tool results
/model [name]Show or switch the active model (interactive picker)
/verboseToggle verbose output (shows token counts)

Git

CommandDescription
/diffShow current git changes
/statusShow git status
/commit [msg]Review diff and create a commit
/reviewAnalyze diff for bugs and issues
/branch [name]Show or switch git branch
/logShow recent git history

Agent Control

CommandDescription
/planToggle plan mode (read-only)
/permissionsShow permission mode and rules
/agentsList available agent types
/tasksList background tasks

Configuration

CommandDescription
/initCreate .agent/settings.toml for this project
/doctorCheck environment health (tools, config, git)
/configShow current configuration
/mcpList connected MCP servers
/hooksShow hook configuration
/pluginsList loaded plugins
/memoryShow loaded memory context
/skillsList available skills
/themeShow and configure color theme
/color [name]Switch color theme mid-session
/featuresShow enabled feature flags

History

CommandDescription
/scrollScrollable view of conversation history
/transcriptShow conversation transcript with message indices
/rewindUndo the last assistant turn
/snip <range>Remove messages by index (e.g., /snip 3-7)
/forkBranch the conversation from this point

Diagnostics

CommandDescription
/statsSession statistics (turns, tools used, cost)
/filesList working directory contents
/release-notesShow release notes for the current version
/feedback <msg>Submit feedback or suggestions
/bugReport a bug (opens GitHub issues link)

Editing

CommandDescription
/vimSwitch to vi editing mode
/emacsSwitch to emacs editing mode
agent [OPTIONS]

Options

FlagDefaultDescription
-p, --prompt <TEXT>Execute a single prompt and exit (non-interactive)
-m, --model <MODEL>claude-sonnet-4-20250514Model to use
--api-base-url <URL>auto-detectedAPI endpoint URL
--api-key <KEY>from envAPI key (prefer env var)
--provider <NAME>autoLLM provider: anthropic, openai, or auto
--permission-mode <MODE>askPermission mode: ask, allow, deny, plan, accept_edits
--dangerously-skip-permissionsfalseSkip all permission checks
-C, --cwd <DIR>current dirWorking directory
--max-turns <N>50Maximum agent turns per request
-v, --verbosefalseEnable verbose output
--dump-system-promptfalsePrint the system prompt and exit
-h, --helpShow help
--versionShow version

Environment variables

VariableEquivalent flagDescription
AGENT_CODE_API_KEY--api-keyAPI key (highest priority)
ANTHROPIC_API_KEY--api-keyAnthropic API key
OPENAI_API_KEY--api-keyOpenAI API key
AGENT_CODE_API_BASE_URL--api-base-urlAPI endpoint URL
AGENT_CODE_MODEL--modelModel name

Examples

# Interactive mode with Anthropic
ANTHROPIC_API_KEY=sk-ant-... agent

# One-shot with OpenAI
OPENAI_API_KEY=sk-... agent --model gpt-4o --prompt "explain main.rs"

# Local Ollama
agent --api-base-url http://localhost:11434/v1 --model llama3 --api-key x

# CI: fix tests without asking
agent --dangerously-skip-permissions --prompt "fix the failing tests"

# Read-only exploration
agent --permission-mode plan

# Debug: see what the LLM receives
agent --dump-system-prompt

API Configuration

VariableDescription
AGENT_CODE_API_KEYAPI key (highest priority, works with any provider)
ANTHROPIC_API_KEYAnthropic API key (auto-selects Anthropic provider)
OPENAI_API_KEYOpenAI API key (auto-selects OpenAI provider)
AGENT_CODE_API_BASE_URLAPI endpoint URL override
AGENT_CODE_MODELModel name override

Behavior

VariableDescription
EDITORDetermines REPL editing mode (vi if contains "vi", else emacs)
SHELLReported in the system prompt environment section

Resolution order

API key is resolved from the first available:

  1. --api-key CLI flag
  2. Config file (api.api_key)
  3. AGENT_CODE_API_KEY env var
  4. ANTHROPIC_API_KEY env var
  5. OPENAI_API_KEY env var

Base URL auto-detection:

  • If only OPENAI_API_KEY is set → defaults to https://api.openai.com/v1
  • Otherwise → defaults to https://api.anthropic.com/v1

This can always be overridden with --api-base-url or the config file.

API Connection

"API key required"

The agent could not find an API key. Set one of:

export ANTHROPIC_API_KEY="sk-ant-..."   # Anthropic
export OPENAI_API_KEY="sk-..."          # OpenAI
export AGENT_CODE_API_KEY="..."         # Any provider

Or add it to your config file:

# ~/.config/agent-code/config.toml
[api]
# api_key is resolved from env vars — don't put keys in config files

"Connection refused" or timeout

  • Check your internet connection
  • Verify the API base URL: agent --dump-system-prompt 2>&1 | head -1
  • For local models (Ollama): ensure the server is running (ollama serve)
  • For corporate proxies: set HTTPS_PROXY environment variable

Rate limited (429)

The agent retries automatically up to 5 times with backoff. If it persists:

  • Switch to a less busy model: /model
  • Wait a few minutes and retry
  • Check your API plan's rate limits

Permission Issues

"Denied by rule" on a command you want to run

Check your permission rules:

> /permissions

Add an allow rule:

# .agent/settings.toml
[[permissions.rules]]
tool = "Bash"
pattern = "your-command *"
action = "allow"

"Write to .git/ is blocked"

This is a built-in safety measure. The agent cannot write to .git/, .husky/, or node_modules/ regardless of permission settings. This prevents repository corruption and dependency tampering.

If you need to modify git config, run the command yourself with !:

> !git config user.name "Your Name"

Context Window

"Prompt too long" or context exceeded

The agent auto-compacts when approaching the limit. If it still fails:

  • Run /compact to manually free context
  • Start a new session for a fresh context window
  • Use /snip 0-10 to remove old messages
  • Switch to a model with a larger context window

Agent seems to forget earlier instructions

Long conversations get compacted automatically. Important context may be summarized. To preserve critical instructions:

  • Put them in AGENTS.md (loaded every session)
  • Use /memory to check what context is loaded
  • Start a fresh session with /clear if context is corrupted

Tool Errors

Bash command fails silently

  • Check the command works manually: !your-command
  • The agent may not have the right PATH. Set it in your shell profile.
  • On Windows, some commands need PowerShell syntax

"File content has changed" on edit

Another process modified the file between the agent reading and editing it. The agent will re-read and retry. If it persists:

  • Close other editors or watchers on the file
  • Disable auto-formatting hooks temporarily

Grep/Glob returns no results

  • Verify ripgrep is installed: !rg --version
  • Check the working directory: /files
  • The pattern may need escaping — try a simpler pattern first

MCP Servers

MCP server stuck "connecting"

  • Check the server command works: !npx -y @modelcontextprotocol/server-name
  • Verify the config in /mcp
  • Check server logs in the terminal where agent-code runs

MCP tools not appearing

  • Run /mcp to verify the server is connected
  • The server may not have registered tools yet — restart the agent
  • Check mcp_server_allowlist in security config isn't blocking it

Installation

"command not found: agent"

The binary isn't in your PATH.

# Cargo install location
export PATH="$HOME/.cargo/bin:$PATH"

# Or find it
which agent || find / -name agent -type f 2>/dev/null | head -5

Build fails from source

# Ensure Rust is up to date
rustup update stable

# Clean and rebuild
cargo clean
cargo build --release

# Check dependencies
cargo check --all-targets

ripgrep not found

Grep and some other tools require rg (ripgrep):

# Linux
sudo apt-get install ripgrep

# macOS
brew install ripgrep

# Windows
choco install ripgrep

Sessions

Can't resume a session

  • List available sessions: /sessions
  • Session files are in ~/.config/agent-code/sessions/
  • Old sessions may have been cleaned up
  • Session format may be incompatible after an upgrade — start fresh

Still Stuck?

  • Run /doctor for a full environment health check
  • Check GitHub Issues for known problems
  • Open a new issue with: agent version (agent --version), OS, and steps to reproduce

General

What is agent-code?

An open-source AI coding agent for the terminal, built in Rust. You describe tasks in natural language and the agent reads your code, runs commands, edits files, and iterates until the task is done.

How is it different from ChatGPT or Claude in a browser?

agent-code runs locally in your terminal with direct access to your filesystem, shell, git, and development tools. It can read files, make edits, run tests, and fix errors in a loop — not just generate text.

Is it free?

agent-code itself is free and open source (MIT license). You pay for the LLM API you choose — Anthropic, OpenAI, or any other provider. Local models via Ollama are completely free.

Which LLM should I use?

For coding tasks, we recommend Claude Sonnet 4 (Anthropic) or GPT-4.1 (OpenAI) as a good balance of quality and cost. For complex architecture work, use Claude Opus or GPT-5.4. For quick tasks, Haiku or GPT-4.1-mini are fast and cheap.

Installation

What are the system requirements?

  • Any modern Linux, macOS, or Windows machine
  • git and rg (ripgrep) for full functionality
  • An API key from any supported LLM provider (or Ollama for local models)

Can I use it on Windows?

Yes. Install via cargo install agent-code or download the prebuilt binary from GitHub Releases. Windows builds are tested in CI.

Can I run it in Docker?

Yes. See the Dockerfile or pull the image:

docker run -it -e ANTHROPIC_API_KEY="sk-ant-..." ghcr.io/avala-ai/agent-code

Usage

How do I switch between models mid-session?

Use the /model command. It opens an interactive picker based on your configured provider:

> /model

Or specify directly: /model gpt-4.1-mini

How do I use it with a local model?

# Start Ollama
ollama serve

# Run agent-code with local model
agent --api-base-url http://localhost:11434/v1 --model llama3 --api-key unused

What does plan mode do?

Plan mode (/plan) restricts the agent to read-only operations. It can search, read files, and analyze code, but cannot edit files or run shell commands. Useful for exploring unfamiliar codebases safely.

How do I give the agent project context?

Create an AGENTS.md file in your project root with instructions, conventions, and architecture notes. This is loaded automatically at the start of every session.

Can it access the internet?

Yes. The WebFetch and WebSearch tools allow the agent to fetch URLs and search the web. These go through the normal permission system.

Cost

How much does it cost per session?

It depends on the model, task complexity, and conversation length. Typical sessions:

TaskModelApproximate Cost
Quick fixSonnet/GPT-4.1$0.02 - $0.10
Feature implementationSonnet/GPT-4.1$0.10 - $0.50
Complex refactorOpus/GPT-5.4$0.50 - $2.00

How do I set a spending limit?

# ~/.config/agent-code/config.toml
[api]
max_cost_usd = 5.0  # Stop after $5 spent

Or check usage anytime with /cost.

Security

Can the agent delete my files?

The agent asks for permission before destructive operations (default mode). You can also:

  • Use plan mode for read-only: agent --permission-mode plan
  • Block specific commands: add deny rules in config
  • Protected directories (.git/, .husky/, node_modules/) are always blocked from writes

Does it send my code to third parties?

Your code is sent to whichever LLM provider you configure (Anthropic, OpenAI, etc.) as part of the conversation context. It is not sent anywhere else. For maximum privacy, use a local model via Ollama.

Can I restrict which tools the agent uses?

Yes, via permission rules:

[permissions]
default_mode = "ask"

[[permissions.rules]]
tool = "Bash"
pattern = "rm *"
action = "deny"

Extensibility

How do I create a custom skill?

Create a markdown file in .agent/skills/:

---
description: My custom workflow
userInvocable: true
---

Do the thing step by step...

Then invoke it with /my-skill. See the Skills guide for details.

Can I connect external tools via MCP?

Yes. Add MCP servers to your config:

[mcp_servers.github]
command = "npx"
args = ["-y", "@modelcontextprotocol/server-github"]

See the MCP guide for details.

Can I use it as a library in my own project?

Yes. The engine is published as agent-code-lib on crates.io:

[dependencies]
agent-code-lib = "0.9"

The binary is a thin wrapper around this library.

agent-code executes shell commands and modifies files on your behalf. The security model ensures the agent only takes actions you've approved.

Permission System

Every tool call passes through a permission check:

ModeBehavior
ask (default)Prompts before mutations, auto-allows reads
allowAuto-approves everything
denyBlocks all mutations
planRead-only tools only
accept_editsAuto-approves file edits, asks for shell commands

Configure per-tool rules:

[permissions]
default_mode = "ask"

[[permissions.rules]]
tool = "Bash"
pattern = "git *"
action = "allow"

[[permissions.rules]]
tool = "Bash"
pattern = "rm *"
action = "deny"

Protected Directories

These directories are blocked from writes regardless of permission config:

  • .git/ — prevents repository corruption
  • .husky/ — prevents hook tampering
  • node_modules/ — prevents dependency modification

Read access is unaffected.

Bash Safety

The Bash tool detects destructive commands and warns before execution:

  • rm -rf, git reset --hard, DROP TABLE
  • chmod -R 777, mkfs, dd
  • System paths (/etc, /usr, /bin, /sbin, /boot)

Large outputs are truncated and persisted to disk.

Skill Safety

Skills from untrusted sources may contain embedded shell blocks. Disable them:

[security]
disable_skill_shell_execution = true

Shell blocks in skill templates are stripped. Non-shell code blocks are preserved.

MCP Server Security

  • Servers run as local subprocesses (your permissions)
  • Tools are namespaced per server
  • Restrict with allowlist/denylist:
[security]
mcp_server_allowlist = ["github", "filesystem"]

API Key Safety

  • Keys resolved from environment variables only (never config files)
  • Never logged or included in error messages
  • Passed to subagents via environment only

Data Privacy

  • No telemetry collected or transmitted
  • Sessions stored locally (~/.config/agent-code/sessions/)
  • Code sent only to your configured LLM provider
  • Use Ollama for fully local, air-gapped operation

Bypass Prevention

The --dangerously-skip-permissions flag disables all checks. To block it:

[security]
disable_bypass_permissions = true

Full Enterprise Config

[security]
disable_bypass_permissions = true
disable_skill_shell_execution = true
mcp_server_allowlist = ["github", "filesystem"]
env_allowlist = ["PATH", "HOME", "SHELL"]
additional_directories = ["/shared/docs"]

Reporting Vulnerabilities

Email security@avala.ai — do not open public issues. See SECURITY.md for the full policy.

API Reference (rustdoc)