What is agent-code?
agent-code is an open-source, AI-powered coding agent for the terminal. You describe what you want in natural language, and the agent reads your codebase, runs commands, edits files, and iterates until the task is done.
It's built in pure Rust for speed, safety, and a single static binary with zero runtime dependencies.
What can it do?
How it works
The core loop is simple:
- You type a request
- The agent sends your request + conversation history to an LLM
- The LLM responds with text and tool calls
- The agent executes the tools (file reads, edits, shell commands, etc.)
- Tool results are fed back to the LLM
- Repeat until the task is done
Every tool call passes through a permission system before executing. You stay in control.
You: "add input validation to the signup endpoint"
Agent:
→ FileRead src/routes/signup.rs
→ Grep "validate" src/
→ FileEdit src/routes/signup.rs (add validation logic)
→ Bash "cargo test"
→ FileEdit src/routes/signup.rs (fix test failure)
→ Bash "cargo test"
✓ All tests pass. Validation added.
Key features
- 32 built-in tools for file operations, shell commands, code search, web access, language servers, and more
- 42 slash commands for git, session management, diagnostics, sharing, and agent control
- 12 bundled skills including security review, architecture advisor, bug hunter, and implementation planner
- 12 LLM providers — Anthropic, OpenAI, xAI, Google, DeepSeek, Groq, Mistral, Together, Zhipu, Ollama, AWS Bedrock, Google Vertex
- Permission system with configurable rules per tool, pattern matching, and protected directories
- MCP support for connecting external tool servers
- Memory system for persistent context across sessions
- Plugin system for bundling skills, hooks, and configuration
- Session persistence with save, resume, share, fork, and rewind
- Plan mode for safe read-only exploration
- Extended thinking for complex reasoning tasks
- Cross-platform — Linux, macOS, and Windows
Architecture
┌─────────────┐
│ CLI / REPL │
└──────┬───────┘
│
┌────────────────▼────────────────┐
│ Query Engine │
│ stream → tools → loop → compact │
└──┬──────────┬──────────┬────────┘
│ │ │
┌────────▼──┐ ┌────▼─────┐ ┌──▼───────┐
│ Tools │ │ Perms │ │ Hooks │
│ 31 built │ │ allow │ │ pre/post │
│ + MCP │ │ deny/ask │ │ shell │
└───────────┘ └──────────┘ └──────────┘
Next steps
Install
brew install avala-ai/tap/agent-code
git clone https://github.com/avala-ai/agent-code.git
cd agent-code
cargo build --release
# Binary: target/release/agent
Set your API key
agent-code works with any LLM provider. Set the key for the one you use:
export OPENAI_API_KEY="sk-..."
export AGENT_CODE_API_KEY="your-key"
export AGENT_CODE_API_BASE_URL="https://api.your-provider.com/v1"
Start the agent
agent
You'll see:
agent session a1b2c3d
Type your message, or /help for commands. Ctrl+C to cancel, Ctrl+D to exit.
>
Try it out
Type a natural language request:
> what files are in this project?
The agent will use the Glob and FileRead tools to explore and answer.
Try something more complex:
> add a health check endpoint to the API server that returns the git commit hash
The agent will:
- Read the existing code to understand the project structure
- Find how other endpoints are defined
- Write the new endpoint
- Run tests if they exist
Slash commands
Type /help to see all available commands:
> /help
Available commands:
/help Show this help message
/clear Clear conversation history
/cost Show session cost and token usage
/model Show or change the current model
/commit Commit current changes
/review Review current diff for issues
/plan Toggle plan mode (read-only)
/doctor Check environment health
...
One-shot mode
For scripting and CI, use --prompt to run a single task and exit:
agent --prompt "fix the failing tests" --dangerously-skip-permissions
Next steps
Requirements
- A supported LLM API key (Anthropic, OpenAI, or any compatible provider)
gitandrg(ripgrep) for full functionality
Install methods
Cargo (recommended)
If you have Rust installed:
cargo install agent-code
This installs the agent binary to ~/.cargo/bin/.
Homebrew
On macOS or Linux:
brew install avala-ai/tap/agent-code
Prebuilt binaries
Download from GitHub Releases:
| Platform | Architecture | Download |
|---|---|---|
| Linux | x86_64 | agent-linux-x86_64.tar.gz |
| Linux | aarch64 | agent-linux-aarch64.tar.gz |
| macOS | x86_64 | agent-macos-x86_64.tar.gz |
| macOS | Apple Silicon | agent-macos-aarch64.tar.gz |
| Windows | x86_64 | agent-windows-x86_64.zip |
# Example: macOS Apple Silicon
curl -L https://github.com/avala-ai/agent-code/releases/latest/download/agent-macos-aarch64.tar.gz | tar xz
sudo mv agent /usr/local/bin/
From source
git clone https://github.com/avala-ai/agent-code.git
cd agent-code
cargo build --release
sudo cp target/release/agent /usr/local/bin/
Verify installation
agent --version
# agent 0.1.1
Run the environment check:
agent --dump-system-prompt | head -5
# You are an AI coding agent...
Uninstall
# Cargo
cargo uninstall agent-code
# Homebrew
brew uninstall agent-code
# Manual
rm $(which agent)
Data locations
| What | Path |
|---|---|
| User config | ~/.config/agent-code/config.toml |
| Session data | ~/.config/agent-code/sessions/ |
| Memory | ~/.config/agent-code/memory/ |
| Skills | ~/.config/agent-code/skills/ |
| Plugins | ~/.config/agent-code/plugins/ |
| Keybindings | ~/.config/agent-code/keybindings.json |
| History | ~/.local/share/agent-code/history.txt |
| Tool output cache | ~/.cache/agent-code/tool-results/ |
| Task output | ~/.cache/agent-code/tasks/ |
This tutorial walks through using agent-code on a real project for the first time.
Prerequisites
- agent-code installed (
agent --versionworks) - An API key configured (any provider)
- A project directory with code in it
Step 1: Navigate to your project
cd /path/to/your/project
agent-code uses your current directory as context. It can read files, run commands, and make edits here.
Step 2: Start the agent
agent
You'll see the welcome banner with your session ID. The agent is ready.
Step 3: Explore the codebase
Ask the agent to understand your project:
> what is this project and how is it structured?
The agent will use Glob to find files, FileRead to read key files (README, package.json, Cargo.toml, etc.), and explain the structure.
Step 4: Make a change
Try something concrete:
> add a health check endpoint that returns {"status": "ok"}
The agent will:
- Read existing code to understand patterns
- Find where endpoints are defined
- Write the new endpoint
- Run tests if they exist
Watch the tool calls — you'll see FileRead, Grep, FileWrite, and Bash in action.
Step 5: Review what changed
> /diff
This shows the git diff of everything the agent modified.
Step 6: Commit if you're happy
> /commit
The agent reviews the diff and creates a commit with a descriptive message.
Step 7: Save project context
Create an AGENTS.md file so the agent remembers your project in future sessions:
> /init
Or ask the agent to create one:
> create an AGENTS.md with our project's tech stack, conventions, and test commands
What's next
- Use
/planto explore code safely (read-only mode) - Use
/reviewto review your changes before committing - Use
/modelto switch to a faster or more capable model - See Custom Skills to create reusable workflows
Skills turn multi-step workflows into single commands. This tutorial creates a skill from scratch.
What we'll build
A /deploy-check skill that verifies a project is ready for deployment: tests pass, no uncommitted changes, and the build succeeds.
Step 1: Create the skill file
mkdir -p .agent/skills
Create .agent/skills/deploy-check.md:
---
description: Verify the project is ready for deployment
userInvocable: true
---
Run a pre-deployment checklist:
1. Check for uncommitted changes with `git status`. If there are
uncommitted changes, warn the user and stop.
2. Run the project's test suite. If any tests fail, report the
failures and stop.
3. Run the build command. If it fails, report the error and stop.
4. If everything passes, report "Ready to deploy" with a summary
of what was checked.
Do not proceed past a failing step.
Step 2: Verify it loaded
Start agent-code and check:
> /skills
You should see deploy-check [invocable] in the list.
Step 3: Run it
> /deploy-check
The agent follows the steps in order, stopping at the first failure.
Adding arguments
Skills support {{arg}} substitution. Create .agent/skills/review-file.md:
---
description: Deep review of a specific file
userInvocable: true
---
Review {{arg}} thoroughly:
1. Read the file and understand its purpose
2. Check for bugs, edge cases, and error handling gaps
3. Check for security issues (injection, XSS, auth bypass)
4. Suggest specific improvements with line references
Use it:
> /review-file src/auth.rs
Directory skills
For complex skills with supporting context, use a directory:
.agent/skills/
deploy-check/
SKILL.md ← the skill definition
checklist.md ← referenced by the skill
known-issues.md ← context the agent can read
The skill file must be named SKILL.md in a directory skill.
Sharing skills
Skills are just markdown files. Share them by:
- Committing
.agent/skills/to your repo (team-wide) - Copying to
~/.config/agent-code/skills/(personal, all projects) - Publishing as a plugin (see Plugins)
Tips
- Keep skill prompts specific — vague instructions produce vague results
- Number the steps — the agent follows numbered lists reliably
- Include stop conditions ("if X fails, stop and report")
- Test with
/skillsto verify loading before running
MCP lets you extend agent-code with tools from external servers — databases, APIs, file systems, and more.
What we'll do
Connect the official GitHub MCP server so the agent can create issues, read PRs, and manage repositories.
Step 1: Create project config
agent # start agent-code
> /init # creates .agent/settings.toml
Or create it manually:
mkdir -p .agent
touch .agent/settings.toml
Step 2: Add the MCP server
Edit .agent/settings.toml:
[mcp_servers.github]
command = "npx"
args = ["-y", "@modelcontextprotocol/server-github"]
env = { GITHUB_TOKEN = "ghp_your_token_here" }
Step 3: Restart and verify
Restart agent-code and check the connection:
> /mcp
1 MCP server(s) configured:
github (stdio)
Step 4: Use it
The agent now has access to GitHub tools:
> create a GitHub issue titled "Add input validation" with a description of what needs to be done
The agent calls the MCP server's create_issue tool, which creates the issue via the GitHub API.
SSE transport (remote servers)
For servers that expose an HTTP endpoint:
[mcp_servers.my-api]
url = "http://localhost:8080"
The agent connects via Server-Sent Events instead of stdio.
Multiple servers
Add as many as you need:
[mcp_servers.github]
command = "npx"
args = ["-y", "@modelcontextprotocol/server-github"]
env = { GITHUB_TOKEN = "ghp_..." }
[mcp_servers.postgres]
command = "npx"
args = ["-y", "@modelcontextprotocol/server-postgres", "postgresql://localhost/mydb"]
[mcp_servers.filesystem]
command = "npx"
args = ["-y", "@modelcontextprotocol/server-filesystem", "/path/to/docs"]
Security
MCP servers have access to whatever their underlying service provides. Restrict access with:
[security]
mcp_server_allowlist = ["github", "filesystem"] # Only these servers can connect
Or block specific ones:
[security]
mcp_server_denylist = ["untrusted-server"]
Browsing MCP resources
Some MCP servers expose resources (like database schemas or file listings). The agent can browse them with the ListMcpResources and ReadMcpResource tools automatically.
Troubleshooting
- Server stuck connecting: verify the command works manually —
npx -y @modelcontextprotocol/server-github - Tools not appearing: restart agent-code after config changes
- Permission errors: check the env vars (tokens, credentials) are correct
- See the full MCP configuration guide for details
agent-code works with 12+ LLM providers. This tutorial shows how to switch between them and set up your preferred workflow.
Quick switch: one env var
Each provider is activated by setting its API key:
# Anthropic (Claude)
export ANTHROPIC_API_KEY="sk-ant-..."
agent
# OpenAI (GPT)
export OPENAI_API_KEY="sk-..."
agent
# Google (Gemini)
export GOOGLE_API_KEY="AIza..."
agent
The agent auto-detects the provider from which key is set and configures the correct API endpoint.
Switch mid-session
Use the /model command to open the interactive model picker:
> /model
This shows models available for your current provider. Or specify directly:
> /model gpt-4.1-mini
Use a specific provider via config
For permanent setup, edit your config file:
# ~/.config/agent-code/config.toml
[api]
model = "claude-sonnet-4-20250514"
Local models with Ollama
Run models locally with zero API cost:
# Install and start Ollama
ollama serve
# Pull a model
ollama pull llama3
# Run agent-code with it
agent --api-base-url http://localhost:11434/v1 --model llama3 --api-key unused
The --api-key unused is required (Ollama ignores it but the flag is needed).
Any OpenAI-compatible endpoint
agent-code works with any service that speaks the OpenAI Chat Completions API:
# OpenRouter (access any model via one key)
agent --api-base-url https://openrouter.ai/api/v1 --api-key sk-or-... --model anthropic/claude-sonnet-4
# Together AI
agent --api-base-url https://api.together.xyz/v1 --api-key ... --model meta-llama/Llama-3-70b-chat-hf
# Groq (fast inference)
agent --api-base-url https://api.groq.com/openai/v1 --api-key gsk_... --model llama-3.3-70b-versatile
# Your own endpoint
agent --api-base-url http://localhost:8080/v1 --api-key ... --model my-model
AWS Bedrock
Access Claude models through your AWS account:
export AGENT_CODE_USE_BEDROCK=1
export AWS_REGION=us-east-1
# Uses your default AWS credential chain (env vars, ~/.aws/credentials, IAM role)
agent
Google Vertex AI
Access Claude models through Google Cloud:
export AGENT_CODE_USE_VERTEX=1
export GOOGLE_CLOUD_PROJECT=my-project
export GOOGLE_CLOUD_LOCATION=us-central1
agent
Model recommendations by task
| Task | Recommended | Why |
|---|---|---|
| Quick fixes, small edits | GPT-4.1-mini, Haiku, Gemini Flash | Fast, cheap |
| Feature implementation | Sonnet, GPT-4.1 | Good balance |
| Complex architecture | Opus, GPT-5.4 | Maximum reasoning |
| Local/private code | Ollama + Llama 3 | No data leaves your machine |
Cost tracking
Check what you're spending:
> /cost
Set a session limit:
[api]
max_cost_usd = 5.0 # Stop after $5
The /cost command shows per-model breakdown when you've used multiple models in one session.
The agent loop is the core execution engine. It handles the cycle of calling the LLM, executing tools, and managing context.
Turn lifecycle
Each turn follows this sequence:
1. Budget check → stop if cost/token limit exceeded
2. Message normalize → pair orphaned tool results, merge consecutive user messages
3. Auto-compact → if context nears window limit:
microcompact → LLM summary → context collapse → aggressive trim
4. Build request → system prompt + history + tool schemas
5. Stream response → display text in real-time, collect content blocks
6. Error recovery → rate limit retry, prompt-too-long compact, max-output continue
7. Extract tool calls → parse tool_use blocks from response
8. Permission check → allow/deny/ask per tool and pattern
9. Execute tools → concurrent batch (read-only) or serial (mutations)
10. Inject results → add tool results to history
11. Loop → back to step 1 until no tool calls
Compaction strategies
Long sessions exceed the context window. The agent uses three strategies, tried in order:
Error recovery
The agent handles these error conditions automatically:
| Error | Recovery |
|---|---|
| Rate limited (429) | Wait retry_after ms, retry up to 5 times |
| Overloaded (529) | 5s backoff, retry up to 5 times, then fall back to smaller model |
| Prompt too long (413) | Reactive microcompact, then context collapse |
| Max output tokens | Inject continuation message, retry up to 3 times |
| Stream interrupted | Exponential backoff with retry |
Token budget
The agent tracks token usage and estimated cost:
- Auto-compact fires at
context_window - 20K reserved - 13K buffer - Budget enforcement stops execution when cost or token limits are reached
- Diminishing progress detection stops after 3 turns with minimal output
Configure limits:
[api]
max_cost_usd = 5.0 # Stop after $5 spent this session
Extended thinking
When using models that support it, the agent sends a thinking budget with each request. The budget scales by model:
| Model | Thinking budget |
|---|---|
| Opus | 32,000 tokens |
| Sonnet | 16,000 tokens |
| Haiku | 8,000 tokens |
Thinking content is displayed briefly in the terminal but not stored in conversation history.
Tools are the bridge between the LLM's intentions and your local environment. Each tool defines what it can do, what inputs it accepts, and whether it's safe to run in parallel.
How tools work
- The LLM decides which tool to call and with what arguments
- The agent validates the input against the tool's schema
- Permission checks run (allow/deny/ask based on your rules)
- The tool executes and returns a result
- The result is sent back to the LLM for the next step
Tool categories
File operations
| Tool | What it does |
|---|---|
| FileRead | Read files with line numbers. Handles text, PDF (via pdftotext), Jupyter notebooks, and images (base64 for vision). |
| FileWrite | Create or overwrite files. Auto-creates parent directories. |
| FileEdit | Search-and-replace within files. Requires unique match unless replace_all is set. |
| NotebookEdit | Edit Jupyter notebook cells (replace, insert, delete). |
Search
| Tool | What it does |
|---|---|
| Grep | Regex content search powered by ripgrep. Supports context lines, glob filtering, case sensitivity. |
| Glob | Find files by pattern, sorted by modification time. |
| ToolSearch | Discover available tools by keyword or direct name. |
Execution
| Tool | What it does |
|---|---|
| Bash | Run shell commands with timeout, background execution, destructive command detection, and output truncation. |
| REPL | Execute Python or Node.js code snippets in an interpreter. |
Agent coordination
| Tool | What it does |
|---|---|
| Agent | Spawn subagents for parallel tasks with optional git worktree isolation. |
| SendMessage | Send messages between running agents. |
| Skill | Invoke user-defined skills programmatically. |
Planning and tracking
| Tool | What it does |
|---|---|
| EnterPlanMode / ExitPlanMode | Toggle read-only mode for safe exploration. |
| TaskCreate / TaskUpdate / TaskGet / TaskList / TaskStop / TaskOutput | Full task lifecycle management. |
| TodoWrite | Structured todo list management. |
Web and external
| Tool | What it does |
|---|---|
| WebFetch | HTTP GET with HTML-to-text conversion. |
| WebSearch | Web search with result extraction. |
| LSP | Language server diagnostics with linter fallbacks. |
MCP integration
| Tool | What it does |
|---|---|
| McpProxy | Calls tools on connected MCP servers. |
| ListMcpResources | Browse MCP server resources. |
| ReadMcpResource | Read a specific MCP resource by URI. |
Workspace
| Tool | What it does |
|---|---|
| EnterWorktree / ExitWorktree | Create and manage isolated git worktrees. |
| AskUserQuestion | Prompt the user with structured multi-choice questions. |
| Sleep | Async pause with cancellation. |
Concurrency
Tools declare whether they're safe to run in parallel:
- Read-only tools (FileRead, Grep, Glob, etc.) run concurrently
- Mutation tools (Bash, FileWrite, FileEdit) run serially
This maximizes throughput while preventing race conditions on file writes.
Result handling
Tool results larger than 64KB are automatically persisted to disk (~/.cache/agent-code/tool-results/). The conversation receives a truncated preview with a file path reference so the full output isn't lost.
Writing custom tools
See Custom Tools for how to implement the Tool trait and register new tools.
Every tool call passes through the permission system before executing. This gives you fine-grained control over what the agent is allowed to do.
Permission modes
Set the mode via CLI flag or config:
| Mode | Behavior |
|---|---|
ask (default) | Prompt before mutations, auto-allow reads |
allow | Auto-approve everything |
deny | Block all mutations |
plan | Read-only tools only (strictest) |
accept_edits | Auto-approve file edits, ask for shell commands |
# CLI
agent --permission-mode plan
# Skip all checks (CI/scripting only)
agent --dangerously-skip-permissions
Permission rules
Configure per-tool rules in your settings file:
# .agent/settings.toml or ~/.config/agent-code/config.toml
[permissions]
default_mode = "ask"
# Auto-approve git commands
[[permissions.rules]]
tool = "Bash"
pattern = "git *"
action = "allow"
# Block destructive commands
[[permissions.rules]]
tool = "Bash"
pattern = "rm -rf *"
action = "deny"
# Allow writes only to /tmp
[[permissions.rules]]
tool = "FileWrite"
pattern = "/tmp/*"
action = "allow"
Rules are evaluated in order. The first matching rule wins. If no rule matches, the default mode applies.
Built-in safety
Protected directories
Write tools (FileWrite, FileEdit, MultiEdit, NotebookEdit) are blocked from writing to these directories regardless of your permission rules:
.git/— prevent repository corruption.husky/— prevent hook tamperingnode_modules/— prevent dependency modification
Read access to these directories is unaffected.
Shell safety
The Bash tool includes additional safety checks beyond the permission system:
- Destructive command detection: warns before
rm -rf,git reset --hard,DROP TABLE,chmod -R 777, and other dangerous patterns - System path blocking: prevents writes to
/etc,/usr,/bin,/sbin,/boot,/sys,/proc - Output truncation: large outputs are persisted to disk instead of flooding context
Plan mode
Plan mode restricts the agent to read-only operations. Use it when you want the agent to analyze and plan without making changes:
> /plan
Plan mode enabled. Only read-only tools available.
> analyze the architecture and suggest improvements
(agent reads files, searches code, but cannot edit or run commands)
> /plan
Plan mode disabled. All tools available.
Denial tracking
Permission denials are recorded with the tool name, reason, and input summary. View them with /permissions:
> /permissions
Permission mode: Ask
Rules:
Bash git * -> Allow
Bash rm * -> Deny
Memory gives the agent context that persists across sessions. There are two layers:
Project memory
Place a AGENTS.md file in your project root or .agent/AGENTS.md:
# Project Context
This is a Rust web API using Axum and SQLx.
The database is PostgreSQL, migrations are in db/migrations/.
Run tests with `cargo test`. The CI pipeline is in .github/workflows/ci.yml.
Always run `cargo fmt` before committing.
This is loaded automatically at the start of every session in that project directory. Use it for project-specific instructions, conventions, and context that every session needs.
User memory
User-level memory lives in ~/.config/agent-code/memory/:
MEMORY.md— the index file, loaded automatically- Individual memory files linked from the index
<!-- ~/.config/agent-code/memory/MEMORY.md -->
- [Preferences](preferences.md) — coding style and response preferences
- [Work context](work.md) — current projects and priorities
<!-- ~/.config/agent-code/memory/preferences.md -->
---
name: preferences
description: User coding style preferences
type: user
---
- I prefer explicit error handling over unwrap/expect
- Use descriptive variable names, not single letters
- Always include tests for new functions
How memory is used
Memory files are injected into the system prompt at session start:
- Project
AGENTS.md→ appears under "# Project Context" - User
MEMORY.mdindex → appears under "# User Memory" - Individual memory files linked from the index → loaded and appended
The agent sees this context on every turn, so it can follow your conventions and understand your project without being told every time.
Size limits
| Limit | Value |
|---|---|
| Max file size | 25KB per memory file |
| Max index lines | 200 lines |
Files exceeding these limits are truncated with a (truncated) marker.
Commands
> /memory
Project context: loaded
User memory: loaded (2 files)
Every interactive session is automatically saved when you exit. You can resume any previous session to continue where you left off.
Auto-save
Sessions save automatically on exit (Ctrl+D or /exit). The session file includes:
- Full conversation history (messages, tool calls, results)
- Turn count and model used
- Working directory at session start
- Session ID
Sessions are stored as JSON in ~/.config/agent-code/sessions/.
Resume a session
List recent sessions:
> /sessions
Recent sessions:
a1b2c3d — /home/user/project (5 turns, 23 msgs, 2026-03-31T20:15:00Z)
e4f5g6h — /home/user/other (12 turns, 67 msgs, 2026-03-31T18:30:00Z)
Use /resume <id> to restore a session.
Resume by ID:
> /resume a1b2c3d
Resumed session a1b2c3d (23 messages, 5 turns)
The agent now has the full conversation context from the previous session and can continue the work.
Session ID
Each session gets a short unique ID shown in the welcome banner:
agent session a1b2c3d
Use this ID to resume later.
Export
Export the current conversation as markdown:
> /export
Exported to conversation-export-20260331-201500.md
The export includes user messages and assistant responses as readable markdown.
Configuration loads from three layers (highest priority first):
- CLI flags and environment variables
- Project config —
.agent/settings.tomlin your repo - User config —
~/.config/agent-code/config.toml
Full config reference
# ~/.config/agent-code/config.toml
[api]
base_url = "https://api.anthropic.com/v1"
model = "claude-sonnet-4-20250514"
# api_key is resolved from env: AGENT_CODE_API_KEY, ANTHROPIC_API_KEY, OPENAI_API_KEY
max_output_tokens = 16384
thinking = "enabled" # "enabled", "disabled", or omit for default
effort = "high" # "low", "medium", "high"
max_cost_usd = 10.0 # Stop session after this spend
timeout_secs = 120
max_retries = 3
[permissions]
default_mode = "ask" # "ask", "allow", "deny", "plan", "accept_edits"
[[permissions.rules]]
tool = "Bash"
pattern = "git *"
action = "allow"
[[permissions.rules]]
tool = "Bash"
pattern = "rm *"
action = "deny"
[ui]
markdown = true
syntax_highlight = true
theme = "dark"
# MCP servers (see MCP Servers page)
[mcp_servers.filesystem]
command = "npx"
args = ["-y", "@modelcontextprotocol/server-filesystem", "/path"]
# Lifecycle hooks (see Hooks page)
[[hooks]]
event = "post_tool_use"
tool_name = "FileWrite"
[hooks.action]
type = "shell"
command = "cargo fmt"
Project config
Create .agent/settings.toml in your repo root for project-specific settings. These override user config but are overridden by CLI flags.
Initialize with:
agent
> /init
Created .agent/settings.toml
Environment variables
| Variable | Purpose |
|---|---|
AGENT_CODE_API_KEY | API key (highest priority) |
ANTHROPIC_API_KEY | Anthropic API key |
OPENAI_API_KEY | OpenAI API key |
AGENT_CODE_API_BASE_URL | API endpoint override |
AGENT_CODE_MODEL | Model override |
EDITOR | Determines vi/emacs REPL mode |
agent-code works with any LLM that speaks the Anthropic Messages API or OpenAI Chat Completions API. The provider is auto-detected from your model name and base URL.
Anthropic (Claude)
export ANTHROPIC_API_KEY="sk-ant-..."
agent
Supported models: Claude Opus, Sonnet, Haiku (all versions).
Features enabled: prompt caching, extended thinking, cache_control breakpoints.
OpenAI (GPT)
export OPENAI_API_KEY="sk-..."
agent --model gpt-4o
Supported models: GPT-4o, GPT-4, o1, o3, and others.
Ollama (local)
agent --api-base-url http://localhost:11434/v1 --model llama3 --api-key unused
No API key needed (pass any string). Start Ollama first: ollama serve.
Groq
agent --api-base-url https://api.groq.com/openai/v1 --api-key gsk_... --model llama-3.3-70b-versatile
Together AI
agent --api-base-url https://api.together.xyz/v1 --api-key ... --model meta-llama/Llama-3-70b-chat-hf
DeepSeek
agent --api-base-url https://api.deepseek.com/v1 --api-key ... --model deepseek-chat
OpenRouter
agent --api-base-url https://openrouter.ai/api/v1 --api-key ... --model anthropic/claude-sonnet-4
OpenRouter lets you access any model through a single API key.
Explicit provider selection
If auto-detection doesn't work for your setup, force it:
agent --provider anthropic # Use Anthropic wire format
agent --provider openai # Use OpenAI wire format
Auto-detection logic
The provider is chosen by checking (in order):
--providerflag (if set)- Base URL contains
anthropic.com→ Anthropic - Base URL contains
openai.com→ OpenAI - Base URL is
localhost→ OpenAI-compatible - Model name starts with
claude/opus/sonnet/haiku→ Anthropic - Model name starts with
gpt/o1/o3→ OpenAI - Default → OpenAI-compatible (most common API shape)
MCP (Model Context Protocol) lets you extend agent-code with tools and resources from external servers. Any MCP-compatible server can be connected.
Configuration
Add servers to your config file:
# .agent/settings.toml or ~/.config/agent-code/config.toml
[mcp_servers.filesystem]
command = "npx"
args = ["-y", "@modelcontextprotocol/server-filesystem", "/home/user/docs"]
[mcp_servers.github]
command = "npx"
args = ["-y", "@modelcontextprotocol/server-github"]
env = { GITHUB_TOKEN = "ghp_..." }
[mcp_servers.postgres]
command = "npx"
args = ["-y", "@modelcontextprotocol/server-postgres", "postgresql://localhost/mydb"]
Transports
Stdio (default)
The server runs as a subprocess, communicating via stdin/stdout JSON-RPC:
[mcp_servers.myserver]
command = "path/to/server"
args = ["--flag", "value"]
env = { API_KEY = "..." }
SSE (HTTP)
For servers that expose an HTTP endpoint:
[mcp_servers.remote]
url = "http://localhost:8080"
How it works
- At startup, agent-code connects to each configured server
- The
initializehandshake negotiates capabilities - Tools are discovered via
tools/listand registered asmcp__server__toolin the agent's tool pool - When the LLM calls an MCP tool, the request is proxied to the server via
tools/call - Resources can be browsed with
ListMcpResourcesand read withReadMcpResource
Commands
> /mcp
2 MCP server(s) configured:
filesystem (stdio)
github (stdio)
Popular MCP servers
| Server | What it provides |
|---|---|
@modelcontextprotocol/server-filesystem | File system access with path restrictions |
@modelcontextprotocol/server-github | GitHub API (issues, PRs, repos) |
@modelcontextprotocol/server-postgres | PostgreSQL query execution |
@modelcontextprotocol/server-sqlite | SQLite database access |
@modelcontextprotocol/server-slack | Slack messaging |
Find more at modelcontextprotocol.io.
Hooks let you run shell commands or HTTP requests at specific points in the agent's lifecycle. Use them for auto-formatting, linting, notifications, or custom validation.
Configuration
# .agent/settings.toml
# Auto-format Rust files after any write
[[hooks]]
event = "post_tool_use"
tool_name = "FileWrite"
[hooks.action]
type = "shell"
command = "cargo fmt"
# Lint after edits
[[hooks]]
event = "post_tool_use"
tool_name = "FileEdit"
[hooks.action]
type = "shell"
command = "cargo clippy --quiet"
# Notify on session start
[[hooks]]
event = "session_start"
[hooks.action]
type = "http"
url = "https://hooks.slack.com/services/T.../B.../..."
method = "POST"
Hook events
| Event | When it fires |
|---|---|
session_start | Session begins |
session_stop | Session ends |
pre_tool_use | Before a tool executes |
post_tool_use | After a tool completes |
user_prompt_submit | User submits input |
Hook actions
Shell
Run a command in the project directory:
[hooks.action]
type = "shell"
command = "make lint"
HTTP
Send a request to a URL:
[hooks.action]
type = "http"
url = "https://example.com/webhook"
method = "POST"
Filtering by tool
Use tool_name to run hooks only for specific tools:
[[hooks]]
event = "pre_tool_use"
tool_name = "Bash"
[hooks.action]
type = "shell"
command = "echo 'Bash command about to run'"
Without tool_name, the hook fires for all tools.
Commands
> /hooks
Hook system active. Configure hooks in .agent/settings.toml:
[[hooks]]
event = "pre_tool_use"
action = { type = "shell", command = "./check.sh" }
Skills are reusable prompt templates that define multi-step workflows. They're markdown files with YAML frontmatter, loaded from .agent/skills/ or ~/.config/agent-code/skills/.
Creating a skill
Create a file in .agent/skills/test-and-fix.md:
---
description: Run tests and fix failures
whenToUse: When the user asks to test or fix failing tests
userInvocable: true
---
Run the test suite with the project's test command. If any tests fail:
1. Read the failing test file
2. Read the source code being tested
3. Identify the root cause
4. Fix the issue
5. Re-run tests to verify
Repeat until all tests pass. Do not skip or delete failing tests.
Frontmatter options
| Field | Type | Description |
|---|---|---|
description | string | What this skill does |
whenToUse | string | Hints for the LLM about when to suggest this skill |
userInvocable | boolean | Whether users can invoke it via /skill-name |
disableNonInteractive | boolean | Disable in one-shot mode |
paths | string[] | File patterns that trigger this skill suggestion |
Invoking skills
As a slash command
If userInvocable: true, invoke with the filename (minus .md):
> /test-and-fix
With arguments
Use {{arg}} in the template for argument substitution:
---
description: Review a specific file
userInvocable: true
---
Review {{arg}} for bugs, security issues, and code quality problems.
Focus on edge cases and error handling.
> /review src/auth.rs
Programmatically via the Skill tool
The LLM can invoke skills when it determines one is appropriate:
{
"name": "Skill",
"input": {
"skill": "test-and-fix",
"args": null
}
}
Directory skills
For complex skills with supporting files, use a directory:
.agent/skills/
deploy/
SKILL.md ← the skill definition
checklist.md ← referenced by the skill
Skill locations
| Location | Scope |
|---|---|
.agent/skills/ | Project-specific |
~/.config/agent-code/skills/ | Available in all projects |
Bundled skills
agent-code ships with 12 built-in skills. These are always available and can be overridden by placing a skill with the same name in your project or user skills directory.
| Skill | Purpose |
|---|---|
/commit | Create well-crafted git commits |
/review | Review diff for bugs and security issues |
/test | Run tests and fix failures |
/explain | Explain how code works |
/debug | Debug errors with root cause analysis |
/pr | Create pull requests |
/refactor | Refactor code for quality |
/init | Initialize project configuration |
/security-review | OWASP-oriented vulnerability scan |
/advisor | Architecture and dependency health analysis |
/bughunter | Systematic bug search |
/plan | Structured implementation planning |
Commands
> /skills
Loaded 15 skills:
commit [invocable] — Create a well-crafted git commit
review [invocable] — Review code changes for bugs and issues
test-and-fix [invocable] — Run tests and fix failures
deploy — Production deployment checklist
Plugins package skills, hooks, and configuration together as installable units. A plugin is a directory with a plugin.toml manifest.
Plugin structure
my-plugin/
plugin.toml ← manifest
skills/
deploy.md ← bundled skills
rollback.md
Manifest
# plugin.toml
name = "my-deploy-plugin"
version = "1.0.0"
description = "Deployment workflows for our stack"
author = "team@company.com"
skills = ["deploy", "rollback"]
[[hooks]]
event = "post_tool_use"
tool_name = "Bash"
command = "notify-deploy-status"
Installing plugins
Place plugin directories in:
| Location | Scope |
|---|---|
.agent/plugins/ | Project-specific |
~/.config/agent-code/plugins/ | Available in all projects |
Commands
> /plugins
Loaded 1 plugins:
my-deploy-plugin v1.0.0 — Deployment workflows for our stack
Skills from plugins are automatically registered and appear in /skills output.
Tools implement the Tool trait. Each tool defines its input schema, permission behavior, and execution logic.
The Tool trait
#![allow(unused)] fn main() { #[async_trait] pub trait Tool: Send + Sync { /// Unique name used in API tool_use blocks. fn name(&self) -> &'static str; /// Description sent to the LLM. fn description(&self) -> &'static str; /// JSON Schema for input parameters. fn input_schema(&self) -> serde_json::Value; /// Execute the tool. async fn call( &self, input: serde_json::Value, ctx: &ToolContext, ) -> Result<ToolResult, ToolError>; /// Whether this tool only reads (no mutations). fn is_read_only(&self) -> bool { false } /// Whether it's safe to run in parallel with other tools. fn is_concurrency_safe(&self) -> bool { self.is_read_only() } } }
Example: a simple tool
#![allow(unused)] fn main() { pub struct TimeTool; #[async_trait] impl Tool for TimeTool { fn name(&self) -> &'static str { "Time" } fn description(&self) -> &'static str { "Returns the current date and time." } fn input_schema(&self) -> serde_json::Value { json!({ "type": "object", "properties": {} }) } fn is_read_only(&self) -> bool { true } async fn call( &self, _input: serde_json::Value, _ctx: &ToolContext, ) -> Result<ToolResult, ToolError> { let now = chrono::Utc::now().to_rfc3339(); Ok(ToolResult::success(now)) } } }
Registering the tool
In src/tools/registry.rs:
#![allow(unused)] fn main() { pub fn default_tools() -> Self { let mut registry = Self::new(); // ... existing tools ... registry.register(Arc::new(TimeTool)); registry } }
ToolContext
Every tool receives a ToolContext with:
| Field | Type | Description |
|---|---|---|
cwd | PathBuf | Current working directory |
cancel | CancellationToken | Check for Ctrl+C |
permission_checker | Arc<PermissionChecker> | Check permissions |
verbose | bool | Verbose output mode |
plan_mode | bool | Read-only mode active |
file_cache | Option<Arc<Mutex<FileCache>>> | Shared file cache |
denial_tracker | Option<Arc<Mutex<DenialTracker>>> | Permission denial log |
ToolResult
#![allow(unused)] fn main() { // Success Ok(ToolResult::success("output text")) // Error (sent back to LLM as an error result) Ok(ToolResult::error("what went wrong")) // Fatal error (stops the tool, not sent to LLM) Err(ToolError::ExecutionFailed("crash details".into())) }
The IDE bridge allows editor extensions (VS Code, JetBrains, etc.) to communicate with a running agent-code instance over HTTP.
How it works
When agent-code starts, it writes a lock file to ~/.cache/agent-code/bridge/. IDE extensions scan for lock files to discover running sessions.
The lock file contains:
{
"port": 8432,
"pid": 12345,
"cwd": "/home/user/project",
"started_at": "2026-03-31T20:00:00Z"
}
Protocol
| Endpoint | Method | Description |
|---|---|---|
/status | GET | Agent status (idle/active, model, session info) |
/messages | GET | Conversation history |
/message | POST | Send a user message |
/events | GET | SSE stream of real-time events |
Discovery
Stale lock files (where the PID is no longer running) are automatically cleaned up when any client scans for bridges.
#![allow(unused)] fn main() { use rs_code::services::bridge::discover_bridges; let bridges = discover_bridges(); for b in &bridges { println!("Found: pid={} port={} cwd={}", b.pid, b.port, b.cwd); } }
Building an extension
- Scan
~/.cache/agent-code/bridge/for.lockfiles - Parse the JSON to get the port
- Connect to
http://localhost:{port} - Use the HTTP endpoints to interact with the session
The bridge is read-only by default — the IDE can observe but not control the agent without explicit user action.
Long sessions exceed the LLM's context window. The compaction system keeps conversations running indefinitely by strategically reducing history size.
Context Window Layout
|<--- context window (e.g., 200K tokens) ------------------------------>|
|<--- effective window (context - 20K reserved for output) ------------>|
|<--- auto-compact threshold (effective - 13K buffer) ----------------->|
| ↑ compaction fires
The 20K reservation ensures the model always has room to respond. The 13K buffer prevents compaction from firing on every turn.
Three Strategies
Strategies are tried in order. Each is progressively more aggressive.
1. Microcompact
Cost: zero (no API call). Savings: moderate.
Clears the content field of old tool results, replacing them with [Old tool result cleared]. Keeps the tool_use/tool_result pairing intact so the conversation structure remains valid.
Before: ToolResult { content: "500 lines of file content..." }
After: ToolResult { content: "[Old tool result cleared]" }
The keep_recent parameter (default: 2) preserves the most recent N turns untouched.
Source: services/compact.rs → microcompact()
2. LLM Summary
Cost: one API call. Savings: large.
Sends older messages to the LLM with a prompt asking for a concise summary. Replaces those messages with a single compact boundary message containing the summary.
The summary preserves: key decisions, file paths discussed, errors encountered, and the current task state.
Source: services/compact.rs → build_compact_summary_prompt()
3. Context Collapse
Cost: zero. Savings: maximum.
Removes message groups from the middle of the conversation, keeping only the first group (initial context/summary) and the last group (recent messages). The full history remains in memory for session persistence — only the API-facing view is collapsed.
Source: services/context_collapse.rs
When Compaction Fires
Auto-compact
Checked before every LLM call. Fires when estimated tokens exceed the threshold:
threshold = context_window - 20K (reserved) - 13K (buffer)
For a 200K context window, this fires at ~167K tokens.
Reactive compact
Triggered by API prompt_too_long (413) errors. Parses the gap from the error message and runs microcompact + context collapse aggressively.
Manual
Users can trigger compaction with /compact to proactively free context.
Token Estimation
Token counts are estimated using a character-based heuristic: 4 bytes per token. This is conservative for English text and intentionally overestimates to prevent context overflow.
Images use a fixed estimate of 2,000 tokens. Tool use blocks estimate from the serialized JSON input.
Source: services/tokens.rs
Execution Flow
When the LLM responds with tool calls, the executor handles them in this order:
LLM response with tool_use blocks
│
▼
Parse pending tool calls
│
▼
For each tool call:
├── Permission check (allow/deny/ask)
├── Input validation (JSON schema)
├── Plan mode check (block mutations)
└── Protected directory check (block .git/, .husky/, node_modules/)
│
▼
Partition into batches:
├── Read-only tools → parallel batch (tokio::join!)
└── Mutation tools → serial execution
│
▼
Execute tools, collect results
│
▼
Fire post-tool-use hooks
│
▼
Inject tool results into conversation
│
▼
Back to LLM for next turn
Batching Strategy
Tools declare two properties:
| Property | Meaning |
|---|---|
is_read_only() | Tool only reads, never mutates |
is_concurrency_safe() | Safe to run alongside other tools (defaults to is_read_only()) |
The executor uses these to partition:
- Parallel batch: all concurrency-safe tools run simultaneously via
tokio::join! - Serial queue: mutation tools run one at a time, in order
This maximizes throughput for read-heavy turns (common when the agent explores code) while ensuring mutation ordering is preserved.
Permission Checks
Every tool call passes through PermissionChecker::check() before execution:
- Protected directories: write tools blocked from
.git/,.husky/,node_modules/(hardcoded, not overridable) - Explicit rules: user-configured per-tool/pattern rules evaluated in order, first match wins
- Default mode:
ask,allow,deny,plan, oraccept_edits
Read-only tools use a relaxed check (check_read()) that only blocks explicit deny rules.
Streaming Executor
Tools begin execution as soon as their input is fully parsed from the SSE stream — they don't wait for the entire response to finish. This overlaps tool execution with LLM generation for faster turns.
The streaming executor watches for complete tool_use content blocks in the accumulating response and dispatches them immediately.
Error Handling
| Error | Recovery |
|---|---|
| Permission denied | Tool result reports denial, LLM adjusts approach |
| Tool execution error | Error message returned as tool result |
| Timeout | Tool cancelled via CancellationToken, error result injected |
| Invalid input | Validation error returned before execution |
The Tool Trait
Every tool implements:
#![allow(unused)] fn main() { #[async_trait] pub trait Tool: Send + Sync { fn name(&self) -> &'static str; fn description(&self) -> &'static str; fn input_schema(&self) -> serde_json::Value; async fn call(&self, input: Value, ctx: &ToolContext) -> Result<ToolResult, ToolError>; fn is_read_only(&self) -> bool { false } fn is_concurrency_safe(&self) -> bool { self.is_read_only() } } }
Adding a new tool means implementing this trait and registering it in the ToolRegistry. No central enum to modify.
Source: tools/mod.rs (trait), tools/executor.rs (dispatch), tools/registry.rs (registration)
The Problem
Different LLM providers use different APIs:
- Anthropic: Messages API with
contentblocks,tool_use/tool_resulttypes, prompt caching - OpenAI: Chat Completions API with
messages,tool_calls,functionformat
agent-code needs to work identically regardless of which provider is configured.
Architecture
User prompt
│
▼
Query Engine (provider-agnostic)
│
▼
Provider Detection (auto from model name + base URL)
│
├── Anthropic wire format → Anthropic Messages API
└── OpenAI wire format → OpenAI Chat Completions API
│
▼
SSE Stream → Normalize → Unified ContentBlock types
│
▼
Tool execution (same code path regardless of provider)
Provider Detection
detect_provider() in llm/provider.rs determines the provider from:
- Model name:
claude-*→ Anthropic,gpt-*→ OpenAI,grok-*→ xAI, etc. - Base URL:
api.anthropic.com→ Anthropic,api.openai.com→ OpenAI, etc. - Environment:
AGENT_CODE_USE_BEDROCK→ Bedrock,AGENT_CODE_USE_VERTEX→ Vertex
Each provider maps to a WireFormat:
| Wire Format | Providers |
|---|---|
Anthropic | Anthropic, Bedrock, Vertex |
OpenAi | OpenAI, xAI, Google, DeepSeek, Groq, Mistral, Together, Zhipu, Ollama, any compatible |
Wire Formats
Anthropic (llm/anthropic.rs)
- Sends
messageswithcontentas array of typed blocks - Tool calls appear as
tool_usecontent blocks in assistant messages - Tool results are
tool_resultcontent blocks in user messages - Supports
cache_controlbreakpoints for prompt caching - Extended thinking via
thinkingcontent blocks
OpenAI (llm/openai.rs)
- Sends
messageswithcontentas string or array - Tool calls appear in
tool_callsarray on assistant messages - Tool results are separate messages with
role: "tool" - Supports streaming via SSE with
[DONE]sentinel
Message Normalization
llm/normalize.rs ensures messages are valid before sending:
- Tool pairing: every
tool_useblock must have a matchingtool_resultin the next user message - Alternation: user and assistant messages must alternate (APIs reject consecutive same-role messages)
- Empty handling: empty content arrays are removed or filled with placeholder text
This runs after every turn, before the next API call.
Stream Parsing
llm/stream.rs handles SSE (Server-Sent Events) parsing:
- Read
data:lines from the HTTP response stream - Parse JSON deltas (content block starts, text deltas, tool input deltas)
- Accumulate into complete
ContentBlockinstances - Emit blocks to the UI (real-time text display) and executor (tool dispatch)
The stream parser handles both Anthropic's content_block_delta events and OpenAI's choices[0].delta format through the wire format abstraction.
Error Recovery
| Error | Recovery |
|---|---|
| Rate limited (429) | Wait retry_after ms, retry up to 5 times |
| Overloaded (529) | 5s exponential backoff, fall back to smaller model after 3 attempts |
| Prompt too long (413) | Reactive compaction, then retry |
| Max output tokens | Inject continuation message, retry up to 3 times |
| Stream interrupted | Reconnect with exponential backoff |
The retry state machine in llm/retry.rs tracks attempts per error type and supports model fallback (e.g., Opus → Sonnet on overload).
Source: llm/provider.rs (detection), llm/anthropic.rs (Anthropic format), llm/openai.rs (OpenAI format), llm/normalize.rs (validation), llm/stream.rs (SSE parsing), llm/retry.rs (error recovery)
Overview
MCP (Model Context Protocol) extends agent-code with tools and resources from external servers. agent-code acts as an MCP client, connecting to one or more MCP servers.
Connection Lifecycle
1. Startup: read mcp_servers from config
2. For each server:
a. Spawn subprocess (stdio) or connect HTTP (SSE)
b. Send initialize request (protocol version, capabilities)
c. Receive initialize response (server capabilities)
d. Send tools/list request
e. Register discovered tools as mcp__<server>__<tool> in the tool pool
f. Send resources/list request (if supported)
g. Register discovered resources
3. Ready: MCP tools available to the LLM alongside built-in tools
Transports
Stdio
The default transport. agent-code spawns the server as a subprocess and communicates via JSON-RPC over stdin/stdout.
[mcp_servers.github]
command = "npx"
args = ["-y", "@modelcontextprotocol/server-github"]
env = { GITHUB_TOKEN = "ghp_..." }
The subprocess inherits the configured env variables. Stderr is captured for debugging.
SSE (Server-Sent Events)
For servers that expose an HTTP endpoint:
[mcp_servers.remote]
url = "http://localhost:8080"
Uses HTTP POST for requests and SSE for responses/notifications.
Tool Proxying
When the LLM calls an MCP tool, the request flows through McpProxy:
LLM: tool_use { name: "mcp__github__create_issue", input: {...} }
│
▼
McpProxy.call()
├── Parse server name and tool name from the namespaced name
├── Find the MCP client for "github"
├── Send tools/call JSON-RPC request
├── Wait for response
└── Return tool result to conversation
Tool names are namespaced as mcp__<server>__<tool> to prevent collisions between servers and with built-in tools.
Resource Access
MCP servers can expose resources (database schemas, file listings, documentation). Two tools handle this:
| Tool | Purpose |
|---|---|
ListMcpResources | Browse available resources across all connected servers |
ReadMcpResource | Read a specific resource by URI |
The LLM uses these tools when it needs context from external systems.
Security
Allowlist / Denylist
[security]
mcp_server_allowlist = ["github", "filesystem"] # Only these can connect
mcp_server_denylist = ["untrusted"] # These are blocked
If allowlist is non-empty, only listed servers are connected. denylist is checked regardless.
Permission Integration
MCP tool calls go through the same permission system as built-in tools. The namespaced tool name (mcp__github__create_issue) can be matched by permission rules:
[[permissions.rules]]
tool = "mcp__github__*"
action = "allow"
Trust Boundary
MCP servers run with the user's permissions. They can access the filesystem, network, and any service the user can. The security boundary is the same as running a shell command — use allowlists to restrict which servers connect.
Error Handling
| Error | Behavior |
|---|---|
| Server fails to start | Warning logged, server skipped, agent continues without it |
| Connection lost | Tool calls to that server return error results |
| Tool call fails | Error message returned as tool result, LLM can retry or adjust |
| Timeout | Transport-level timeout, error result returned |
JSON-RPC Protocol
MCP uses JSON-RPC 2.0 over the chosen transport:
// Request
{"jsonrpc": "2.0", "id": 1, "method": "tools/call", "params": {"name": "create_issue", "arguments": {...}}}
// Response
{"jsonrpc": "2.0", "id": 1, "result": {"content": [{"type": "text", "text": "Issue created: #42"}]}}
Source: services/mcp/client.rs (client), services/mcp/transport.rs (stdio/SSE), services/mcp/types.rs (JSON-RPC types), tools/mcp_proxy.rs (tool proxy), tools/mcp_resources.rs (resource access)
Model Selection
The single biggest performance lever is model choice.
| Priority | Model Type | Examples | Tradeoff |
|---|---|---|---|
| Speed | Small/fast | GPT-4.1-mini, Haiku, Gemini Flash | Faster, cheaper, less capable |
| Balance | Mid-tier | Sonnet, GPT-4.1 | Good for most tasks |
| Quality | Large | Opus, GPT-5.4 | Slower, expensive, best reasoning |
Switch mid-session with /model — use a fast model for exploration, switch to a capable model for complex changes.
Context Management
Monitor usage
> /context
Context: ~45000 tokens (23% of 200000 window)
Auto-compact at: 167000 tokens
Messages: 24
Manual compaction
If context is getting large, compact proactively:
> /compact
Freed ~12000 estimated tokens.
Start fresh when stuck
If the agent seems confused or repetitive, clear context:
> /clear
Or start a new session — previous sessions are auto-saved and can be resumed with /resume.
Cost Control
Set a budget
[api]
max_cost_usd = 5.0
The agent stops when the budget is reached. Check spending with /cost.
Reduce cost per turn
- Shorter prompts: be specific, avoid repeating instructions the agent already has
- Use AGENTS.md: persistent context loads once instead of being re-explained each session
- Smaller models for simple tasks:
/model gpt-4.1-minifor quick edits, switch back for complex work - Plan mode:
/planfor exploration uses only read tools (cheaper turns)
Token Budget
Enable budget tracking to get warnings before hitting limits:
[features]
token_budget = true
The agent shows a warning when approaching the auto-compact threshold.
Compaction Tuning
The default thresholds work well for most use cases. For very long sessions:
- Microcompact fires first and is free — it clears old tool results
- LLM summary costs one API call but frees significant context
- Context collapse is a last resort — removes middle messages entirely
If sessions are compacting too aggressively, consider using a model with a larger context window (200K+).
Tool Execution Speed
Streaming execution
Tools execute as soon as their input is parsed from the LLM response stream — they don't wait for the full response. This is automatic and requires no configuration.
Parallel reads
Read-only tools (FileRead, Grep, Glob, WebFetch) run in parallel. The agent naturally batches reads when exploring code. No tuning needed.
Bash timeout
Long-running shell commands can be given explicit timeouts:
{"command": "npm test", "timeout": 60000}
The default timeout is 120 seconds. Background mode (run_in_background: true) has no timeout.
Session Persistence
Sessions auto-save on exit. For long-running work:
- Use
/forkto create a checkpoint before risky changes - Use
/resume <id>to return to a previous state - Use
/exportor/shareto save a readable copy
Benchmarking
Run the built-in benchmarks to measure performance on your machine:
cargo bench # All benchmarks
cargo bench -- microcompact # Compaction only
cargo bench -- estimate_tokens # Token estimation only
Results with HTML reports are generated in target/criterion/.
File Operations
| Tool | Description | Read-only | Concurrent |
|---|---|---|---|
FileRead | Read files with line numbers. Handles text, PDF, notebooks, images. | Yes | Yes |
FileWrite | Create or overwrite files. Auto-creates parent dirs. | No | No |
FileEdit | Search-and-replace. Requires unique match or replace_all. | No | No |
MultiEdit | Batch multiple edits in a single atomic operation. | No | No |
NotebookEdit | Edit Jupyter cells (replace, insert, delete). | No | No |
Search
| Tool | Description | Read-only | Concurrent |
|---|---|---|---|
Grep | Regex search via ripgrep. Context lines, glob filter, case control. | Yes | Yes |
Glob | Find files by pattern, sorted by modification time. | Yes | Yes |
ToolSearch | Discover tools by keyword or select:Name. | Yes | Yes |
Execution
| Tool | Description | Read-only | Concurrent |
|---|---|---|---|
Bash | Shell commands. Background mode, destructive detection, sandbox, timeout. | No | No |
PowerShell | PowerShell commands (Windows). | No | No |
REPL | Python or Node.js code execution. | No | No |
Agent Coordination
| Tool | Description | Read-only | Concurrent |
|---|---|---|---|
Agent | Spawn subagents. Optional worktree isolation. | No | No |
SendMessage | Inter-agent communication. | No | Yes |
Skill | Invoke skills programmatically. | Yes | No |
Planning and Tracking
| Tool | Description | Read-only | Concurrent |
|---|---|---|---|
EnterPlanMode | Switch to read-only mode. | Yes | Yes |
ExitPlanMode | Re-enable all tools. | Yes | Yes |
TaskCreate | Create a progress tracking task. | Yes | Yes |
TaskUpdate | Update task status. | Yes | Yes |
TaskGet | Get task details by ID. | Yes | Yes |
TaskList | List all session tasks. | Yes | Yes |
TaskStop | Stop a running background task. | No | Yes |
TaskOutput | Read output of a completed task. | Yes | Yes |
TodoWrite | Structured todo list management. | Yes | Yes |
Web and External
| Tool | Description | Read-only | Concurrent |
|---|---|---|---|
WebFetch | HTTP GET with HTML-to-text conversion. | Yes | Yes |
WebSearch | Web search with result extraction. | Yes | Yes |
LSP | Language server diagnostics with linter fallbacks. | Yes | Yes |
MCP Integration
| Tool | Description | Read-only | Concurrent |
|---|---|---|---|
McpProxy | Call tools on connected MCP servers. | No | No |
ListMcpResources | Browse MCP server resources. | Yes | Yes |
ReadMcpResource | Read an MCP resource by URI. | Yes | Yes |
Workspace
| Tool | Description | Read-only | Concurrent |
|---|---|---|---|
EnterWorktree | Create an isolated git worktree. | No | No |
ExitWorktree | Clean up a worktree. | No | No |
AskUserQuestion | Interactive multi-choice prompts. | Yes | No |
Sleep | Async pause (max 5 min). | Yes | Yes |
Type / followed by a command name in the REPL. Unknown commands are passed to the agent as prompts. Skill names work as commands too.
Session
| Command | Description |
|---|---|
/help | Show all commands and loaded skills |
/exit | Exit the REPL |
/clear | Reset conversation history |
/resume <id> | Restore a previous session by ID |
/sessions | List recent saved sessions |
/export | Export conversation to markdown file |
/share | Export session as shareable markdown with metadata |
/summary | Ask the agent to summarize the session |
/version | Show agent-code version |
Context
| Command | Description |
|---|---|
/cost | Token usage and estimated session cost |
/context | Context window usage and auto-compact threshold |
/compact | Free context by clearing stale tool results |
/model [name] | Show or switch the active model (interactive picker) |
/verbose | Toggle verbose output (shows token counts) |
Git
| Command | Description |
|---|---|
/diff | Show current git changes |
/status | Show git status |
/commit [msg] | Review diff and create a commit |
/review | Analyze diff for bugs and issues |
/branch [name] | Show or switch git branch |
/log | Show recent git history |
Agent Control
| Command | Description |
|---|---|
/plan | Toggle plan mode (read-only) |
/permissions | Show permission mode and rules |
/agents | List available agent types |
/tasks | List background tasks |
Configuration
| Command | Description |
|---|---|
/init | Create .agent/settings.toml for this project |
/doctor | Check environment health (tools, config, git) |
/config | Show current configuration |
/mcp | List connected MCP servers |
/hooks | Show hook configuration |
/plugins | List loaded plugins |
/memory | Show loaded memory context |
/skills | List available skills |
/theme | Show and configure color theme |
/color [name] | Switch color theme mid-session |
/features | Show enabled feature flags |
History
| Command | Description |
|---|---|
/scroll | Scrollable view of conversation history |
/transcript | Show conversation transcript with message indices |
/rewind | Undo the last assistant turn |
/snip <range> | Remove messages by index (e.g., /snip 3-7) |
/fork | Branch the conversation from this point |
Diagnostics
| Command | Description |
|---|---|
/stats | Session statistics (turns, tools used, cost) |
/files | List working directory contents |
/release-notes | Show release notes for the current version |
/feedback <msg> | Submit feedback or suggestions |
/bug | Report a bug (opens GitHub issues link) |
Editing
| Command | Description |
|---|---|
/vim | Switch to vi editing mode |
/emacs | Switch to emacs editing mode |
agent [OPTIONS]
Options
| Flag | Default | Description |
|---|---|---|
-p, --prompt <TEXT> | — | Execute a single prompt and exit (non-interactive) |
-m, --model <MODEL> | claude-sonnet-4-20250514 | Model to use |
--api-base-url <URL> | auto-detected | API endpoint URL |
--api-key <KEY> | from env | API key (prefer env var) |
--provider <NAME> | auto | LLM provider: anthropic, openai, or auto |
--permission-mode <MODE> | ask | Permission mode: ask, allow, deny, plan, accept_edits |
--dangerously-skip-permissions | false | Skip all permission checks |
-C, --cwd <DIR> | current dir | Working directory |
--max-turns <N> | 50 | Maximum agent turns per request |
-v, --verbose | false | Enable verbose output |
--dump-system-prompt | false | Print the system prompt and exit |
-h, --help | — | Show help |
--version | — | Show version |
Environment variables
| Variable | Equivalent flag | Description |
|---|---|---|
AGENT_CODE_API_KEY | --api-key | API key (highest priority) |
ANTHROPIC_API_KEY | --api-key | Anthropic API key |
OPENAI_API_KEY | --api-key | OpenAI API key |
AGENT_CODE_API_BASE_URL | --api-base-url | API endpoint URL |
AGENT_CODE_MODEL | --model | Model name |
Examples
# Interactive mode with Anthropic
ANTHROPIC_API_KEY=sk-ant-... agent
# One-shot with OpenAI
OPENAI_API_KEY=sk-... agent --model gpt-4o --prompt "explain main.rs"
# Local Ollama
agent --api-base-url http://localhost:11434/v1 --model llama3 --api-key x
# CI: fix tests without asking
agent --dangerously-skip-permissions --prompt "fix the failing tests"
# Read-only exploration
agent --permission-mode plan
# Debug: see what the LLM receives
agent --dump-system-prompt
API Configuration
| Variable | Description |
|---|---|
AGENT_CODE_API_KEY | API key (highest priority, works with any provider) |
ANTHROPIC_API_KEY | Anthropic API key (auto-selects Anthropic provider) |
OPENAI_API_KEY | OpenAI API key (auto-selects OpenAI provider) |
AGENT_CODE_API_BASE_URL | API endpoint URL override |
AGENT_CODE_MODEL | Model name override |
Behavior
| Variable | Description |
|---|---|
EDITOR | Determines REPL editing mode (vi if contains "vi", else emacs) |
SHELL | Reported in the system prompt environment section |
Resolution order
API key is resolved from the first available:
--api-keyCLI flag- Config file (
api.api_key) AGENT_CODE_API_KEYenv varANTHROPIC_API_KEYenv varOPENAI_API_KEYenv var
Base URL auto-detection:
- If only
OPENAI_API_KEYis set → defaults tohttps://api.openai.com/v1 - Otherwise → defaults to
https://api.anthropic.com/v1
This can always be overridden with --api-base-url or the config file.
API Connection
"API key required"
The agent could not find an API key. Set one of:
export ANTHROPIC_API_KEY="sk-ant-..." # Anthropic
export OPENAI_API_KEY="sk-..." # OpenAI
export AGENT_CODE_API_KEY="..." # Any provider
Or add it to your config file:
# ~/.config/agent-code/config.toml
[api]
# api_key is resolved from env vars — don't put keys in config files
"Connection refused" or timeout
- Check your internet connection
- Verify the API base URL:
agent --dump-system-prompt 2>&1 | head -1 - For local models (Ollama): ensure the server is running (
ollama serve) - For corporate proxies: set
HTTPS_PROXYenvironment variable
Rate limited (429)
The agent retries automatically up to 5 times with backoff. If it persists:
- Switch to a less busy model:
/model - Wait a few minutes and retry
- Check your API plan's rate limits
Permission Issues
"Denied by rule" on a command you want to run
Check your permission rules:
> /permissions
Add an allow rule:
# .agent/settings.toml
[[permissions.rules]]
tool = "Bash"
pattern = "your-command *"
action = "allow"
"Write to .git/ is blocked"
This is a built-in safety measure. The agent cannot write to .git/, .husky/, or node_modules/ regardless of permission settings. This prevents repository corruption and dependency tampering.
If you need to modify git config, run the command yourself with !:
> !git config user.name "Your Name"
Context Window
"Prompt too long" or context exceeded
The agent auto-compacts when approaching the limit. If it still fails:
- Run
/compactto manually free context - Start a new session for a fresh context window
- Use
/snip 0-10to remove old messages - Switch to a model with a larger context window
Agent seems to forget earlier instructions
Long conversations get compacted automatically. Important context may be summarized. To preserve critical instructions:
- Put them in
AGENTS.md(loaded every session) - Use
/memoryto check what context is loaded - Start a fresh session with
/clearif context is corrupted
Tool Errors
Bash command fails silently
- Check the command works manually:
!your-command - The agent may not have the right PATH. Set it in your shell profile.
- On Windows, some commands need PowerShell syntax
"File content has changed" on edit
Another process modified the file between the agent reading and editing it. The agent will re-read and retry. If it persists:
- Close other editors or watchers on the file
- Disable auto-formatting hooks temporarily
Grep/Glob returns no results
- Verify
ripgrepis installed:!rg --version - Check the working directory:
/files - The pattern may need escaping — try a simpler pattern first
MCP Servers
MCP server stuck "connecting"
- Check the server command works:
!npx -y @modelcontextprotocol/server-name - Verify the config in
/mcp - Check server logs in the terminal where agent-code runs
MCP tools not appearing
- Run
/mcpto verify the server is connected - The server may not have registered tools yet — restart the agent
- Check
mcp_server_allowlistin security config isn't blocking it
Installation
"command not found: agent"
The binary isn't in your PATH.
# Cargo install location
export PATH="$HOME/.cargo/bin:$PATH"
# Or find it
which agent || find / -name agent -type f 2>/dev/null | head -5
Build fails from source
# Ensure Rust is up to date
rustup update stable
# Clean and rebuild
cargo clean
cargo build --release
# Check dependencies
cargo check --all-targets
ripgrep not found
Grep and some other tools require rg (ripgrep):
# Linux
sudo apt-get install ripgrep
# macOS
brew install ripgrep
# Windows
choco install ripgrep
Sessions
Can't resume a session
- List available sessions:
/sessions - Session files are in
~/.config/agent-code/sessions/ - Old sessions may have been cleaned up
- Session format may be incompatible after an upgrade — start fresh
Still Stuck?
- Run
/doctorfor a full environment health check - Check GitHub Issues for known problems
- Open a new issue with: agent version (
agent --version), OS, and steps to reproduce
General
What is agent-code?
An open-source AI coding agent for the terminal, built in Rust. You describe tasks in natural language and the agent reads your code, runs commands, edits files, and iterates until the task is done.
How is it different from ChatGPT or Claude in a browser?
agent-code runs locally in your terminal with direct access to your filesystem, shell, git, and development tools. It can read files, make edits, run tests, and fix errors in a loop — not just generate text.
Is it free?
agent-code itself is free and open source (MIT license). You pay for the LLM API you choose — Anthropic, OpenAI, or any other provider. Local models via Ollama are completely free.
Which LLM should I use?
For coding tasks, we recommend Claude Sonnet 4 (Anthropic) or GPT-4.1 (OpenAI) as a good balance of quality and cost. For complex architecture work, use Claude Opus or GPT-5.4. For quick tasks, Haiku or GPT-4.1-mini are fast and cheap.
Installation
What are the system requirements?
- Any modern Linux, macOS, or Windows machine
gitandrg(ripgrep) for full functionality- An API key from any supported LLM provider (or Ollama for local models)
Can I use it on Windows?
Yes. Install via cargo install agent-code or download the prebuilt binary from GitHub Releases. Windows builds are tested in CI.
Can I run it in Docker?
Yes. See the Dockerfile or pull the image:
docker run -it -e ANTHROPIC_API_KEY="sk-ant-..." ghcr.io/avala-ai/agent-code
Usage
How do I switch between models mid-session?
Use the /model command. It opens an interactive picker based on your configured provider:
> /model
Or specify directly: /model gpt-4.1-mini
How do I use it with a local model?
# Start Ollama
ollama serve
# Run agent-code with local model
agent --api-base-url http://localhost:11434/v1 --model llama3 --api-key unused
What does plan mode do?
Plan mode (/plan) restricts the agent to read-only operations. It can search, read files, and analyze code, but cannot edit files or run shell commands. Useful for exploring unfamiliar codebases safely.
How do I give the agent project context?
Create an AGENTS.md file in your project root with instructions, conventions, and architecture notes. This is loaded automatically at the start of every session.
Can it access the internet?
Yes. The WebFetch and WebSearch tools allow the agent to fetch URLs and search the web. These go through the normal permission system.
Cost
How much does it cost per session?
It depends on the model, task complexity, and conversation length. Typical sessions:
| Task | Model | Approximate Cost |
|---|---|---|
| Quick fix | Sonnet/GPT-4.1 | $0.02 - $0.10 |
| Feature implementation | Sonnet/GPT-4.1 | $0.10 - $0.50 |
| Complex refactor | Opus/GPT-5.4 | $0.50 - $2.00 |
How do I set a spending limit?
# ~/.config/agent-code/config.toml
[api]
max_cost_usd = 5.0 # Stop after $5 spent
Or check usage anytime with /cost.
Security
Can the agent delete my files?
The agent asks for permission before destructive operations (default mode). You can also:
- Use plan mode for read-only:
agent --permission-mode plan - Block specific commands: add deny rules in config
- Protected directories (
.git/,.husky/,node_modules/) are always blocked from writes
Does it send my code to third parties?
Your code is sent to whichever LLM provider you configure (Anthropic, OpenAI, etc.) as part of the conversation context. It is not sent anywhere else. For maximum privacy, use a local model via Ollama.
Can I restrict which tools the agent uses?
Yes, via permission rules:
[permissions]
default_mode = "ask"
[[permissions.rules]]
tool = "Bash"
pattern = "rm *"
action = "deny"
Extensibility
How do I create a custom skill?
Create a markdown file in .agent/skills/:
---
description: My custom workflow
userInvocable: true
---
Do the thing step by step...
Then invoke it with /my-skill. See the Skills guide for details.
Can I connect external tools via MCP?
Yes. Add MCP servers to your config:
[mcp_servers.github]
command = "npx"
args = ["-y", "@modelcontextprotocol/server-github"]
See the MCP guide for details.
Can I use it as a library in my own project?
Yes. The engine is published as agent-code-lib on crates.io:
[dependencies]
agent-code-lib = "0.9"
The binary is a thin wrapper around this library.
agent-code executes shell commands and modifies files on your behalf. The security model ensures the agent only takes actions you've approved.
Permission System
Every tool call passes through a permission check:
| Mode | Behavior |
|---|---|
ask (default) | Prompts before mutations, auto-allows reads |
allow | Auto-approves everything |
deny | Blocks all mutations |
plan | Read-only tools only |
accept_edits | Auto-approves file edits, asks for shell commands |
Configure per-tool rules:
[permissions]
default_mode = "ask"
[[permissions.rules]]
tool = "Bash"
pattern = "git *"
action = "allow"
[[permissions.rules]]
tool = "Bash"
pattern = "rm *"
action = "deny"
Protected Directories
These directories are blocked from writes regardless of permission config:
.git/— prevents repository corruption.husky/— prevents hook tamperingnode_modules/— prevents dependency modification
Read access is unaffected.
Bash Safety
The Bash tool detects destructive commands and warns before execution:
rm -rf,git reset --hard,DROP TABLEchmod -R 777,mkfs,dd- System paths (
/etc,/usr,/bin,/sbin,/boot)
Large outputs are truncated and persisted to disk.
Skill Safety
Skills from untrusted sources may contain embedded shell blocks. Disable them:
[security]
disable_skill_shell_execution = true
Shell blocks in skill templates are stripped. Non-shell code blocks are preserved.
MCP Server Security
- Servers run as local subprocesses (your permissions)
- Tools are namespaced per server
- Restrict with allowlist/denylist:
[security]
mcp_server_allowlist = ["github", "filesystem"]
API Key Safety
- Keys resolved from environment variables only (never config files)
- Never logged or included in error messages
- Passed to subagents via environment only
Data Privacy
- No telemetry collected or transmitted
- Sessions stored locally (
~/.config/agent-code/sessions/) - Code sent only to your configured LLM provider
- Use Ollama for fully local, air-gapped operation
Bypass Prevention
The --dangerously-skip-permissions flag disables all checks. To block it:
[security]
disable_bypass_permissions = true
Full Enterprise Config
[security]
disable_bypass_permissions = true
disable_skill_shell_execution = true
mcp_server_allowlist = ["github", "filesystem"]
env_allowlist = ["PATH", "HOME", "SHELL"]
additional_directories = ["/shared/docs"]
Reporting Vulnerabilities
Email security@avala.ai — do not open public issues. See SECURITY.md for the full policy.