agentic-loops-and-harness-engineering

AGENTS.md vs CLAUDE.md: How to Actually Configure a Coding Agent

Your config files are the agent's control plane — get the three-tier permission model and context budget right, or watch instruction adherence rot as sessions grow.

June 10, 20268 min read AGENTS.mdCLAUDE.md.cursor/rulesthree-tier permission model
AGENTS.md vs CLAUDE.md: How to Actually Configure a Coding Agent

In March 2026, HumanLayer published a result that should change how you think about agent configuration: running Claude Code with a 1M-token context window, they found instruction adherence "dramatically degraded" at full context — and reverted to a smaller-context model on purpose. Their earlier guidance explains why: frontier LLMs reliably follow roughly 150–200 instructions, and Claude Code's system prompt already uses about 50 of them. Every line you add to your config files spends against that budget.

This is the core insight most teams miss. AGENTS.md, CLAUDE.md, and .cursor/rules/*.mdc aren't documentation — they're the control plane for your agent. They decide what loads into context every single turn, what the agent can do without asking, and what it must never touch. Configure them carelessly and you get an agent that pesters you for permission to run tests while silently curl-ing your .env to a debug endpoint.

TL;DR: Author AGENTS.md as your canonical, tool-agnostic instruction file. Add a thin CLAUDE.md adapter for Claude Code and thin .cursor/rules/*.mdc adapters for Cursor. Enforce the three-tier permission model (allow / ask / deny) in .claude/settings.json — machine-parsed files, not prose. Keep the always-loaded surface under ~60 lines, and push durable state into files like feature_list.json instead of chat history.

Key takeaways

  • Three standards coexist because they solve different problems: AGENTS.md is portable, CLAUDE.md is deep, .mdc rules are precise.
  • Permissions belong in settings.json, not prose. "Please don't run sudo" in Markdown is a suggestion; "deny": ["Bash(sudo:*)"] is enforced.
  • Context is a budget, not a bucket. Chroma's Context Rot study (194,480 API calls across 18 models) showed uniform degradation as input grows.
  • Files beat chat history for state: version-controlled, queryable, and shared across parallel agents.
  • The winning pattern is canonical-plus-adapters: one AGENTS.md, thin re-exports per tool.

The three standards, and why all three exist

Figure 1: AGENTS.md vs CLAUDE.md vs Cursor Rules: Agent Config Done Right

As of mid-2026, three project-level config formats dominate, and they have meaningfully different owners and shapes:

Dimension AGENTS.md CLAUDE.md .cursor/rules/*.mdc
Owner Agentic AI Foundation (Linux Foundation) Anthropic Cursor
Format Plain Markdown Markdown + @import Markdown + YAML frontmatter
Scoping Nested files, closest-wins managed > user > project > local globs: + 4 activation modes
Permissions None built-in allow/ask/deny in settings.json Per-rule activation + ask/never
Read by Codex, Copilot, Cursor, Devin, Factory, Roo Code Claude Code only Cursor only

AGENTS.md — donated by OpenAI to the Linux Foundation–hosted Agentic AI Foundation in December 2025, alongside MCP — is deliberately minimal: "a README for agents." It defines almost nothing beyond closest-file-wins precedence, which is exactly why Codex, GitLab Duo, Devin, and Cursor all read it.

CLAUDE.md is Anthropic's auto-loaded memory file, with four scope levels, modular .claude/rules/*.md, and @<path> imports. It's vendor-specific, but it's the gateway to Claude Code's deepest features: sub-agents with isolated context windows, skills, and hooks.

Cursor's .mdc rules take a third shape: each rule is a separate file with YAML frontmatter controlling when it loads — alwaysApply: true for every conversation, globs: for file-triggered injection, manual attachment, or agent-requested loading.

These aren't competitors so much as layers. The pattern that emerged across 2026 — sometimes called the AgentLint pattern — is: author AGENTS.md canonically, then write thin adapters. Your CLAUDE.md opens with "Read ./AGENTS.md first. It is the canonical source," then adds only Claude-specific wiring. Your .cursor/rules/stack.mdc re-exports the same stack and boundary rules with Cursor's glob targeting. One source of truth, three consumers, no drift.

The three-tier permission model: prose is not policy

Figure 2: AGENTS.md vs CLAUDE.md vs Cursor Rules: Agent Config Done Right

Every mature agent setup converges on the same shape: Always (safe, reversible, pre-approved), Ask First (risky, needs a human), Never (blacklisted, period). The three standards implement this with wildly different rigor.

AGENTS.md has no permission fields at all. You write "never run rm -rf" as prose and hope the consuming tool's enforcement layer cares. Maximum portability, minimum enforcement — the spec itself leaves obedience to each tool.

Claude Code is the most expressive. Permissions live in .claude/settings.json, not in CLAUDE.md, with prefix-matching patterns — Bash(npm run:*) matches npm run test but not npm install:

{
  "permissions": {
    "allow": ["Bash(pnpm test:*)", "Edit(./src/**)", "Read(./src/**)"],
    "ask":   ["Bash(git push:*)", "Bash(rm:*)"],
    "deny":  ["Bash(sudo:*)", "Bash(curl:*)", "Read(./.env)", "Read(./secrets/**)"]
  }
}

Note what's in deny: not just destructive commands but reads of secrets. An agent that can Read(./.env) can leak it into context, logs, or a commit. Deny rules are your only hard boundary; everything else is negotiable at runtime via the /permissions command.

Cursor inverts the model. Instead of one global allow/ask/deny list, each .mdc rule activates independently — always, glob-triggered, manual, or agent-requested — with per-rule ask/never toggles in the editor UI. This is the most precise targeting (your testing.mdc only loads when test files are touched, costing zero tokens otherwise) but it fragments policy across many files. A 30-rule project means 30 files to review.

The practical synthesis: declare intent twice. A prose rule in AGENTS.md ("run the test suite before marking work complete") for every tool to read, plus a mechanical rule in settings.json (Bash(pnpm test:*) → allow) so Claude Code never interrupts you to run tests. The prose communicates; the JSON enforces.

Context budgeting: why your CLAUDE.md should be under 60 lines

The empirical case against fat config files is now solid. The "Lost in the Middle" finding (Liu et al., 2023) showed U-shaped recall over long contexts. Chroma's 2025 Context Rot study demonstrated that all 18 frontier models tested degrade as input length grows — "LLMs are typically presumed to process context uniformly… in practice, this assumption does not hold." And HumanLayer's production experience closed the loop: more context window made their agent worse at following instructions.

So budget ruthlessly. Anthropic's own best-practices guidance says keep CLAUDE.md focused and point to detail files via @import rather than inlining. HumanLayer's root file is under 60 lines — roughly 1k tokens, loaded every turn, essentially free. The math is simple: if the model follows ~150–200 instructions and the harness eats ~50, your entire config stack gets maybe 100–150 before adherence becomes probabilistic. Spend them on boundaries and verification gates, not code-style trivia your formatter already enforces.

Files beat chat history: the state layer

Long-running agents need memory that survives the session. Chat history fails every test:

Property Chat history File on disk
Survives restart / model switch No Yes
Version-controlled No Yes
Shareable across parallel agents No Yes
Queryable / structured No Yes (JSON)
Token cost Grows unbounded Bounded

Two artifacts from Anthropic's autonomous-coding harness patterns solve this. feature_list.json is a durable task graph — features with id, priority, status, and acceptance criteria. The agent reads it at session start, picks the highest-priority pending item, and flips status on completion. claude-progress.txt is an append-only log, one line per work session:

2026-06-10 11:30 | auth-002 | IN_PROGRESS | rate limiter logic complete; blocked on INFRA-417

A fresh session reads the last 50 lines and knows what's in flight, what's blocked, and what just shipped — without re-deriving anything from a conversation that no longer exists.

The companion pattern is three shell scripts that turn vague intentions into gates: init.sh (deterministic bootstrap — toolchain checks, frozen-lockfile install, migrations), test-all.sh (single-command verification: typecheck, lint, unit, e2e, build), and review.sh (pre-merge diff review and secret scan). To be clear about provenance: none of these filenames is mandated by Anthropic's first-party docs — they're community-converged patterns visible in anthropics/claude-quickstarts and harness repos like everything-claude-code. But the logic is first-party: Anthropic's harness guidance describes exactly this loop of "a feature list, a progress log, a test gate, and a review gate."

Your CLAUDE.md then becomes mostly a session protocol, not a style guide:

## On session start
1. Run ./init.sh if not yet run on this branch.
2. Read ./feature_list.json; pick the highest-priority pending item.
3. Skim the last 50 lines of ./claude-progress.txt.

## On task completion
1. Run ./test-all.sh — must pass.
2. Run ./review.sh and address findings.
3. Update feature_list.json; append one line to claude-progress.txt.

What this means for you

If you're setting this up today, the order of operations is:

  1. Write AGENTS.md first — stack versions, build commands, testing rules, boundaries. This is the file every tool reads. Keep it under a page.
  2. Add .claude/settings.json with real allow/ask/deny rules. Deny sudo, curl, and all secret-file reads. Allow your test/lint/build commands so the agent stops asking. This is the single highest-leverage file in the stack.
  3. Write a thin CLAUDE.md that points to AGENTS.md, defines the session-start/completion protocol, and references settings.json — not a second copy of your style guide.
  4. If your team uses Cursor, add stack.mdc (alwaysApply: true) and glob-scoped rules like testing.mdc so context only loads when relevant.
  5. Add the state layerfeature_list.json and claude-progress.txt — before you attempt any multi-session or parallel-agent work, not after your first lost afternoon.

And audit for the known failure modes: AGENTS.md prose that nothing enforces, a local settings file silently overriding your project permissions, and Cursor globs that went stale after a directory rename and now never fire.

The agents are getting more capable every quarter. The teams getting the most out of them aren't the ones writing the longest config files — they're the ones treating configuration as a control plane: small always-loaded surface, machine-enforced permissions, durable file-based state. Get those three right and everything else is iteration.

Frequently asked questions

Should I use AGENTS.md or CLAUDE.md for my project?

Use both, with AGENTS.md as the canonical source. AGENTS.md is the vendor-neutral Linux Foundation standard read by Codex, Copilot, Cursor, Devin, and others. Then add a thin CLAUDE.md that points to AGENTS.md and adds Claude-Code-specific features like settings.json permissions and sub-agents.

What is the three-tier permission model for coding agents?

It's the Always / Ask First / Never pattern: pre-approve safe, reversible actions (running tests), require confirmation for risky ones (git push, rm), and hard-block dangerous ones (sudo, reading .env). Claude Code implements it as allow/ask/deny arrays in .claude/settings.json; Cursor approximates it with per-rule activation modes and ask/never toggles; AGENTS.md has no built-in enforcement.

How long should a CLAUDE.md file be?

Short — HumanLayer keeps their root CLAUDE.md under 60 lines (~1k tokens), citing evidence that frontier models reliably follow only ~150–200 instructions, of which Claude Code's system prompt already consumes ~50. Put detail in linked files via @imports rather than inlining everything.

Why store agent state in files like feature_list.json instead of chat history?

Files survive session restarts and model switches, are version-controlled, queryable, and shareable across parallel agents — chat history is none of these. The pattern from Anthropic's autonomous-coding harness: the agent reads feature_list.json and the tail of claude-progress.txt at session start instead of re-deriving state from conversation.

Do Cursor rules (.mdc files) work with Claude Code?

No — .cursor/rules/*.mdc files are Cursor-specific, just as settings.json permissions are Claude Code-specific. Cursor does read AGENTS.md, which is why the mid-2026 best practice is to author AGENTS.md canonically and keep .mdc files as thin adapters for Cursor's glob-scoped activation modes.