AI coding CLI telemetry crossed from privacy footnote to hardware cost when GitHub issue #28224 reported that OpenAI's Codex CLI could write an annualized ~640 TB/year into a local SQLite feedback database, enough to burn through a 2 TB consumer NVMe drive's rated endurance in roughly two years of continuous use.
That number is not a universal benchmark. It came from one observed long-running instance, and actual write volume depends on usage, codebase size, session duration, and version.
But the lesson is bigger than one CLI logging bug: local AI tools now deserve the same operational scrutiny teams already apply to cloud telemetry, build caches, and observability agents.
TL;DR, Last updated: June 22, 2026
- GitHub issue #28224 reports that Codex CLI feedback logs can write an annualized
640 TB/year to `/.codex/logs_2.sqlite`. - The reported root cause is a TRACE-level SQLite feedback sink using
Targets::new.with_default(Level::TRACE), withRUST_LOGignored for that sink. - Consumer SSDs commonly ship with about 1,200 TBW at 2 TB capacity, so sustained high-volume logging can become a real endurance budget issue.
- Claude Code, Gemini CLI, GitHub Copilot, Aider, and others expose different telemetry controls, but local file growth still needs auditing.
- The safe first move is measurement: file sizes, WAL growth, SQLite metadata, and documented vendor toggles.
AI coding CLI telemetry is the local collection, persistence, and optional export of coding assistant operational data from command-line tools, including logs, traces, feedback records, prompts, shell events, filesystem activity, and OpenTelemetry payloads. The operational risk is that telemetry can consume disk, leak sensitive workflow metadata, or create write amplification without a visible user-facing failure.
Why AI Coding CLI Telemetry Now Has a Hardware Cost
The Codex report is useful because it makes an invisible class of risk measurable.
The affected file path is concrete: ~/.codex/logs_2.sqlite on macOS and Linux, with related logs_2.sqlite-wal and logs_2.sqlite-shm files. OpenAI's own Codex configuration reference documents the ~/.codex/ configuration location, while the issue report identifies the SQLite feedback database as the growth point.
The reported write rate is also concrete. A developer said one Codex instance ran for about 21 days and generated roughly 37 TB of SSD writes, which extrapolates to about 640 TB/year, according to the DEV Community write-up “Stop OpenAI Codex Writing 640 TB/Year to Your SSD”.
That is the right way to state the figure: annualized from one reported instance. Treat it as a high-end observed scenario, not a median across all Codex users.
A second Codex issue, #26374, reported logs_N.sqlite growth of about 0.75 GB/day with unbounded file growth. That extrapolates to about 274 GB/year, far below 640 TB/year but still a signal that coding assistant logs can grow without a clear retention model.
What Actually Broke In The CLI Logging Bug?
The core bug reported in openai/codex#28224 is that a SQLite feedback sink is wired at TRACE verbosity.
The issue traces the behavior to Targets::new.with_default(Level::TRACE). TRACE is the noisiest conventional Rust tracing level, meant for detailed execution flow rather than a persistent default feedback stream.
The problem gets worse because RUST_LOG reportedly does not reduce writes to logs_2.sqlite. Developers who set RUST_LOG=warn or RUST_LOG=error may reduce normal logging while the SQLite feedback sink keeps writing at its own effective level.
The issue analysis says TRACE-level entries account for about 70.7% of retained bytes in the database. It also says that combined OpenTelemetry sink categories account for roughly 96% of written data, meaning the bulk of the footprint is vendor-side operational telemetry rather than something most developers will inspect locally.
That distinction matters. This is not just “large logs are annoying.” This is observability behavior living inside a developer workstation, writing to a local database, with unclear retention and an ineffective standard log-level escape hatch.
Key Takeaways
- A local AI tool can create storage risk without failing, slowing down, or showing an obvious warning.
- SQLite WAL files matter because write-ahead logging can make growth and write volume look different from the base database size.
- Developer telemetry controls should be evaluated by behavior, not by privacy-page language.
RUST_LOG,DISABLE_TELEMETRY, and--no-telemetryare only meaningful after you verify which sinks obey them.- SSD endurance math belongs in AI workstation policy for teams running agents continuously.
How Bad Is 640 TB/Year For An SSD?
A 2 TB Samsung 990 Pro, WD Black SN850X, and Crucial T700 are all commonly specified around 1,200 TBW at that capacity, according to vendor materials from Western Digital and the manufacturer datasheets referenced in the research report. At 640 TB/year, that rating is consumed in about 1.9 years before normal OS, build, browser, Docker, package manager, and database writes are counted.
A higher-end consumer Kingston KC3000 at 1,600 TBW would last about 2.5 years under the same assumption. Enterprise drives change the math completely: a Samsung PM9A3 7.68 TB read-intensive drive at about 14,028 TBW would have roughly 21.9 years of headroom at 640 TB/year, and a Micron 7450 MAX 6.4 TB at 35,040 TBW would have about 54.8 years.
TBW is a warranty and endurance rating, not a cliff. Many SSDs keep working after the rated number, and write patterns, compression, spare area, firmware, and temperature all affect real wear.
Still, developer machines are no longer light-write devices. Containers, local databases, embeddings, model caches, browser profiles, CI loops, and coding assistants all compete for the same NAND budget.
Which Coding Assistants Give You Real Telemetry Controls?
The practical question is not whether a tool has telemetry. The question is whether you can inspect it, constrain it, and prove the constraint works.
| Tool | Default or Reported Behavior | Control Surface | Local Logging Risk |
|---|---|---|---|
| OpenAI Codex CLI | Reported TRACE writes to SQLite feedback sink | ~/.codex/config.toml; RUST_LOG reportedly ignored by SQLite sink |
High until issue #28224 is fixed |
| Claude Code | Configurable telemetry | DISABLE_TELEMETRY, settings.json |
More governable, still audit local state |
| Gemini CLI | Telemetry disabled by default, per docs | --no-telemetry, settings, OTLP controls |
Lower default risk |
| GitHub Copilot | IDE and CLI telemetry controls vary by surface | VS Code telemetry settings, GitHub CLI opt-out | Depends on editor and policy |
| Aider | Analytics can be disabled | --no-analytics and YAML config |
Lower by design, still measure history files |
| Cursor | CLI documented, enterprise privacy controls exist | Cursor CLI docs | Telemetry toggle detail was not fully captured in the research |
| Sourcegraph Cody | Self-hosting changes data-control model | Sourcegraph releases | Stronger in self-hosted deployments |
Claude Code has unusually explicit enterprise controls. Anthropic documents user and project settings in Claude Code settings, environment variables in Claude Code env vars, OpenTelemetry behavior in monitoring usage, and zero data retention for eligible plans.
Gemini CLI is notable because its telemetry docs say telemetry is disabled by default, with --telemetry, --no-telemetry, telemetry.enabled, telemetry.target, and telemetry.logPrompts controls in its OpenTelemetry documentation.
The pattern is clear: the better tools separate local diagnostics, exported telemetry, prompt logging, and enterprise retention. The dangerous design is a single opaque “feedback” pipe that silently persists verbose traces.
How To Audit Coding Assistant Logs Without Breaking Anything
Start read-only. You need a baseline before deciding whether anything needs cleanup.
On macOS or Linux, locate the major assistant directories:
ls -la ~/.codex/ 2>/dev/null
ls -la ~/.claude/ 2>/dev/null
ls -la ~/.gemini/ 2>/dev/null
ls -la ~/.config/ | grep -iE 'codex|claude|gemini|cursor' 2>/dev/null
ls -la ~/.local/share/ | grep -iE 'codex|claude|gemini|cursor' 2>/dev/null
Measure sizes and modification times:
du -sh ~/.codex ~/.claude ~/.gemini 2>/dev/null
ls -lah ~/.codex/logs_2.sqlite* 2>/dev/null
ls -lah ~/.codex/logs/ 2>/dev/null
wc -l ~/.codex/history.jsonl 2>/dev/null
Inspect configuration and environment controls:
grep -nE '^\[telemetry\]|log[_-]?level|RUST_LOG' ~/.codex/config.toml 2>/dev/null
grep -nE '"level"|"telemetry"|"env"' ~/.claude/settings.json 2>/dev/null
grep -nE '"telemetry"|"logPrompts"|"otlpEndpoint"' ~/.gemini/settings.json 2>/dev/null
echo "RUST_LOG=$RUST_LOG"
echo "DISABLE_TELEMETRY=$DISABLE_TELEMETRY"
echo "CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC=$CLAUDE_CODE_DISABLE_NONESSENTIAL_TRAFFIC"
echo "OTEL_EXPORTER_OTLP_ENDPOINT=$OTEL_EXPORTER_OTLP_ENDPOINT"
Then query SQLite metadata without mutating the database:
DB="$HOME/.codex/logs_2.sqlite"
file "$DB"
ls -lah "$DB" "$DB-wal" "$DB-shm" 2>/dev/null
sqlite3 "$DB" "PRAGMA journal_mode;"
sqlite3 "$DB" "PRAGMA wal_autocheckpoint;"
sqlite3 "$DB" "PRAGMA journal_size_limit;"
sqlite3 "$DB" "PRAGMA page_count;"
sqlite3 "$DB" "PRAGMA page_size;"
sqlite3 "$DB" "SELECT name FROM sqlite_master WHERE type='table' ORDER BY name;"
sqlite3 "$DB" "PRAGMA integrity_check;"
Avoid VACUUM, DELETE, DROP, UPDATE, REINDEX, and PRAGMA wal_checkpoint(TRUNCATE) during measurement. Those commands change the state you are trying to observe.
When Should Teams Treat Developer Telemetry As Production Risk?
Treat local AI tools as production-adjacent infrastructure when they run continuously, touch proprietary repositories, or execute shell commands inside build environments.
A founder using Codex for two short sessions a week has a different risk profile from an engineering org that leaves agentic CLIs running across dozens of workstations. The second group needs policy.
Policy does not need to be heavy. It should answer four questions:
- Which AI coding tools are approved?
- Where do they store local logs, sessions, databases, and auth files?
- Which telemetry controls are required for default use?
- How often do developer machines report disk growth and SSD wear indicators?
The most useful control is behavioral verification. Set DISABLE_TELEMETRY=true, --no-telemetry, or RUST_LOG=warn, then check whether files actually stop growing.
What This Means For You
If you run Codex CLI as of June 22, 2026, check issue #28224 and the official Codex changelog before assuming the bug is fixed. The research found no changelog entry resolving the SQLite feedback logging issue by the publish date.
If you run Claude Code, Gemini CLI, Copilot, Aider, Cursor, or Cody, don't copy the Codex conclusion onto those tools. Audit them separately. Claude Code and Gemini CLI both document specific telemetry controls, while GitHub Copilot inherits some behavior from VS Code's telemetry settings and enterprise policy controls.
If you manage a team, add local AI assistant storage to your workstation baseline. This is the same category as Docker volume growth, package caches, editor indexes, and local observability agents.
Practical Checklist
- Record current assistant versions and publish-date context for any issue you file.
- Measure
~/.codex,~/.claude,~/.gemini, editor extension folders, and local cache directories. - For Codex, check
~/.codex/logs_2.sqlite,logs_2.sqlite-wal, andlogs_2.sqlite-shm. - Measure growth over a fixed interval, ideally 24 hours of normal use.
- Verify whether documented telemetry toggles actually reduce local file growth.
- Prefer vendor-supported controls over cleanup scripts.
- Track SSD wear using your OS or vendor SMART tools if agents run continuously.
- For enterprise fleets, use managed settings where available and keep local telemetry defaults in endpoint policy.
- Follow the upstream issue and changelog before resuming continuous Codex runs on consumer SSDs.
LinkedIn Teaser
A Codex CLI issue turned “developer telemetry” into a hardware-cost story.
GitHub issue #28224 reports that Codex feedback logs can write an annualized 640 TB/year into `/.codex/logs_2.sqlite`. On a 2 TB consumer NVMe drive rated around 1,200 TBW, that is roughly 1.9 years of endurance budget before normal workstation writes.
The uncomfortable part: RUST_LOG=warn reportedly does not constrain the SQLite feedback sink.
The fix for teams is not panic. It is measurement. Audit local assistant directories, watch WAL files, verify telemetry toggles by file growth, and treat coding agents like any other workstation daemon with logs, retention, and operational cost.
Sources
- Codex SQLite feedback logs can write ~640 TB/year and rapidly consume SSD endurance
- Stop OpenAI Codex Writing 640 TB/Year to Your SSD
- Excessive SQLite WAL writes during streaming due to TRACE logs ignoring RUST_LOG
- feedback log sqlite grows unbounded, ~0.75 GB/day
- Codex Configuration Reference
- Codex Changelog
- Claude Code settings
- Claude Code environment variables
- Claude Code monitoring usage
- Claude Code zero data retention
- Gemini CLI telemetry documentation
- VS Code telemetry settings
- VS Code enterprise policies
- GitHub CLI opt-out usage telemetry
- Aider configuration documentation
