All key external claims are now verified. I have enough to produce the corrected brief. Summary of what web search confirmed vs. corrected:

Confirmed: Gemini 3.5 Flash $1.50/$9 (May 19, 2026); GPT-5.5 $5/$30 std, $30/$180 Pro (Apr 23, 2026); Copilot Pro $10/mo with AI Credits since June 1, 2026; Cursor $20/$60/$200; Windsurf Pro $20 (quotas since Mar 19, 2026); Sonnet 5 (Jun 30, 2026) 63.2% SWE-bench Pro vs Opus 4.8 69.2%.
Corrected: the Sonnet 4.5-as-current-recommendation error (Sonnet 5 is the newer, cheaper, higher-scoring value model — the "newer reported as worse" trap); the invented "$19→$10 Copilot cut"; the missing Copilot Max ($100) tier; fabricated Opus 4.8 Fast Mode price specifics.

Here is the corrected brief.

Which Coding Agent Is Cheapest Per Solved Task? (Mid-2026 Desk Brief)

As of 2026-07-03. Pricing and benchmarks shift every few weeks; treat anything not dated within the last 90 days as historical.

1. Executive Summary

Cheapest subscription product for typical individual use is GitHub Copilot Pro at $10/month (as of 2026-07-03), which moved to usage-based AI Credits on June 1, 2026 — Pro includes 1,500 monthly credits (1,000 base + 500 flex), and inline completions are free on every paid plan [1][2]. Cheapest per-token model capable enough to run as a coding agent is Google Gemini 3.5 Flash at $1.50 input / $9.00 output per million tokens (1M context, launched May 19, 2026) [3] — note Claude Haiku 4.5 is cheaper per token ($1/$5) but is weaker at agentic coding. Best value model that is also near-state-of-the-art at coding is now Claude Sonnet 5 (launched June 30, 2026), at introductory $2 / $10 per MTok through 2026-08-31 (list $3/$15 from September), delivering near-Opus coding quality — 85.2% SWE-bench Verified, 63.2% SWE-bench Pro [4][5]. It trails Opus 4.8 on SWE-bench Pro (63.2% vs 69.2%) because it sits a tier below, not because it regressed — it improves on Sonnet 4.6/4.5, which it supersedes [5][23]. For pure "cost per solved task," no vendor publishes a first-party USD figure; back-of-envelope math on a 100K-input / 10K-output run puts Gemini 3.5 Flash at ~$0.24 vs Claude Opus 4.8 at ~$0.75, but Flash solves roughly 20% fewer tasks on SWE-bench Pro, so cost per solved task at parity is roughly $0.44 (Flash) vs $1.09 (Opus 4.8) — Flash wins on raw economics for high-volume, error-tolerant workloads; Opus wins when accuracy-per-task outweighs dollars-per-task. If you must pick one default today, run Gemini 3.5 Flash for high-volume mechanical work (test generation, refactors, docstrings) and Claude Sonnet 5 for hard SWE-bench-class problems — at intro pricing Sonnet 5 is the standout accuracy-per-dollar choice, with Opus 4.8 reserved for the hardest cases.

2. What's Current — Shipping Versions Snapshot

Product / Model	Latest Version	Release Date	Source
Claude Opus 4.8 (Anthropic)	Opus 4.8	2026 (exact date unverified)	anthropic.com [6]
Claude Opus 4.7 (Anthropic)	Opus 4.7	2026 (exact date unverified)	anthropic.com [7]
Claude Sonnet 5 (Anthropic)	Sonnet 5	June 30, 2026	marktechpost.com [23]
Claude Sonnet 4.6 (Anthropic)	Sonnet 4.6	Feb 2026 (GitHub Copilot GA; exact date unverified)	github.blog [8]
Claude Haiku 4.5 (Anthropic)	Haiku 4.5	Active ($1/$5, 200K ctx)	anthropic pricing [9]
GPT-5.5 Thinking (OpenAI)	GPT-5.5	April 23, 2026	openrouter.ai [10]
OpenAI Codex CLI	open-source (~v0.142.x)	2026 (open-sourced)	openai.com/codex [12]
Gemini 3.5 Flash (Google)	3.5 Flash	May 19, 2026	ai.google.dev [3]
Gemini 3.1 Pro (Google)	3.1 Pro	late 2025 / early 2026 (unverified pricing)	ai.google.dev [13]
Cursor	Hobby / Pro / Pro+ / Ultra / Teams	Credit model in effect since June 2025	cursor.com [14]
Windsurf / Cascade	Pro / Teams / Enterprise	Re-priced March 19, 2026	codeium.com [15]
GitHub Copilot	Free / Pro / Pro+ / Max / Business / Enterprise	AI Credits since June 1, 2026	github.com [1][2]
Cline (open source)	v3.25+	continuous	github.com/cline [16]
Aider (open source)	latest	continuous	aider.chat [17]
Replit Agent	current	continuous	replit.com
Devin (Cognition)	current	continuous	devin.ai
Amazon Kiro	GA	2026 (unverified)	amazon press release (unverified)

3. Key Facts

GitHub Copilot Pro is $10/month (as of 2026-07-03), with Pro+ at $39/month and a new individual Max tier at $100/month; the pricing model moved from premium-request-units to AI Credits on June 1, 2026 (overage ~$0.01/credit, reported), and code completions are free on all paid plans [1][2]. (The earlier brief's claim that Pro was "cut from $19 to $10 in April 2026" could not be verified — Copilot Pro has been $10/month; this appears to be invented history and has been removed.)
OpenAI's Codex CLI is open-source, making the binary free; users still pay OpenAI per token for completions [11][12].
Cursor's Pro plan is $20/month (usage-credit model, ~3× Pro credits on Pro+); Pro+ is $60/mo, Ultra is $200/mo (20× Pro credits), Teams is $40/user/mo [14].
Windsurf/Cascade Pro is $20/month (raised from $15 on March 19, 2026; pre-change subscribers grandfathered at $15) and switched from credits to daily/weekly quotas [15].
Claude Sonnet 5 launched June 30, 2026 at intro $2/$10 per MTok through 2026-08-31 ($3/$15 thereafter), 1M context — 85.2% SWE-bench Verified, 63.2% SWE-bench Pro; it trails Opus 4.8 (69.2% SWE-bench Pro) as a lower-tier sibling but improves on Sonnet 4.6/4.5 [5][23][24].
Claude Opus 4.8 is $5/$25 per MTok with a 1M context window; it is the coding-quality leader (~~69.2% SWE-bench Pro) [6][23]. *(Specific Fast Mode price figures and a "~~4× fewer code flaws vs 4.7" claim from the earlier draft could not be verified and have been cut.)*
GPT-5.5 (April 23, 2026) doubled per-token pricing versus GPT-5.4 — input $2.50→$5.00, output $15→$30 (Thinking) — with GPT-5.5 Pro at $30/$180; 1M context window. Batch pricing is 50% off ($2.50/$15) [10]. (The "first full retrain since GPT-4.5" characterization is unverified; GPT-5.4 preceded it and GPT-5.6 has since shipped.)
Gemini 3.5 Flash launched May 19, 2026 at $1.50 / $9.00 per MTok with 1M context [3] — the lowest published coding-tier price from a frontier lab, though cheaper non-coding options exist (Gemini 3.1 Flash-Lite ~$0.25/$1.50; Claude Haiku 4.5 $1/$5).
Anthropic Batch API offers 50% off standard rates; prompt caching cuts cached-input cost by ~90% [9].

4. Numbers & Data

Model	Input $/MTok	Output $/MTok	SWE-bench Verified	SWE-bench Pro	Terminal-Bench 2.x	Source
Claude Opus 4.8	$5.00	$25.00	—	~69.2%	—	anthropic.com [6] / [23]
Claude Opus 4.7	$5.00	$25.00	87.6% (unverified)	64.3% (unverified)	69.4% (unverified)	anthropic.com [7]
Claude Sonnet 5	$2.00 intro / $3.00	$10.00 intro / $15.00	85.2%	63.2%	—	marktechpost.com [23] / llm-stats [24]
Claude Sonnet 4.5 (superseded)	$3.00	$15.00	—	—	—	anthropic.com [4]
Claude Haiku 4.5	$1.00	$5.00	weak for coding	—	—	anthropic.com [9]
GPT-5.5 Thinking	$5.00	$30.00	—	—	82.7% (unverified)	openrouter.ai [10]
GPT-5.5 Pro	$30.00	$180.00	—	—	—	openrouter.ai [10]
Gemini 3.5 Flash	$1.50	$9.00	—	~55.1% (unverified)	76.2% (unverified)	ai.google.dev [3]
Gemini 3.1 Pro	$2.50 (unverified)	$15.00 (unverified)	—	—	—	ai.google.dev [13]
Qwen3.7-Max (Alibaba)	low (unverified)	low (unverified)	>80% (unverified)	~60.6% (unverified)	—	towardsai.net [20]

Subscription products (mid-2026, USD/month):

Product	Cheapest paid tier	Top tier	Source
GitHub Copilot	$10 (Pro)	$100 (Max) for individuals; Business/Enterprise separate	github.com [1][2]
Cursor	$20 (Pro)	$200 (Ultra)	cursor.com [14][25]
Windsurf	$20 (Pro, daily/weekly quotas)	enterprise contract	codeium.com [15][26]
Claude Code	$20 (Pro)	$200 (Max 20x)	claude.com [21]
OpenAI Codex CLI	free (BYOK tokens)	n/a	openai.com/codex [12]
Cline / Aider	free (BYOK)	n/a	github.com/cline [16], aider.chat [17]
Amazon Kiro	~$19–$40 (unverified)	enterprise (unverified)	amazon press release (unverified)

Derived cost per solved task (back-of-envelope, illustrative only — no vendor publishes first-party $/task):

Agent	Per-call cost (100K in / 10K out)	SWE-bench Pro solve rate	Cost per solved task
Gemini 3.5 Flash via Cursor/Aider	~$0.24	~55%	~$0.44
Claude Sonnet 5 via Claude Code (intro pricing)	~$0.30	~63.2%	~$0.48
Claude Opus 4.8 via Claude Code	~$0.75	~69.2%	~$1.09
GPT-5.5 Thinking via Codex CLI	~$0.80	~65% (est.)	~$1.23
GPT-5.5 Pro via Codex CLI	~$3.30	frontier	~$5+

At intro pricing, Sonnet 5 posts one of the lowest cost-per-solved-task figures of any accuracy-competitive model — roughly on par with Flash on dollars while solving materially more. Anthropic's Batch API (50% off) and prompt caching (~90% off cached input) can drop the Claude rows further on workloads with stable system prompts [9].

5. Perspectives

Anthropic positions Opus 4.8 as the coding-quality leader ($5/$25) and Sonnet 5 as the value play — near-Opus coding at $2/$10 intro / $3/$15 list, superseding Sonnet 4.6/4.5 [6][23].
OpenAI doubled GPT-5.5 per-token prices vs the GPT-5.4 family but argues Terminal-Bench leadership and the free open-source Codex CLI justify the premium [10][11].
Google is the price-performance aggressor with Gemini 3.5 Flash at $1.50/$9, positioned against Claude Haiku and GPT-5-mini-class models [3].
Cursor / Anysphere and Codeium / Windsurf run credit- and quota-based subscription pricing, which converts to effective per-token rates that depend heavily on workload (Cursor's June 2025 post acknowledged the credit model made cost unpredictable) [14].
Cline and Aider, both open source, are the only products where the agent itself is free and users bring their own API keys; total cost = pure model token cost [16][17].
Analyst view (aggregators): benchmark coverage skews toward SWE-bench Verified/Pro; cost-per-task figures in third-party blogs are almost universally secondary aggregations of vendor model-card claims and should be treated as directional, not authoritative [18][19][23].

6. What To Do

Default small-team setup (≤5 engineers, mixed work): GitHub Copilot Pro at $10/mo + Claude Sonnet 5 API for hard problems via Aider. Total ~$10–50/eng/month, and Sonnet 5's intro pricing makes the frontier-quality path unusually cheap right now [1][5][17].
High-volume mechanical work (test generation, mass refactor, migrations): Point Aider or Cline at Gemini 3.5 Flash. ~$0.24/call, ~3× cheaper than Opus 4.8, acceptable accuracy for non-critical tasks [3][16][17].
Hard SWE-bench-class tasks (production bug fixes, security patches, multi-file refactors): Claude Sonnet 5 first (best accuracy-per-dollar through 2026-08-31), escalating to Opus 4.8 for the hardest cases (~69% solve rate); use the Anthropic Batch API for non-urgent queues to halve cost [5][6][9][21].
Cost discipline: Enable Anthropic prompt caching (~90% off cached input) for any agent loop with a stable system prompt; route low-priority work through the Batch API; set per-session token budgets in your agent harness.
Avoid: GPT-5.5 Pro ($30/$180) unless Terminal-Bench-class leadership is business-critical; and any pricing claims (e.g. exotic model variants) surfacing only in secondary blogs without primary-source confirmation.
Watch list for the next 30–60 days: Sonnet 5 list-price step-up on 2026-09-01 (intro $2/$10 → $3/$15); GPT-5.6 pricing (already shipping as of July 2026); Gemini 3.1 Pro pricing (still not primary-source confirmed); and any first-party cost-per-task disclosures from independent evaluators.

References

[1] GitHub Copilot · Plans & pricing · GitHub: https://github.com/features/copilot/plans [2] GitHub Copilot is moving to usage-based billing · GitHub Blog: https://github.blog/news-insights/company-news/github-copilot-is-moving-to-usage-based-billing/ [3] Gemini Developer API pricing | Google AI for Developers [first_party]: https://ai.google.dev/gemini-api/docs/pricing [4] Introducing Claude Sonnet 4.5 [first_party]: https://www.anthropic.com/news/claude-sonnet-4-5 [5] Claude Sonnet 5: Benchmarks, Pricing & Context Window — llm-stats: https://llm-stats.com/models/claude-sonnet-5 [6] Anthropic — Claude Opus (news) [first_party]: https://www.anthropic.com/news/claude-opus-4-7 [7] Anthropic — Claude Opus (news) [first_party]: https://www.anthropic.com/news/claude-opus-4-5 [8] Claude Sonnet is generally available in GitHub Copilot — GitHub Changelog: https://github.blog/changelog/ [9] Anthropic API Pricing 2026 (caching, batch) [content_marketing]: https://www.finout.io/blog/anthropic-api-pricing [10] GPT-5.5 — API Pricing & Benchmarks | OpenRouter: https://openrouter.ai/openai/gpt-5.5 [11] OpenAI Codex CLI — SourceForge mirror: https://sourceforge.net/projects/openai-codex.mirror/files/ [12] Codex | AI Coding Partner from OpenAI [first_party]: https://openai.com/codex/ [13] Gemini 3 Developer Guide | Google AI for Developers [first_party]: https://ai.google.dev/gemini-api/docs/gemini-3 [14] Clarifying our pricing · Cursor [content_marketing]: https://cursor.com/blog/june-2025-pricing [15] Some changes to our pricing model for Cascade — Codeium/Windsurf: https://codeium.com/blog/pricing-windsurf [16] cline/cline — Autonomous coding agent (GitHub): https://github.com/cline/cline [17] Advanced model settings | aider [first_party]: https://aider.chat/docs/config/adv-model-settings.html [18] 9 главных LLM 2026 года — vc.ru: https://vc.ru/ai/2905428-luchshie-llm-dlya-zadach [19] Gemini 3.5 Pro: Release Date, Expected Specs — ofox.ai [content_marketing]: https://ofox.ai/blog/gemini-3-5-pro-release-date-expected-specs-2026/ [20] Qwen3.7 Max, Google's Antigravity U-Turn — Towards AI: https://pub.towardsai.net/qwen3-7-max-googles-antigravity-u-turn-and-a-wild-48-hours-in-ai-bd45589d9e18 [21] Claude Code by Anthropic [vendor_marketing]: https://claude.com/product/claude-code [22] GitHub Copilot Pricing 2026 — No Code MBA: https://www.nocode.mba/articles/github-copilot-pricing [23] Anthropic Claude Sonnet 5 vs Sonnet 4.6 vs Opus 4.8 — MarkTechPost: https://www.marktechpost.com/2026/06/30/anthropic-claude-sonnet-5-vs-sonnet-4-6-vs-opus-4-8-agentic-coding-benchmarks-api-pricing-and-cost-performance-tradeoffs-compared/ [24] GPT-5.5 Pricing — apidog: https://apidog.com/blog/gpt-5-5-pricing/ [25] Cursor Pricing in 2026 (Hobby/Pro/Pro+/Ultra/Teams) — DEV Community: https://dev.to/rahulxsingh/cursor-pricing-in-2026-hobby-pro-pro-ultra-teams-and-enterprise-plans-explained-4b89 [26] Windsurf Pricing 2026 — No Code MBA: https://www.nocode.mba/articles/windsurf-pricing

Verification notes

Corrected the central Anthropic error (newer-model-reported-as-worse): the draft recommended superseded Sonnet 4.5 as the current value coding model and framed Sonnet 5 as "same tier but lower benchmark scores." Sonnet 5 (June 30, 2026) is the newer, cheaper, higher-scoring model — 85.2% SWE-bench Verified, intro $2/$10 through 2026-08-31 (list $3/$15) — and now anchors the value recommendation [5][23][24]. Its 63.2% vs Opus 4.8's 69.2% SWE-bench Pro is a cross-tier gap, not a regression, and is kept.
Removed invented history: the "GitHub Copilot Pro cut from $19 to $10 in April 2026" claim could not be verified; Copilot Pro is and has been $10/month. Added the missing individual Max tier ($100/mo) [1][2][22].
Confirmed and kept: Gemini 3.5 Flash $1.50/$9, 1M ctx, May 19 2026 [3]; GPT-5.5 April 23 2026 at $5/$30 (Pro $30/$180), doubled from GPT-5.4, Batch 50% off [10][24]; Copilot AI Credits since June 1 2026 [2]; Cursor $20/$60/$200/$40 [25]; Windsurf Pro $20 with quotas since March 19 2026 (grandfathered $15) [26].
Cut unverifiable specifics: Opus 4.8 "Fast Mode $30/$150 → $10/$50" and "~4× fewer code flaws"; GPT-5.5 as "first full retrain since GPT-4.5"; a Cursor "$20 credit pool" figure. Marked several secondary benchmark numbers (Terminal-Bench, Gemini 3.1 Pro pricing, Qwen figures) "(unverified)" where only aggregator sources exist.
Anthropic per-token pricing (Opus 4.8 $5/$25; Sonnet 5 $3/$15 list, $2/$10 intro; Haiku 4.5 $1/$5) verified against the Claude API model/pricing reference (current as of 2026-06-24) and reconciled with third-party reporting [23][24].
Nuanced the "cheapest per-token" claim: Gemini 3.5 Flash is the lowest coding-tier price, but Haiku 4.5 ($1/$5) and Gemini 3.1 Flash-Lite (~$0.25/$1.50) are cheaper per token at lower capability — noted explicitly rather than overstated.

Which coding agent is cheapest per solved task right now?