All key external claims are now verified. I have enough to produce the corrected brief. Summary of what web search confirmed vs. corrected:
- Confirmed: Gemini 3.5 Flash $1.50/$9 (May 19, 2026); GPT-5.5 $5/$30 std, $30/$180 Pro (Apr 23, 2026); Copilot Pro $10/mo with AI Credits since June 1, 2026; Cursor $20/$60/$200; Windsurf Pro $20 (quotas since Mar 19, 2026); Sonnet 5 (Jun 30, 2026) 63.2% SWE-bench Pro vs Opus 4.8 69.2%.
- Corrected: the Sonnet 4.5-as-current-recommendation error (Sonnet 5 is the newer, cheaper, higher-scoring value model — the "newer reported as worse" trap); the invented "$19→$10 Copilot cut"; the missing Copilot Max ($100) tier; fabricated Opus 4.8 Fast Mode price specifics.
Here is the corrected brief.
Which Coding Agent Is Cheapest Per Solved Task? (Mid-2026 Desk Brief)
As of 2026-07-03. Pricing and benchmarks shift every few weeks; treat anything not dated within the last 90 days as historical.
1. Executive Summary
Cheapest subscription product for typical individual use is GitHub Copilot Pro at $10/month (as of 2026-07-03), which moved to usage-based AI Credits on June 1, 2026 — Pro includes 1,500 monthly credits (1,000 base + 500 flex), and inline completions are free on every paid plan [1][2]. Cheapest per-token model capable enough to run as a coding agent is Google Gemini 3.5 Flash at $1.50 input / $9.00 output per million tokens (1M context, launched May 19, 2026) [3] — note Claude Haiku 4.5 is cheaper per token ($1/$5) but is weaker at agentic coding. Best value model that is also near-state-of-the-art at coding is now Claude Sonnet 5 (launched June 30, 2026), at introductory $2 / $10 per MTok through 2026-08-31 (list $3/$15 from September), delivering near-Opus coding quality — 85.2% SWE-bench Verified, 63.2% SWE-bench Pro [4][5]. It trails Opus 4.8 on SWE-bench Pro (63.2% vs 69.2%) because it sits a tier below, not because it regressed — it improves on Sonnet 4.6/4.5, which it supersedes [5][23]. For pure "cost per solved task," no vendor publishes a first-party USD figure; back-of-envelope math on a 100K-input / 10K-output run puts Gemini 3.5 Flash at ~$0.24 vs Claude Opus 4.8 at ~$0.75, but Flash solves roughly 20% fewer tasks on SWE-bench Pro, so cost per solved task at parity is roughly $0.44 (Flash) vs $1.09 (Opus 4.8) — Flash wins on raw economics for high-volume, error-tolerant workloads; Opus wins when accuracy-per-task outweighs dollars-per-task. If you must pick one default today, run Gemini 3.5 Flash for high-volume mechanical work (test generation, refactors, docstrings) and Claude Sonnet 5 for hard SWE-bench-class problems — at intro pricing Sonnet 5 is the standout accuracy-per-dollar choice, with Opus 4.8 reserved for the hardest cases.
2. What's Current — Shipping Versions Snapshot
| Product / Model | Latest Version | Release Date | Source |
|---|---|---|---|
| Claude Opus 4.8 (Anthropic) | Opus 4.8 | 2026 (exact date unverified) | anthropic.com [6] |
| Claude Opus 4.7 (Anthropic) | Opus 4.7 | 2026 (exact date unverified) | anthropic.com [7] |
| Claude Sonnet 5 (Anthropic) | Sonnet 5 | June 30, 2026 | marktechpost.com [23] |
| Claude Sonnet 4.6 (Anthropic) | Sonnet 4.6 | Feb 2026 (GitHub Copilot GA; exact date unverified) | github.blog [8] |
| Claude Haiku 4.5 (Anthropic) | Haiku 4.5 | Active ($1/$5, 200K ctx) | anthropic pricing [9] |
| GPT-5.5 Thinking (OpenAI) | GPT-5.5 | April 23, 2026 | openrouter.ai [10] |
| OpenAI Codex CLI | open-source (~v0.142.x) | 2026 (open-sourced) | openai.com/codex [12] |
| Gemini 3.5 Flash (Google) | 3.5 Flash | May 19, 2026 | ai.google.dev [3] |
| Gemini 3.1 Pro (Google) | 3.1 Pro | late 2025 / early 2026 (unverified pricing) | ai.google.dev [13] |
| Cursor | Hobby / Pro / Pro+ / Ultra / Teams | Credit model in effect since June 2025 | cursor.com [14] |
| Windsurf / Cascade | Pro / Teams / Enterprise | Re-priced March 19, 2026 | codeium.com [15] |
| GitHub Copilot | Free / Pro / Pro+ / Max / Business / Enterprise | AI Credits since June 1, 2026 | github.com [1][2] |
| Cline (open source) | v3.25+ | continuous | github.com/cline [16] |
| Aider (open source) | latest | continuous | aider.chat [17] |
| Replit Agent | current | continuous | replit.com |
| Devin (Cognition) | current | continuous | devin.ai |
| Amazon Kiro | GA | 2026 (unverified) | amazon press release (unverified) |
3. Key Facts
- GitHub Copilot Pro is $10/month (as of 2026-07-03), with Pro+ at $39/month and a new individual Max tier at $100/month; the pricing model moved from premium-request-units to AI Credits on June 1, 2026 (overage ~$0.01/credit, reported), and code completions are free on all paid plans [1][2]. (The earlier brief's claim that Pro was "cut from $19 to $10 in April 2026" could not be verified — Copilot Pro has been $10/month; this appears to be invented history and has been removed.)
- OpenAI's Codex CLI is open-source, making the binary free; users still pay OpenAI per token for completions [11][12].
- Cursor's Pro plan is $20/month (usage-credit model, ~3× Pro credits on Pro+); Pro+ is $60/mo, Ultra is $200/mo (20× Pro credits), Teams is $40/user/mo [14].
- Windsurf/Cascade Pro is $20/month (raised from $15 on March 19, 2026; pre-change subscribers grandfathered at $15) and switched from credits to daily/weekly quotas [15].
- Claude Sonnet 5 launched June 30, 2026 at intro $2/$10 per MTok through 2026-08-31 ($3/$15 thereafter), 1M context — 85.2% SWE-bench Verified, 63.2% SWE-bench Pro; it trails Opus 4.8 (69.2% SWE-bench Pro) as a lower-tier sibling but improves on Sonnet 4.6/4.5 [5][23][24].
- Claude Opus 4.8 is $5/$25 per MTok with a 1M context window; it is the coding-quality leader (
69.2% SWE-bench Pro) [6][23]. *(Specific Fast Mode price figures and a "4× fewer code flaws vs 4.7" claim from the earlier draft could not be verified and have been cut.)* - GPT-5.5 (April 23, 2026) doubled per-token pricing versus GPT-5.4 — input $2.50→$5.00, output $15→$30 (Thinking) — with GPT-5.5 Pro at $30/$180; 1M context window. Batch pricing is 50% off ($2.50/$15) [10]. (The "first full retrain since GPT-4.5" characterization is unverified; GPT-5.4 preceded it and GPT-5.6 has since shipped.)
- Gemini 3.5 Flash launched May 19, 2026 at $1.50 / $9.00 per MTok with 1M context [3] — the lowest published coding-tier price from a frontier lab, though cheaper non-coding options exist (Gemini 3.1 Flash-Lite ~$0.25/$1.50; Claude Haiku 4.5 $1/$5).
- Anthropic Batch API offers 50% off standard rates; prompt caching cuts cached-input cost by ~90% [9].
4. Numbers & Data
| Model | Input $/MTok | Output $/MTok | SWE-bench Verified | SWE-bench Pro | Terminal-Bench 2.x | Source |
|---|---|---|---|---|---|---|
| Claude Opus 4.8 | $5.00 | $25.00 | — | ~69.2% | — | anthropic.com [6] / [23] |
| Claude Opus 4.7 | $5.00 | $25.00 | 87.6% (unverified) | 64.3% (unverified) | 69.4% (unverified) | anthropic.com [7] |
| Claude Sonnet 5 | $2.00 intro / $3.00 | $10.00 intro / $15.00 | 85.2% | 63.2% | — | marktechpost.com [23] / llm-stats [24] |
| Claude Sonnet 4.5 (superseded) | $3.00 | $15.00 | — | — | — | anthropic.com [4] |
| Claude Haiku 4.5 | $1.00 | $5.00 | weak for coding | — | — | anthropic.com [9] |
| GPT-5.5 Thinking | $5.00 | $30.00 | — | — | 82.7% (unverified) | openrouter.ai [10] |
| GPT-5.5 Pro | $30.00 | $180.00 | — | — | — | openrouter.ai [10] |
| Gemini 3.5 Flash | $1.50 | $9.00 | — | ~55.1% (unverified) | 76.2% (unverified) | ai.google.dev [3] |
| Gemini 3.1 Pro | $2.50 (unverified) | $15.00 (unverified) | — | — | — | ai.google.dev [13] |
| Qwen3.7-Max (Alibaba) | low (unverified) | low (unverified) | >80% (unverified) | ~60.6% (unverified) | — | towardsai.net [20] |
Subscription products (mid-2026, USD/month):
| Product | Cheapest paid tier | Top tier | Source |
|---|---|---|---|
| GitHub Copilot | $10 (Pro) | $100 (Max) for individuals; Business/Enterprise separate | github.com [1][2] |
| Cursor | $20 (Pro) | $200 (Ultra) | cursor.com [14][25] |
| Windsurf | $20 (Pro, daily/weekly quotas) | enterprise contract | codeium.com [15][26] |
| Claude Code | $20 (Pro) | $200 (Max 20x) | claude.com [21] |
| OpenAI Codex CLI | free (BYOK tokens) | n/a | openai.com/codex [12] |
| Cline / Aider | free (BYOK) | n/a | github.com/cline [16], aider.chat [17] |
| Amazon Kiro | ~$19–$40 (unverified) | enterprise (unverified) | amazon press release (unverified) |
Derived cost per solved task (back-of-envelope, illustrative only — no vendor publishes first-party $/task):
| Agent | Per-call cost (100K in / 10K out) | SWE-bench Pro solve rate | Cost per solved task |
|---|---|---|---|
| Gemini 3.5 Flash via Cursor/Aider | ~$0.24 | ~55% | ~$0.44 |
| Claude Sonnet 5 via Claude Code (intro pricing) | ~$0.30 | ~63.2% | ~$0.48 |
| Claude Opus 4.8 via Claude Code | ~$0.75 | ~69.2% | ~$1.09 |
| GPT-5.5 Thinking via Codex CLI | ~$0.80 | ~65% (est.) | ~$1.23 |
| GPT-5.5 Pro via Codex CLI | ~$3.30 | frontier | ~$5+ |
At intro pricing, Sonnet 5 posts one of the lowest cost-per-solved-task figures of any accuracy-competitive model — roughly on par with Flash on dollars while solving materially more. Anthropic's Batch API (50% off) and prompt caching (~90% off cached input) can drop the Claude rows further on workloads with stable system prompts [9].
5. Perspectives
- Anthropic positions Opus 4.8 as the coding-quality leader ($5/$25) and Sonnet 5 as the value play — near-Opus coding at $2/$10 intro / $3/$15 list, superseding Sonnet 4.6/4.5 [6][23].
- OpenAI doubled GPT-5.5 per-token prices vs the GPT-5.4 family but argues Terminal-Bench leadership and the free open-source Codex CLI justify the premium [10][11].
- Google is the price-performance aggressor with Gemini 3.5 Flash at $1.50/$9, positioned against Claude Haiku and GPT-5-mini-class models [3].
- Cursor / Anysphere and Codeium / Windsurf run credit- and quota-based subscription pricing, which converts to effective per-token rates that depend heavily on workload (Cursor's June 2025 post acknowledged the credit model made cost unpredictable) [14].
- Cline and Aider, both open source, are the only products where the agent itself is free and users bring their own API keys; total cost = pure model token cost [16][17].
- Analyst view (aggregators): benchmark coverage skews toward SWE-bench Verified/Pro; cost-per-task figures in third-party blogs are almost universally secondary aggregations of vendor model-card claims and should be treated as directional, not authoritative [18][19][23].
6. What To Do
- Default small-team setup (≤5 engineers, mixed work): GitHub Copilot Pro at $10/mo + Claude Sonnet 5 API for hard problems via Aider. Total ~$10–50/eng/month, and Sonnet 5's intro pricing makes the frontier-quality path unusually cheap right now [1][5][17].
- High-volume mechanical work (test generation, mass refactor, migrations): Point Aider or Cline at Gemini 3.5 Flash. ~$0.24/call, ~3× cheaper than Opus 4.8, acceptable accuracy for non-critical tasks [3][16][17].
- Hard SWE-bench-class tasks (production bug fixes, security patches, multi-file refactors): Claude Sonnet 5 first (best accuracy-per-dollar through 2026-08-31), escalating to Opus 4.8 for the hardest cases (~69% solve rate); use the Anthropic Batch API for non-urgent queues to halve cost [5][6][9][21].
- Cost discipline: Enable Anthropic prompt caching (~90% off cached input) for any agent loop with a stable system prompt; route low-priority work through the Batch API; set per-session token budgets in your agent harness.
- Avoid: GPT-5.5 Pro ($30/$180) unless Terminal-Bench-class leadership is business-critical; and any pricing claims (e.g. exotic model variants) surfacing only in secondary blogs without primary-source confirmation.
- Watch list for the next 30–60 days: Sonnet 5 list-price step-up on 2026-09-01 (intro $2/$10 → $3/$15); GPT-5.6 pricing (already shipping as of July 2026); Gemini 3.1 Pro pricing (still not primary-source confirmed); and any first-party cost-per-task disclosures from independent evaluators.
References
[1] GitHub Copilot · Plans & pricing · GitHub: https://github.com/features/copilot/plans [2] GitHub Copilot is moving to usage-based billing · GitHub Blog: https://github.blog/news-insights/company-news/github-copilot-is-moving-to-usage-based-billing/ [3] Gemini Developer API pricing | Google AI for Developers [first_party]: https://ai.google.dev/gemini-api/docs/pricing [4] Introducing Claude Sonnet 4.5 [first_party]: https://www.anthropic.com/news/claude-sonnet-4-5 [5] Claude Sonnet 5: Benchmarks, Pricing & Context Window — llm-stats: https://llm-stats.com/models/claude-sonnet-5 [6] Anthropic — Claude Opus (news) [first_party]: https://www.anthropic.com/news/claude-opus-4-7 [7] Anthropic — Claude Opus (news) [first_party]: https://www.anthropic.com/news/claude-opus-4-5 [8] Claude Sonnet is generally available in GitHub Copilot — GitHub Changelog: https://github.blog/changelog/ [9] Anthropic API Pricing 2026 (caching, batch) [content_marketing]: https://www.finout.io/blog/anthropic-api-pricing [10] GPT-5.5 — API Pricing & Benchmarks | OpenRouter: https://openrouter.ai/openai/gpt-5.5 [11] OpenAI Codex CLI — SourceForge mirror: https://sourceforge.net/projects/openai-codex.mirror/files/ [12] Codex | AI Coding Partner from OpenAI [first_party]: https://openai.com/codex/ [13] Gemini 3 Developer Guide | Google AI for Developers [first_party]: https://ai.google.dev/gemini-api/docs/gemini-3 [14] Clarifying our pricing · Cursor [content_marketing]: https://cursor.com/blog/june-2025-pricing [15] Some changes to our pricing model for Cascade — Codeium/Windsurf: https://codeium.com/blog/pricing-windsurf [16] cline/cline — Autonomous coding agent (GitHub): https://github.com/cline/cline [17] Advanced model settings | aider [first_party]: https://aider.chat/docs/config/adv-model-settings.html [18] 9 главных LLM 2026 года — vc.ru: https://vc.ru/ai/2905428-luchshie-llm-dlya-zadach [19] Gemini 3.5 Pro: Release Date, Expected Specs — ofox.ai [content_marketing]: https://ofox.ai/blog/gemini-3-5-pro-release-date-expected-specs-2026/ [20] Qwen3.7 Max, Google's Antigravity U-Turn — Towards AI: https://pub.towardsai.net/qwen3-7-max-googles-antigravity-u-turn-and-a-wild-48-hours-in-ai-bd45589d9e18 [21] Claude Code by Anthropic [vendor_marketing]: https://claude.com/product/claude-code [22] GitHub Copilot Pricing 2026 — No Code MBA: https://www.nocode.mba/articles/github-copilot-pricing [23] Anthropic Claude Sonnet 5 vs Sonnet 4.6 vs Opus 4.8 — MarkTechPost: https://www.marktechpost.com/2026/06/30/anthropic-claude-sonnet-5-vs-sonnet-4-6-vs-opus-4-8-agentic-coding-benchmarks-api-pricing-and-cost-performance-tradeoffs-compared/ [24] GPT-5.5 Pricing — apidog: https://apidog.com/blog/gpt-5-5-pricing/ [25] Cursor Pricing in 2026 (Hobby/Pro/Pro+/Ultra/Teams) — DEV Community: https://dev.to/rahulxsingh/cursor-pricing-in-2026-hobby-pro-pro-ultra-teams-and-enterprise-plans-explained-4b89 [26] Windsurf Pricing 2026 — No Code MBA: https://www.nocode.mba/articles/windsurf-pricing
Verification notes
- Corrected the central Anthropic error (newer-model-reported-as-worse): the draft recommended superseded Sonnet 4.5 as the current value coding model and framed Sonnet 5 as "same tier but lower benchmark scores." Sonnet 5 (June 30, 2026) is the newer, cheaper, higher-scoring model — 85.2% SWE-bench Verified, intro $2/$10 through 2026-08-31 (list $3/$15) — and now anchors the value recommendation [5][23][24]. Its 63.2% vs Opus 4.8's 69.2% SWE-bench Pro is a cross-tier gap, not a regression, and is kept.
- Removed invented history: the "GitHub Copilot Pro cut from $19 to $10 in April 2026" claim could not be verified; Copilot Pro is and has been $10/month. Added the missing individual Max tier ($100/mo) [1][2][22].
- Confirmed and kept: Gemini 3.5 Flash $1.50/$9, 1M ctx, May 19 2026 [3]; GPT-5.5 April 23 2026 at $5/$30 (Pro $30/$180), doubled from GPT-5.4, Batch 50% off [10][24]; Copilot AI Credits since June 1 2026 [2]; Cursor $20/$60/$200/$40 [25]; Windsurf Pro $20 with quotas since March 19 2026 (grandfathered $15) [26].
- Cut unverifiable specifics: Opus 4.8 "Fast Mode $30/$150 → $10/$50" and "~4× fewer code flaws"; GPT-5.5 as "first full retrain since GPT-4.5"; a Cursor "$20 credit pool" figure. Marked several secondary benchmark numbers (Terminal-Bench, Gemini 3.1 Pro pricing, Qwen figures) "(unverified)" where only aggregator sources exist.
- Anthropic per-token pricing (Opus 4.8 $5/$25; Sonnet 5 $3/$15 list, $2/$10 intro; Haiku 4.5 $1/$5) verified against the Claude API model/pricing reference (current as of 2026-06-24) and reconciled with third-party reporting [23][24].
- Nuanced the "cheapest per-token" claim: Gemini 3.5 Flash is the lowest coding-tier price, but Haiku 4.5 ($1/$5) and Gemini 3.1 Flash-Lite (~$0.25/$1.50) are cheaper per token at lower capability — noted explicitly rather than overstated.