Archive

All articles

Every piece we've published, newest first. 189 articles and counting.

July 2026

  1. AI Coding Agents Are Phoning Home. Read the PromptAI Tools · July 2, 2026 · 9 min
  2. A 7B Model Beat Claude Opus by Routing, Not ReasoningModels & Releases · July 2, 2026 · 8 min
  3. Claude Fable 5 Returns With a Government Key in the LockModels & Releases · July 1, 2026 · 15 min
  4. Cloudflare Is Rewiring GEO: Block, Charge, or Allow AI CrawlersSearch & GEO · July 1, 2026 · 11 min
  5. Fable 5 Is Back: The Power-User Playbook for Long-Horizon AgentsAI Tools · July 1, 2026 · 16 min

June 2026

  1. Claude Science Is a Workflow Bet, Not a Model BetAI Frontiers · June 30, 2026 · 9 min
  2. Claude Sonnet 5 Makes Opus Hard to JustifyAI Frontiers · June 30, 2026 · 11 min
  3. A Delivery Company Trained a 1.6T Coding Model, No NvidiaAI Frontiers · June 30, 2026 · 8 min
  4. Meta Curbed Claude Code and Codex. Distillation Is WhyAI Tools · June 29, 2026 · 11 min
  5. Brain2Qwerty v2 Hits Real-Time Typing, Still Stuck in a RoomAI Frontiers · June 29, 2026 · 12 min
  6. ChatGPT Logs as Evidence: What the Palisades Fire Trial Means for AIAI Frontiers · June 29, 2026 · 12 min
  7. Cascaded vs End-to-End Voice Agents: Which Ships in Healthcare?Agents & Harnesses · June 29, 2026 · 13 min
  8. Pentagon Agent Network: The Multi-Agent Architecture No One Is ParsingAgents & Harnesses · June 29, 2026 · 10 min
  9. Agent Reliability Needs a Score, Not a Gut FeelingAgents & Harnesses · June 29, 2026 · 11 min
  10. Beyond Accuracy: The UX Metrics That Decide If AI Products SurviveAI Frontiers · June 29, 2026 · 11 min
  11. Ford's Gray Beard Reversal: What It Teaches About AI LimitsAI Frontiers · June 29, 2026 · 10 min
  12. The 100-Tool Agent Trap: Why Less Is More in ProductionAgents & Harnesses · June 28, 2026 · 10 min
  13. Shiller Was Right About AI Doommaxxing. The Polling Proves ItAI Frontiers · June 28, 2026 · 12 min
  14. STT API Showdown: TTFT vs WER in 2026AI Frontiers · June 28, 2026 · 10 min
  15. A Clinical Scribe Fell to Three Prompts. The VA Scaled It to 130 SitesSecurity & Safety · June 28, 2026 · 12 min
  16. OpenAI's Jalapeño Chip Is an Inference Hedge, Not a Nvidia KillerAI Economics · June 28, 2026 · 12 min
  17. Your AI Agent Went Rogue on Friday. Here's the PostmortemAgents & Harnesses · June 28, 2026 · 17 min
  18. Five Weeks Until EU AI Act High-Risk Day. Is Your Stack Ready?Security & Safety · June 28, 2026 · 11 min
  19. LLM-as-Judge Reliability: The Cohen's Kappa Every Production Eval NeedsModel Evaluation · June 28, 2026 · 12 min
  20. Why Memory Bandwidth, Not Compute, Is the LLM Inference BottleneckMemory & Context · June 28, 2026 · 12 min
  21. Government-Gated AI: Who Decides What Models You Can RunAI Frontiers · June 28, 2026 · 10 min
  22. Voice AI Under 500ms: The Latency Budget That Decides Who ShipsAI Economics · June 27, 2026 · 12 min
  23. 15 Agent Benchmarks, Zero Safety Scores. Here's the Fix.Model Evaluation · June 27, 2026 · 12 min
  24. LPU vs GPU Inference: Groq's 70% Latency Win, DecodedModel Evaluation · June 27, 2026 · 12 min
  25. 92% Blew Their AI Budget. AI FinOps Is the FixAI Economics · June 27, 2026 · 11 min
  26. The "US" vs Them: Fable Off, GPT-5.6 GatedModels & Releases · June 26, 2026 · 18 min
  27. GPT-5.6 Deployment Starts Behind a Federal GateModels & Releases · June 26, 2026 · 17 min
  28. Reasoning Models Break Guardrails 97% of the Time. Score It Like CVSS.Model Evaluation · June 26, 2026 · 11 min
  29. C2PA Watermarking for Model Outputs: A 2026 Ship PlanAI Frontiers · June 26, 2026 · 10 min
  30. VLLM vs TensorRT-LLM vs SGLang: 2026 Serving Benchmark, Same HardwareModel Evaluation · June 26, 2026 · 11 min
  31. Multimodal Evals Are Now the Hardest Part of the StackModel Evaluation · June 26, 2026 · 10 min
  32. Facebook AI Mode Cites Your Friends' Posts. Here's the GEO PlaySearch & GEO · June 26, 2026 · 10 min
  33. Generative UI Quietly Became the Third Interface PatternAI Tools · June 26, 2026 · 14 min
  34. Neocloud GPU Economics Are Cheap, Fragile, and Winning AnywayAI Economics · June 26, 2026 · 13 min
  35. Fable 5 Went Dark. Praxis Is the City Built to Outrun It.Models & Releases · June 26, 2026 · 18 min
  36. John Jumper's Move Says AI Life Sciences Is Now a Platform WarAI Frontiers · June 26, 2026 · 12 min
  37. Multimodal Evaluation Broke. Here's How Teams Fix ItModel Evaluation · June 26, 2026 · 10 min
  38. The AI Hallucination Tariff: 2026 Lawyer Sanctions DecodedAI Frontiers · June 26, 2026 · 13 min
  39. AI Data Center Permitting Met a New Bypass: National SecuritySecurity & Safety · June 26, 2026 · 9 min
  40. AI's Grid Crunch: FERC Forces a Fast Lane for Data Center PowerAI Frontiers · June 26, 2026 · 13 min
  41. McCarthy Put Palantir on the Jobsite. Here's What Actually Ships.AI Frontiers · June 26, 2026 · 10 min
  42. Document AI 2026: VLMs Didn't Kill OCR, Hybrid Pipelines DidAI Frontiers · June 26, 2026 · 11 min
  43. Ambient AI Scribes Break at the EMR, Not the MicAI Frontiers · June 26, 2026 · 13 min
  44. On-Device AI's Real Bottleneck Isn't the Chip. It's the MemoryMemory & Context · June 26, 2026 · 12 min
  45. Prompts Are Production Code. Treat Your Agent Pipeline Like Infrastructure.Agents & Harnesses · June 26, 2026 · 10 min
  46. Multimodal Evaluation Has a 35-Point Blind SpotModel Evaluation · June 24, 2026 · 10 min
  47. Synthetic Data Generation Breaks at the TailsAI Frontiers · June 24, 2026 · 10 min
  48. The AI Biotech Stack Needs a Wet-Lab ClockAgents & Harnesses · June 24, 2026 · 10 min
  49. AI Biology Timeline: When Models Reached the Wet LabAI Frontiers · June 24, 2026 · 16 min
  50. AI-Designed Medicines Just Hit the Biology WallAI Frontiers · June 24, 2026 · 11 min
  51. Fable 5 Biology Restrictions Have a Real JobModels & Releases · June 24, 2026 · 10 min
  52. The GEO Playbook: Getting Cited by AI EnginesSearch & GEO · June 23, 2026 · 25 min
  53. 14 Days of Fable 5: The Shutdown That Rewired AIModels & Releases · June 23, 2026 · 28 min
  54. US AI Regulation Is a Patchwork. Ship ControlsAI Frontiers · June 23, 2026 · 10 min
  55. AI Inference Hardware Has a New Cost BottleneckAI Economics · June 23, 2026 · 10 min
  56. LLM Evaluation Breaks When Teams Trust One ScoreModel Evaluation · June 23, 2026 · 9 min
  57. EU AI Act GPAI Transparency Code: Ship the ControlsAI Frontiers · June 23, 2026 · 12 min
  58. Small Models Are Taking Over the Sovereign AI StackAI Frontiers · June 23, 2026 · 10 min
  59. Your ML Team Probably Doesn't Need a Feature StoreAI Frontiers · June 23, 2026 · 12 min
  60. Running Local AI Models Just Crossed the LineAI Frontiers · June 23, 2026 · 12 min
  61. Long Context vs RAG: Stop Chunking at the Right TimeMemory & Context · June 22, 2026 · 11 min
  62. AI Coding CLI Telemetry Has an SSD ProblemModel Evaluation · June 22, 2026 · 10 min
  63. Conductor LLMs Make Model Choice a Product LeverAI Frontiers · June 22, 2026 · 12 min
  64. Frontier Model Access Can Vanish. EU Teams Need a PlanAI Frontiers · June 22, 2026 · 11 min
  65. Vector Database Comparison: Speed Is the TrapMemory & Context · June 22, 2026 · 12 min
  66. AI Video Generator Comparison: Pick What ShipsAI Economics · June 22, 2026 · 13 min
  67. Self Hosted Open Models Win After This Cost CliffAI Economics · June 22, 2026 · 11 min
  68. Fable Without Fable: Sakana Fugu Ultra's Big BetModels & Releases · June 22, 2026 · 10 min
  69. LLM as Judge Needs Calibration Before CI GatesModel Evaluation · June 22, 2026 · 10 min
  70. AI Product UX Is Moving Past the ChatboxAI Frontiers · June 22, 2026 · 11 min
  71. Production RAG Chunking Breaks at the BoundaryMemory & Context · June 21, 2026 · 11 min
  72. EU AI Act Compliance Has an Audit Gap ProblemAI Frontiers · June 21, 2026 · 11 min
  73. LLM Observability Must Catch Drift Before IncidentsModel Evaluation · June 21, 2026 · 11 min
  74. AI Safety Routing Is Real. The Audit Trail Isn't YetSecurity & Safety · June 21, 2026 · 12 min
  75. KV Cache Compression Is the New Inference LeverAI Economics · June 21, 2026 · 11 min
  76. The MCP Server Boom Moved the Moat to GatewaysAgents & Harnesses · June 21, 2026 · 11 min
  77. AI Feature Engineering Is the Product Moat NowAI Frontiers · June 21, 2026 · 10 min
  78. Voice Agent Latency Hit a Wall. Design Around ItAgents & Harnesses · June 21, 2026 · 11 min
  79. Siri AI Is Now a Routing Problem Developers OwnAI Frontiers · June 21, 2026 · 11 min
  80. AI Agent Identity Is the Next Platform BattleAgents & Harnesses · June 20, 2026 · 10 min
  81. Physical AI 2026 Hits the Jobsite BottleneckAI Frontiers · June 20, 2026 · 12 min
  82. AI Radiology Report Generation Moves Into the CursorAI Tools · June 20, 2026 · 10 min
  83. EU AI Act GPAI Enforcement June 2026: 14 Moves Before FinesAI Frontiers · June 20, 2026 · 11 min
  84. The Fable 5 Mythos 5 Export Directive Hit Your APIModels & Releases · June 20, 2026 · 11 min
  85. AI Inference TCO 2026: Tokens Beat FLOPSAI Frontiers · June 20, 2026 · 11 min
  86. VLLM vs SGLang: Pick by Workload ShapeAI Frontiers · June 20, 2026 · 11 min
  87. Custom AI Silicon Inference Cost Is Now Board-LevelAI Economics · June 20, 2026 · 11 min
  88. AI FinOps Is Now Board Work: Forecast Token SpendAI Frontiers · June 20, 2026 · 11 min
  89. AI Model Shutdown Risk Is Now a Friday ProblemSecurity & Safety · June 20, 2026 · 12 min
  90. Your Model Isn't the Agent. Your Agentic Harness Is.Agents & Harnesses · June 19, 2026 · 11 min
  91. One Mind or Many? The 2026 Subagent Systems PlaybookAgents & Harnesses · June 19, 2026 · 11 min
  92. Long-Horizon Agents Run for Hours. Wield Them SafelyModels & Releases · June 19, 2026 · 11 min
  93. Your MCP Server Is a Backdoor. Here's How to Harden ItAgents & Harnesses · June 19, 2026 · 12 min
  94. Your AI Agent Has the Keys. Here Is How to Contain ItAgents & Harnesses · June 19, 2026 · 12 min
  95. Human-in-the-Loop Doesn't Scale. Build On-the-LoopAgents & Harnesses · June 19, 2026 · 10 min
  96. Memory Poisoning: The Agent Attack That Survives a ResetMemory & Context · June 19, 2026 · 11 min
  97. The 800ms Bar Quietly Decides Your Voice Agent StackAgents & Harnesses · June 19, 2026 · 11 min
  98. Claude Artifacts Quietly Became an App PlatformAI Frontiers · June 19, 2026 · 10 min
  99. The AI Stack Is Fracturing. Here's What Builders Do NowAI Frontiers · June 18, 2026 · 12 min
  100. AI Agent Memory Got Crowded. Here's What ShippedMemory & Context · June 18, 2026 · 8 min
  101. Context Graphs: The Missing Layer Between Tools and AI AgentsMemory & Context · June 18, 2026 · 12 min
  102. AI Voice Agent Production Governance Checklist 2026AI Economics · June 18, 2026 · 9 min
  103. Voice Agent Evaluation: The Four-Metric ScorecardModel Evaluation · June 18, 2026 · 11 min
  104. EU AI Act August 2 Deadline: The GPAI Provider ChecklistAI Frontiers · June 18, 2026 · 12 min
  105. Continuous LLM Evaluation in Production: 7 PatternsModel Evaluation · June 18, 2026 · 10 min
  106. Static HTML vs JavaScript Rendering: The AI Crawler GapSearch & GEO · June 18, 2026 · 9 min
  107. Getting Cited by Perplexity: What It Actually QuotesAI Frontiers · June 18, 2026 · 11 min
  108. AI Hallucinated Citations in Court: 2026 Sanctions RulesSearch & GEO · June 17, 2026 · 10 min
  109. GPT-5.4 Drug Discovery: AI Improves a Lab ReactionModels & Releases · June 17, 2026 · 11 min
  110. OpenTelemetry GenAI Conventions: Instrument AI AgentsModel Evaluation · June 17, 2026 · 10 min
  111. How to Design a Custom LLM Eval in 2026 (Without MMLU)Model Evaluation · June 17, 2026 · 9 min
  112. AI Export Controls for Founders: A Deemed-Export PlaybookAI Frontiers · June 17, 2026 · 11 min
  113. EU AI Act August 2026: The Engineer's Compliance ChecklistAI Frontiers · June 17, 2026 · 11 min
  114. Fable 5 Export Controls: A New Model-Recall PrecedentModels & Releases · June 17, 2026 · 7 min
  115. The 2026 AI Coding Tool Stack: Which Tool for Which JobAI Tools · June 17, 2026 · 9 min
  116. Gemini CLI & Code Assist: Google's 2026 Coding StackAI Tools · June 17, 2026 · 10 min
  117. Aider in Practice: Terminal-Native AI Pair ProgrammingAI Tools · June 17, 2026 · 10 min
  118. Windsurf for Serious Builders: Cascade, Rules & MCPAI Tools · June 16, 2026 · 11 min
  119. Will Google Gemini Coding Catch Up to Codex and Claude?AI Tools · June 16, 2026 · 10 min
  120. Claude Code vs Codex 2026: Which Coding Agent Ships MoreAI Tools · June 16, 2026 · 13 min
  121. GitHub Copilot Power-User Guide 2026: Beyond AutocompleteAI Tools · June 16, 2026 · 14 min
  122. Cursor, Tuned: The Power-User Setup That CompoundsAI Tools · June 16, 2026 · 11 min
  123. Getting 10x More Out of OpenAI Codex: A Power-User PlaybookAI Tools · June 16, 2026 · 11 min
  124. AI Frontiers 2026: Diffusion Models, Multimodal AI & MoreAI Frontiers · June 16, 2026 · 18 min
  125. AI Models 2026: The Mid-Year Frontier and Open-Weight MapAI Frontiers · June 16, 2026 · 23 min
  126. The Magic They Switched Off: Get Your Claude Max Ready for Fable 5Models & Releases · June 15, 2026 · 20 min
  127. Securing AI Agents and LLM Apps: The 2026 Threat ModelSecurity & Safety · June 15, 2026 · 20 min
  128. Context Engineering for AI Agents: Memory, RAG & MCPMemory & Context · June 15, 2026 · 21 min
  129. Evaluating AI Models and Agents: The 2026 Field GuideModel Evaluation · June 15, 2026 · 22 min
  130. How to Make Your Claude Code Setup Far More ProductiveAI Tools · June 15, 2026 · 10 min
  131. AI Coding Tools in 2026: The Power-User Field GuideAI Tools · June 15, 2026 · 20 min
  132. GEO vs SEO: What Changes When You Optimize for AISearch & GEO · June 15, 2026 · 9 min
  133. Block or Allow AI Crawlers? GPTBot, ClaudeBot, CloudflareSearch & GEO · June 15, 2026 · 16 min
  134. Schema.org for AI Citations: What Actually Works in 2026Search & GEO · June 15, 2026 · 11 min
  135. AI Share of Voice: How to Measure It Across AI EnginesSearch & GEO · June 15, 2026 · 11 min
  136. Llms.txt Explained: Does It Actually Get You AI Citations?Search & GEO · June 15, 2026 · 10 min
  137. US Blocks Foreign Access to Anthropic's Fable 5 and Mythos 5Models & Releases · June 13, 2026 · 5 min
  138. AI Compute Cost in 2026: Build vs. Buy vs. Lease, by the NumbersAI Economics · June 12, 2026 · 10 min
  139. Neural Memory Abstraction: Context Management for AI AgentsMemory & Context · June 12, 2026 · 9 min
  140. Red-teaming AI in 2026: a practical adversarial testing guideSecurity & Safety · June 12, 2026 · 10 min
  141. Open-Source Reasoning Models in 2026: The Gap Has ClosedModels & Releases · June 12, 2026 · 9 min
  142. Geo-Aware AI Search: How Maps Grounding Rewires AI AnswersSearch & GEO · June 12, 2026 · 11 min
  143. Multimodal AI UX in 2026: voice, vision, and text patternsAI Frontiers · June 12, 2026 · 9 min
  144. AI in Education 2026: What the Evidence and Rollbacks ShowAI Frontiers · June 12, 2026 · 10 min
  145. LLMOps vs MLOps: The 2026 Guide to Operating AI AgentsAgents & Harnesses · June 12, 2026 · 10 min
  146. Multi-Modal RAG in 2026: Architecture, Benchmarks, and CostsModel Evaluation · June 12, 2026 · 9 min
  147. AI Agent Cost in Production: Real Per-Run Numbers for 2026AI Economics · June 12, 2026 · 10 min
  148. What Is MCP? Model Context Protocol Explained for 2026Memory & Context · June 12, 2026 · 10 min
  149. DiffusionGemma 26B-A4B: Can Diffusion Beat Autoregression?AI Frontiers · June 12, 2026 · 9 min
  150. OpenAI vs Anthropic IPOs: What the S-1 Race Means for AI CostsAI Economics · June 12, 2026 · 10 min
  151. Cursor vs Copilot vs Windsurf: The 2026 AI Coding Tool TestAI Tools · June 12, 2026 · 9 min
  152. SWE-bench Is Dead: Build Your Own LLM Eval Harness in 2026Model Evaluation · June 12, 2026 · 10 min
  153. Harness Engineering: Why Agent Reliability Beats Model IQAgents & Harnesses · June 12, 2026 · 10 min
  154. Claude Fable 5 vs GPT-5.5: Coding Benchmarks That MatterModel Evaluation · June 12, 2026 · 8 min
  155. Agentic AI vs Traditional Automation: 2026 Cost-Benefit AnalysisAI Economics · June 12, 2026 · 12 min
  156. Stateful vs. Stateless Agents: The 2026 Architecture DecisionAgents & Harnesses · June 12, 2026 · 9 min
  157. Modular Context Windows: The Future of AI Agent ReasoningMemory & Context · June 11, 2026 · 11 min
  158. Multi-Hop Reasoning vs Single-Hop Retrieval for AI AgentsMemory & Context · June 11, 2026 · 11 min
  159. AI Risk Management for Enterprises: Closing the Shadow AI GapSecurity & Safety · June 11, 2026 · 11 min
  160. RAG vs Fine-Tuning for LLM Agents: 2026 Cost BreakdownAI Economics · June 11, 2026 · 10 min
  161. Inference-as-a-Service in 2026: Cost, Speed, and ScaleAI Economics · June 11, 2026 · 11 min
  162. AI Agent Evaluation in 2026: Beyond LLM BenchmarksModel Evaluation · June 11, 2026 · 10 min
  163. Hybrid Context Storage: Vector + Graph Databases for LLM AgentsMemory & Context · June 11, 2026 · 10 min
  164. Fine-Tuning vs Prompt Engineering: The 2026 Cost BreakdownAI Economics · June 11, 2026 · 10 min
  165. Modular vs Monolithic Agent Architecture: 2026 VerdictAgents & Harnesses · June 11, 2026 · 10 min
  166. AI Decision-Making in High-Stakes Sectors: Risks and RewardsSecurity & Safety · June 11, 2026 · 10 min
  167. Is the AI Agent Memory Layer the Wrong Abstraction? 2026Memory & Context · June 11, 2026 · 10 min
  168. Anthropic S-1 IPO: What's Confirmed vs. The $965B LeakModels & Releases · June 11, 2026 · 12 min
  169. Best Local LLM for Coding on 16GB VRAM: June 2026 RankingsModels & Releases · June 11, 2026 · 10 min
  170. Agentic AI in 2026: Real Deployments, Real Failure RatesAgents & Harnesses · June 11, 2026 · 10 min
  171. Prompt Injection in 2026 Looks Nothing Like 2023. Here's ProofSecurity & Safety · June 11, 2026 · 10 min
  172. RAGAS vs TruLens vs DeepEval: The 2026 LLM Eval ShowdownModel Evaluation · June 11, 2026 · 10 min
  173. Stateless MCP Migration Guide: The 2026-07-28 RC ExplainedAgents & Harnesses · June 11, 2026 · 9 min
  174. AI Agent Observability in 2026: The New Telemetry StackModel Evaluation · June 11, 2026 · 10 min
  175. Reading AI System Cards in 2026: The Anthropic Walk-Back TestSecurity & Safety · June 11, 2026 · 11 min
  176. Claude Fable 5 First Look: Retention Rules Beat BenchmarksModel Evaluation · June 11, 2026 · 10 min
  177. Agent Harness Engineering and Agentic Loops: 2026 Field GuideAgents & Harnesses · June 11, 2026 · 17 min
  178. Generative Engine Optimization: How to Earn AI CitationsSearch & GEO · June 11, 2026 · 17 min
  179. AI Coding Agent Economics: Real ROI and Cost per Pull RequestAI Tools · June 11, 2026 · 20 min
  180. Context Rot and the Dumb Zone: Engineering Past 100k TokensMemory & Context · June 10, 2026 · 11 min
  181. SWE-bench Pro vs Verified: Can You Trust Coding Benchmarks?Model Evaluation · June 10, 2026 · 18 min
  182. AGENTS.md vs CLAUDE.md vs Cursor Rules: Config Done RightAI Tools · June 10, 2026 · 9 min
  183. The Ralph Wiggum Loop: Why Stateless Agents Beat Smart OnesAgents & Harnesses · June 10, 2026 · 9 min
  184. Reasoning-First LLMs: Make Models Reason, Not RationalizeModels & Releases · June 10, 2026 · 11 min