Archive
All articles
Every piece we've published, newest first. 189 articles and counting.
July 2026
June 2026
- Claude Science Is a Workflow Bet, Not a Model Bet
- Claude Sonnet 5 Makes Opus Hard to Justify
- A Delivery Company Trained a 1.6T Coding Model, No Nvidia
- Meta Curbed Claude Code and Codex. Distillation Is Why
- Brain2Qwerty v2 Hits Real-Time Typing, Still Stuck in a Room
- ChatGPT Logs as Evidence: What the Palisades Fire Trial Means for AI
- Cascaded vs End-to-End Voice Agents: Which Ships in Healthcare?
- Pentagon Agent Network: The Multi-Agent Architecture No One Is Parsing
- Agent Reliability Needs a Score, Not a Gut Feeling
- Beyond Accuracy: The UX Metrics That Decide If AI Products Survive
- Ford's Gray Beard Reversal: What It Teaches About AI Limits
- The 100-Tool Agent Trap: Why Less Is More in Production
- Shiller Was Right About AI Doommaxxing. The Polling Proves It
- STT API Showdown: TTFT vs WER in 2026
- A Clinical Scribe Fell to Three Prompts. The VA Scaled It to 130 Sites
- OpenAI's Jalapeño Chip Is an Inference Hedge, Not a Nvidia Killer
- Your AI Agent Went Rogue on Friday. Here's the Postmortem
- Five Weeks Until EU AI Act High-Risk Day. Is Your Stack Ready?
- LLM-as-Judge Reliability: The Cohen's Kappa Every Production Eval Needs
- Why Memory Bandwidth, Not Compute, Is the LLM Inference Bottleneck
- Government-Gated AI: Who Decides What Models You Can Run
- Voice AI Under 500ms: The Latency Budget That Decides Who Ships
- 15 Agent Benchmarks, Zero Safety Scores. Here's the Fix.
- LPU vs GPU Inference: Groq's 70% Latency Win, Decoded
- 92% Blew Their AI Budget. AI FinOps Is the Fix
- The "US" vs Them: Fable Off, GPT-5.6 Gated
- GPT-5.6 Deployment Starts Behind a Federal Gate
- Reasoning Models Break Guardrails 97% of the Time. Score It Like CVSS.
- C2PA Watermarking for Model Outputs: A 2026 Ship Plan
- VLLM vs TensorRT-LLM vs SGLang: 2026 Serving Benchmark, Same Hardware
- Multimodal Evals Are Now the Hardest Part of the Stack
- Facebook AI Mode Cites Your Friends' Posts. Here's the GEO Play
- Generative UI Quietly Became the Third Interface Pattern
- Neocloud GPU Economics Are Cheap, Fragile, and Winning Anyway
- Fable 5 Went Dark. Praxis Is the City Built to Outrun It.
- John Jumper's Move Says AI Life Sciences Is Now a Platform War
- Multimodal Evaluation Broke. Here's How Teams Fix It
- The AI Hallucination Tariff: 2026 Lawyer Sanctions Decoded
- AI Data Center Permitting Met a New Bypass: National Security
- AI's Grid Crunch: FERC Forces a Fast Lane for Data Center Power
- McCarthy Put Palantir on the Jobsite. Here's What Actually Ships.
- Document AI 2026: VLMs Didn't Kill OCR, Hybrid Pipelines Did
- Ambient AI Scribes Break at the EMR, Not the Mic
- On-Device AI's Real Bottleneck Isn't the Chip. It's the Memory
- Prompts Are Production Code. Treat Your Agent Pipeline Like Infrastructure.
- Multimodal Evaluation Has a 35-Point Blind Spot
- Synthetic Data Generation Breaks at the Tails
- The AI Biotech Stack Needs a Wet-Lab Clock
- AI Biology Timeline: When Models Reached the Wet Lab
- AI-Designed Medicines Just Hit the Biology Wall
- Fable 5 Biology Restrictions Have a Real Job
- The GEO Playbook: Getting Cited by AI Engines
- 14 Days of Fable 5: The Shutdown That Rewired AI
- US AI Regulation Is a Patchwork. Ship Controls
- AI Inference Hardware Has a New Cost Bottleneck
- LLM Evaluation Breaks When Teams Trust One Score
- EU AI Act GPAI Transparency Code: Ship the Controls
- Small Models Are Taking Over the Sovereign AI Stack
- Your ML Team Probably Doesn't Need a Feature Store
- Running Local AI Models Just Crossed the Line
- Long Context vs RAG: Stop Chunking at the Right Time
- AI Coding CLI Telemetry Has an SSD Problem
- Conductor LLMs Make Model Choice a Product Lever
- Frontier Model Access Can Vanish. EU Teams Need a Plan
- Vector Database Comparison: Speed Is the Trap
- AI Video Generator Comparison: Pick What Ships
- Self Hosted Open Models Win After This Cost Cliff
- Fable Without Fable: Sakana Fugu Ultra's Big Bet
- LLM as Judge Needs Calibration Before CI Gates
- AI Product UX Is Moving Past the Chatbox
- Production RAG Chunking Breaks at the Boundary
- EU AI Act Compliance Has an Audit Gap Problem
- LLM Observability Must Catch Drift Before Incidents
- AI Safety Routing Is Real. The Audit Trail Isn't Yet
- KV Cache Compression Is the New Inference Lever
- The MCP Server Boom Moved the Moat to Gateways
- AI Feature Engineering Is the Product Moat Now
- Voice Agent Latency Hit a Wall. Design Around It
- Siri AI Is Now a Routing Problem Developers Own
- AI Agent Identity Is the Next Platform Battle
- Physical AI 2026 Hits the Jobsite Bottleneck
- AI Radiology Report Generation Moves Into the Cursor
- EU AI Act GPAI Enforcement June 2026: 14 Moves Before Fines
- The Fable 5 Mythos 5 Export Directive Hit Your API
- AI Inference TCO 2026: Tokens Beat FLOPS
- VLLM vs SGLang: Pick by Workload Shape
- Custom AI Silicon Inference Cost Is Now Board-Level
- AI FinOps Is Now Board Work: Forecast Token Spend
- AI Model Shutdown Risk Is Now a Friday Problem
- Your Model Isn't the Agent. Your Agentic Harness Is.
- One Mind or Many? The 2026 Subagent Systems Playbook
- Long-Horizon Agents Run for Hours. Wield Them Safely
- Your MCP Server Is a Backdoor. Here's How to Harden It
- Your AI Agent Has the Keys. Here Is How to Contain It
- Human-in-the-Loop Doesn't Scale. Build On-the-Loop
- Memory Poisoning: The Agent Attack That Survives a Reset
- The 800ms Bar Quietly Decides Your Voice Agent Stack
- Claude Artifacts Quietly Became an App Platform
- The AI Stack Is Fracturing. Here's What Builders Do Now
- AI Agent Memory Got Crowded. Here's What Shipped
- Context Graphs: The Missing Layer Between Tools and AI Agents
- AI Voice Agent Production Governance Checklist 2026
- Voice Agent Evaluation: The Four-Metric Scorecard
- EU AI Act August 2 Deadline: The GPAI Provider Checklist
- Continuous LLM Evaluation in Production: 7 Patterns
- Static HTML vs JavaScript Rendering: The AI Crawler Gap
- Getting Cited by Perplexity: What It Actually Quotes
- AI Hallucinated Citations in Court: 2026 Sanctions Rules
- GPT-5.4 Drug Discovery: AI Improves a Lab Reaction
- OpenTelemetry GenAI Conventions: Instrument AI Agents
- How to Design a Custom LLM Eval in 2026 (Without MMLU)
- AI Export Controls for Founders: A Deemed-Export Playbook
- EU AI Act August 2026: The Engineer's Compliance Checklist
- Fable 5 Export Controls: A New Model-Recall Precedent
- The 2026 AI Coding Tool Stack: Which Tool for Which Job
- Gemini CLI & Code Assist: Google's 2026 Coding Stack
- Aider in Practice: Terminal-Native AI Pair Programming
- Windsurf for Serious Builders: Cascade, Rules & MCP
- Will Google Gemini Coding Catch Up to Codex and Claude?
- Claude Code vs Codex 2026: Which Coding Agent Ships More
- GitHub Copilot Power-User Guide 2026: Beyond Autocomplete
- Cursor, Tuned: The Power-User Setup That Compounds
- Getting 10x More Out of OpenAI Codex: A Power-User Playbook
- AI Frontiers 2026: Diffusion Models, Multimodal AI & More
- AI Models 2026: The Mid-Year Frontier and Open-Weight Map
- The Magic They Switched Off: Get Your Claude Max Ready for Fable 5
- Securing AI Agents and LLM Apps: The 2026 Threat Model
- Context Engineering for AI Agents: Memory, RAG & MCP
- Evaluating AI Models and Agents: The 2026 Field Guide
- How to Make Your Claude Code Setup Far More Productive
- AI Coding Tools in 2026: The Power-User Field Guide
- GEO vs SEO: What Changes When You Optimize for AI
- Block or Allow AI Crawlers? GPTBot, ClaudeBot, Cloudflare
- Schema.org for AI Citations: What Actually Works in 2026
- AI Share of Voice: How to Measure It Across AI Engines
- Llms.txt Explained: Does It Actually Get You AI Citations?
- US Blocks Foreign Access to Anthropic's Fable 5 and Mythos 5
- AI Compute Cost in 2026: Build vs. Buy vs. Lease, by the Numbers
- Neural Memory Abstraction: Context Management for AI Agents
- Red-teaming AI in 2026: a practical adversarial testing guide
- Open-Source Reasoning Models in 2026: The Gap Has Closed
- Geo-Aware AI Search: How Maps Grounding Rewires AI Answers
- Multimodal AI UX in 2026: voice, vision, and text patterns
- AI in Education 2026: What the Evidence and Rollbacks Show
- LLMOps vs MLOps: The 2026 Guide to Operating AI Agents
- Multi-Modal RAG in 2026: Architecture, Benchmarks, and Costs
- AI Agent Cost in Production: Real Per-Run Numbers for 2026
- What Is MCP? Model Context Protocol Explained for 2026
- DiffusionGemma 26B-A4B: Can Diffusion Beat Autoregression?
- OpenAI vs Anthropic IPOs: What the S-1 Race Means for AI Costs
- Cursor vs Copilot vs Windsurf: The 2026 AI Coding Tool Test
- SWE-bench Is Dead: Build Your Own LLM Eval Harness in 2026
- Harness Engineering: Why Agent Reliability Beats Model IQ
- Claude Fable 5 vs GPT-5.5: Coding Benchmarks That Matter
- Agentic AI vs Traditional Automation: 2026 Cost-Benefit Analysis
- Stateful vs. Stateless Agents: The 2026 Architecture Decision
- Modular Context Windows: The Future of AI Agent Reasoning
- Multi-Hop Reasoning vs Single-Hop Retrieval for AI Agents
- AI Risk Management for Enterprises: Closing the Shadow AI Gap
- RAG vs Fine-Tuning for LLM Agents: 2026 Cost Breakdown
- Inference-as-a-Service in 2026: Cost, Speed, and Scale
- AI Agent Evaluation in 2026: Beyond LLM Benchmarks
- Hybrid Context Storage: Vector + Graph Databases for LLM Agents
- Fine-Tuning vs Prompt Engineering: The 2026 Cost Breakdown
- Modular vs Monolithic Agent Architecture: 2026 Verdict
- AI Decision-Making in High-Stakes Sectors: Risks and Rewards
- Is the AI Agent Memory Layer the Wrong Abstraction? 2026
- Anthropic S-1 IPO: What's Confirmed vs. The $965B Leak
- Best Local LLM for Coding on 16GB VRAM: June 2026 Rankings
- Agentic AI in 2026: Real Deployments, Real Failure Rates
- Prompt Injection in 2026 Looks Nothing Like 2023. Here's Proof
- RAGAS vs TruLens vs DeepEval: The 2026 LLM Eval Showdown
- Stateless MCP Migration Guide: The 2026-07-28 RC Explained
- AI Agent Observability in 2026: The New Telemetry Stack
- Reading AI System Cards in 2026: The Anthropic Walk-Back Test
- Claude Fable 5 First Look: Retention Rules Beat Benchmarks
- Agent Harness Engineering and Agentic Loops: 2026 Field Guide
- Generative Engine Optimization: How to Earn AI Citations
- AI Coding Agent Economics: Real ROI and Cost per Pull Request
- Context Rot and the Dumb Zone: Engineering Past 100k Tokens
- SWE-bench Pro vs Verified: Can You Trust Coding Benchmarks?
- AGENTS.md vs CLAUDE.md vs Cursor Rules: Config Done Right
- The Ralph Wiggum Loop: Why Stateless Agents Beat Smart Ones
- Reasoning-First LLMs: Make Models Reason, Not Rationalize