Gen α AI · Field notes for AI builders

Depth over hype, for people who bet on AI.

Evidence-first analysis of agentic systems, model evaluation, and the economics of AI software. We read the system card, find the primary source, and tell you what actually changed — and what didn't.

Evidence over vibesDepth over volumeHonest about uncertainty
179Deep dives published
9Evergreen pillar guides
BiweeklyThe field briefing
Editor’s picksNew here? These are the pieces we’d hand you first.
The latestFresh analysis, published continuously — the full archive lives in the rail and the pillars below.
ChatGPT Logs Are Now Evidence. The Palisades Fire Trial Rewrites AI Governance.AI Frontiers

ChatGPT Logs as Evidence: What the Palisades Fire Trial Means for AI

A federal arson case just made AI conversation transcripts admissible in court, and that changes what every company must do about AI governance.

12 minJune 29, 2026
Cascaded vs End-to-End Voice Agents: Which Architecture Ships in Healthcare?Agents & Harnesses

Cascaded vs End-to-End Voice Agents: Which Ships in Healthcare?

The latency gap is narrowing, but the workflow, not the benchmark, picks the architecture.

13 minJune 29, 2026
Pentagon Agent Network: What PSP-2 Actually Tells Us About Defense Multi-Agent AIAgents & Harnesses

Pentagon Agent Network: The Multi-Agent Architecture No One Is Parsing

The DoW's second Pace-Setting Project names vendors, latency targets, and a hard human-in-the-loop line. The engineering questions are the interesting part.

10 minJune 29, 2026
Agent Reliability Needs a Score, Not a Gut FeelingAgents & Harnesses

Agent Reliability Needs a Score, Not a Gut Feeling

A five-metric scoring framework that turns production agent reliability from vibes into a number you can alert on.

11 minJune 29, 2026
Beyond Accuracy: The UX Metrics That Decide If AI Products SurviveAI Frontiers

Beyond Accuracy: The UX Metrics That Decide If AI Products Survive

Model accuracy gets the press release; task completion, trust, and retention decide what ships and what sticks.

11 minJune 29, 2026
Ford's Gray Beard Reversal: What It Teaches About AI Engineering LimitsAI Frontiers

Ford's Gray Beard Reversal: What It Teaches About AI Limits

A 350-engineer rehire exposes where domain knowledge still beats AI, and where it doesn't.

10 minJune 29, 2026
The 100-Tool Agent Trap: Why Less Is More in ProductionAgents & Harnesses

The 100-Tool Agent Trap: Why Less Is More in Production

The famous 3.2× failure stat is unverified, but the mechanisms behind it are real, and composition beats proliferation every time.

10 minJune 28, 2026
Shiller Was Right About AI Doommaxxing, and the Polling Already Proves ItAI Frontiers

Shiller Was Right About AI Doommaxxing. The Polling Proves It

A 70% job-loss fear gap is reshaping enrollment, hiring, and spending before any mass displacement has arrived.

12 minJune 28, 2026
Time-to-First-Token vs Word Error Rate: Picking an STT API in 2026AI Frontiers

STT API Showdown: TTFT vs WER in 2026

The fast-and-leaky vs slow-and-tight split between Deepgram and Gradium is now the production-defining buying decision for voice agents.

10 minJune 28, 2026
Clinical AI's Real Attack Surface Is the EHR Integration, Not the ModelSecurity & Safety

A Clinical Scribe Fell to Three Prompts. The VA Scaled It to 130 Sites

The Heidi Health NEXUS jailbreak proved safety lives in a text layer the model will gladly rewrite, and the VA just multiplied that risk across 130 facilities.

12 minJune 28, 2026
OpenAI's Jalapeño Chip Is an Inference Hedge, Not a Nvidia KillerAI Economics

OpenAI's Jalapeño Chip Is an Inference Hedge, Not a Nvidia Killer

The first OpenAI custom AI chip keeps the API intact, which is the part earlier custom-silicon efforts got wrong.

12 minJune 28, 2026
How to Debug an AI Agent Incident: A Postmortem PlaybookAgents & Harnesses

Your AI Agent Went Rogue on Friday. Here's the Postmortem

A blameless, SRE-style framework for the five failure modes traditional incident response was never built to handle.

17 minJune 28, 2026
Explore the pillarsNine durable guides that organize everything we publish.
AI Tools16 pieces

AI Coding Tools in 2026: The Power-User Field Guide

The gap between demo and production is the harness you build around the model, not the…

Explore →
Search & GEO11 pieces

Generative Engine Optimization: How to Earn AI Citations

Search is becoming synthesis. If ChatGPT, Perplexity, and Google's AI Overviews don't cite…

Explore →
Agents & Harnesses24 pieces

Agent Harness Engineering and Agentic Loops: 2026 Field Guide

Execution loops, externalized state, and verification gates now matter more than raw model…

Explore →
AI Economics17 pieces

AI Coding Agent Economics: Real ROI and Cost per Pull Request

Frontier labs now ship more AI-written code than human-written code, but the viral ROI…

Explore →
Model Evaluation25 pieces

Evaluating AI Models and Agents: The 2026 Field Guide

Why static leaderboards lost authority, and how to build an eval program that survives…

Explore →
Memory & Context16 pieces

Context Engineering for AI Agents: Memory, RAG & MCP

Why the context window, not the prompt, is the real bottleneck, and how to engineer…

Explore →
Security & Safety11 pieces

Securing AI Agents and LLM Apps: The 2026 Threat Model

Why indirect prompt injection, tool-mediated exfiltration, and rogue agents now define LLM…

Explore →
Models & Releases16 pieces

AI Models 2026: The Mid-Year Frontier and Open-Weight Map

How the open-weight cluster closed the gap, why reasoning became the default, and which of…

Explore →
AI Frontiers43 pieces

AI Frontiers 2026: Diffusion Models, Multimodal AI & More

A practitioner's map of frontier AI in mid-2026, where independent measurement finally…

Explore →