Security & Safety
Prompt injection, agent security, and AI safety in production: real attack surfaces, defenses that hold up, and how teams ship agents without shipping incidents.
PillarSecuring AI Agents and LLM Apps: The 2026 Threat Model
Why indirect prompt injection, tool-mediated exfiltration, and rogue agents now define LLM security, and the layered controls that actually hold.
Security & SafetyA Clinical Scribe Fell to Three Prompts. The VA Scaled It to 130 Sites
The Heidi Health NEXUS jailbreak proved safety lives in a text layer the model will gladly rewrite, and the VA just multiplied that risk across 130 facilities.
Security & SafetyFive Weeks Until EU AI Act High-Risk Day. Is Your Stack Ready?
The August 2, 2026 high-risk deadline stacks three compliance regimes onto a single AI product. Here's how to satisfy them simultaneously.
Security & SafetyAI Data Center Permitting Met a New Bypass: National Security
The DOJ's xAI intervention turns 'national security' into a legal lever to skip Clean Air Act permits, and founders building AI infrastructure should track every move.
Security & SafetyAI Safety Routing Is Real. The Audit Trail Isn't Yet
Routing risky prompts to safer models can be a serious governance control, but only if buyers can inspect the classifier, fallback chain, logs, and audit evidence.
Security & SafetyAI Model Shutdown Risk Is Now a Friday Problem
Anthropic's Fable 5 suspension turned model choice into an availability-control problem, and the fix is contractual, technical, and operational.
Security & SafetyRed-teaming AI in 2026: a practical adversarial testing guide
A step-by-step methodology for designing AI red-team exercises, plus an honest comparison of PyRIT, Garak, HarmBench, and Promptfoo.
Security & SafetyAI Risk Management for Enterprises: Closing the Shadow AI Gap
Four in five enterprise AI tools run unmanaged while the EU's high-risk deadline lands in August. Here's the playbook that actually closes the gap.
Security & SafetyAI Decision-Making in High-Stakes Sectors: Risks and Rewards
From NHS radiology wards to courtrooms and kill chains, AI is making consequential calls faster than the law can assign blame for them.
Security & SafetyPrompt Injection in 2026 Looks Nothing Like 2023. Here's Proof
Production attacks have moved to multi-step goal hijacking, context pollution, and delayed payloads while most deployed defenses still grep for 'ignore previous instructions.'
Security & SafetyReading AI System Cards in 2026: The Anthropic Walk-Back Test
Anthropic reversed Claude Fable 5's silent anti-sabotage clause in 48 hours. The episode is a repeatable audit template for every system card you'll read this year.