# GenAlphAI > GenAlphAI is a research-driven AI publication for engineers and operators: deep, evidence-backed analysis of agentic systems, model evaluation, and the economics of AI software. ## Pillars - [SWE-bench Pro vs Verified: Can You Trust Coding Benchmarks?](https://genalphai.com/swe-bench-pro-vs-verified/): OpenAI deprecated SWE-bench Verified after finding flawed tests in 59.4% of hard tasks. How SWE-bench Pro and DeepSWE's 32.5% verifier error rate change agent evaluation. ## Articles - [Context Rot and the Dumb Zone: Engineering Past 100k Tokens](https://genalphai.com/context-rot-and-the-dumb-zone/): Context rot degrades LLM agents well inside advertised windows. Why the ~100k dumb zone exists, what 'lost in the middle' research shows, and the inner-loop/outer-loop architecture that fixes it. - [AGENTS.md vs CLAUDE.md vs Cursor Rules: Agent Config Done Right](https://genalphai.com/agents-md-vs-claude-md/): AGENTS.md, CLAUDE.md, and .cursor/rules compared: three-tier permissions, context budgeting, and the canonical-plus-adapters pattern that keeps coding agents obedient. - [The Ralph Wiggum Loop: Why Stateless Agents Beat Smart Ones](https://genalphai.com/ralph-wiggum-loop-stateless-agents/): The Ralph Wiggum loop re-feeds one prompt to a fresh agent process forever, using files and git as the only memory. Why this dumb pattern keeps winning. - [Reasoning-First LLMs: Make Models Reason, Not Rationalize](https://genalphai.com/reasoning-first-llms/): LLMs rationalize answers they already chose. Process supervision, self-consistency, and faithfulness probes force models to reason to the right answer. ## Important Links - [llms-full.txt](https://genalphai.com/llms-full.txt): Consolidated Markdown archive of all articles. - [RSS](https://genalphai.com/rss.xml)