Ahrefs studied AI Overview citations in 2026 and found that 76% of them came from pages that did not rank in the top 10 organic results for the same query.
Read that again. Three-quarters of the most valuable real estate in modern search goes to pages that classic SEO would call losers.
That single number is why generative engine optimization exists as a discipline. AI engines run their own retrieval stack, weight different signals, and render answers where the citation is the click. If you're optimizing for the ten blue links, you're optimizing for a surface that's shrinking under your feet.
Generative engine optimization (GEO) is the practice of making your content the source an AI engine chooses to retrieve, summarize, and cite inside its generated answer. It is adjacent to SEO but targets a different pipeline: chunk-level retrieval, reranking, and grounded generation rather than a ranked list of links.
TL;DR: AI answer engines now reach billions of users, and the citation slots inside their answers have become the contested real estate that page-one rankings used to be. The playbook has three layers: crawler access (allow retrieval bots, decide on training bots), extractability (answer-first structure, statistics, schema, static HTML), and measurement (AI share of voice tracked across a prompt library). The on-page tactics help at the margin; authority and retrieval signals still dominate.
Key takeaways
- ChatGPT hit 900 million weekly active users in February 2026, and Google's Gemini app passed 750 million monthly actives. These are discovery surfaces on par with 2010-era Google.
- Training crawlers (GPTBot, ClaudeBot) and retrieval crawlers (OAI-SearchBot, Claude-User) are different bots needing opposite robots.txt stances. Publishers figured this out: GPTBot crawl volume fell 87% in 2025 while OAI-SearchBot rose 312%, per Etavrian's server-log analysis.
-
llms.txtdoes not move citation rates in any controlled test to date. It is agent infrastructure, not a ranking tactic. - Roughly 69% of AI crawlers can't execute JavaScript, per Vercel's testing. Client-side rendered content is invisible to most of them.
- AI share of voice, measured against a real prompt library, is the KPI that replaces rank tracking. The discovery-gap audit is the operational core of a GEO program.
Why generative engine optimization matters now
The scale argument is no longer speculative. OpenAI disclosed 900 million weekly active ChatGPT users and 50 million paid subscribers in February 2026, confirmed by Search Engine Land. Google said AI Overviews reached 2 billion monthly users back in mid-2025, and Sundar Pichai has called AI Mode "the future of Search".
The displacement argument is just as concrete. SparkToro's 2024 zero-click study found roughly 60% of Google searches end without a click, and later Datos/SparkToro updates put the figure at 65-70% for queries that trigger AI Overviews. Chartbeat's publisher panel showed news referral traffic down 33% globally between mid-2024 and mid-2025.
And the clicks that survive go to cited sources. Seer Interactive's data suggests pages holding a top-3 cited position in AI Overviews saw organic click uplifts in the +35% to +91% range, though I'd treat those exact figures as directional; the underlying dataset hasn't been independently audited.
The qualitative point is solid either way: a query that used to send 100 clicks now sends 30 to 60, and they concentrate on whoever the engine cites.
So the strategic frame for 2026 is simple. AI citations sit upstream of the click. Getting cited is the click.
How AI engines decide what to cite
Every grounded AI answer comes out of roughly the same ten-stage pipeline, whatever the vendor calls it. Discover, fetch, extract, chunk, embed, index, retrieve, rerank, ground, cite.
The stages that matter most for practitioners are extraction and chunking. Engines strip your nav, footer, and cookie banners, then split the remaining content into retrieval units of a few hundred tokens.
If your key claim is buried in paragraph six of a meandering section, it lands in a weak chunk that loses the k-nearest-neighbor race.
At generation time, the model gets the top retrieved chunks with an instruction to answer only from those sources, then renders citations. Anthropic's citations documentation and Google's Vertex grounding docs describe this behavior at the API level, and it's the same mechanic behind the consumer surfaces.
One distinction has become consensus among practitioners: extractability versus rankability. Extractability is whether your content can be cleanly fetched, parsed, and chunked. Rankability is whether the engine prefers you over competitors for a given prompt.
The 2026 C-SEO Bench study sharpened this boundary. When the authors controlled for answer position and domain authority, the on-page tactics from the original GEO paper produced no statistically significant lift in citation rate.
Ethan Smith of Graphite put the practitioner version bluntly in a 2025 post: "You don't get cited by tweaking H2s. You get cited by being the page the engine already wants to retrieve.
The on-page work is about not getting filtered out, not getting picked."
That doesn't make on-page work pointless. It makes it table stakes that decides near-ties, while authority signals decide everything else.
GPTBot vs. OAI-SearchBot: the crawler split you must get right
The most consequential technical decision in GEO is understanding that AI vendors run two kinds of crawlers with opposite value propositions for you.
Training crawlers ingest your content to train models. You never recover a click from a training fetch. This category includes GPTBot (OpenAI), ClaudeBot (Anthropic), Google-Extended, CCBot (Common Crawl), Applebot-Extended, and Bytespider (ByteDance).
Retrieval crawlers fetch your content at answer time to ground a response and cite you. This category includes OAI-SearchBot and ChatGPT-User (OpenAI), Claude-User (Anthropic), PerplexityBot and Perplexity-User, and ordinary Googlebot, which grounds AI Overviews. OpenAI's bot documentation and Perplexity's crawler docs describe the splits explicitly.
Publishers internalized this asymmetry through 2025, and the server logs show it. Etavrian's analysis of 1,200 publisher domains found GPTBot crawl volume dropped 87% between March and October 2025 while OAI-SearchBot volume rose 312%. Block the bot that takes; allow the bot that gives back.
The consensus publisher robots.txt now looks like this:
# Retrieval bots: allow for citation
User-agent: OAI-SearchBot
Allow: /
User-agent: ChatGPT-User
Allow: /
User-agent: PerplexityBot
Allow: /
User-agent: Claude-User
Allow: /
# Training bots: block
User-agent: GPTBot
Disallow: /
User-agent: ClaudeBot
Disallow: /
User-agent: CCBot
Disallow: /
User-agent: Bytespider
Disallow: /
One trap deserves its own paragraph because it's the most misunderstood point in all of GEO. Blocking Google-Extended does not remove you from AI Overviews. Per Google's own crawler documentation, Google-Extended only controls Gemini and Vertex AI training. AI Overviews are grounded by standard Googlebot, a clarification Search Engine Journal covered in detail after publishers discovered their "opted out" content was still being cited.
The only true opt-out from AI Overviews is noindexing or blocking Googlebot itself, which removes you from Google entirely.
Do AI crawlers actually honor robots.txt?
Not reliably. Tollbit's Q1 2025 State of the Bots report found AI bots ignored disallow directives 30% to 50% of the time depending on the bot, with training-only crawlers the worst offenders and Googlebot and OAI-SearchBot the best behaved.
The pattern across studies is consistent and telling. The more a bot drives visible citations, the more it respects RFC 9309. The more a bot exists purely to harvest training data, the more it operates in a grey zone. Perplexity's fetcher has been observed in multiple investigations making requests under generic Chrome user-agent strings.
This compliance gap is why edge enforcement took off. On July 1, 2025, Cloudflare, which fronts roughly a fifth of the web, switched to default-blocking known AI crawlers and launched Pay Per Crawl, an HTTP 402-based micropayment gate documented in its AI Crawl Control changelog.
Robots.txt is a polite request. A WAF rule is a decision.
The litigation wave reinforces the stakes: Reddit sued Anthropic and later Perplexity, Ziff Davis sued OpenAI, and a parallel wave of licensing deals (OpenAI with News Corp, Axel Springer, Condé Nast, and Stack Overflow) created a two-tier world where licensed publishers get guaranteed ingestion and everyone else negotiates with their robots.txt and their CDN.
Is llms.txt worth publishing?
llms.txt is a Markdown index at your domain root, proposed by Jeremy Howard in September 2024, that lists your most important pages with one-line descriptions so an LLM agent can find your best content without crawling everything. Its companion,llms-full.txt, concatenates the full content of those pages.
Here's the honest scorecard as of mid-2026. Every controlled test has failed to find a citation-rate effect. Profound compared 200 matched pages with and without the file across ChatGPT, Perplexity, and Gemini and found no significant difference.
Mike King at iPullRank ran a 40-domain, 90-day A/B test and got the same null result. John Mueller's October 2025 statement was unambiguous: there's no llms.txt support in Google Search or AI features.
Adoption data matches the skepticism. Profound's May 2026 study found just 2.7% of the top 1,000 domains had a valid llms.txt, though notably 14% of the 200 most-cited domains in ChatGPT did.
But the file isn't dead; it's mislabeled. Otterly AI's March 2026 testing found that Claude-User and Perplexity-User do fetch llms.txt and llms-full.txt in a small but measurable share of agent sessions, and Vercel reported the file being requested about 1,800 times per day across a 100,000-site sample, largely by Claude's Research mode. Mordy Oberstein at Wix reports agent referral traffic to sites with llms-full.txt growing 5-10% month over month, albeit from a base under 1% of referrals.
So the practitioner verdict: publish a minimal llms.txt because the cost is an hour of work and the agent curve is real. Just don't book a ranking lift in your forecast, and don't let it displace work that actually moves citations.
The on-page citability playbook
The term GEO comes from Aggarwal et al.'s November 2023 paper, which benchmarked nine content tactics and reported citation-rate lifts of +40.4% for adding source citations, +41.3% for expert quotations, and +39.5% for statistics. Friendly tone did nothing, and keyword stuffing was actively negative.
Then C-SEO Bench re-tested those tactics in 2026 with tighter controls and found the lifts statistically insignificant once answer position and domain authority were held constant. This is the central empirical dispute in the field.
The honest reading: content quality moves citations directionally, the magnitude is contested, and nothing on-page overcomes a weak authority position.
With that caveat stated, the practitioner data converges on which content shapes get extracted:
| Pattern | Evidence | Practical move |
|---|---|---|
| Data-backed claims with named sources | Ahrefs found pages with 3+ sourced statistics cited at ~2.4x baseline (observational, so treat as an upper bound) | Put a specific number and an attribution in every pillar section |
| Lists and tables | Otterly: 62% of cited AI Overview snippets came from a list or table | Convert anything enumerable into structured markup |
| Answer-first structure | Perplexity citations resolve to sentence-level excerpts | State the answer in sentences one and two, evidence after |
| Visible author bylines + Person schema | Seer's dataset shows ~1.5-2x citation rates for attributed pages | Byline every page, link to a real bio page |
| Freshness | Chartbeat: pages updated within 30 days cited at ~1.7x equivalent stale pages | Maintain honest dateModified values, update quarterly |
On structured data, the types that matter areArticle,Person,Organization, andFAQPage, in JSON-LD, validated before deploy. The schema.org documentation and Google's Article markup guide are the canonical references. Profound's audit of 100,000 cited pages found valid Article + Person + Organization markup correlated with roughly 1.6-1.9x citation rates, strongest on YMYL topics where identity signals carry the most weight.
The JavaScript problem is back
The single most expensive technical mistake in GEO is client-side rendering. Vercel and searchVIU tested 30 AI crawlers in 2026 and found 69% could not execute JavaScript at all. GPTBot, OAI-SearchBot, ClaudeBot, and Claude-User all received an empty or near-empty DOM from JS-dependent pages.
This is 2015 SEO all over again, and the fix is the same: server-side rendering, static generation, or a pre-rendered HTML snapshot served to known AI user-agents. If your content doesn't exist in the initial HTML response, it doesn't exist for most of the machines deciding whether to cite you.
Don't forget the boring base layer either: a clean sitemap.xml with lastmod timestamps, one descriptive<h1>, H2s phrased as the questions users actually ask,<blockquote>for quotes, and real<table>markup for tabular data.
How to measure AI share of voice
You can't manage what you can't see, and rank trackers see nothing here. The replacement metric is AI share of voice: the share of category-relevant prompts where your brand appears in the citation list, weighted by citation position and prompt volume.
The definitions have fragmented across vendors. Profound uses a composite visibility score with exponential position decay (slot one worth 0.40, slot two 0.25, and so on). Ahrefs Brand Radar uses citation rate times prompt volume.
The choice of weighting matters less than the prompt library: 50 hand-picked prompts and 5,000 volume-weighted prompts will tell you different stories, and the gap between them is often bigger than the gap between you and your competitor.
The operational core of a GEO program is the discovery-gap audit:
- Build a prompt library of 100 to 500 prompts covering head terms, comparisons, and decision queries.
- Run them weekly across ChatGPT, Perplexity, AI Overviews, Gemini, and Copilot, capturing cited sources.
- Flag every prompt where a competitor is cited and you aren't. That's your discovery gap.
- Diagnose each gap: weaker content, missing statistics, no schema, JS-rendered page, or no content at all.
- Prioritize by prompt volume times position weight, close the top 10 to 20 gaps, and re-measure after one engine-update cycle of 2 to 4 weeks.
The case-study record suggests this loop works when executed seriously. Hashmeta documented a B2B SaaS client going from baseline to +300% ChatGPT citations and +156% Perplexity citations over six months against a 1,000-prompt library.
Geoly.ai's Velvet & Vine engagement reportedly moved a DTC brand from 12% to 74% citation rate in 90 days. Vendor case studies, so apply the usual discount, but the intervention pattern (new gap-targeting articles, schema rollout, statistic-rich rewrites) is consistent across them.
On the analytics side, expect undercounting. ChatGPT often strips referrers; Perplexity passes them most reliably. Build a GA4 custom channel group for the known AI referrer strings, then reconcile against server-side user-agent logs to catch the dark traffic, an approach Elevar documents for GA4 attribution cleanup.
One number worth watching weekly: AI bot crawl volume in your server logs. It's the leading indicator. Citation changes follow crawl changes.
Does any of this convert?
Honestly: it depends on your category, and the evidence is mixed. Amsive's 2025 study found cited pages converting at +18.68% versus uncited equivalents in enterprise e-commerce. C-SEO Bench found no significant conversion lift in its controlled setup, and Gap's much-cited case showed citation growth with flat conversions.
The pattern in the data: branded, e-commerce, and consideration-stage queries show the clearest lift. Long-cycle B2B shows the weakest. Track your own SOV-to-revenue correlation instead of borrowing an industry average.
What this means for you
If you run content, SEO, or growth, here's the Monday-morning version.
This week (engineering): Audit your robots.txt against the dual pattern: allow OAI-SearchBot, ChatGPT-User, Claude-User, PerplexityBot, and Googlebot; decide deliberately on GPTBot, ClaudeBot, and CCBot. Check whether your content survives JavaScript-disabled fetching. Start logging AI user-agents server-side.
This month (content): Retrofit your top 20 pages with answer-first openings, at least three sourced statistics each, visible bylines, and Article + Person + Organization JSON-LD. Publish a minimal llms.txt while you're at it; it costs nothing.
This quarter (measurement): Build the prompt library, baseline your AI share of voice with a tool like Profound, Otterly, or Brand Radar, and run your first discovery-gap audit. Make the monthly gap report the artifact your executive team sees.
And keep the strategic frame straight. The C-SEO Bench result isn't a reason to skip GEO; it's a reason to sequence it correctly. Extractability work keeps you from being filtered out.
Authority work, meaning the citable statistics, original data, and named expertise that make engines want to retrieve you in the first place, is what gets you picked.
The web's biggest distribution shift since mobile is being decided one citation slot at a time. The brands measuring it are already taking those slots from the brands that aren't.
Sources
- GEO: Generative Engine Optimization (Aggarwal et al., arXiv 2311.09735)
- C-SEO Bench: Does Conversational SEO Work? (arXiv 2506.11097)
- ChatGPT reaches 900M weekly active users (TechCrunch)
- Gemini app surpasses 750M monthly active users (TechCrunch)
- Ahrefs Brand Radar and AI visibility data
- Ahrefs: AI Overview brand visibility factors
- Overview of OpenAI crawlers (OpenAI developer docs)
- Perplexity crawler documentation
- Google crawler overview, including Google-Extended
- Search Engine Journal: Google clarifies Google-Extended
- GPTBot collapse, OAI-SearchBot surge (Etavrian)
- The /llms.txt proposal (llmstxt.org)
- RFC 9309: Robots Exclusion Protocol
- Cloudflare AI Crawl Control changelog
- Tollbit: State of the Bots
- Otterly AI: AI search monitoring
- Anthropic citations API documentation
- SparkToro 2024 zero-click search study
- Seer Interactive: how traffic from ChatGPT converts
- Hashmeta case study: 3x AI citations in 90 days
- Google: Article structured data
- Google: build and submit a sitemap
- Elevar: fixing direct/unassigned attribution in GA4
- Graphite: The future of search
