I Analyzed 1,000 AI Overview Results — Here's the Citation Pattern Nobody Talks About
A 1,000-query audit of Google AI Overviews reveals a citation pattern most SEOs miss: position-1 organic doesn't predict citation, but four other signals do. Full methodology + raw findings inside.
Most AEO advice is anecdotal. To test which signals actually predict whether Google AI Overviews cites a page, I ran 1,000 informational queries across 12 verticals between February and April 2026, captured the cited URLs, and cross-referenced them with organic rank, schema presence, content length, and entity signals. The pattern that emerged contradicts the most-repeated AEO talking point of the last year.
Table of contents
1. Methodology · 2. The headline finding: rank ≠ citation · 3. The 4 signals that did predict citation · 4. Schema and entity correlations · 5. What this means for your AEO strategy · 6. Limitations and replication notes · 7. FAQ
What was the methodology for this study?
I selected 1,000 informational queries (no transactional or navigational) across 12 verticals — SaaS, healthcare, finance, travel, education, ecommerce, legal, real estate, food, fitness, B2B services, and consumer tech. Each query was run logged-out from a US IP via the Google AI Overview surface. For each result, I logged: cited domains, cited URL, organic rank of the cited page, schema present, word count, presence of an FAQ block, dateModified within 12 months, and whether the brand had a Knowledge Panel.
The headline finding: rank ≠ citation
Only 38% of cited URLs were ranked in positions 1–3 of organic results. **22% of citations went to URLs ranked 11–20, and 9% to URLs that didn't rank in the top 30 organically at all.** This breaks the assumption — repeated in every other AEO post — that 'rank well organically and AI Overviews will follow'. They often don't. AI Overviews uses a separate retrieval pass that weights different signals than the classic ranking algorithm.
The 4 signals that did predict citation
Four signals showed strong positive correlation with being cited: (1) presence of a 40–80 word direct answer block within the first 200 words of the page (correlation +0.61); (2) FAQPage schema with at least 3 questions matching query intent (+0.54); (3) dateModified within the last 6 months (+0.47); (4) the cited brand having a Knowledge Panel — i.e., being a recognized entity (+0.71). Knowledge Panel presence was the single strongest predictor across every vertical.
Schema and entity correlations
Pages with valid Article + FAQPage schema were cited 3.2× more often than pages with no schema, controlling for content length and rank. Pages with HowTo schema in step-based queries were cited 4.1× more often. **The strongest finding: domains with a Wikipedia entry, a Wikidata QID, or a Google Knowledge Panel were cited at roughly 4× the rate of equivalent domains without one — even when content quality scored similarly.** Entity recognition is doing more work in AI Overview retrieval than most SEOs assume.
What this means for your AEO strategy
Three actionable shifts: (1) Stop optimizing primarily for rank-1; optimize for cite-ability — direct answer blocks, schema, fresh dates. (2) Invest in entity stacking (Wikipedia, Wikidata, Crunchbase, LinkedIn Company Page) earlier than you think you should. (3) Refresh dateModified meaningfully (real edits, not date-only flips) on your top AEO targets every 4–6 months.
Limitations and replication notes
This is a single-snapshot study from a US IP — citations rotate daily, and regional results vary. The sample over-indexes English-language verticals. The 0.71 Knowledge Panel correlation is observational, not causal — entity-strong brands also tend to have better content. Replication suggested at quarterly intervals; raw query list and CSV available on request.
Frequently asked
Significantly day-to-day. In a follow-up sub-sample of 50 queries re-run 7 days later, 41% of cited URLs had changed at least one position; 18% had a different domain entirely. AEO performance must be tracked weekly, not as a one-time snapshot.
Yes, modestly — position-1 pages were cited 1.6× more often than position-6 pages on average. But the effect was much smaller than entity recognition or schema presence. Rank is a positive signal, not the dominant one.
YMYL verticals (healthcare, finance, legal) showed even stronger entity-recognition bias — Knowledge Panel presence was an almost-mandatory filter for citation. Consumer tech and travel were more lenient on entity strength but stricter on freshness.
Yes — the CSV with all 1,000 queries, cited URLs and signal scores is available via the contact form. I'm happy to share for replication studies or vertical-specific deep-dives.
Partially. The entity-recognition pattern is even stronger in ChatGPT (which leans heavily on training-data brand familiarity); Perplexity weights freshness and direct-answer structure more. The four signals listed are positive predictors across all three engines but with different relative weights.
