Article

What Share of AI Overviews Answers Cite a Source Published in the Last 30 Days? (2026 Data)

11 min readBy Khalid Hamadeh, Founder, LumenGEO
Google AI OverviewsAI citationscontent freshnessoriginal researchcitation analysisGEO dataAIO

27.1% of Google AI Overviews answers cite a source published in the last 30 days. That figure rises to 37.1% (36 of 97) if you only count the answers where a reference was actually captured — the other 36 of 133 AIO answers in our sample returned an AI Overview with no attributable source at all. We captured Google AI Overviews for 160 commercial queries across 8 industries, pulled every cited URL, and fetched each one to extract its datePublished and dateModified. The median cited source is 8 months old by original publish date — but just 3.2 months old by last-modified date, because AI Overviews leans heavily on pages that get refreshed often even when their original publish date is years old.

Last updated: July 2026. First-party data — full methodology below.

Two numbers answer the freshness question, and they diverge on purpose. Answer-level: 27.1% of AIO answers cite at least one source from the last 30 days (37.1% among answers with a captured reference). Source-level: among the citations we could date, only 11.5% were published in the last 30 days — most cited pages are not brand-new. The bridge between those two facts is dateModified: 25.4% of citations were last updated in the last 30 days, more than double the publish-date rate. AI Overviews doesn't require new pages. It rewards old pages that keep getting refreshed.

How fresh are AI Overviews' cited sources, really

We ran the numbers two ways, because "freshness" means something different depending on which date field you measure — and AI Overviews' cited pages tell a genuinely different story by each one.

Age threshold% of citations, by publish date% of citations, by modified date
Under 30 days old11.5%25.4%
Under 90 days old27.1%48.4%
Under 1 year old59.1%85.3%
Median age250.1 days (~8 months)96.3 days (~3.2 months)
Mean age585.3 days (~1.6 years)198.6 days (~6.5 months)

Read the two columns side by side and the pattern is unmistakable: at every threshold we measured, "last modified" freshness runs higher than "originally published" freshness — most sharply at 30 days, where the modified-date rate (25.4%) is more than double the publish-date rate (11.5%). Only 40.9% of dateable citations by publish date are over a year old — but by modified date, that drops to 14.7%. The mean age (585.3 days) is heavily skewed by a long tail of genuinely old evergreen pages still getting cited, some with 2014-2017 publish dates; the median (250.1 days) is the more representative "typical" figure for a page in our sample.

Why published and modified dates tell different stories

The gap between 250.1 days (median publish age) and 96.3 days (median modified age) is not measurement noise — it is the single most useful finding in this dataset. Many of the pages Google AI Overviews cites are not new content. They are old URLs — sometimes years old — that get substantively updated on a recurring cadence, and the update is what keeps them in the cited pool.

This lines up with what we found in our earlier study of why AI citations decay: AI-cited content has a short measured half-life, and the winning defense is a maintenance cadence, not a constant stream of net-new URLs. This freshness dataset is the AI-Overviews-specific evidence for the same mechanic — a page's dateModified field is doing real work in what gets cited, even when datePublished is old.

If your content calendar is optimized for publishing new URLs, you are optimizing for the wrong date field. In our sample, AI Overviews' cited pages look far fresher by last-modified date (median 3.2 months) than by original publish date (median 8 months). A well-maintained page that has existed for two years and was substantively updated last month reads as "fresh" to whatever signal AI Overviews is using — a brand-new page you never touch again does not stay that way for long.

By industry: who gets the freshest citations

Freshness is not evenly distributed. Some categories reward recency far more than others — and the spread between the leader and the laggard is large enough to change how you'd plan a content-refresh calendar by vertical.

IndustryAIO answers% with a <30-day sourceCitationsDate coverageMedian source age (published)
SaaS / software1963.2%13571.9%221 days
Travel1827.8%8950.6%375 days
Ecommerce / retail1827.8%13255.3%226 days
Healthcare / wellness1625.0%9855.1%140 days
Finance / fintech1822.2%9957.6%510 days
Marketing / SEO2020.0%12166.1%238 days
Legal services1513.3%2060.0%163 days
Home services*90.0%728.6%267 days

*Home services has only 9 AIO-present answers and 7 total citation instances at 28.6% date coverage — too sparse to treat as more than directional; the 0% figure is likely an artifact of the small sample, not a real freshness gap.

SaaS/software stands well apart: 63.2% of its AIO answers surfaced a source published in the last 30 days, roughly triple Marketing/SEO's 20.0% among the industries with a reasonably-sized sample. That tracks with the categories' behavior — a plausible explanation is that SaaS pricing, feature, and comparison queries change fast enough that Google's AI favors recently-published pages, while marketing/SEO informational content skews toward evergreen guides. Finance/fintech shows the oldest typical source by publish date (510 days, ~17 months) despite decent date coverage (57.6%) — high-stakes categories lean on established, slower-moving sources rather than the newest thing published.

Free GEO audit

Is your content fresh enough to get cited in AI Overviews?

Your audit checks whether Google AI Overviews, ChatGPT, and Perplexity cite your site today — and flags which pages need a real refresh, not just a date bump.

Run my free GEO audit

Why 40% of citations don't have a usable date

Of 699 unique URLs cited across our 133 AIO-present answers, 420 (60.1%) yielded an extractable datePublished or dateModified — above the 40% honesty-gate threshold we hold ourselves to before reporting a rate as if it applies broadly. The remaining 39.9% splits into two structurally different buckets:

  • 96 URLs (13.7%) failed to fetch outright. 80 of those were HTTP 403 bot-blocks, concentrated on a handful of large, well-known publishers: Forbes, Investopedia, PCMag, Quora, Serious Eats, and Trustpilot. These sites are cited by AI Overviews but actively block the kind of automated fetch our extraction pipeline uses — so their real freshness is unknown to us, not old.
  • 183 URLs (26.2%) fetched fine but carry no byline date at all. This bucket is dominated by Reddit threads (85 URLs, 0% dated) and Google SERP/shopping result fragments (35 URLs); the remainder is a long tail of small-count domains, including Facebook (4 URLs) and Amazon (3). This is less an extraction failure than a feature of what AI Overviews cites: forum threads and search fragments simply don't carry a datePublished meta tag the way an article does.

Neither bucket should be read as "old." A 403-blocked Forbes page could be five days old; we cannot tell. Our 27.1% / 37.1% headline and the age-bucket percentages describe the 60.1% of citations we could actually date — the true rate for the rest is unknown, and any tool reporting one clean number without naming this gap is glossing over almost 40% of its own sample.

Method

Built to be reproducible, with every judgment call stated plainly.

  • Engine and window: Google AI Overviews, captured for 160 US commercial queries spanning 8 industries (Marketing/SEO, SaaS/software, Finance/fintech, Ecommerce/retail, Travel, Healthcare/wellness, Legal services, Home services), all fetched the same day, 2026-07-02.
  • AIO presence: 133 of 160 queries (83.1%) returned a visible AI Overview; the remaining 27 are excluded from the freshness analysis (no AIO, no citations to date).
  • What we counted: every reference URL the AI Overview displayed — 701 citation instances across 699 unique URLs (some URLs were cited more than once across different queries).
  • Date extraction — a 3-tier fallback: for each unique URL we fetched the live page and tried, in order, (1) JSON-LD structured data (datePublished/dateModified) — the source for 268 of 420 dated URLs, (2) meta tags (Open Graph article:published_time, etc.) — 112 URLs, and (3) a visible-text regex looking for a byline-style date on the rendered HTML — 40 URLs, the least reliable tier. We ran no DOM/JS-rendering library, so a page that only renders a date via client-side JavaScript shows as undateable even if a date exists.
  • Two caught errors, disclosed: the visible-text tier — the least reliable of the three — produced two confirmed false positives, both nulled before analysis. The first: a Semrush tool page whose regex match read as a publish date more than three months in the future (2026-10-13, against the 2026-07-02 capture date). The second: royalcaribbean.com's homepage, whose regex match read as the day after the capture date (2026-07-03). Neither was caught by the codified extraction bound, which only rejects a matched year after 2027 (see extract-dates.mjs) — both were caught in a separate manual cleaning pass. Treat any single visible-text-sourced date with more skepticism than a JSON-LD-sourced one.
  • Two denominators for the headline number, reported together, not collapsed: 27.1% treats the 36 of 133 AIO answers with zero captured references as "no recent source" (conservative). 37.1% excludes those 36 and asks the question only of the 97 answers where a reference was actually captured. We can't tell whether a zero-reference answer reflects a genuinely sourceless AI Overview or a gap in our own scrape of Google's UI, so we report both instead of picking one.
  • Age math: every "age" is computed against the fixed capture date, 2026-07-02, not the date you're reading this — reproducing this analysis later against the same raw data returns identical numbers.

Limitations

  • Date coverage is 60.1% (420/699 URLs) — above our 40% honesty-gate threshold, so we're not required to strictly reframe every number as "of dateable sources only." Still, every age-bucket percentage here is implicitly a rate among dateable citations unless stated as an answer-level metric (27.1% and 37.1% are answer-level and already account for undated references).
  • The undateable 39.9% is not one thing. 96 URLs (13.7%) failed to fetch, mostly HTTP 403 bot-blocking on major publishers. 183 URLs (26.2%) fetched fine but structurally carry no date — dominated by Reddit and Google SERP fragments. The second bucket is a feature of what AI Overviews cites, not a tooling gap.
  • Denominator ambiguity in the headline. 36 of 133 AIO-present answers (27%) had zero references captured by our scraper — we can't distinguish "genuinely no visible sources" from "our scrape missed them," so we report both the 27.1% and 37.1% figures.
  • The mean is not the typical value. Mean age by publish date (585.3 days) is pulled upward by a long tail of older evergreen sources, some dated 2014-2017, still being cited today. The median (250.1 days) is the more representative figure.
  • Published and modified dates measure different things — any freshness claim needs to specify which one it means, and this piece has tried to be explicit every time.
  • Per-industry breakdowns for Home services (9 answers, 7 citations, 28.6% date coverage) and Legal services (20 citations) are small-n — directional only.
  • Single-snapshot capture, one engine. All fetches happened the same day (2026-07-02) against 160 US commercial queries — not a statistically representative sample of all Google AI Overviews traffic, and Google AIO only, not ChatGPT, Perplexity, or other AI search engines. Treat the directional finding (modified-date freshness roughly doubles publish-date freshness) as durable and the exact percentages as a July-2026 snapshot. For how citation sets behave over time more broadly, see our ChatGPT domain-authority study and Perplexity's Reddit-citation breakdown — first-party snapshots with the same discipline about what a single measurement can and can't tell you. Our live citation tracker shows how these numbers move as we keep sampling.

What this means for GEO

  • Stop optimizing for "newly published." Only 11.5% of dateable AIO citations were published in the last 30 days. Chasing a constant stream of net-new URLs is the wrong lever for most categories.
  • Start optimizing for "recently substantively updated." 25.4% of dateable citations were modified in the last 30 days — more than double the publish-date rate. A real refresh (new data, a rewritten section, updated numbers) on a year-old page is a stronger freshness signal than a thin new page you never touch again. Update the visible dateModified timestamp together with a genuine change — see Google AI Overviews optimization for the fuller playbook and why AI citations decay for the maintenance cadence that holds citations once you have them.
  • Weight your refresh calendar by category. SaaS/software rewarded recency at 3x the rate of Marketing/SEO — in a fast-moving vertical, a monthly or quarterly refresh cadence is defensible. In slower-moving, authority-driven categories like finance or legal, a well-maintained but less frequently updated page still gets cited.
  • Know what you can't measure. Roughly 40% of the pages AI Overviews cites — bot-blocked publishers, Reddit threads, SERP fragments — don't expose a usable date. Don't assume a source is old just because you can't verify its date; plan your own site to be the opposite: dated, structured, and fetchable.

FAQ

What share of Google AI Overviews answers cite a source published in the last 30 days?

In our July 2026 first-party study of 160 commercial queries, 27.1% of AI Overviews answers (36 of 133 AIO-present answers) cited at least one source published within the last 30 days. That rises to 37.1% (36 of 97) if you only count answers where a reference was actually captured by our scrape — the other 36 of 133 answers returned an AI Overview with no attributable source in our capture, and we can't tell whether that reflects a genuinely sourceless answer or a gap in the scrape itself.

How old is a typical source cited by Google AI Overviews?

By original publish date, the median cited source in our sample is 250.1 days old — about 8 months. The mean (585.3 days, ~1.6 years) is much higher because a long tail of older evergreen pages, some dated 2014-2017, are still getting cited; the median is the more representative "typical" figure. Measured by last-modified date instead, the typical cited source looks far fresher: a median of 96.3 days, about 3.2 months.

Does Google AI Overviews prefer newly published content or recently updated content?

Recently updated, based on our data. Only 11.5% of dateable citations were published in the last 30 days, but 25.4% were last modified in the last 30 days — more than double. Modified-date freshness was higher than publish-date freshness at every threshold we measured (30, 90, and 365 days), though the gap narrows at longer horizons: about 1.8x at 90 days and 1.4x at 365 days. This suggests AI Overviews' cited pool leans on pages that get substantively refreshed on a recurring cadence, even when their original publish date is old, rather than requiring a constant stream of brand-new URLs.

Which industries get the freshest AI Overviews citations?

SaaS/software had the highest rate of recent-source answers in our sample at 63.2% — roughly 3x Marketing/SEO's 20.0% among the industries with a reasonably-sized sample (both had a well-sampled 18-20 AIO-present answers). Finance/fintech had the oldest typical cited source by publish date at a 510-day median (~17 months), consistent with high-stakes categories leaning on established, slower-moving sources. Home services showed 0%, but with only 9 answers and 7 total citations at 28.6% date coverage, that figure is too small-n to trust as a real finding.

Why don't all AI Overviews citations have a traceable publish date?

Of 699 unique URLs cited in our sample, 60.1% (420) yielded an extractable date. The remaining 39.9% splits into two different problems: 96 URLs (13.7%) failed to fetch, mostly HTTP 403 bot-blocks on sites like Forbes, Investopedia, PCMag, Quora, Serious Eats, and Trustpilot, so their real freshness is unknown to us rather than old. The other 183 URLs (26.2%) fetched fine but structurally carry no byline date — dominated by Reddit threads and Google SERP/shopping fragments, where a datePublished field simply doesn't exist to extract.