What Percentage of Perplexity Citations Are Paywalled? We Checked 1,385 Citations (2026 Data)
Just 0.4% of Perplexity's citations sit behind a hard paywall — 5 citations out of 1,385, all from the Wall Street Journal (4) and Barron's (1). Widen the definition to any paywall type — hard, metered, and registration walls combined — and the share rises to 6.1% (84 of 1,385). We measured this directly: 1,385 citations across 160 commercial queries spanning 8 industries, captured 2026-07-02. Paywalled sources are a small minority of what Perplexity actually cites — but they aren't rare in the answers themselves: 33.8% of answers (54 of 160) include at least one paywalled citation somewhere in their list.
Captured 2026-07-02. First-party data — full methodology below.
Two numbers answer two different questions. "How often does Perplexity link to a page I'd need a subscription to read?" — 0.4% (STRICT: hard paywalls only) to 6.1% (BROAD: hard + metered + registration combined). "How often does an answer include at least one paywalled source?" — 33.8%, roughly one answer in three. Perplexity isn't routing buyers into subscription walls as a rule. But a third of its commercial answers do brush up against one, usually via a single Forbes, New York Times, or Consumer Reports citation sitting alongside seven or eight free sources.
This is a live gap in the public record. Ask an AI engine "what percentage of Perplexity citations are paywalled" and you get hedging, not a number, because no one had published one. We built the dataset to close it.
Why this question exists in the first place
Publishers and GEO practitioners have a genuine, unresolved worry: if AI answer engines increasingly cite subscription media (the Wall Street Journal, Consumer Reports), does that squeeze out smaller, freely-accessible sites — or are paywalled publishers becoming irrelevant to AI search because their content can't be crawled and cited at all? Both stories circulate. Neither had a number attached to it, for Perplexity specifically, on commercial buying-decision queries — until now.
The overall paywall rate: two ways to read it
We ran 160 commercial queries — "best [category]," "is [product] worth it," "[A] vs [B]" — through Perplexity across 8 industries and classified every one of the 1,385 citations it returned by its domain's paywall status.
| Metric | % of citations | Count |
|---|---|---|
| STRICT (hard paywall only) | 0.4% | 5 / 1,385 |
| BROAD (hard + metered + registration) | 6.1% | 84 / 1,385 |
| Answers with ≥1 paywalled citation | 33.8% | 54 / 160 answers |
Three paywall classes make up BROAD:
- Hard — no free reads at all. WSJ and Barron's are the only two domains classified this way, accounting for all 5 STRICT citations.
- Metered — a limited number of free articles per month before the wall drops (NYT/Wirecutter, Consumer Reports, RTINGS, Wired, Fortune, Good Housekeeping, Business Insider, Condé Nast Traveler).
- Registration — free to read after creating an account or handing over an email (Forbes — the single largest paywalled-domain contributor by citation count).
STRICT is the narrowest, most defensible number: content you flatly cannot read without paying. BROAD is the more useful planning number, since a metered or registration wall still stops a large share of readers, even if the specific article Perplexity linked to happened to be a free read that day.
Where the paywall rate concentrates: per-industry breakdown
Paywall exposure is not evenly spread across commercial categories — it swings from 0% to 14.7% depending on what people are shopping for.
BROAD = hard + metered + registration paywalls combined. Each industry: 20 commercial queries. LumenGEO × DataForSEO, Perplexity, captured 2026-07-02.
| Industry | Queries | Citations | STRICT | BROAD | Answers touching a paywall |
|---|---|---|---|---|---|
| Ecommerce / retail | 20 | 150 | 0% | 14.7% (22) | 65% |
| Healthcare / wellness | 20 | 164 | 0% | 12.8% (21) | 65% |
| Finance / fintech | 20 | 177 | 2.8% (5) | 7.9% (14) | 35% |
| Legal services | 20 | 187 | 0% | 5.9% (11) | 50% |
| Home services | 20 | 189 | 0% | 4.2% (8) | 35% |
| Travel | 20 | 173 | 0% | 4.0% (7) | 15% |
| SaaS / software | 20 | 172 | 0% | 0.6% (1) | 5% |
| Marketing / SEO | 20 | 173 | 0% | 0% (0) | 0% |
| All industries | 160 | 1,385 | 0.4% | 6.1% | 33.8% |
A few things jump out:
- Ecommerce/retail leads on BROAD paywall share (14.7%, the highest of any vertical). Product-buying questions ("best robot vacuum," "is [brand] worth it") pull Perplexity toward consumer-testing publishers — Consumer Reports, RTINGS, Wirecutter — almost all metered. 65% of ecommerce answers touched a paywalled source.
- Healthcare/wellness is close behind (12.8% BROAD, 65% of answers) for the same reason: treatment and wellness-product comparisons pull from the same consumer-safety and lifestyle-media cluster.
- Finance/fintech is the only vertical with hard-paywall citations. All 5 STRICT citations came from finance queries, where Perplexity occasionally cited WSJ or Barron's directly — unsurprising, given what those outlets cover.
- Legal services is a split case. Its BROAD rate (5.9%) is mid-pack, but 50% of answers touched a paywalled source — driven mostly by Forbes Advisor's legal content recurring across many queries rather than many paywalled domains.
- SaaS/software (0.6%) and Marketing/SEO (0%) are the floor. Both categories are served almost entirely by trade publications, vendor pages, and review platforms (G2, Capterra) — structurally open, not paywalled.
Which domains actually carry the paywall
Eleven domains in the full 689-domain set were classified as paywalled at all. Ten of them account for 83 of the 84 BROAD citations; the eleventh, Condé Nast Traveler, contributed the final citation.
| Domain | Paywall class | Paywalled citations |
|---|---|---|
| forbes.com | Registration | 23 |
| nytimes.com (incl. Wirecutter) | Metered | 22 |
| consumerreports.org | Metered | 16 |
| rtings.com | Metered | 5 |
| wired.com | Metered | 4 |
| fortune.com | Metered | 4 |
| wsj.com | Hard | 4 |
| goodhousekeeping.com | Metered | 3 |
| barrons.com | Hard | 1 |
| businessinsider.com | Metered | 1 |
| cntraveler.com (Condé Nast Traveler) | Metered | 1 |
Forbes.com alone is bigger than every hard-paywalled domain combined (23 vs. 5) — the headline inside the headline: the largest paywalled contributor to Perplexity's citations isn't a subscription outlet at all, it's a registration wall, the mildest form of gating there is. NYT/Wirecutter and Consumer Reports round out the top three, both metered rather than hard-walled. WSJ and Barron's, the only two genuinely subscription-only sources in the sample, contributed a combined 5 citations out of 1,385.
Only 11 of the 689 unique domains cited across the whole study (1.6%) were classified as paywalled at all. Everything else — including Perplexity's two largest single sources by raw volume, reddit.com (146 citations) and youtube.com (133 citations) — is freely accessible. Those two alone account for 20.1% of all citations in the sample (279 of 1,385), which mechanically caps how high a paywalled share could ever climb. See our companion study on Reddit's share of Perplexity citations by industry for how that free, high-volume source behaves category by category, and our broader AI citation index for how brand, aggregator, and editorial domains split across AI search's retrieval landscape.
Why "answers touching a paywall" runs so far ahead of citation share
33.8% of answers vs. 6.1% of citations looks like a contradiction until you run the arithmetic. Perplexity cited an average of 8.7 sources per answer. Across the 54 answers that touched a paywalled source, those 84 paywalled citations average out to about 1.6 per answer — a paywall-touched answer is still overwhelmingly built from free sources. One Forbes or Consumer Reports link supplements an answer's source base; it doesn't replace it.
The practical read: a paywalled citation is usually a garnish, not the main course. A competitor cited via Forbes Advisor or Consumer Reports isn't winning the whole answer on that one link — they're getting one slot in a list still dominated by free content: Reddit threads, YouTube reviews, open editorial and vendor pages. The domain-authority study we ran on ChatGPT citations found something structurally similar: authority (there, Tranco rank; here, access model) is an edge, not a gate. Being paywalled doesn't disqualify a source, and being free doesn't guarantee one — but free, open, easily-fetched content fills the other 93.9% of the list.
Is your content actually fetchable — or effectively invisible?
Perplexity cites open, freely-crawlable pages 93.9% of the time in our sample. Your free audit checks whether your site is structured to be fetched, extracted, and cited at all.
Run my free GEO auditMethodology
Built to be reproducible.
- Engine: Perplexity, queried live with web search enabled, US locale, single capture on 2026-07-02.
- Sample: 8 industries × 20 commercial queries = 160 queries ("best [category]," "is [product] worth it," "[A] vs [B]" intent patterns): ecommerce/retail, healthcare/wellness, finance/fintech, legal services, home services, travel, SaaS/software, marketing/SEO.
- What we counted: every source Perplexity returned as an inline citation — 1,385 citations across 689 unique domains, averaging 8.66 per answer.
- How domains were classified: we explicitly classified the top 120 domains by citation count (57.8% of all citations), then grepped the remaining long tail against roughly 110 known paywall-brand names. In total, 124 of 689 domains were explicitly classified; everything else defaults to "free." Each classified domain got one of four states: free, registration (Forbes), metered (NYT/Wirecutter, Consumer Reports, RTINGS, Wired, Fortune, Good Housekeeping, Business Insider, Condé Nast Traveler), or hard (WSJ, Barron's) — see Limitations below for what "domain-level" classification does and doesn't confirm.
- Two metrics reported: STRICT (hard-paywall citations only) and BROAD (hard + metered + registration), plus an answer-level metric (share of the 160 answers with ≥1 paywalled citation).
Limitations
Read before treating any number above as more precise than it is.
- Single-snapshot capture. One Perplexity run per query on 2026-07-02 — not a longitudinal average. The citation mix can and does shift run-to-run; treat the percentages as a snapshot, not a permanent ranking.
- Domain-level classification, not URL-level. A domain marked "paywalled" reflects its general access model, not confirmation that the exact cited URL was gated when Perplexity fetched it. Several specific articles we spot-checked were, in fact, free to read.
- Long-tail coverage is partial. Only the top 120 domains (57.8% of all citations) plus a targeted grep against ~110 paywall-brand names were explicitly classified. The remaining 553 single-citation long-tail domains default to "free." If a hidden paywalled outlet exists among them, the true BROAD percentage could be marginally higher than 6.1%.
- Four domains relied on indirect verification. Bot defenses blocked a clean fetch for barrons.com, consumerreports.org, and forbes.com (all three classified paywalled), and g2.com (checked and confirmed free); each classification rests on corroborating public evidence rather than a direct page check.
- Per-industry percentages rest on 20 queries each. With an n of 20 per vertical, a single citation swings the smaller per-industry numbers materially — SaaS/software's 0.6% is 1 citation out of 172, and Marketing/SEO's 0% is 0 out of 173. Treat the per-industry breakdown as directional, not precise to a tenth of a percent.
- Reddit and YouTube structurally cap the ceiling. These two free platforms alone are 20% of all citations in this sample (146 + 133 of 1,385) — a fifth of the pool is already locked onto two unambiguously free sources before anything else is counted.
- Live paywall enforcement varies by session — logged-in vs. anonymous, geography, referrer, and monthly-meter state all affect whether a wall appears. A stateless fetch can't always observe a metered gate (nytimes.com/Wirecutter showed no server-side block on a fresh check, but its well-documented ~10-free-articles/month model is exactly the gate a returning reader would eventually hit).
We publish the limits because a paywall number is easy to overstate in either direction — "AI search ignores paywalled media" and "AI search is squeezing out the open web" are both simpler stories than the data supports. For how we approach measuring AI citations generally, see how to track your brand's citations in ChatGPT, Claude, and Perplexity and the best Perplexity SEO tracking tools.
What this means for GEO
- Paywalling your own content is a real citation cost, not a neutral choice. 93.9% of citations in this sample went to freely-accessible domains, and only 1.6% of unique cited domains were paywalled at all. Content behind a hard or metered wall opts out of the majority of what Perplexity cites, regardless of editorial authority.
- Registration walls are the least costly gate, if you must gate something (based on Forbes, our only registration-class domain). Forbes was the single largest paywalled contributor — bigger than every hard-paywalled domain combined — suggesting a light registration gate carries a much smaller citation tax than a hard or metered one.
- In ecommerce/retail or healthcare/wellness, expect paywalled competitors in roughly two-thirds of answers, the reality of those verticals' review ecosystems (Consumer Reports, Wirecutter, RTINGS, Good Housekeeping). The lever isn't to out-paywall them — see our Perplexity SEO guide for how sources actually get selected — it's to be the best free, extractable alternative next to them.
- In SaaS/software and marketing/SEO, paywalls are close to irrelevant on both counts (0.6% and 0% BROAD, respectively) — the competition for citations there is almost entirely other open content, not subscription media. Legal services doesn't fit that grouping: its BROAD citation share is a modest 5.9%, but 50% of legal-services answers touch a paywalled source (mostly recurring Forbes Advisor content) — light on citation share, heavy on answer-level exposure.
- Don't confuse "a paywalled competitor got cited" with "you lost the answer." A paywall-touched answer averages 1.6 paywalled citations out of 8.7 total — 7+ slots are still decided by the same free-content competition every other GEO lever addresses.
FAQ
What percentage of Perplexity's citations are paywalled?
In our first-party study of 1,385 citations across 160 commercial queries (2026-07-02), only 0.4% sit behind a hard paywall (5 citations — WSJ and Barron's). Widening the definition to include metered and registration walls raises the share to 6.1% (84 citations). Despite that small share, 33.8% of answers (54 of 160) include at least one paywalled citation somewhere in their list.
Does Perplexity avoid citing paywalled sources like the New York Times or Wall Street Journal?
No, but it clearly favors open content. Perplexity does cite subscription and metered outlets — NYT/Wirecutter (22 citations), WSJ (4), and Barron's (1) all appeared in our sample — but they're a small fraction of the total. Forbes, a registration wall rather than a hard paywall, was actually the single largest paywalled-domain contributor at 23 citations. Freely-accessible domains made up 93.9% of all citations in the study.
Which industries have the highest rate of paywalled Perplexity citations?
Ecommerce/retail had the highest BROAD-paywalled share at 14.7%, followed by healthcare/wellness at 12.8% — both driven by consumer-testing publishers like Consumer Reports, RTINGS, and Wirecutter. Marketing/SEO had zero paywalled citations, and SaaS/software had just 0.6%, since both are served almost entirely by trade publications and vendor content.
Should I put my content behind a paywall if I want to be cited by Perplexity?
The data argues against it. Only 1.6% of the 689 unique domains in our sample were classified as paywalled at all, and 93.9% of citations went to freely-accessible pages. If you must gate something, a light registration wall (Forbes' model — our only registration-class domain) appears to carry a much smaller citation cost than a metered or hard paywall — Forbes was the largest single paywalled contributor, ahead of every hard-paywalled outlet combined.
How was this Perplexity paywall study measured?
We ran 160 commercial queries (8 industries × 20 buyer-intent queries: "best X," "is X worth it," "A vs B") through Perplexity with web search enabled on 2026-07-02, and classified every one of the 1,385 inline citations by its domain's access model — free, registration, metered, or hard paywall. Classification covered the top 120 domains by citation volume (57.8% of all citations) plus a targeted grep of the long tail against known paywall brands; unchecked domains default to free. Full limits, including the domain-level (not URL-level) method, are disclosed above.