Early Data From Our AI Visibility Scanner — What We're Seeing So Far

Early findings: After 200+ scans through LLMGeoKit, the average AI visibility score is 59 out of 100 — a C grade. Most websites get the basics right (metadata, robots.txt) but fall short on AI-specific signals like llms.txt, structured data depth, and content extractability. This is early data from a self-selected sample, not a comprehensive study — but the patterns are consistent enough to be useful.

A note on methodology

This data comes from 200+ scans run through LLMGeoKit between January and March 2026. The sample is self-selected — people who found our scanner and chose to scan their sites. This skews toward more tech-aware companies. The real average across all websites is likely lower. We're sharing these patterns because they're directionally useful, not because they're statistically representative. We'll publish a more rigorous analysis once we have a larger, more diverse dataset.

The Average Score: 59 out of 100

Across our scan data, the average AI visibility score is 59 out of 100 — a C grade. That means the typical website gets the fundamentals right but has significant gaps in the dimensions that differentiate AI-visible sites from invisible ones.

Average score (out of 100)

Average grade

200+

Scans analyzed

Dimensions scored

A score of 59 means there's room to improve on almost every site we've scanned. But it also means that most of the sites reaching our scanner already have some awareness of technical SEO basics — which gives them a head start.

The 7 Dimensions: What We're Seeing

Our scanner evaluates 7 dimensions. While we're not publishing exact pass rates from this early dataset, the patterns across scans are clear enough to rank the dimensions from strongest to weakest:

Where most sites do well

Metadata is the strongest dimension overall. Most CMS platforms generate title tags, meta descriptions, and Open Graph tags by default. This is table stakes — important, but not a differentiator.

Robots.txt is generally fine. Most sites don't explicitly block AI crawlers. The exceptions are notable, though — some sites have inadvertently blocked GPTBot, ClaudeBot, or PerplexityBot, making themselves completely invisible to those AI assistants.

Content structure varies widely. Sites with proper heading hierarchies and semantic HTML score well here. Sites built primarily with divs and inline styles do not.

Where most sites fall short

Structured data is where the gap starts. Many sites have basic Organization or WebSite schemas (often auto-generated by CMS plugins), but few have the specific schema types that help AI understand page content — Article, FAQ, Product, HowTo. The difference between "has some schema" and "has useful schema" is significant.

Important nuance

Having "some" structured data isn't the same as having useful structured data. An Organization schema tells AI who you are. An Article schema with author, dates, and topic helps AI cite your content. A FAQ schema makes your answers directly extractable. Most sites stop at the first level.

Citation signals — author attribution, publication dates, canonical URLs — are frequently incomplete. Many sites have canonical URLs (CMS default) but are missing author and date signals that AI models use to evaluate source credibility and freshness.

Extractability is consistently weak. FAQ sections, definition lists, data tables, comparison grids — the content structures that AI can directly quote — are rare. Most websites are walls of continuous prose, which is harder for AI to extract specific facts from.

llms.txt is the weakest dimension by far. The vast majority of sites we've scanned have no llms.txt file. This is an emerging standard, so low adoption is expected — but it's also the single biggest quick win available.

The quick win

Creating an llms.txt file takes about 20 minutes and immediately improves your AI visibility score. It's the highest effort-to-impact ratio of any single optimization we've seen in the data.

Consistent Patterns

Even in early data, some patterns are consistent enough to act on:

llms.txt is the biggest gap. Nearly every site we scan is missing it. 20-minute fix.
Structured data needs depth, not just presence. Move beyond Organization/WebSite to Article, FAQ, and Product schemas on relevant pages.
Extractable content is rare. Adding FAQ sections, comparison tables, and definition lists to key pages makes your content easier for AI to quote.
Citation signals are incomplete. Author, dates, and canonical URLs on every page help AI trust and attribute your content.
Most sites aren't blocking AI crawlers — but check yours to be sure. Accidental blocks on GPTBot or ClaudeBot make you invisible to those platforms.

What's Next

This is early data from a small, self-selected sample. As our scanner processes more sites across more industries, we'll publish a more rigorous analysis with per-dimension pass rates, industry breakdowns, and score distributions.

For now, the directional takeaway is clear: the fundamentals are usually in place, but the AI-specific signals are missing. That's where the opportunity is.

Where does your site stand? Run a free AI visibility scan and see how you score across all 7 dimensions. 30 seconds, no signup required. For the full picture, read our AI visibility guide or see our analysis of 220 website scans.