How Webappski Checks Whether AI Recommends YOUR Site — the Honest AEO Audit We Run for Clients (and Why It Beats a Closed Score)

Jun 19, 2026

How Webappski Checks Whether AI Recommends YOUR Site — the Honest AEO Audit We Run for Clients (and Why It Beats a Closed Score)

When a buyer asks ChatGPT, Gemini, or Claude for the best provider in your category, are you in the answer? Webappski's AEO audit reads each engine's answer to your real buyer questions, lets you click any result open and read the engine's word-for-word answer behind it, and states in writing what it does not measure. This is the exact audit we run on a client domain — walked through on a real one — and why a checkable report beats a closed dashboard score.

By Alex Isa, Lead Full-Stack Developer & AEO Lead, Webappski

How Webappski Checks Whether AI Recommends YOUR Site — the Honest AEO Audit We Run for Clients (and Why It Beats a Closed Score)

To check whether AI recommends your site, run your real buyer questions through each engine and read the verbatim answers — not one closed score. That is a Webappski AEO audit: we point our open tool at your domain and give you a report where every result clicks open to the engine's verbatim answer — showing where ChatGPT, Gemini, and Claude name you, where they name a rival, and what we did not measure.

An AEO audit is a structured read of how AI answer engines respond when a buyer asks for the best provider in your category. It is not a guess and not a vanity dashboard: it queries each engine on the exact questions your customers ask, records whether your brand is named or cited, and keeps the engine's word-for-word answer so every result is re-checkable. Webappski is an Answer Engine Optimization studio; we run this audit on a client's site the same way we run it on our own brand — with the same open tool and the same honesty about its limits.

This article is a buyer's walkthrough of that audit, not a tour of the tool. We show how we configure the audit for a client domain and a regional Poland/DACH question basket, run it, and what the client actually receives. The tool's internals — why the score is auditable, how citations are matched, why a single zero is reported as a range — are compressed below into one-paragraph trust proofs, with the full engineering deferred to our companion piece on whether you can trust an AI visibility score.

How Do I Check Whether AI Recommends My Site?

Ask the engines the questions your buyers ask, then read the answers — do not trust a single number. A real check means running each AI engine on your category's buyer questions ("best CDN providers 2026", "best low-latency video streaming CDN") and recording, per question and per engine, whether your brand is named, whether your domain is cited as a source, and which competitor took the spot you wanted.

You can do this by hand — open ChatGPT, Gemini, and Claude, ask the question, and read who they recommend. That is exactly how every honest audit starts, and you should do it once before paying anyone. The reason Webappski runs it as a structured audit instead is reproducibility: a one-shot manual check is a single sample of a non-deterministic system, so it tells you almost nothing about a trend. A configured audit freezes the questions, queries each engine's API with your keys, and saves every raw answer, so next month's run is comparable to this one's. That is the difference between a screenshot and a measurement.

What Does Webappski Configure Before Running a Client Audit?

We configure three things for the client: their domain, their real buyer questions, and the region those buyers are in. The audit is driven by a single config file (.aeo-tracker.json) that names the client's brand and registrable domain, the engines to query, and a question basket built from the prompts the client's customers actually type into an AI assistant.

The questions are the audit. A weak basket of vanity phrases produces a flattering, useless number; a basket of the buyer's real decision questions produces a number that predicts revenue. So we draft the basket from the client's category and sales calls — for a content-delivery provider, that is questions like "best CDN providers 2026", "best low-latency video streaming CDN 2026", and "alternatives to the market-leading CDN 2026" — and we freeze it, so every future run measures the same ground.

For European clients we add a regional axis. The audit can ask each question localized to a market with --lang and --geo — for example, running the Poland/DACH basket in the buyer's language. We label this honestly as soft-geo: it localizes the prompt, it does not spoof a German or Polish IP, so it is a directional regional signal rather than a guarantee of what a logged-in user in Munich sees. Naming that limit is the point — a closed tool would let the regional number read as more than it is.

What's Actually In the Report — and How You Verify Every Number

The report is a self-contained HTML file that shows the visibility number, the matrix of engines against questions, the verbatim answer behind every cell, the competitors who took the spots, and a header line stating exactly what the run did and did not measure. Every number is yours to keep, re-open, and re-run — nothing is locked behind a login.

Concretely, the report opens with one headline sentence of fact — in the illustrative walkthrough below, "Named in 5 of 18 answers" — followed by a grid. Each cell in the grid is clickable: open it and you read the engine's word-for-word answer to that question, so you can see why a cell is a hit or a miss instead of trusting a parser you cannot inspect. And because AI answers are non-deterministic, each cell can be asked several times: the report shows the presence rate with a Wilson confidence interval, so a single lucky or unlucky answer never decides your number. The competitor table lists only brands that two independent models both extracted from the answers, which is what keeps it from listing names a single model hallucinated.

Below is an illustrative fragment showing the shape of a client-domain audit deliverable — built around a fictional content-delivery provider we call Northwind CDN (northwind.example), on a frozen CDN buyer basket. Illustrative example — synthetic data, not a real client run, shown only to demonstrate the report's shape. We use a made-up brand on purpose: the layout, the receipts and the honesty markers are exactly what a client deliverable carries, while no real third party's figures appear.

Illustrative example — synthetic data, not a real client run, shown only to demonstrate the report's shape.
Report element	What the client sees in the deliverable
Headline fact	Named in 5 of 18 answers (a 28% presence rate on the frozen CDN basket) — illustrative figures
A cell that is a hit — clicked open	Gemini, "best low-latency video streaming CDN 2026": "Northwind CDN: a strong choice for ultra-low-latency, real-time streaming, with a large global edge network..." (illustrative — not a real engine response)
A second hit on another engine — clicked open	Claude, "best CDN providers 2026": "...providers worth shortlisting include Northwind CDN for its edge footprint..." (illustrative — not a real engine response)
A cell that is a miss — clicked open	ChatGPT, "best GPU cloud for AI inference 2026": answer names Vendor A, Vendor B, and a hyperscaler GPU cloud; the audited brand is absent — a gap visible in the raw text, not a parser error (illustrative)
Citation match (your domain cited as a source)	Gemini cited the brand's own domain (northwind.example) as a source on "best CDN providers 2026" — counted as the client's, because it is the client's registrable domain
Competitors who took the spots (two-model verified)	CDN A, CDN B, CDN C, a hyperscaler CDN, and a niche CDN (illustrative generic labels)
Measurement-surface disclaimer (header line)	API surface via your own keys (ChatGPT, Gemini, Claude) — a reproducible proxy, not the consumer apps; excludes Google AI Overviews / Copilot, which have no query API, and Perplexity, which is not part of this reproducible API run and is available only by manual paste

That fragment is the whole argument. In a real deliverable the 28% is not a number to take on faith — under it sits the exact Gemini sentence that names the brand and the exact ChatGPT answer that leaves it out, so the client can verify both with their own eyes and re-run the audit with their own keys. (The figures above are illustrative; on your own domain the receipts are your engines' real words.)

Why Is This Audit More Trustworthy Than a Closed Visibility Score?

Because every number in it is backed by evidence you can re-derive, and the report names its own blind spots. Six engineering choices make the audit honest; each is summarized here in one paragraph, with the full mechanism in our companion engineering piece so this walkthrough stays about your site, not our tooling.

It states what it does not measure. Every report carries a fixed disclaimer line: it measures each engine's API surface with your own keys — a reproducible proxy — not the consumer app a human uses, and it does not cover Google AI Overviews / AI Mode or Microsoft Copilot, which have no first-party query API. No closed tracker we have seen names what it omits; a number that hides its boundary is making a sale, not a measurement.

The score is not proprietary — and that is a feature for you. The visibility number is a plain presence rate over an open formula, auditable from the saved answers with your own keys. You cannot re-derive a competitor's closed dashboard score; you can re-derive ours from the files on disk, which is exactly why it is worth trusting.

You get the artifact, not a login. The output is a self-contained HTML report you keep, re-open, and re-run with your own keys — not a dashboard you rent access to. Every result clicks open to the engine's exact words, so the number is yours to verify, not ours to assert.

Every cell has a receipt, and competitors are dual-model verified. Behind each cell is a click-to-reveal of the engine's verbatim answer, so a hit or a miss is readable evidence rather than a parser's claim; and the competitor table keeps only brands two independent models both named, with a verbatim guard, which is the proof the tool does not invent rivals.

Citations are matched to your real domain, not a look-alike. A citation counts as yours when the source host is your registrable domain or a true subdomain of it; a spoof host that merely contains your name as a string does not count, and a real subdomain does. So your blog subdomain is credited and an impersonating host is not — the "cited you" figure is honest in both directions. Name-matching is just as careful: a real mention is not lost to a stray separator, so a brand written as "Northwind CDN", "Northwind-CDN", or "NorthwindCDN" is treated as the same brand. That separator tolerance is exactly the class of bug that used to turn a genuine mention into a false 0% — the engine named you, but a literal byte-for-byte match missed it because of a hyphen.

A 0% is reported as a hypothesis, not a verdict. AI answers are non-deterministic, so the audit can sample each cell several times and report the presence rate as a Wilson confidence interval, alongside a representativeness panel that shows how many of your product lines the basket actually touched. A single empty cell is an interval to test next run, not a declaration that you are invisible — and small samples are flagged as small, never sold as certainty. The report also splits your hit-rate by how close each question sits to what you actually sell — core (your direct offering), adjacent (a neighbouring market you compete in), and aspirational (where you are aiming next) — so a low overall number reads in context: missing in an aspirational category is ambition, not invisibility, while a gap in your core is the one that costs you buyers.

How Does the Webappski Client Audit Compare to a Closed Score Dashboard or Agency?

The honest difference is what you walk away holding: a checkable report with the raw answers behind every number, versus a number you must take on faith. The table below compares on buyer axes — what you receive, whether the evidence is in the report, what is honestly excluded, and the cost to start.

Buyer axis	Webappski client AEO audit	Typical closed score dashboard / agency
What you walk away with	A self-contained HTML report you keep and re-run — every cell click-reveals the engine's verbatim answer	A login to a hosted dashboard, or a slide with a single score
Verbatim receipts in the report	Every cell click-reveals the engine's word-for-word answer	Usually a number only; raw answers not exposed
Re-derivable by the client	Yes — run it again with your own API keys, open formula	No — the score cannot be re-computed from outside
What it honestly does NOT measure	Stated on every run: API surface, not consumer apps; no AI Overviews / Copilot	Rarely disclosed
Competitor list	Only brands two models both named (verbatim-guarded)	Varies; single-model lists can include hallucinated names
Cost to start	Free initial AEO audit of your site	Often a paid subscription or retainer before any data

This is not a claim that a dashboard is worthless — a good one tracks a trend usefully. It is a claim about evidence: when an AI engine's answer decides whether a buyer ever sees you, the report that shows you the answer is worth more than the report that shows you only a number.

What Should I Honestly Expect From an AEO Audit?

Expect a clear, evidenced starting line — not a magic switch. An audit is a readout, not a lever: it tells you where you stand across the engines today, with the proof attached, and it is the input to a plan, not the plan itself. Improving the number is separate work, and engines re-crawl on their own schedule, so a change you ship now typically shows up in a re-measure two to four weeks later, not the next day.

Expect honesty about scope, too. The audit reads the API surface, not the personalized consumer app; it does not cover Google AI Overviews or Copilot; and a small sample is a hypothesis, not a verdict. Those limits are printed on the report on purpose. An audit that promised certainty from a handful of queries would be the exact closed-score dishonesty this one is built to replace.

Frequently Asked Questions

How do I check whether AI recommends my site?

Ask each AI engine the buyer questions your customers ask — "best [your category] 2026", "[competitor] alternatives" — and read whether it names you, cites your domain, or names a rival instead. Do it once by hand across ChatGPT, Gemini, and Claude to get a feel for it. To make it repeatable and comparable month over month, Webappski runs a structured audit that freezes the questions, queries each engine's API with your keys, and saves every raw answer so the next run is comparable to this one.

What is an AEO audit and what do I get from Webappski?

An AEO audit is a structured read of how AI answer engines respond to your category's buyer questions. From Webappski you get a self-contained HTML report: the visibility number, the engine-by-question matrix, the verbatim answer behind every cell (click any result open to read it), the dual-model-verified competitors who took the spots, and a header line stating exactly what the run measured and did not measure. A deliverable you keep and can re-derive, not a login to a dashboard.

Do you measure what a real user sees in ChatGPT or Gemini?

We measure each engine's official API surface using your own keys, which is reproducible and auditable, and we say so on every run. That is a faithful proxy, but it is not the consumer app a logged-in human sees, which can use a different model version, personalization, and locale. We also do not cover Google AI Overviews / AI Mode or Microsoft Copilot, because they have no first-party query API. Naming those boundaries is what separates a measurement from a sales number.

Can you audit my site for a specific country like Poland or Germany?

Yes, with an honest caveat. The audit can localize each question to a market and run it in the buyer's language using a regional axis, so a Poland/DACH basket is read the way a local buyer would phrase it. We label this as soft-geo: it localizes the prompt, it does not spoof a national IP, so it is a directional regional signal rather than a guarantee of what a user physically in that country sees. We tell you which it is, on the report.

How much does a Webappski AEO audit cost?

The initial AEO audit of your site is free — we run your real buyer questions across the engines and show you where you stand, with the raw answers attached. Pricing for ongoing work is discussed after that first read, because the right scope depends on what the audit reveals. The point of the free audit is that you see the evidence before any commitment.

How long until an AEO audit shows improvement?

The audit itself is a readout you get quickly; moving the number is separate work. Because AI engines re-crawl and re-index on their own schedule, a change you ship today typically appears in a re-measure roughly two to four weeks later, not the next day. We re-run the same frozen questions so the before-and-after is a real comparison rather than two unrelated snapshots.

See Where AI Names You — Free

The fastest way to know whether AI recommends your site is to let us run your real buyer questions across ChatGPT, Gemini, and Claude and show you the raw answers behind every result. Request a free AEO audit and you will get a report where every result clicks open to the engine's verbatim answer: where your brand is named, where a competitor took your spot, and the exact measurement surface those numbers came from.

Webappski is an Answer Engine Optimization studio that runs this exact audit on its own brand before selling the service — see the AEO services we build on it. If you want the engineering behind why the number is honest, our companion piece walks through whether you can trust an AI visibility score in full.

This article was published on 19 June 2026. The audit described uses aeo-platform, the open-source tool Webappski builds and maintains; the current version is 1.5.0. The worked-example figures — named in 5 of 18 answers, a 28% presence rate, the competitor labels — are illustrative synthetic data for a fictional brand (Northwind CDN, northwind.example), shown only to demonstrate the report's shape; they are not a real client run. AEO is a fast-moving field; we update this article as the engines evolve. If you notice outdated information, contact us at info@webappski.com.

← Back to all posts