Apr 26, 2026

AEO Visibility Challenge — Week 3: 0/10 on Our Own Brand, npm Downloads Almost Doubled to 729

Week 2 went into building the tracker, not writing about it — so this is the combined Week 2 / Week 3 update for the Webappski AEO services visibility challenge. We open-sourced the measurement layer and re-ran our own brand through it. Score: still zero. Downloads: a quiet ~400 per week while the package shipped a plain text-and-table report, then almost doubled to 729 within ~48 hours of v0.2.5 replacing it with the visualized HTML report.

AEO Visibility Challenge — Week 3: 0/10 on Our Own Brand, npm Downloads Almost Doubled to 729

Quick answer

Webappski's open-source AEO tracker — @webappski/aeo-tracker (npm, MIT, link) — sat at roughly 400 weekly downloads for the first two weeks of its life as a plain text-and-table report, then almost doubled to 729 within ~48 hours of shipping v0.2.5 with a visualized HTML/SVG report. Webappski's own AEO visibility score for the same week: 0 of 10 automated cells (3 Gemini + 4 ChatGPT + 3 Claude). The comparable-to-Week-1 figure — dropping the ChatGPT stability cell and adding 3 manual Perplexity + 3 manual Copilot queries — is 0 of 15 (3 queries × 5 engines), exactly the Week 1 baseline. Both numbers are true. Below: why we built the tool instead of paying for one, what its first run on Webappski revealed, and the plan to move our number off zero.


Why we stopped paying for AEO trackers

In Week 1 we ran webappski.com through HubSpot's AEO Grader and Ahrefs' free AI Visibility tool. HubSpot's grader returned a score in the 28–44 range across separate runs the same week. Ahrefs' tool returned a different shape of result entirely. Direct API tests against the same engines showed zero mentions across every query.

Over the two weeks that followed, we extended the experiment to the paid trackers our prospective clients had been asking us about: Profound, Otterly, and Peec.ai (pricing as of April 2026: $499/mo, €69/mo and €89/mo respectively, entry plans).

The results were not encouraging.

  • Different vendors returned different numbers for the same brand in the same week.
  • None of them would show us the underlying AI response their score was supposedly computed from.
  • Several appeared to scrape Bing or Google SERPs and infer "ChatGPT visibility" from the result — which is not the same thing as a ChatGPT API call.
  • Every dashboard wrapped its measurement in a proprietary scoring layer ("Brand Presence Index", "Visibility Score", "AEO Index") whose formula was not documented anywhere we could find.

For an agency that intends to charge clients for AEO work, this was untenable. We need a number we can defend in a meeting. A black-box score from a vendor we cannot independently verify is not a number we can defend.

So we wrote down the minimum requirements for a tracker we would trust:

  1. Direct API calls to ChatGPT, Gemini, Claude and Perplexity — not scraping, not proxied browser sessions.
  2. Every raw response saved to disk, so any number in the report can be audited back to the exact AI reply it came from.
  3. No proprietary scoring layer — show mention counts and ranks, not an opaque index.
  4. Open source, so anyone (clients, competitors, journalists) can read the code and contest the logic.
  5. Cheap enough to run weekly on a pre-revenue brand. Sub-dollar weekly cycles.

Nothing on the market cleared that bar. So we built it.


What @webappski/aeo-tracker is

@webappski/aeo-tracker is a Node.js command-line tool published on npm under the MIT licence. It runs locally on your machine, calls the official APIs of ChatGPT, Gemini, Claude and Perplexity, and produces a Markdown report with inline SVG charts plus a fully interactive HTML dashboard. The package has zero runtime dependencies — the entire CLI, including the chart renderer, is plain Node.js 18+. You can audit it in an afternoon.

Three commands, start to finish:

npm install -g @webappski/aeo-tracker
aeo-tracker init --auto
aeo-tracker run
aeo-tracker report --html

init --auto fetches your homepage, asks an LLM to suggest category-appropriate queries, validates them with a second model, and writes a config. run calls each AI engine whose API key is set in your shell environment. report --html renders the deliverables.

A few design choices that map directly to the frustrations above:

  • Direct API calls, nothing in between. No web scraping, no browser automation, no proxied sessions. When the tracker says "ChatGPT" it means a call to the OpenAI API. When it says "Gemini" it means the Google Gemini API. The model identifiers are config-driven defaults you can override per run.
  • Two-model LLM cross-check on competitor extraction. After each AI response comes back, two cheap classification models independently extract every brand name mentioned in the text. If both agree, the brand lands in the "verified" tier of the report. If only one model agrees, it lands in "unverified" with a dashed badge. Hallucinated brand names — which single-model extractors return routinely — are filtered out by this cross-check.
  • Pre-flight query validation. Before any query hits the engines, a separate LLM pass checks whether each query is commercially ambiguous, too acronym-heavy, or outside the brand's actual category. Bad queries are rejected before you spend API credits on them.
  • Raw responses saved to disk. Every query × engine combination writes a JSON file under aeo-responses/YYYY-MM-DD/. Any number in the report is auditable back to the exact AI reply it came from.
  • Sub-dollar weekly runs. Two-engine coverage (OpenAI + Gemini) is roughly twenty cents per run. Full four-engine coverage with Anthropic and Perplexity added is approximately fifty-five cents.

The full source is at github.com/DVdmitry/aeo-tracker. The repository lives under the maintainer's personal GitHub hosting account while the npm package is published under the @webappski organisation scope; this is a deliberate choice we made to ship faster, not an accident, and we may consolidate ownership in a later release. Going forward we will dogfood the tracker weekly on Webappski itself and on every future client engagement — this post is the first weekly cycle.


What it shows for Webappski today

Here is the headline from the latest run, dated 21 April 2026:

0 out of 10 query-engine cells named Webappski. Score: 0%. Status: INVISIBLE.

We chose three queries that match the verticals we sell into — a commercial agency search, a startup-focused service search, and a problem-led informational search. None of them mention our brand by name. A branded query proves nothing; the engine cites you because you asked for yourself.

  1. Q1: "best Answer Engine Optimization agencies 2026" (commercial intent)
  2. Q2: "Answer Engine Optimization service for startups" (startup-vertical intent)
  3. Q3: "how to fix AI search visibility for my startup" (problem-led informational intent)

Per-engine breakdown:

  • Gemini: 0 of 3. Webappski did not appear in the answer text or the cited sources for any of the three queries.
  • ChatGPT: 0 of 4. Same outcome. ChatGPT's API was queried four times instead of three: three runs against the current browser-default model (the headline cells) plus one comparative run against an earlier model generation as a stability sanity check (counted as the fourth cell). All four returned zero mentions, so the headline shape — 0 — is identical either way.
  • Claude: 0 of 3. Webappski did not appear in any response.

That gives 0 of 10 automated cells (3 Gemini + 4 ChatGPT + 3 Claude), zero mentions in any of them. To reconcile with the Week 1 baseline of 0/15 (3 queries × 5 engines), we drop the extra ChatGPT stability cell and add manual incognito checks for Perplexity (3 queries) and Microsoft Copilot (3 queries) run on the same date: 9 automated + 6 manual = 0 of 15, the same shape as Week 1. Across every engine the tool can reach automatically and every engine our prospective clients use manually, we remain invisible.

There is one upside in this measurement that Week 1's manual approach did not give us: we now see exactly which competitors are being recommended in our place, on which query, by which engine.


Who AI named instead of us

Aggregated across all ten cells, four other AEO trackers showed up by name in the responses — each in a single cell.

  • Webappski (us): 0 / 10 cells — Baseline.
  • Profound: 1 / 10 cells — Closed-source dashboard, entry plan from $499/mo.
  • Scrunch AI: 1 / 10 cells — Closed-source dashboard, pricing not publicly listed.
  • Peec.ai: 1 / 10 cells — Closed-source dashboard, from €89/mo.
  • Otterly.ai: 1 / 10 cells — Closed-source dashboard, from €69/mo. Surfaced on the Q1 Claude response.

Below the tracker tier, AI engines named several agencies:

  • For Q1 ("best AEO agencies 2026"): ChatGPT cited LSEO and WebFX in source text. Claude cited Profound, Scrunch AI, Peec AI and Otterly AI as the tooling layer.
  • For Q2 ("AEO service for startups"): Gemini named Flying Cat, NoGood, Marcel Digital and Minuttia. Claude named Discovered Labs, Omniscient Digital, Omnius and RevenueZen.
  • For Q3 ("how to fix AI search visibility for my startup"): Mostly category-level guidance, not specific brand recommendations. The cell where you most expect competitive citations turned out to be the cell with the least competitive density.

For a Week-1 agency in this category, that map is the plan. Each named competitor is somewhere AI engines have already learned to recommend, on a query our prospective clients are typing this month. Closing the gap means appearing on the same kind of source the engines used to learn those names.


What the tracker tells us to do this week

The tool generates engine-specific recommendations from the run data. Engine pull-pools are different — what works on ChatGPT does not necessarily work on Claude — so the action list is split per engine. The Week 3 list is straightforward and predictable for a brand at zero:

  • Gemini is grounded in Google Search results — this is documented behaviour (Google AI for Developers — Grounding with Google Search). Domain-authority sites Google already indexes for the relevant keywords therefore carry the most weight. Action: earn citations from high-DR publications. Add FAQPage and Article schema to landing pages so Gemini's structured-data signals can find us.
  • ChatGPT browsing has historically been powered by Bing's search index, per OpenAI's own Browse with Bing help docs and SearchEngineLand reporting. Review platforms and community Q&A pages are over-represented in Bing's authoritative result pool. Action: get listed on G2 and Capterra. Answer questions on Reddit and Quora where the tool name is mentioned naturally. Publish a comparison post.
  • Claude combines training-corpus knowledge with web search. Anthropic documents the web-search tool; the underlying provider has shifted over time, with Brave widely cited as the historical default. Developer ecosystems (npm, GitHub, dev.to) and product launch pages are over-represented in the training corpus. Action: ship on npm (we did — that is partly why this article exists). Publish on dev.to. Launch on Product Hunt.

We are executing on the Claude recommendation first, because it is the one with the most leverage for a company whose own product is also a developer tool: Webappski's tracker on npm and GitHub is itself an attempt to populate Claude's training corpus.


Why we open-sourced the measurement layer

A reasonable business move would have been to wrap @webappski/aeo-tracker as a paid SaaS at the low end of category pricing, charge for a hosted dashboard, and gate every export. We considered it. We went the other way for one straightforward reason: in this category, the measurement should be commodity — the interpretation and execution should not. Closed-source dashboards in this space currently price anywhere from tens of dollars (Otterly, Peec.ai) to several hundred (Profound) per month for the measurement layer alone.

A client who can independently install our tracker and see their own raw numbers is a client who can check our work week over week. They do not have to take our word for the score we present in the Monday meeting; they can run the same script and read the same JSON files. That is the kind of relationship we want.

What stays with Webappski — and what we charge for — is the part the CLI does not do. Reading the matrix on Monday morning. Deciding which gaps are worth filling and which are query noise. Pitching the canonical sources AI engines cite. Writing the comparison pages that move the score. Negotiating the listicle inclusions. Drafting the Reddit and Quora answers that will not be removed by moderators. None of that work is in the CLI, none of it is cheap, and all of it is why an AEO agency exists.

If that sounds interesting to you: webappski.com/aeo-services is where we book introductory calls. If it does not: the tracker is yours anyway — npm install -g @webappski/aeo-tracker, no telemetry, no analytics, no traffic to our servers. Your API keys and your data stay on your machine.


729 weekly downloads after the visualized report shipped

The chronology matters more than the headline number, because the curve is the more interesting story than the peak.

npm Weekly Downloads panel showing 729 for @webappski/aeo-tracker v0.2.5, captured 2026-04-25
npm's weekly download counter for @webappski/aeo-tracker, captured 25 April 2026 — 729 downloads, almost double the ~400 baseline of the prior text-only versions.

We first published @webappski/aeo-tracker on npm roughly two weeks before this report (around 12 April 2026). The early versions produced a plain text-and-table report — useful, but visually unremarkable. (For readers landing here cold: Answer Engine Optimization is the practice of making a brand surface inside AI assistant answers — ChatGPT, Perplexity, Gemini, Claude — the same way SEO makes a brand surface inside Google's blue links. A tracker, in this category, is the instrument that tells you whether the practice is working.) Across the first ten to twelve days the package settled into the ~400 weekly downloads range — a respectable baseline for a niche utility, but well within what a free open-source AEO tool can earn from word-of-mouth and an organisation-scope npm namespace alone.

On Thursday, 23 April 2026 we shipped v0.2.5, which replaced the plain text-and-table report with a visualized HTML report (inline SVG charts, an interactive engine matrix, a per-engine recommendation panel) plus a Markdown variant carrying the same charts. Within roughly 48 hours of that release, npm's weekly counter read 729 downloads — almost double the prior baseline, on the same package, in the same niche, with no new distribution channel switched on between the two snapshots.

We attribute the doubling to the visualization upgrade, not to a sudden burst of inbound traffic. Our TypelessForm case study — where we ran the new visualized tracker on a sister product and published the entire HTML report, including the cells where TypelessForm scored 33 out of 100 and the cells where it scored zero — publishes alongside this Webappski post on 26 April 2026, so it cannot have driven the 23–25 April download surge captured in the screenshot above. The 729 number came in before the TypelessForm cross-link went live. Likely contributors during that 48-hour window: direct npm-package-page traffic (the visualized README is materially more clickable than the previous text-only one), residual word-of-mouth from earlier versions, and CI / mirror caches that npm does not de-bot. Full referrer attribution and the steady-state weekly number will appear in the Week 4 status note on 3 May 2026, once the npm 7-day rolling window has fully turned over after v0.2.5 and once the TypelessForm cross-link has had a full week to contribute.

Note on the npm counter itself: weekly downloads include mirror caches and CI installs — npm does not de-bot its public counter — so the 729 figure should be read as "rough-order-of-magnitude growth signal", not a precise human-install count.

This is the cross-link the open-source release was designed to enable: a working agency dogfooding a working tool on a working product, with the numbers visible. The summary of that case study in one line: the tool produced a defensible 33/100 baseline for a brand that had been live for five weeks, and surfaced a competitor map (AnveVoice, Wispr Flow, Form2Agent) that mapped cleanly onto where TypelessForm needs to invest content next.

We will publish a similar Webappski-specific competitor breakdown when our own number begins to move. Until then, the meaningful update is that we have an instrument calibrated to detect movement when it happens, instead of relying on a vendor dashboard whose number can shift by fifteen points between Tuesday and Thursday for reasons no one will explain.


What stays the same in Week 3

Two things that have not changed since Week 1 are worth restating, because in this category restraint is a positioning move.

We have zero AI-engine citations. The 0 of 10 in this run is the third consecutive flat-zero measurement (we ran the tracker daily on April 19, 20 and 21 to validate stability — same result every day, same brands cited instead of us). An agency that just pivoted into AEO and cannot move its own number after three weeks is a fair target for scepticism. We are publishing this anyway, because the alternative — pretending we are further along than we are — is the exact thing we sell against.

We have not yet pitched any of the canonical sources AI engines cite. That is the obvious first move (a single citation on a high-DR domain propagates across multiple engines). Pitching takes a week per outlet to do properly. We will report on the first outreach round when it concludes — names, dates, response rates, and the engine-specific delta against the Week 3 baseline — in the Week 6 update on 17 May 2026.

Status of the five SaaS-AEO listicles flagged in Week 1. Week 1 named five existing lists (Discovered Labs, Team4 Agency, ABM Agency, Maximus Labs, Omnius blog) where Webappski was absent and committed to landing on at least one within four weeks. Honest update at the end of Week 3: outreach has not begun yet — the engineering effort to ship the tracker consumed the available capacity. We are extending the deadline by three weeks. The new target date for first listicle placement is Week 6 (the report published on 17 May 2026). Names, outlet contacts, dates and response rates will appear in that Week 6 update — extension explicitly acknowledged so the public commitment stays trackable.

The challenge is public for a reason. Anything we hide about our own visibility, we cannot credibly claim to fix for a client.


Try @webappski/aeo-tracker on your brand

The whole point of open-sourcing the tracker is that you should not take our number on faith. The minimum viable setup is two API keys and under a minute of terminal time:

npm install -g @webappski/aeo-tracker

export OPENAI_API_KEY="sk-proj-..."
export GEMINI_API_KEY="AIzaSy..."

aeo-tracker init --yes --brand=YOURBRAND --domain=YOURDOMAIN.COM --auto
aeo-tracker run
aeo-tracker report --html

That covers ChatGPT and Gemini at roughly twenty cents per run. Add an Anthropic key for Claude (about thirty cents) or a Perplexity key for the Perplexity column (about five cents) for full four-engine coverage at fifty-five cents per weekly run.

After the first run, the workflow is two commands once a week: aeo-tracker run && aeo-tracker report --html. The HTML report opens in your browser automatically.

If you run it on your own brand and want help interpreting the matrix, our AEO services page is the right starting point — bring the JSON file from aeo-responses/ and the HTML report, and we will tell you which gaps are worth your time and which are query noise.


Quick decision guide

A short situation-to-action map for readers who do not want to read the whole post:

  • You manage AEO for a brand and want a number you can defend in a meeting → install @webappski/aeo-tracker, run it weekly, ship the JSON output to whoever asks. Cost: $0.20–$0.55 per run.
  • You already pay for Profound, Otterly or Peec.ai and want to verify their numbers → run our tracker in parallel for one week and diff the per-engine cells. Disagreements indicate which dashboard's scoring layer is doing the heavy lifting.
  • You want help interpreting the matrix once you have it, or you want a third party to do the citation pitchingbook a Webappski AEO services call. Bring the JSON + HTML report from the CLI.
  • You are evaluating whether AEO is worth it before spending a dollar → run our tracker once on your own brand, free, and read the Week 1 baseline post to see what zero looks like for an agency that just pivoted into the category.
  • You are an open-source contributor or AEO researcher → repository, issues and PRs at github.com/DVdmitry/aeo-tracker. The two-model competitor-extraction logic is the part most worth reviewing.

FAQ

What is @webappski/aeo-tracker?

An open-source MIT-licensed Node.js CLI that measures how often AI answer engines — ChatGPT, Gemini, Claude, Perplexity — mention a brand. It calls the official APIs directly, saves every raw response to disk, extracts competitor mentions via a two-model cross-check, and produces a Markdown plus HTML report. Installable with npm install -g @webappski/aeo-tracker.

How is it different from Profound, Otterly, Peec.ai or HubSpot's AEO Grader?

Three differences. (1) Direct API calls to the engines — no scraping, no proxied browser sessions, no third-party scoring layer. (2) Every raw AI response is saved to disk, so any number in the report can be audited back to the underlying text. (3) It is free and open source under MIT. You read the code, you contest the logic, you fork it.

Does Webappski offer paid AEO services?

Yes — that is the agency. The tracker is the measurement layer. The agency is the interpretation and the execution: third-party citations, comparison content, listicle inclusions, schema audits, weekly reading of the report. webappski.com/aeo-services is where introductory calls are booked.

Why is your own AEO score still zero?

Because we are eighteen days into the agency's existence in this category and we have not yet pitched any of the canonical sources AI engines cite. We measure first, we publish honestly, and we will move the number when we move it. The challenge is public so that the gap between "what we sell" and "what our own number does" is visible.

What does 729 weekly downloads mean for a brand-new package?

npm's weekly download counter is a 7-day rolling window that includes mirror and CI installs. The package sat at roughly 400 weekly downloads for its first two weeks while it shipped a plain text-and-table report, then almost doubled to 729 within ~48 hours of the v0.2.5 release that replaced that report with a visualized HTML/SVG dashboard. The doubling is best read as a visualization-upgrade signal, not a fresh-launch spike. We will report the steady-state weekly number in Week 4 once the 7-day window has fully turned over after v0.2.5.


What's next

The publishing cadence between now and the next big report is deliberate, not a silence. Here is the schedule:

  • Week 4 (publishing 3 May 2026) — short status note. First outreach round to the canonical sources aeo-tracker flagged from competitor citations begins this week; we publish a compact pitch-list with outlets contacted, dates, and tracker delta (if any) against the Week 3 baseline.
  • Week 5 (publishing 10 May 2026) — short status note. Distribution: dev.to article live, plus a self-promo-tolerant Reddit community (likely r/SideProject or r/SEO — exact subreddit confirmed only after we read each rule set) and Hacker News (Show HN, when account karma threshold permits). Each is a different engine's preferred source pool. We do not commit to any specific subreddit until moderation rules are checked, because a removed post is a publicly-visible failure we cannot afford this early.
  • Week 6 (publishing 17 May 2026) — full report. Outreach response rates, the first listicle placement (if any), the engine-specific delta against the Week 3 baseline, and a re-run of aeo-tracker daily through the week with the diff published.

The last row of the Webappski platform table will be the same in every one of these updates: zero paying AEO clients onboarded so far. That is also the row that matters most. Visibility moves first, revenue moves second. The bridge between the two is what an agency is for.


Read Week 1 of the AEO Visibility Challenge for the full baseline measurement, or book an AEO services call if you want help interpreting your own tracker matrix. Newsletter signup and contact email are in the site footer.

← Back to all posts