How is your website ranking on ChatGPT?
Synthetic Query Seeding for AEO: Use LLM Exploratory Queries to Win Early Discovery
Spotify’s Sep 8, 2025 AudioBoost results show that seeding autocomplete and retrieval with LLM generated synthetic queries can lift impressions, clicks, and exploratory completions. This playbook shows growth teams how to generate, score, and deploy intent rich queries by season, use case, price band, symptoms, and jobs to be done to capture high intent discovery across site search, retailer apps, and answer engines before traditional SEO loads.

Vicky
Sep 20, 2025
Why this matters now
On Sep 8, 2025 Spotify released the AudioBoost study showing that LLM generated synthetic queries, when indexed in autocomplete and retrieval, lifted audiobook impressions, clicks, and exploratory query completions for a cold start catalog. Treating autocomplete and retrieval as rankable surfaces measurably improved exploration. See the Spotify AudioBoost study for details. Earlier work from Spotify found that graph based query suggestions increased coverage and clicks on exploratory queries, underscoring autocomplete as a high leverage entry point. Review the graph learning for exploratory suggestions approach.
Thesis
Discovery now begins before a results page. Autocomplete menus, dynamic facets, related questions, and in app Q&A prompts are rankable. Seeding these surfaces with intent rich synthetic queries lets brands win high intent moments before traditional SEO has a chance to load.
The synthetic query matrix
Cover the full space of early exploration:
- Season or event
- Use case
- Price band
- Symptoms or problems
- Jobs to be done
- Audience segment
- Locale or language
- Compatibility or fit
- Benefits and tradeoffs
- Constraints like time and space
Example intents:
- Seasonal: patio heater for small balcony winter use
- Use case: laptop for video editing while traveling
- Price band: noise canceling headphones under 150
- Symptoms: itchy eyes relief for spring allergies
- JTBD: organize receipts for quarterly taxes
Generation workflow
1) Source ground truth
- Pull product and service metadata, reviews, UGC, customer support logs, past queries, SKU attributes, inventory and pricing, store availability, and policy constraints.
2) Prompt the LLM for exploratory coverage
- Template: Generate 25 natural language queries a shopper might type for {category or symptom} that reflect early exploration. Vary specificity, attribute combinations, and constraints. Include phrases common in {locale}. Range across price bands and jobs to be done. Exclude brand slurs and regulated claims.
- Add guardrails: policy snippets, banned claims list, medical or legal disclaimers, locale vocabulary, and competitive rules.
3) Score, dedupe, and cluster
- Normalize and language detect. Remove near duplicates with MinHash or cosine-similarity thresholds. Cluster by intent and attribute bundle. Score for novelty vs historical logs and expected answerability given current content.
4) Map queries to answers
- For each synthetic query choose a best landing object: answer snippet, buying guide module, comparison table, PDP section, store locator, or expert Q&A.
- Generate one short answer, one follow up suggestion, and canonical facets to pre select.
- Attach schema such as FAQPage, HowTo, Product variant, and medical disclaimers where appropriate. Create retrieval chunks tied to SKUs and policies so answer engines can ground responses.
5) Seed every rankable surface
- Autocomplete and suggested searches: push top intent clusters into site QAC with labels like for small rooms or under 200. Mirror to retailer partner QAC programs where available.
- Facets and filters: add synonyms and composite facet presets that reflect the synthetic intents. Example preset small balcony safe heaters with power 1200 to 1500 W and tip over shutoff.
- On site answers and Q&A: pre write short answers tied to inventory and policy. Surface in chat widgets and category landing pages.
- Collections and landing pages: auto build curated collections for each high value cluster and link from autocomplete and related searches modules.
- Emerging answer engines: provide structured answers and SKU grounded evidence in feeds or APIs where supported. Optimize for answer boxes that show clarifying prompts and follow ups. For parallel tactics at the browser level, see the Chrome-as-Answer AEO playbook.
Test and learn like a performance channel
Measure both offline and online, and iterate:
- Offline: retrievability uplift, intent coverage, novelty vs logs, answerability rate, toxicity and policy compliance.
- Online: autocomplete click through, exploratory query completion rate, search refinement reduction, PDP views per search, add to cart rate, margin weighted conversion, store pickup starts, customer support deflection.
- Use the AudioBoost deltas as directional benchmarks for early phases when you seed both autocomplete and retrieval.
Governance and risk controls
- Hallucination control: require evidence snippets and SKU ties for any claim. Block unsupported medical, legal, or safety statements. Insert disclaimers for symptom based guidance and route to licensed content where needed.
- Brand and compliance: canonical terminology, regional compliance flags, accessibility language. Maintain a denylist and a required phrases list per market. Extend governance into crawling surfaces by turning robots.txt as paid AEO.
- Freshness: auto retire queries that lead to out of stock items or expired promotions. Refresh seasonally and ahead of events.
Operating cadence
- Weekly: regenerate the top 20 percent of clusters by traffic and margin. Update QAC and related search modules.
- Monthly: expand matrix coverage and rotate collections. Re run dedupe and novelty checks.
- Seasonal: pre seed 6 to 8 weeks before peaks like back to school or holiday.
Quick start in 30 days
- Week 1: define the intent matrix and success metrics. Integrate product and policy data.
- Week 2: generate and score 5k to 10k synthetic queries. Map to answers and facets.
- Week 3: seed site QAC, related searches, and Q&A modules. Launch an A or B experiment. Extend to visual entry points with camera-first AEO with Amazon Lens.
- Week 4: analyze uplift, prune low performers, export winning clusters to retailer partners and answer engines.
Stack blueprint
- LLM orchestration with policy prompts and toxicity filters
- Vector store and lexical index for dual retrieval
- Query store for QAC and suggestion APIs
- Feature flags and experimentation
- Observability for coverage and compliance
What good looks like
- Autocomplete includes attribute rich intents that feel human and local.
- Exploratory query completions rise and refinements fall.
- Category pages and Q&A show grounded answers with clear next steps.
- Retailer app search exposes your curated presets.
- Answer engines surface your concise, supported answers before a traditional results page loads.
Summary
Spotify’s AudioBoost offers production proof that LLM generated synthetic queries can lift exploration when seeded into autocomplete and retrieval. Applying the same strategy across brand owned surfaces, retailer apps, and answer engines creates a durable early discovery moat for AEO.