Every AI search engine gives answers. Not every answer includes your brand.
ChatGPT, Perplexity, and Google’s AI Overviews all pull from web content to generate responses. But they don’t pull equally. Some brands get cited repeatedly. Others, with similar content quality, get skipped entirely.
Understanding how to get cited by AI starts with understanding what these engines actually look for when they decide which sources to reference, and which to ignore.
The mechanics differ across engines. But the underlying signals overlap more than you’d expect. Here’s what actually drives citation decisions, and what you can do about each one.
How AI citation works under the hood
All three major AI search engines use some form of retrieval-augmented generation (RAG). The model doesn’t just “know” the answer. It searches the web, retrieves relevant pages, then synthesises a response from what it finds.
Google’s documentation on AI features explains how this works for AI Overviews and AI Mode: these features “surface relevant links to help people find the information they’re looking for quickly and reliably.” Both use a technique called “query fan-out,” which Google describes as “issuing multiple related searches across subtopics and data sources” to develop a response.
That fan-out step is critical. It means your content doesn’t just need to match the original query. It needs to match the subtopic queries the model generates internally. A page about “best CRM software” might get pulled into an AI response about “how to manage customer relationships for a small business,” because the model issued a fan-out query about CRM tools while building its answer.
Perplexity works similarly, with one notable difference: it has always included inline citations in every answer. As Perplexity states, “From day one, we’ve included citations in each answer, ensuring publishers receive proper credit and building user trust.”
ChatGPT search also retrieves web pages and cites them, though its retrieval approach uses Bing’s index rather than Google’s.
The result is the same pattern across all three: retrieve, evaluate, cite. The differences are in how they evaluate.
Signal 1: Source authority
This is the most foundational signal across every AI search engine.
According to Semrush’s analysis of AI Overviews, the systems behind AI Overviews “evaluate many signals, like content quality, source authority, and relevance, to determine what information to include.” Source authority isn’t a single metric. It’s a bundle of signals that includes domain reputation, backlink profile, brand mentions across the web, and historical trustworthiness.
For Google’s AI features specifically, there’s a direct connection to traditional search authority. Google’s documentation states that you should “apply the same foundational SEO best practices for AI features as you do for Google Search overall.” The AI layer doesn’t have its own separate authority system. It inherits authority signals from Google’s core search ranking systems.
This means every backlink, every brand mention, every piece of E-E-A-T signal you’ve built for traditional search also feeds your AI citation potential. Brands that already rank well organically have a head start.
For Perplexity, source authority shows up in which pages get cited when multiple sources cover the same topic. Perplexity’s Publishers’ Program launched with partners including TIME, Der Spiegel, Fortune, Entrepreneur, The Texas Tribune, and WordPress.com. These are established, high-authority publishers that Perplexity explicitly chose to build its citation system around.
What to do about it: Authority isn’t something you build overnight. But you can accelerate it by earning mentions on authoritative sites in your niche, building a consistent backlink profile, and ensuring your brand appears across multiple trusted sources. The AI ranking factors that matter most are the same ones that have driven organic search for years, with a few AI-specific additions.
Signal 2: Content structure and clarity
AI search engines don’t read pages the way humans do. They parse them. Content that’s well-structured, with clear headings, logical flow, and distinct sections, is easier for retrieval systems to extract and cite accurately.
Google’s AI features documentation lists specific technical best practices for appearing in AI responses: “making sure that important content is available in textual form,” “supporting your textual content with high-quality images and videos, when applicable,” and “making sure your structured data matches the visible text on the page.”
That last point is worth pausing on. Structured data (schema markup) acts as a translation layer between your content and AI retrieval systems. When your JSON-LD declares an entity type, its properties, and its relationships to other entities, you’re giving the AI engine a machine-readable map of what your page contains. This makes it significantly easier for the retrieval step to match your content to the right query.
Content structure also matters at the sentence level. AI engines extract specific claims and attribute them to sources. If your content buries a key fact inside a long, meandering paragraph, the retrieval system is less likely to pull it. Clear, direct statements that answer specific questions are more citable than walls of text.
What to do about it: Structure your content with descriptive H2 and H3 headings. Write clear topic sentences that state your main point up front. Use lists and tables where they make the content easier to parse. Add structured data that accurately reflects the content on the page.
Signal 3: Entity clarity
AI search engines think in entities, not just keywords. An entity is a distinct, well-defined thing: a brand, a person, a product, a concept. The clearer your content is about which entities it discusses and how they relate to each other, the more likely it is to be cited.
This shows up in how query fan-out works. When Google’s AI features issue “multiple related searches across subtopics and data sources,” they’re essentially decomposing a query into its component entities and relationships. A query like “best project management tool for remote teams” gets broken into entities (project management tools, remote teams) and relationships (best for).
Your content needs to clearly establish your brand as an entity, define what it does, who it’s for, and how it relates to the broader category. This is where having consistent information across your website, your Google Business Profile, and third-party sources matters. When multiple sources agree on what your brand is and does, AI engines have higher confidence in citing you.
What to do about it: Make sure your brand, products, and key people are clearly defined on your site. Use consistent naming across all platforms. Implement Organization, Product, or Person schema that maps your entities and their relationships. Keep your Google Business Profile and Merchant Center data accurate, as Google specifically recommends “checking that your Merchant Center and Business Profile information is up-to-date.”
Signal 4: Recency and freshness
AI search engines weight recency differently depending on the query type. For news and trending topics, freshness is critical. For evergreen content, it matters less, but it still matters.
The fan-out mechanism reinforces this. When Google’s AI features issue subtopic queries, some of those queries may be time-sensitive even if the original query isn’t. A question about “best practices for email marketing” might trigger fan-out queries about recent changes to email deliverability standards. If your content was last updated in 2022, it won’t match those fresh subtopic queries.
Semrush’s research on AI Overviews shows that AI Overviews are expanding beyond informational queries: “keywords triggering AI Overviews went from being 89.03% informational in October 2024 to just 57.16% informational in October 2025.” As AI Overviews expand into commercial and transactional queries, recency becomes even more important. Product pages, pricing, and feature comparisons go stale fast.
What to do about it: Audit your highest-value pages quarterly. Update statistics, examples, and references. Add last-updated dates to your content (and back them up with schema). Prioritise freshness on pages targeting queries where the answer changes over time.
Signal 5: Citation chains and corroboration
This is the signal most brands miss entirely.
AI search engines don’t just evaluate your page in isolation. They evaluate it in the context of what other sources say about the same topic. When multiple authoritative sources agree on a claim and reference each other, that claim becomes more “citable” for AI engines.
Think of it as corroboration. If your page says Product X costs $49/month, and three other authoritative sources confirm the same pricing, the AI engine has high confidence citing that claim. If only your page makes that claim, the engine may skip it or hedge with language like “according to one source.”
This is also where being cited by others creates a compounding effect. When authoritative publications reference your brand, your research, or your data, those references become part of the citation graph that AI engines use to evaluate trustworthiness. Perplexity’s Publishers’ Program is an explicit example: the company built revenue-sharing relationships with publishers specifically because their content forms the backbone of Perplexity’s citation system.
Google’s approach to corroboration comes through query fan-out. According to Google, AI features use fan-out to identify “more supporting web pages, allowing us to display a wider and more diverse set of helpful links associated with the response than with a classic web search.” The word “supporting” is key. Pages that corroborate each other get surfaced together.
What to do about it: Create original research, data, or frameworks that other sites will reference. Make your claims easy to verify with clear sources. Build relationships with publications in your space that naturally cite each other’s work. The goal is to become part of the citation network, not just a standalone source.
Signal 6: Technical accessibility
None of the above matters if AI crawlers can’t reach your content.
Google’s documentation is explicit about the baseline requirements: “To be eligible to be shown as a supporting link in AI Overviews or AI Mode, a page must be indexed and eligible to be shown in Google Search with a snippet.” There are “no additional technical requirements” beyond what’s needed for standard Google Search.
But “no additional requirements” doesn’t mean technical issues don’t block you. Semrush’s analysis identifies four common technical blockers: robots.txt files blocking crawler access, accidental noindex tags, 4XX errors on important pages, and poor site structure that prevents efficient crawling.
For Perplexity and ChatGPT search, which rely on Bing’s index, you also need to ensure Bing can crawl and index your pages. Check your robots.txt for any rules that might block BingBot, and verify your pages appear in Bing Webmaster Tools.
What to do about it: Run a technical audit focused on crawlability and indexability. Check that your robots.txt allows access to AI crawlers (GPTBot, PerplexityBot, Google-Extended). Verify your important pages are indexed in both Google and Bing. Fix any 4XX errors and ensure your internal linking makes every important page discoverable.
How the signals differ by engine
The six signals above apply across all AI search engines, but each engine weights them differently.
Google AI Overviews lean heavily on existing search authority. If you rank well organically for a query, you have a strong baseline for appearing in the AI Overview for that query. Google’s official guidance reinforces this: the same SEO best practices that drive organic rankings also drive AI feature inclusion.
Perplexity places more emphasis on citation chains and source diversity. It pulls from multiple sources per answer and actively shows inline citations, making corroboration a more visible part of its citation logic. Its Publishers’ Program suggests that editorial authority and trustworthiness carry significant weight in source selection.
ChatGPT search uses Bing’s index, which means Bing-specific ranking signals (page authority, social signals, content freshness) influence what gets retrieved. Brands that have neglected Bing optimisation may find themselves underrepresented in ChatGPT’s citations.
Understanding these differences helps you prioritise. If most of your audience uses Google, double down on organic authority. If ChatGPT is a growing referral source, make sure Bing can see your best content.
The click-through reality
Getting cited is only half the equation. The other half is whether citations actually drive traffic.
Google’s blog on the AI Overviews launch states that “the links included in AI Overviews get more clicks than if the page had appeared as a traditional web listing for that query.” Google also notes that “people have been visiting a greater diversity of websites for help with more complex questions.”
But there’s a counterpoint. According to Semrush, a Pew Research Center report found that “users click traditional search result links just 8% of the time when an AI Overview is present, versus 15% of the time when there’s no AI Overview present. And users only click results within the AI Overview 1% of the time.”
Both can be true simultaneously. Individual links in AI Overviews may get higher click-through rates than they would as organic listings, while the overall click-through rate on the page decreases because many users get their answer from the AI summary itself. The implication: getting cited matters more, not less, because fewer total clicks are available. Each citation carries more weight.
This is why learning how to get cited by AI isn’t optional anymore. As AI Overviews expand from 89% informational queries to 57% informational (with commercial and transactional queries filling the gap), citation visibility becomes a direct revenue signal, not just a brand awareness play.
Where to start
If you’re approaching this for the first time, don’t try to optimise for all six signals at once. Start with what’s blocking you.
- Technical audit first. Make sure AI crawlers can access your content. This is binary: either they can or they can’t.
- Content structure second. Restructure your highest-value pages with clear headings, direct answers, and accurate structured data.
- Entity clarity third. Ensure your brand information is consistent across your site, Google Business Profile, and third-party sources.
- Authority and citations ongoing. Build original research and relationships that earn references from other authoritative sources.
The brands that get cited consistently aren’t doing anything mysterious. They’re doing the fundamentals well, across every surface where AI engines look for answers.
For a deeper look at the specific factors each engine weighs, explore our AI SEO research hub.