Schema markup has always been about translation. You write content for humans. Structured data translates it into a format that machines can read without guessing.
For the last decade, that translation mostly served one purpose: earning rich snippets in Google Search. Star ratings, FAQ dropdowns, recipe cards. Visual enhancements that made your listing stand out on a results page.
That use case still matters. But structured data has picked up a second job that most schema guides don’t cover.
AI search engines like ChatGPT, Google’s AI Overviews, and Microsoft Copilot now use retrieval-augmented generation (RAG) to pull content from search indexes and generate answers. According to Google’s AI optimization guide, their generative AI features are “rooted in our core Search ranking and quality systems” and rely on RAG, “a technique used to improve the quality, accuracy, and freshness of AI responses by relying on our core Search ranking systems to retrieve relevant, up-to-date web pages from our Search index.”
Your structured data feeds directly into those ranking systems. It tells the retrieval layer what your page is about, what entities it references, and how those entities relate to each other. The question isn’t whether schema markup matters for AI search. It’s which types matter, and which ones you can safely deprioritise.
How AI search engines consume structured data
To understand which schema types feed AI citations, you need to understand how AI search actually works under the hood.
Google’s AI Overviews use two key mechanisms. The first is RAG, where the model retrieves web pages from Google’s search index and reviews specific information from those pages to generate responses. The second is what Google calls “query fan-out,” which it describes as “a set of concurrent, related queries generated by the model to request more information and fetch additional relevant search results to address the user’s query.” Google’s guide gives an example: if the original query is “how to fix a lawn that’s full of weeds,” fan-out queries might include “best herbicides for lawns,” “remove weeds without chemicals,” and “how to prevent weeds in lawn.”
Both mechanisms rely on Google’s core search index. Structured data that helps Google categorise and understand your pages improves your chances of being retrieved during both the initial query and the fan-out queries that follow.
Microsoft takes a more direct stance on the relationship. According to Semrush’s schema markup guide, Microsoft’s guidance states that “schema is a type of code that helps search engines and AI systems understand your content.” That phrasing is deliberate. It puts search engines and AI systems on equal footing as consumers of structured data.
The schema types that feed AI citation
Schema.org’s vocabulary currently consists of 823 types and 1,529 properties. You don’t need most of them. Google’s structured data documentation supports a specific set of markup types, each with its own rich result features. But not all of these carry equal weight when it comes to AI search.
Here’s where to focus.
Organization markup
Organization schema is the closest thing to an identity card for your business in the eyes of both search engines and AI systems. It tells machines your legal name, logo, address, contact information, social profiles, and business identifiers.
Google’s documentation describes Organization markup as helping “Google better understand your organization’s administrative details and disambiguate your organization in search results.” That disambiguation is critical for AI search. When a language model needs to decide which “Acme” to cite, Organization schema provides the structured signals that resolve ambiguity.
Key properties to include: name, url, logo, description, sameAs (linking to your social profiles and other authoritative pages about your business), address, contactPoint, and business identifiers like vatID or iso6523Code. Google notes that “some properties are used behind the scenes to disambiguate your organization from other organizations (like iso6523 and naics).” Those behind-the-scenes properties are exactly the kind of signals that help AI systems match your entity to the right queries.
If you’re an ecommerce business, Google supports a more specific OnlineStore subtype that can include shipping and return policy information. This feeds directly into the merchant knowledge panel and product-related AI responses.
Product markup
Product structured data comes in two flavours. Product snippets apply to pages where users can’t purchase directly (review pages, comparison articles). Merchant listings apply to pages where users can buy.
For AI search, both matter. Google states that “when you add structured data to your product pages, your product information can appear in richer ways in Google Search results (including Google Images and Google Lens).” That same product data feeds into the retrieval systems that AI Overviews query when someone asks a product comparison question.
The properties that carry the most weight for AI retrieval include price, availability, review ratings, shipping information, and return policies. Google notes that “it is recommended to provide as much rich product information as available, without concern for the exact experiences that will use it.” That guidance applies doubly for AI search, where the model may pull your product data to construct a comparison or recommendation it couldn’t have built from page text alone.
FAQ markup
FAQ structured data defines question-and-answer pairs on a page. While Google has pulled back on displaying FAQ rich results for most sites, the markup itself still serves an important function for AI search.
Google’s search gallery lists FAQ as a supported structured data type, describing it as “a Frequently Asked Question (FAQ) page contains a list of questions and answers pertaining to a particular topic.” The question-and-answer format aligns naturally with how AI search works. When someone asks ChatGPT or an AI Overview a direct question, the retrieval system looks for pages that contain clear, structured answers. FAQ schema makes those answers machine-readable.
Even without the visual rich result, FAQ markup creates a structured signal that says “this page answers these specific questions.” That signal helps retrieval systems match your page to natural language queries during both the primary search and query fan-out.
Article markup
Article schema tells search engines the type of content on your page, including the headline, author, date published, and date modified. Google’s search gallery describes Article markup as applicable to “a news, sports, or blog article displayed in various rich result features.”
For AI search, Article schema serves two purposes. First, it establishes publication freshness. AI systems, like traditional search, prioritise recent content for topics where recency matters. The datePublished and dateModified properties give the model a structured timestamp rather than forcing it to guess from page text. Second, author information embedded in Article schema contributes to the experience and authority signals that influence AI rankings.
HowTo markup
HowTo schema breaks a process into discrete, ordered steps. Each step can include text descriptions, images, tools required, and estimated time. This type maps directly to instructional queries, which make up a significant share of AI search traffic.
When Google’s AI Overview uses query fan-out to answer a question like “how to fix a lawn that’s full of weeds,” it generates sub-queries and retrieves pages for each step of the process. HowTo schema that explicitly defines those steps in structured data makes your page easier to parse and cite.
LocalBusiness markup
LocalBusiness is a subtype of Organization that adds location-specific properties: opening hours, geographic coordinates, service areas, and menu URLs for restaurants. Google’s search gallery lists local business details as appearing “in the Google knowledge panel, including open hours, ratings, directions, and actions to book appointments or order items.”
For AI search, local markup matters because AI engines increasingly handle location-based queries. When someone asks ChatGPT for “best plumber near me,” the answer is assembled from structured business data, including the signals that LocalBusiness schema provides.
Schema types that don’t move the AI needle
Not every structured data type contributes to AI citation. Several exist primarily to trigger specific visual treatments in traditional search results.
Breadcrumb markup tells Google where a page sits in your site hierarchy. It generates the breadcrumb trail under your search listing. Useful for user navigation, but it doesn’t provide the kind of entity or factual data that AI retrieval systems query.
Sitelinks search box lets users search your site directly from Google results. This is a UI feature with no bearing on how AI systems evaluate or cite your content.
Carousel markup creates a scrollable gallery of results from a single site. It’s a display format, not a content signal. The underlying content types (Recipe, Movie, Course) carry the weight for AI retrieval, not the carousel wrapper.
Event markup triggers event listings in search results with dates, times, and venues. While useful for event discovery, AI search engines rarely cite event pages when answering informational queries. The exception is if someone asks a direct question about a specific event, in which case the underlying content matters more than the schema.
The gap between schema and AI citation
As Semrush notes, “schema markup may be important for AI visibility, but there’s limited evidence showing it definitively improves your likelihood of showing up in AI responses.” That’s an honest assessment, and it’s worth understanding why.
AI search engines don’t read your JSON-LD directly and feed it into a prompt. They use structured data as one input among many during the retrieval phase. Your schema helps the retrieval system decide which pages to pull. The language model then reads the actual page content to generate its answer.
This means schema markup works as a targeting mechanism, not a content mechanism. It helps your page get retrieved for the right queries. But the content on the page still determines whether you get cited.
Google’s AI optimization guide reinforces this distinction. Rather than emphasising schema specifically, Google recommends focusing on “creating content that people find unique, compelling, and useful” and notes this “will likely influence your website’s presence in generative AI search in the long run more than any of the other suggestions in this guide.” The best generative engine optimization strategy combines strong structured data with content that actually answers the query better than competing pages.
How to prioritise your schema implementation
If you’re starting from zero or auditing your existing structured data, here’s the order that maximises impact for both traditional and AI search.
Start with Organization schema on your homepage. This establishes your entity identity across all search systems. Include as many properties as apply to your business, paying particular attention to sameAs links and business identifiers that disambiguate you from other organisations.
Add Product or Merchant Listing schema to every product page. Include price, availability, ratings, shipping, and return information. Google recommends providing “as much rich product information as available.”
Implement Article schema on all blog and editorial content. Always include datePublished, dateModified, and author information. These signals feed into freshness and authority evaluations.
Add FAQ schema where you have genuine question-and-answer content. Don’t manufacture FAQ sections just for the markup. But where you naturally answer specific questions, structured data helps retrieval systems match your page to those queries.
Use HowTo schema for process-oriented content. Break complex procedures into explicit steps with descriptions and, where relevant, required tools and estimated times.
Layer in LocalBusiness schema if you serve specific geographic areas. This applies whether you’re a single-location business or a multi-location brand.
Validation and testing
Google recommends using the Rich Results Test to check which Google rich results your structured data can generate. For generic schema validation that isn’t tied to Google’s specific features, use the Schema Markup Validator at validator.schema.org.
The Rich Results Test will flag errors and warnings specific to Google’s structured data requirements. Fix critical errors first. Non-critical warnings are worth addressing too, as Google notes they “can help improve the quality of your structured data.”
One crucial point from Google’s AI optimization guide: “Make sure structured data matches the visible content.” AI systems cross-reference your schema against what’s actually on the page. If your Product schema lists a price of $49 but your page shows $59, that mismatch undermines trust in both traditional and AI search.
What this means for your site
Schema markup isn’t a switch you flip to appear in AI search results. It’s infrastructure that makes your content easier for retrieval systems to find, parse, and match to the right queries.
The schema types that matter most for AI search are the ones that describe real entities and provide factual data: Organization, Product, FAQ, Article, HowTo, and LocalBusiness. The types that exist purely for visual formatting in traditional search results don’t carry the same weight.
Start with the types that match your content. Validate them. Keep them in sync with what’s actually on your pages. And pair them with the kind of unique, non-commodity content that AI systems want to cite in the first place.