Schema Markup: The Definitive Guide for 2026
Which schema types actually earn you something, and which ones waste your afternoon
How to use structured data to feed entities to Google's Knowledge Graph and to LLMs
Copy-paste JSON-LD you can ship today, plus the mistakes that get you ignored
KEY TAKEAWAYS
- check_circleSchema is a clarity layer, not a ranking factor. It changes how machines understand and display your content.
- check_circleJSON-LD is the only format worth using. It is decoupled from your HTML, auditable in one place, and Google's preference.
- check_circleMatch the type to the content. Article for articles, Product for products, Person for people. Never fake reviews or ratings.
- check_circleEntity markup with sameAs and a stable @id is the highest-leverage work. It feeds the Knowledge Graph and the LLMs.
- check_circleUse @graph to connect your entities into one web instead of scattered, isolated blocks.
- check_circleValidate before every deploy and monitor Search Console after. Schema breaks silently when templates change.
INSIDE THIS GUIDE
9 chapters. Jump to any of them.
CHAPTER 01
What Schema Markup Actually Does
Let me clear something up before we go one inch further. Schema markup is not a ranking factor. I have watched people chase it for a decade like it was a magic dial. It is not.
Here is what it actually is. Schema markup is a way to label the things on your page so a machine knows what they are. A price is a price. A review is a review. An author is a person, not a string of pixels. You are adding a layer of meaning on top of your HTML.
Think about it from Google's side. A crawler lands on your page and sees a wall of text, some images, and a few numbers floating around. It has to guess. Is that number a price, a phone number, or a date? Schema removes the guessing. You hand the machine a clean, labeled answer.
And here is the part most people miss. When the machine is sure about what your content means, good things follow. Rich results. Knowledge panel entries. Citations inside AI answers. None of those are a ranking boost in the classic sense. They are visibility. They are real estate. They are the difference between a plain blue link and a result that takes up half the screen.
Clarity, not ranking
Schema does not move you up the list. It changes how you look once you are on the list, and how confidently machines repeat what you said.
The vocabulary comes from Schema.org, a shared project backed by Google, Microsoft, Yandex, and others. That shared part matters. It means the labels you write are understood by every major engine, not just one. You learn it once, and it pays off across the whole search and AI landscape. We will lean on that vocabulary the rest of this guide.
lightbulbPRO TIP
If you only remember one sentence from this chapter: schema does not make your content better, it makes your content legible. Garbage with perfect markup is still garbage.
CHAPTER 02
Rich Results vs the Knowledge Graph
People lump these two together. They are not the same thing, and treating them the same is why so much schema work goes nowhere.
Rich results are about a single page
A rich result is the enhanced look of one page in the search listing. Star ratings under a product. The little FAQ accordion. A recipe card with cook time and calories. Breadcrumbs instead of a raw URL. These are page-level. You earn them by marking up the specific content on that specific page.
Rich results are transactional in a way. You give Google clean data, Google may give you a fancier listing. May. Nothing is guaranteed. Google decides per query whether to show the enhancement at all, and it changes its mind often.
The Knowledge Graph is about entities
The Knowledge Graph is Google's internal map of things in the world. People, companies, places, books, concepts. Each is an entity with properties and relationships. When you search a brand and a panel appears on the right with the logo, founders, and social links, that is the Knowledge Graph talking.
You do not directly write into the Knowledge Graph. You contribute signals. Your Organization and Person markup, your sameAs links to verified profiles, your consistency across the web. Google connects those dots and decides whether you are a real entity worth tracking. This is the long game, and it is the part that matters most for the AI era. More on that in chapter five.
- Rich results equal how one page looks in the listing. Fast feedback, page-level markup.
- Knowledge Graph equals whether you exist as a recognized entity. Slow, cumulative, site-wide.
- You optimize for rich results with content schema. You optimize for the graph with entity schema.
Rich results win you the click today. The Knowledge Graph wins you the citation forever. Most people only play for the click.Shmul
If you want to go deeper on the AI side of this, I broke it down in the guide to generative engine optimization. Entities are the bridge between old-school SEO and getting named inside AI answers.
CHAPTER 03
Why JSON-LD Wins (And the Others Lost)
There are three ways to put structured data on a page. Microdata, RDFa, and JSON-LD. I am going to save you the history lecture. Use JSON-LD. Full stop.
Microdata and RDFa work by sprinkling attributes all over your visible HTML. You wrap a span here, add an itemprop there, tangle your markup into your content. It works, technically. It is also a nightmare to maintain. Every template change risks breaking your structured data, and you cannot see it all in one place to audit it.
JSON-LD is different. It is a single block of structured data, usually dropped in a script tag in the head or body. It sits separate from your visible content. You can read it top to bottom. You can validate it on its own. You can generate it from a template or a CMS without touching your HTML layout.
Google has openly recommended JSON-LD for years. That alone should settle the argument. But the practical reasons are stronger. Here is a minimal Article block so you can see how clean it is.
{
"@context": "https://schema.org",
"@type": "Article",
"headline": "Schema Markup: The Definitive Guide",
"author": {
"@type": "Person",
"name": "Samuel Dorenbaum"
},
"datePublished": "2026-01-15",
"dateModified": "2026-06-30",
"publisher": {
"@type": "Organization",
"name": "Shmul Playbook"
}
}- It lives in one place, so you can read and audit it in seconds.
- It is decoupled from your HTML, so a design change will not silently break it.
- It is trivial to generate dynamically from your CMS or a script.
- Google prefers it, and the testing tools assume it.
lightbulbPRO TIP
If you inherited a site full of Microdata, do not panic-migrate everything at once. Add JSON-LD for new pages and high-value templates first, then phase the old stuff out. Do not run both formats describing the same thing on one page, that confuses parsers.
CHAPTER 04
The Schema Types That Actually Earn You Something
Schema.org has hundreds of types. You do not need most of them. I am going to give you the ones that earn a rich result, feed an entity, or help an AI understand the page. Ignore the rest until you have a reason.
The workhorses
- Article, NewsArticle, or BlogPosting: tells engines this is editorial content, who wrote it, and when. The foundation for authorship and freshness signals.
- Product plus Offer: price, availability, currency, condition. Earns price and availability in listings and feeds shopping surfaces.
- Review and AggregateRating: star ratings, but only when they describe genuine reviews on your own page. Never fake them.
- BreadcrumbList: turns a raw URL into a clean clickable path in the result. Easy win, low effort, broadly supported.
- FAQPage: marks up question and answer pairs. Note Google has narrowed where this shows, so treat the rich result as a bonus, not the goal.
- HowTo: step-by-step instructions. Also narrowed by Google over time, still useful for machine comprehension.
- Organization: who you are as a company. Logo, name, contact, social profiles. Pure entity fuel.
- Person: an individual, usually your author. Critical for E-E-A-T and the Knowledge Graph.
- LocalBusiness: address, hours, geo, phone. Essential if you have a physical location.
- VideoObject: thumbnail, duration, upload date. Helps video show and get understood.
- WebSite with a SearchAction: can earn the sitelinks search box and names your site as an entity.
Match type to intent
A blog post gets Article. A store page gets Product. A bio page gets Person. Do not staple Product onto an essay because you read it earns stars. It will not, and you risk a penalty.
A word on FAQ and HowTo
Google has pulled back on showing FAQ and HowTo rich results for most sites. I still recommend the markup. Why? Because the rich result was never the only point. That structured question and answer is exactly the format an AI engine wants when it is pulling an answer. You are not just dressing up a listing, you are pre-chewing your content for the machines that summarize the web. That ties directly into how you get cited in ChatGPT and how you win AI Overviews.
lightbulbPRO TIP
Before you add any rich-result type, check Google's Search Central docs for that type's required and recommended properties. A required property missing means no rich result, period. The validator will tell you, but knowing up front saves a round trip.
CHAPTER 05
Entities, sameAs, and Becoming a Real Thing
This is the chapter most guides skip, and it is the one that matters most going forward. If you only do FAQ markup and call it a day, you are leaving the real value on the table.
An entity is a thing the search engines recognize as distinct and real. You as a person. Your company. Your product line. The difference between being a string of text and being a known entity is enormous. Known entities get cited. Known entities get knowledge panels. Known entities get trusted by LLMs that are deciding whose claims to repeat.
The single most powerful property for this is sameAs. It is an array of URLs that point to other authoritative profiles describing the same entity. Your LinkedIn. Your verified social accounts. Your Wikipedia page if you have one. Your Crunchbase, your author page on a publication, your GitHub. You are telling Google: all of these are me, connect the dots.
{
"@context": "https://schema.org",
"@type": "Person",
"name": "Samuel Dorenbaum",
"jobTitle": "SEO Consultant",
"url": "https://shmulplay.com/about/",
"sameAs": [
"https://www.linkedin.com/in/example",
"https://x.com/example",
"https://github.com/example"
],
"knowsAbout": ["Search Engine Optimization", "Structured Data", "Generative Engine Optimization"]
}Notice knowsAbout in there. That property tells engines what topics this person is an authority on. It is a quiet but useful signal for tying your name to your subject area. Same idea for an Organization: list the sameAs profiles, the founders, the logo, the contact points.
Consistency is the whole game here. Your name, your logo, your social links must match across every place they appear. If your Organization schema says one thing and your LinkedIn says another, you have weakened the connection. Pick a canonical version of every fact about your entity and repeat it everywhere, identically.
sameAs is your identity glue
Every authoritative profile you link with sameAs is another vote that you are a real, verifiable entity. This is the foundation of trust at the machine level.
This connects straight to trust. I wrote a full breakdown of how experience, expertise, authority, and trust get evaluated in the E-E-A-T guide. Entity markup is how you make those abstract trust signals machine-readable.
CHAPTER 06
Nesting and @graph: Connecting Your Schema
Most sites bolt on schema blocks that have no idea the others exist. An Article here, an Organization there, a Person somewhere else, all strangers. You can do much better by connecting them into one graph.
There are two ways to connect schema. The first is nesting: you place one type inside a property of another. The author of an Article is a Person object, nested right inside. The publisher is an Organization object nested inside. This is fine for simple pages and you have already seen it above.
The second, and better, way for a full page is @graph. Instead of nesting everything inside one root object, you list your entities as separate nodes and link them by @id. Each node gets a stable @id URL. Then anywhere you reference that entity, you point to its @id instead of repeating the whole object.
{
"@context": "https://schema.org",
"@graph": [
{
"@type": "Organization",
"@id": "https://shmulplay.com/#org",
"name": "Shmul Playbook",
"url": "https://shmulplay.com/"
},
{
"@type": "Person",
"@id": "https://shmulplay.com/#shmul",
"name": "Samuel Dorenbaum"
},
{
"@type": "Article",
"@id": "https://shmulplay.com/playbook/schema-markup/#article",
"headline": "Schema Markup: The Definitive Guide",
"author": { "@id": "https://shmulplay.com/#shmul" },
"publisher": { "@id": "https://shmulplay.com/#org" }
}
]
}See what happened. The Article references the Person and Organization by @id, not by copying them. Define each entity once, reference it everywhere. The whole page becomes one connected graph instead of three lonely islands. Machines love this because it mirrors how they actually model the world, as nodes and edges.
- Give every meaningful entity a stable @id (a URL with a fragment works well).
- Define each entity once, then reference it by @id wherever it appears.
- Reuse the same @id site-wide for your Organization and Person so engines merge them across pages.
- Use @graph at the page level when you have three or more connected types.
lightbulbPRO TIP
Keep your @id values consistent forever. If your Organization @id changes between pages, engines may treat them as different organizations. Pick the pattern once and never drift.
CHAPTER 07
Schema in the Age of AI: Feeding the Machines
For twenty years, schema was about pleasing one machine, Google's crawler. The game changed. Now there is a whole class of machines reading your page: the language models behind ChatGPT, Perplexity, Gemini, and AI Overviews. Schema speaks their language too.
Let me be careful here, because there is a lot of hype. No, ChatGPT does not require your JSON-LD to understand your page. Modern LLMs read raw text reasonably well. So why bother? Because schema removes ambiguity, and ambiguity is where machines hallucinate or skip you entirely.
When you mark up an author with a Person type and a sameAs to a real LinkedIn, you are not just hoping the model figures out who wrote this. You are stating it as fact in a structured form. When you mark up a price with currency and validity dates, the model does not have to guess whether nineteen ninety-nine is dollars or euros or a year. You collapse the uncertainty.
There is a second, bigger reason. Entity recognition. The AI engines are increasingly grounded in entity knowledge, the same Knowledge Graph thinking we covered in chapter five. If you are a recognized entity with consistent structured data across the web, you are more likely to be the thing the model reaches for when it answers a question in your space. Schema is one of the inputs that builds that recognition.
Schema reduces ambiguity
An LLM that is sure what your page says is an LLM that will repeat it. An LLM that has to guess is an LLM that may cite your competitor instead.
The structured question and answer format of FAQPage is genuinely useful for AI extraction even where Google stopped showing the rich result. So is clean Article markup with a clear author and date. The same hygiene that earned rich results yesterday makes you more quotable today. I go much deeper on this in the guides on how to rank in Perplexity and how to win AI Overviews.
Schema does not make an AI cite you. It makes you the easiest, safest thing to cite. In a race between you and an unclear competitor, easy wins.Shmul
CHAPTER 08
Common Mistakes and How to Validate
I have audited a lot of sites. The schema mistakes are remarkably consistent. Knock these out and you are ahead of most of the web.
The mistakes that get you ignored or penalized
- Marking up content that is not on the page. Schema must describe visible content. Invisible markup is a violation and can earn a manual action.
- Fake or self-serving review and rating markup. Putting AggregateRating on a page with no real reviews is the fastest way to lose rich result eligibility.
- Wrong type for the content. Slapping Product on an article, or Recipe on a non-recipe, confuses engines and earns nothing.
- Missing required properties. Every rich-result type has required fields. Miss one and the whole thing is ineligible.
- Inconsistent entity data. Different name, logo, or social links across pages weakens your entity. Pick one canonical set.
- Two formats fighting. Running Microdata and JSON-LD that describe the same thing on one page. Pick one.
- Stale dateModified. Bumping the modified date without actually changing content. Engines notice patterns and discount them.
How to validate, in order
- 1Run the page through Google's Rich Results Test. It tells you which rich result types you qualify for and flags errors and warnings on the specific properties.
- 2Run it through the Schema.org validator at validator.schema.org. This checks your markup against the vocabulary itself, broader than Google's rich-result lens.
- 3Fix every error. Errors mean ineligible. Treat warnings as a priority list, not a crisis, but clear the ones tied to rich results you want.
- 4Check Search Console under the enhancement reports. This is real-world data on what Google actually parsed across your live pages, not a single-page test.
- 5Re-test after any template change. Schema breaks silently when templates change. Make a re-test part of your deploy checklist.
lightbulbPRO TIP
Warnings are not errors. A warning usually means a recommended property is missing, which can improve your result but will not block it. Do not lose a day chasing every warning to zero. Clear errors first, then add recommended properties where the payoff is real.
Validation is part of broader site hygiene. If your schema is clean but your pages are slow or uncrawlable, none of it lands. I cover the foundations in the technical SEO guide, and the speed side in Core Web Vitals.
CHAPTER 09
Implementation: From Zero to Shipped
Enough theory. Here is exactly how I roll schema onto a site, in the order I do it. Follow this and you will not waste effort on markup that earns nothing.
- 1Inventory your page types. Most sites have five to ten: home, article, product, category, author bio, contact, about. Map each to the schema type it deserves before you write a single line.
- 2Build your entity foundation first. Create one canonical Organization block and one Person block per author, with logo, name, url, and sameAs. Give each a stable @id. This goes site-wide.
- 3Template it, do not hand-write it. Add schema to your CMS templates so every page of a type gets it automatically. Hand-writing per page does not scale and drifts immediately.
- 4Pull values dynamically. Title, author, dates, price, availability should come from your actual page data, not hardcoded strings. Hardcoded schema goes stale and lies.
- 5Connect with @graph and @id. Reference your Organization and Person by @id from every Article and page. One connected graph beats scattered blocks.
- 6Validate before deploy. Run the Rich Results Test and Schema.org validator on a sample of each page type. Clear all errors.
- 7Deploy, then monitor Search Console. Watch the enhancement reports for the first few weeks. Errors there are real-world and worth chasing.
- 8Add re-validation to your deploy checklist. Every future template change gets re-tested. Non-negotiable.
On the build versus plugin question: if you are on WordPress, a reputable schema plugin handles the heavy lifting and keeps up with Google's spec changes. If you are on a custom stack, generate JSON-LD from your data layer with the @graph pattern. Either way, the rules above hold. Real data, stable @id, validate, monitor.
Prioritize by payoff. Do not schema-everything on day one. Start with the templates that touch money and authority: product pages, your highest-traffic articles, and your author and organization entity blocks. Those earn rich results and feed the Knowledge Graph fastest.
Templates over pages
The leverage is in the template. Mark up one Article template correctly and a thousand articles inherit it. Mark up a thousand pages by hand and you will be wrong on half of them by next quarter.
One last thing. Schema amplifies content, it does not replace it. If your content is thin, perfect markup just makes the machines very sure your page is thin. Get the content and the targeting right first, see the keyword research guide, then layer schema on top to make it legible. That order matters.
lightbulbPRO TIP
Set a quarterly reminder to re-audit your top twenty pages in the Rich Results Test. Google changes which types earn results, plugins update, templates drift. Schema is not set-and-forget, it is set-and-check.
Frequently asked
Does schema markup improve my rankings?expand_more
Which schema format should I use?expand_more
Does FAQ schema still earn a rich result?expand_more
What is sameAs and why does it matter?expand_more
Can schema markup hurt my site?expand_more
How do I check if my schema is working?expand_more
Want this done for you?
I help brands win on Google and get cited in AI search. Tell me about your project.