PLAY 11

Schema Markup: The Definitive Guide for 2026

Which schema types actually earn you something, and which ones waste your afternoon

How to use structured data to feed entities to Google's Knowledge Graph and to LLMs

Copy-paste JSON-LD you can ship today, plus the mistakes that get you ignored

14 min readUpdated 2026By Shmul

KEY TAKEAWAYS

  • check_circleSchema is a clarity layer, not a ranking factor. It changes how machines understand and display your content.
  • check_circleJSON-LD is the only format worth using. It is decoupled from your HTML, auditable in one place, and Google's preference.
  • check_circleMatch the type to the content. Article for articles, Product for products, Person for people. Never fake reviews or ratings.
  • check_circleEntity markup with sameAs and a stable @id is the highest-leverage work. It feeds the Knowledge Graph and the LLMs.
  • check_circleUse @graph to connect your entities into one web instead of scattered, isolated blocks.
  • check_circleValidate before every deploy and monitor Search Console after. Schema breaks silently when templates change.
01

CHAPTER 01

What Schema Markup Actually Does

Let me clear something up before we go one inch further. Schema markup is not a ranking factor. I have watched people chase it for a decade like it was a magic dial. It is not.

Here is what it actually is. Schema markup is a way to label the things on your page so a machine knows what they are. A price is a price. A review is a review. An author is a person, not a string of pixels. You are adding a layer of meaning on top of your HTML.

Think about it from Google's side. A crawler lands on your page and sees a wall of text, some images, and a few numbers floating around. It has to guess. Is that number a price, a phone number, or a date? Schema removes the guessing. You hand the machine a clean, labeled answer.

And here is the part most people miss. When the machine is sure about what your content means, good things follow. Rich results. Knowledge panel entries. Citations inside AI answers. None of those are a ranking boost in the classic sense. They are visibility. They are real estate. They are the difference between a plain blue link and a result that takes up half the screen.

Clarity, not ranking

Schema does not move you up the list. It changes how you look once you are on the list, and how confidently machines repeat what you said.

The vocabulary comes from Schema.org, a shared project backed by Google, Microsoft, Yandex, and others. That shared part matters. It means the labels you write are understood by every major engine, not just one. You learn it once, and it pays off across the whole search and AI landscape. We will lean on that vocabulary the rest of this guide.

lightbulbPRO TIP

If you only remember one sentence from this chapter: schema does not make your content better, it makes your content legible. Garbage with perfect markup is still garbage.

02

CHAPTER 02

Rich Results vs the Knowledge Graph

People lump these two together. They are not the same thing, and treating them the same is why so much schema work goes nowhere.

Rich results are about a single page

A rich result is the enhanced look of one page in the search listing. Star ratings under a product. The little FAQ accordion. A recipe card with cook time and calories. Breadcrumbs instead of a raw URL. These are page-level. You earn them by marking up the specific content on that specific page.

Rich results are transactional in a way. You give Google clean data, Google may give you a fancier listing. May. Nothing is guaranteed. Google decides per query whether to show the enhancement at all, and it changes its mind often.

The Knowledge Graph is about entities

The Knowledge Graph is Google's internal map of things in the world. People, companies, places, books, concepts. Each is an entity with properties and relationships. When you search a brand and a panel appears on the right with the logo, founders, and social links, that is the Knowledge Graph talking.

You do not directly write into the Knowledge Graph. You contribute signals. Your Organization and Person markup, your sameAs links to verified profiles, your consistency across the web. Google connects those dots and decides whether you are a real entity worth tracking. This is the long game, and it is the part that matters most for the AI era. More on that in chapter five.

  • Rich results equal how one page looks in the listing. Fast feedback, page-level markup.
  • Knowledge Graph equals whether you exist as a recognized entity. Slow, cumulative, site-wide.
  • You optimize for rich results with content schema. You optimize for the graph with entity schema.
Rich results win you the click today. The Knowledge Graph wins you the citation forever. Most people only play for the click.Shmul

If you want to go deeper on the AI side of this, I broke it down in the guide to generative engine optimization. Entities are the bridge between old-school SEO and getting named inside AI answers.

03

CHAPTER 03

Why JSON-LD Wins (And the Others Lost)

There are three ways to put structured data on a page. Microdata, RDFa, and JSON-LD. I am going to save you the history lecture. Use JSON-LD. Full stop.

Microdata and RDFa work by sprinkling attributes all over your visible HTML. You wrap a span here, add an itemprop there, tangle your markup into your content. It works, technically. It is also a nightmare to maintain. Every template change risks breaking your structured data, and you cannot see it all in one place to audit it.

JSON-LD is different. It is a single block of structured data, usually dropped in a script tag in the head or body. It sits separate from your visible content. You can read it top to bottom. You can validate it on its own. You can generate it from a template or a CMS without touching your HTML layout.

Google has openly recommended JSON-LD for years. That alone should settle the argument. But the practical reasons are stronger. Here is a minimal Article block so you can see how clean it is.

{
  "@context": "https://schema.org",
  "@type": "Article",
  "headline": "Schema Markup: The Definitive Guide",
  "author": {
    "@type": "Person",
    "name": "Samuel Dorenbaum"
  },
  "datePublished": "2026-01-15",
  "dateModified": "2026-06-30",
  "publisher": {
    "@type": "Organization",
    "name": "Shmul Playbook"
  }
}
  • It lives in one place, so you can read and audit it in seconds.
  • It is decoupled from your HTML, so a design change will not silently break it.
  • It is trivial to generate dynamically from your CMS or a script.
  • Google prefers it, and the testing tools assume it.

lightbulbPRO TIP

If you inherited a site full of Microdata, do not panic-migrate everything at once. Add JSON-LD for new pages and high-value templates first, then phase the old stuff out. Do not run both formats describing the same thing on one page, that confuses parsers.

04

CHAPTER 04

The Schema Types That Actually Earn You Something

Schema.org has hundreds of types. You do not need most of them. I am going to give you the ones that earn a rich result, feed an entity, or help an AI understand the page. Ignore the rest until you have a reason.

The workhorses

  • Article, NewsArticle, or BlogPosting: tells engines this is editorial content, who wrote it, and when. The foundation for authorship and freshness signals.
  • Product plus Offer: price, availability, currency, condition. Earns price and availability in listings and feeds shopping surfaces.
  • Review and AggregateRating: star ratings, but only when they describe genuine reviews on your own page. Never fake them.
  • BreadcrumbList: turns a raw URL into a clean clickable path in the result. Easy win, low effort, broadly supported.
  • FAQPage: marks up question and answer pairs. Note Google has narrowed where this shows, so treat the rich result as a bonus, not the goal.
  • HowTo: step-by-step instructions. Also narrowed by Google over time, still useful for machine comprehension.
  • Organization: who you are as a company. Logo, name, contact, social profiles. Pure entity fuel.
  • Person: an individual, usually your author. Critical for E-E-A-T and the Knowledge Graph.
  • LocalBusiness: address, hours, geo, phone. Essential if you have a physical location.
  • VideoObject: thumbnail, duration, upload date. Helps video show and get understood.
  • WebSite with a SearchAction: can earn the sitelinks search box and names your site as an entity.

Match type to intent

A blog post gets Article. A store page gets Product. A bio page gets Person. Do not staple Product onto an essay because you read it earns stars. It will not, and you risk a penalty.

A word on FAQ and HowTo

Google has pulled back on showing FAQ and HowTo rich results for most sites. I still recommend the markup. Why? Because the rich result was never the only point. That structured question and answer is exactly the format an AI engine wants when it is pulling an answer. You are not just dressing up a listing, you are pre-chewing your content for the machines that summarize the web. That ties directly into how you get cited in ChatGPT and how you win AI Overviews.

lightbulbPRO TIP

Before you add any rich-result type, check Google's Search Central docs for that type's required and recommended properties. A required property missing means no rich result, period. The validator will tell you, but knowing up front saves a round trip.

05

CHAPTER 05

Entities, sameAs, and Becoming a Real Thing

This is the chapter most guides skip, and it is the one that matters most going forward. If you only do FAQ markup and call it a day, you are leaving the real value on the table.

An entity is a thing the search engines recognize as distinct and real. You as a person. Your company. Your product line. The difference between being a string of text and being a known entity is enormous. Known entities get cited. Known entities get knowledge panels. Known entities get trusted by LLMs that are deciding whose claims to repeat.

The single most powerful property for this is sameAs. It is an array of URLs that point to other authoritative profiles describing the same entity. Your LinkedIn. Your verified social accounts. Your Wikipedia page if you have one. Your Crunchbase, your author page on a publication, your GitHub. You are telling Google: all of these are me, connect the dots.

{
  "@context": "https://schema.org",
  "@type": "Person",
  "name": "Samuel Dorenbaum",
  "jobTitle": "SEO Consultant",
  "url": "https://shmulplay.com/about/",
  "sameAs": [
    "https://www.linkedin.com/in/example",
    "https://x.com/example",
    "https://github.com/example"
  ],
  "knowsAbout": ["Search Engine Optimization", "Structured Data", "Generative Engine Optimization"]
}

Notice knowsAbout in there. That property tells engines what topics this person is an authority on. It is a quiet but useful signal for tying your name to your subject area. Same idea for an Organization: list the sameAs profiles, the founders, the logo, the contact points.

Consistency is the whole game here. Your name, your logo, your social links must match across every place they appear. If your Organization schema says one thing and your LinkedIn says another, you have weakened the connection. Pick a canonical version of every fact about your entity and repeat it everywhere, identically.

sameAs is your identity glue

Every authoritative profile you link with sameAs is another vote that you are a real, verifiable entity. This is the foundation of trust at the machine level.

This connects straight to trust. I wrote a full breakdown of how experience, expertise, authority, and trust get evaluated in the E-E-A-T guide. Entity markup is how you make those abstract trust signals machine-readable.

06

CHAPTER 06

Nesting and @graph: Connecting Your Schema

Most sites bolt on schema blocks that have no idea the others exist. An Article here, an Organization there, a Person somewhere else, all strangers. You can do much better by connecting them into one graph.

There are two ways to connect schema. The first is nesting: you place one type inside a property of another. The author of an Article is a Person object, nested right inside. The publisher is an Organization object nested inside. This is fine for simple pages and you have already seen it above.

The second, and better, way for a full page is @graph. Instead of nesting everything inside one root object, you list your entities as separate nodes and link them by @id. Each node gets a stable @id URL. Then anywhere you reference that entity, you point to its @id instead of repeating the whole object.

{
  "@context": "https://schema.org",
  "@graph": [
    {
      "@type": "Organization",
      "@id": "https://shmulplay.com/#org",
      "name": "Shmul Playbook",
      "url": "https://shmulplay.com/"
    },
    {
      "@type": "Person",
      "@id": "https://shmulplay.com/#shmul",
      "name": "Samuel Dorenbaum"
    },
    {
      "@type": "Article",
      "@id": "https://shmulplay.com/playbook/schema-markup/#article",
      "headline": "Schema Markup: The Definitive Guide",
      "author": { "@id": "https://shmulplay.com/#shmul" },
      "publisher": { "@id": "https://shmulplay.com/#org" }
    }
  ]
}

See what happened. The Article references the Person and Organization by @id, not by copying them. Define each entity once, reference it everywhere. The whole page becomes one connected graph instead of three lonely islands. Machines love this because it mirrors how they actually model the world, as nodes and edges.

  • Give every meaningful entity a stable @id (a URL with a fragment works well).
  • Define each entity once, then reference it by @id wherever it appears.
  • Reuse the same @id site-wide for your Organization and Person so engines merge them across pages.
  • Use @graph at the page level when you have three or more connected types.

lightbulbPRO TIP

Keep your @id values consistent forever. If your Organization @id changes between pages, engines may treat them as different organizations. Pick the pattern once and never drift.

07

CHAPTER 07

Schema in the Age of AI: Feeding the Machines

For twenty years, schema was about pleasing one machine, Google's crawler. The game changed. Now there is a whole class of machines reading your page: the language models behind ChatGPT, Perplexity, Gemini, and AI Overviews. Schema speaks their language too.

Let me be careful here, because there is a lot of hype. No, ChatGPT does not require your JSON-LD to understand your page. Modern LLMs read raw text reasonably well. So why bother? Because schema removes ambiguity, and ambiguity is where machines hallucinate or skip you entirely.

When you mark up an author with a Person type and a sameAs to a real LinkedIn, you are not just hoping the model figures out who wrote this. You are stating it as fact in a structured form. When you mark up a price with currency and validity dates, the model does not have to guess whether nineteen ninety-nine is dollars or euros or a year. You collapse the uncertainty.

There is a second, bigger reason. Entity recognition. The AI engines are increasingly grounded in entity knowledge, the same Knowledge Graph thinking we covered in chapter five. If you are a recognized entity with consistent structured data across the web, you are more likely to be the thing the model reaches for when it answers a question in your space. Schema is one of the inputs that builds that recognition.

Schema reduces ambiguity

An LLM that is sure what your page says is an LLM that will repeat it. An LLM that has to guess is an LLM that may cite your competitor instead.

The structured question and answer format of FAQPage is genuinely useful for AI extraction even where Google stopped showing the rich result. So is clean Article markup with a clear author and date. The same hygiene that earned rich results yesterday makes you more quotable today. I go much deeper on this in the guides on how to rank in Perplexity and how to win AI Overviews.

Schema does not make an AI cite you. It makes you the easiest, safest thing to cite. In a race between you and an unclear competitor, easy wins.Shmul
08

CHAPTER 08

Common Mistakes and How to Validate

I have audited a lot of sites. The schema mistakes are remarkably consistent. Knock these out and you are ahead of most of the web.

The mistakes that get you ignored or penalized

  • Marking up content that is not on the page. Schema must describe visible content. Invisible markup is a violation and can earn a manual action.
  • Fake or self-serving review and rating markup. Putting AggregateRating on a page with no real reviews is the fastest way to lose rich result eligibility.
  • Wrong type for the content. Slapping Product on an article, or Recipe on a non-recipe, confuses engines and earns nothing.
  • Missing required properties. Every rich-result type has required fields. Miss one and the whole thing is ineligible.
  • Inconsistent entity data. Different name, logo, or social links across pages weakens your entity. Pick one canonical set.
  • Two formats fighting. Running Microdata and JSON-LD that describe the same thing on one page. Pick one.
  • Stale dateModified. Bumping the modified date without actually changing content. Engines notice patterns and discount them.

How to validate, in order

  1. 1Run the page through Google's Rich Results Test. It tells you which rich result types you qualify for and flags errors and warnings on the specific properties.
  2. 2Run it through the Schema.org validator at validator.schema.org. This checks your markup against the vocabulary itself, broader than Google's rich-result lens.
  3. 3Fix every error. Errors mean ineligible. Treat warnings as a priority list, not a crisis, but clear the ones tied to rich results you want.
  4. 4Check Search Console under the enhancement reports. This is real-world data on what Google actually parsed across your live pages, not a single-page test.
  5. 5Re-test after any template change. Schema breaks silently when templates change. Make a re-test part of your deploy checklist.

lightbulbPRO TIP

Warnings are not errors. A warning usually means a recommended property is missing, which can improve your result but will not block it. Do not lose a day chasing every warning to zero. Clear errors first, then add recommended properties where the payoff is real.

Validation is part of broader site hygiene. If your schema is clean but your pages are slow or uncrawlable, none of it lands. I cover the foundations in the technical SEO guide, and the speed side in Core Web Vitals.

09

CHAPTER 09

Implementation: From Zero to Shipped

Enough theory. Here is exactly how I roll schema onto a site, in the order I do it. Follow this and you will not waste effort on markup that earns nothing.

  1. 1Inventory your page types. Most sites have five to ten: home, article, product, category, author bio, contact, about. Map each to the schema type it deserves before you write a single line.
  2. 2Build your entity foundation first. Create one canonical Organization block and one Person block per author, with logo, name, url, and sameAs. Give each a stable @id. This goes site-wide.
  3. 3Template it, do not hand-write it. Add schema to your CMS templates so every page of a type gets it automatically. Hand-writing per page does not scale and drifts immediately.
  4. 4Pull values dynamically. Title, author, dates, price, availability should come from your actual page data, not hardcoded strings. Hardcoded schema goes stale and lies.
  5. 5Connect with @graph and @id. Reference your Organization and Person by @id from every Article and page. One connected graph beats scattered blocks.
  6. 6Validate before deploy. Run the Rich Results Test and Schema.org validator on a sample of each page type. Clear all errors.
  7. 7Deploy, then monitor Search Console. Watch the enhancement reports for the first few weeks. Errors there are real-world and worth chasing.
  8. 8Add re-validation to your deploy checklist. Every future template change gets re-tested. Non-negotiable.

On the build versus plugin question: if you are on WordPress, a reputable schema plugin handles the heavy lifting and keeps up with Google's spec changes. If you are on a custom stack, generate JSON-LD from your data layer with the @graph pattern. Either way, the rules above hold. Real data, stable @id, validate, monitor.

Prioritize by payoff. Do not schema-everything on day one. Start with the templates that touch money and authority: product pages, your highest-traffic articles, and your author and organization entity blocks. Those earn rich results and feed the Knowledge Graph fastest.

Templates over pages

The leverage is in the template. Mark up one Article template correctly and a thousand articles inherit it. Mark up a thousand pages by hand and you will be wrong on half of them by next quarter.

One last thing. Schema amplifies content, it does not replace it. If your content is thin, perfect markup just makes the machines very sure your page is thin. Get the content and the targeting right first, see the keyword research guide, then layer schema on top to make it legible. That order matters.

lightbulbPRO TIP

Set a quarterly reminder to re-audit your top twenty pages in the Rich Results Test. Google changes which types earn results, plugins update, templates drift. Schema is not set-and-forget, it is set-and-check.

Frequently asked

Does schema markup improve my rankings?expand_more
Not directly. Schema is not a ranking factor. What it does is make you eligible for rich results and feed the entity graph, both of which can increase clicks and trust at the same rank. Any ranking lift is a second-order effect of a bigger, more credible listing, not the markup itself.
Which schema format should I use?expand_more
JSON-LD, every time. It sits in a script tag separate from your visible HTML, you can read and validate it in one place, it generates cleanly from a CMS, and Google has recommended it for years. Skip Microdata and RDFa unless you are stuck maintaining a legacy site, and even then, migrate toward JSON-LD.
Does FAQ schema still earn a rich result?expand_more
For most sites, Google has narrowed where FAQ rich results appear, so treat the visual enhancement as a bonus rather than the goal. Keep the markup anyway. The structured question-and-answer format is exactly what AI engines want when they extract an answer, so it still pays off for AI visibility.
What is sameAs and why does it matter?expand_more
sameAs is a property that holds an array of URLs pointing to other authoritative profiles for the same entity, like your LinkedIn, verified social accounts, or Crunchbase page. It tells engines that all those profiles are the same person or organization, which helps you get recognized as a real entity in the Knowledge Graph and trusted by AI models.
Can schema markup hurt my site?expand_more
Yes, if you abuse it. Marking up content that is not visible on the page, or adding fake review and rating markup, can earn a manual action and lose your rich-result eligibility. Used honestly to describe real, visible content, schema carries no downside.
How do I check if my schema is working?expand_more
Run the page through Google's Rich Results Test and the Schema.org validator to catch errors before you ship. After deploying, watch the enhancement reports in Google Search Console, which show real-world data on what Google actually parsed across your live pages. Re-test after any template change, because schema breaks silently.

Want this done for you?

I help brands win on Google and get cited in AI search. Tell me about your project.

Work with me