XML Sitemap
An XML sitemap is a structured file that lists the URLs you want search engines to crawl and index, along with optional metadata about each one. It helps discovery but does not guarantee indexing.
An XML sitemap is a file that hands search engines a clean list of the URLs on your site that you actually want them to find. Instead of relying entirely on the crawler stumbling across pages through links, you give it a map. This is especially useful for large sites, new sites with few external links, and pages that are buried deep in your structure.
Be clear about what a sitemap does and does not do. It is a discovery aid. It tells the engine these URLs exist and here is some metadata about them. It does not force indexing, it does not boost rankings, and it does not override other signals. A URL in your sitemap can still be excluded if the engine decides it is low quality or a duplicate.
A sitemap is a suggestion for what to crawl, not a command to index. It improves discovery, especially for pages that are hard to reach through links alone.
What a sitemap looks like
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
<loc>https://www.yoursite.com/</loc>
<lastmod>2026-06-30</lastmod>
</url>
<url>
<loc>https://www.yoursite.com/playbook/technical-seo/</loc>
<lastmod>2026-06-15</lastmod>
</url>
</urlset>The loc element is the URL itself and it is required. The lastmod element tells the engine when the page last changed, which can help it prioritize re-crawling. Older specs included changefreq and priority, but Google has said it largely ignores those, so do not waste effort fine-tuning them. Accurate lastmod values are far more valuable.
Rules worth following
- Only include canonical, indexable URLs that return a 200 status code.
- Keep out noindexed pages, redirected URLs, and pages blocked by robots.txt.
- Cap each sitemap file at 50,000 URLs or 50 MB uncompressed, and split larger sites across multiple files.
- Use a sitemap index file to point to all your individual sitemaps when you have many.
- Keep lastmod honest. If every URL claims it changed today, the engine stops trusting the signal.
warningWATCH OUT
A sitemap full of non-canonical URLs, redirects, or 404s sends mixed signals and erodes trust in the file. Treat your sitemap as a promise: every URL in it should be a page you genuinely want indexed, returning a clean 200.
targetHow to submit it
Add the sitemap line to your robots.txt so crawlers find it automatically, and also submit it directly in the Sitemaps report inside Google Search Console. That report shows how many URLs were discovered and flags errors, which makes it your best monitoring tool.
Index file for big sites
If your site has thousands of pages, do not cram them into one giant file. Split by content type or section and reference them all from a single sitemap index. It is easier to manage and easier to debug when something breaks.
Most modern platforms generate sitemaps for you automatically, and for a lot of sites that default output is good enough. The catch is that automated sitemaps are only as disciplined as the platform behind them. They will happily include redirected URLs, noindexed pages, and dead links if your site has them, because the generator just dumps what it knows about. So even when a plugin or framework builds the file, spot check it. Open the sitemap, click a handful of URLs, and confirm they return clean 200 pages that you actually want in search.
There is also a diagnostic angle most people miss. Because Google reports how many sitemap URLs it discovered versus how many it indexed, your sitemap becomes a measuring tool. If you submit 5,000 URLs and only 1,200 get indexed, that gap is telling you something about content quality, duplication, or crawl issues across a big chunk of your site. A well-built sitemap does not just help discovery, it gives you a clean denominator for spotting indexing problems at scale.
A clean, accurate sitemap is low effort and high value, especially when you are migrating or launching new sections and want the engine to find everything fast. For how sitemaps work alongside crawling and indexing, see my technical SEO guide, and the site migrations guide when you are moving URLs around.
RELATED TERMS
Want this handled by someone who has measured search for 20 years?
Work with me