Duplicate Content
Duplicate content is the same or substantially similar content appearing at more than one URL. It rarely earns a penalty, but it splits ranking signals and forces Google to guess which version to show.
Duplicate content is exactly what it sounds like: the same or near-identical content living at more than one URL. The myth I have to correct most often is that duplicate content carries a penalty. For the vast majority of sites, it does not. What it actually does is quieter and more expensive: it splits your signals across multiple URLs and makes Google pick a winner for you, often the wrong one. When the same content sits at three URLs, any links and authority that should reinforce one strong page get scattered across all three. Instead of one page with the full strength of its signals, you get three weaker pages competing with themselves. Google also has to choose which version to index and rank, and it does not always choose the one you want. Worse, it can spend crawl budget revisiting near-identical pages instead of finding the new content you actually want indexed.
Duplicate content rarely gets you penalized. It gets you diluted, which over time can hurt just as much.
Where duplicate content comes from
Most duplicate content is self-inflicted and technical, not stolen by someone else. The classic sources are URL variations like with and without a trailing slash, www and non-www, http and https. Then there are URL parameters from tracking, sorting, and filtering that create endless variants of the same page. Add printer-friendly or AMP versions duplicating the main article, boilerplate product descriptions repeated across many similar pages, and content syndicated to other sites without canonical attribution. Notice how few of these involve anyone copying you. The duplicates are coming from inside the house, generated by your own URL handling, which is good news because it means you control the fix. A single ecommerce category with sort, filter, and pagination parameters can spin off hundreds of duplicate URLs without anyone writing a word of new content. That is why duplicate content is usually a configuration problem dressed up as a writing problem, and why crawling your own site honestly is the first step to seeing it.
- URL variations: with and without a trailing slash, www and non-www, http and https.
- URL parameters from tracking, sorting, and filtering that create endless variants.
- Printer-friendly versions or AMP pages duplicating the main article.
- Boilerplate product descriptions repeated across many similar pages.
- The same content syndicated to other sites without canonical attribution.
| Situation | Correct fix |
|---|---|
| Multiple URLs, one preferred version | Canonical tag pointing to the preferred URL |
| Old URL fully replaced by a new one | 301 redirect from old to new |
| Parameter URLs you never want indexed | Canonical to the clean URL |
| Syndicated to another site | Ask for a canonical back to your original |
warningWATCH OUT
Do not reach for noindex when canonical is the right tool. Noindex removes the page entirely; canonical consolidates the signals into your preferred version. They solve different problems.
- 1Crawl the site and identify clusters of pages with identical or near-identical content.
- 2Pick the single preferred URL for each cluster and point canonicals at it.
- 3Use 301 redirects where an old URL is truly retired, then verify in Search Console.
Canonical to consolidate
The default move for duplicates you want to keep accessible is a canonical tag. It tells Google which version is the real one and pools the signals there.
targetFix your own URLs first
URL parameters and trailing-slash variants quietly generate duplicates by the thousand, so fix your own URL handling before you ever worry about another site copying you. Duplication is one of the cleanest problems to solve once you can see it. Walk through the detection and canonical strategy in my duplicate content playbook.
RELATED TERMS
Want this handled by someone who has measured search for 20 years?
Work with me