How to Do an SEO Audit: The Definitive Playbook
Most SEO audits are checklists wearing a suit. They list 200 problems, rank none of them, and get filed in a folder nobody opens again.
I will walk you through the audit I have run for twenty years: crawl, indexation, technical health, on-page, links, E-E-A-T, Core Web Vitals, and GEO readiness.
You will leave with a prioritized, defensible deliverable that tells a client exactly what to fix first, why it matters, and what it is worth.
KEY TAKEAWAYS
- check_circleAn SEO audit is a prioritized diagnosis, not an exhaustive list of every flaw. Judgment is the deliverable.
- check_circleStart by reconciling your crawl against Search Console's index. The gap, and its reasons, set the agenda.
- check_circleFix the issues that uncap rankings (indexation, duplication, thin content) before the cosmetic ones.
- check_circleInternal linking is the highest-ROI lever you fully control, and most audits barely touch it.
- check_circleAudit E-E-A-T and GEO readiness as first-class sections. The AI answer layer is not optional anymore.
- check_circleRank every finding by impact and effort, and present a three-layer document a busy person will actually run.
INSIDE THIS GUIDE
9 chapters. Jump to any of them.
CHAPTER 01
What an SEO Audit Is Actually For
Let me start by killing the most common mistake. An SEO audit is not a list of everything wrong with a website. If that were the goal, a crawler could do it in twenty minutes and you would be redundant. The audit is the part the crawler cannot do: deciding what matters, in what order, and why.
A good audit answers three questions in plain language. What is holding this site back right now? In what order should we fix it? And what is each fix worth if we do? If your deliverable does not answer those three, you did not run an audit. You ran a scan and pasted the results into a document.
I have watched agencies hand clients a 180-page PDF with every missing alt tag and every 302 redirect lovingly cataloged, and watched those PDFs die in a Downloads folder. The client cannot tell whether a missing meta description matters more than a crawl trap eating half their budget. That is your job. The value you add is judgment, not enumeration.
A finding without a priority is just trivia. The entire job of an audit is to turn a flat list of problems into a ranked plan a busy person can actually act on.
The deliverable, defined up front
Before I crawl a single URL, I know what I am building toward. The deliverable is a short, ranked document. At the top, three to five issues genuinely costing the site traffic or money right now, each with the fix and the expected impact. Below that, the medium and low priority items, grouped so they do not drown the headline.
targetWhat goes in the deliverable
An executive summary in plain English. A prioritized issue list (critical, high, medium, low). For each issue: what it is, why it matters, the fix, and the rough effort. A short technical appendix with the raw data for whoever implements it. That is the whole shape. If you want the wider strategic frame, pair this with my SEO strategy guide.
Diagnosis over inventory
Your crawler finds problems. You decide which problems matter. The audit is the second part, and it is the only part a client is actually paying for. Scope it before you start: a 50-page brochure site and a 2-million-URL marketplace need different audits.
CHAPTER 02
Crawl and Indexation: See What Google Sees
Every audit starts the same way for me. I need to see the site the way a crawler sees it, then compare that against what Google has actually indexed. The gap between those two numbers is where most of the interesting problems live.
Run a full crawl first. I use Screaming Frog for most sites, the cloud crawlers for the big ones. Configure it to render JavaScript if the site relies on it, because crawling the raw HTML of a React app and calling it a day is how you miss half the site. The crawl gives you the architecture: every URL, its status code, its title, its canonical, its depth from the homepage, its internal links in and out.
Then pull the indexation picture from Google Search Console. The Pages report tells you how many URLs Google has indexed and, more usefully, how many it has chosen not to, and why. Excluded by canonical, crawled but not indexed, discovered but not indexed, soft 404. Those exclusion reasons are Google telling you in writing what it thinks of your pages.
- 1Crawl the site and count the indexable URLs (200 status, self-canonical, not noindexed).
- 2Pull the indexed count from Search Console's Pages report.
- 3Compare. A big gap in either direction is a flag.
- 4If crawl is much bigger than index, Google is rejecting pages. Read the exclusion reasons.
- 5If index is much bigger than crawl, you have orphaned or parameter URLs the crawler never reached.
- 6Check the XML sitemap against both. Sitemap URLs should be indexable and live, with no redirects or 404s.
targetThe most revealing line in Search Console
Crawled, currently not indexed is the report row I read first. When you see it at scale, Google found the pages, looked at them, and decided they were not worth keeping. That is almost always a quality or duplication problem, not a technical one, and it points your content review straight at the weak URLs.
Crawl budget and the server logs
Crawl budget matters here too, though far less than people think for small sites. If you have 5,000 pages, ignore it. If you have 500,000, look hard at what Googlebot is actually spending its time on. The honest way to answer that is the server logs, not a tool's estimate. I cover that in my guide on log file analysis, and for big sites it is the single most underused audit input there is.
Example
A retailer once asked me why new products were taking weeks to rank. The crawl looked fine. The logs told the real story: Googlebot was burning 60 percent of its visits on faceted-navigation URLs, every color-and-size combination its own crawlable page. The fix was not more content. It was robots rules and canonicals to stop Google wasting itself on junk, so it could reach the products that mattered. The gap between crawl and index, and the reasons behind it, set the agenda for the rest of the audit.
CHAPTER 03
Technical Health: The Plumbing Check
With the crawl in hand, I walk the technical foundation. This is the plumbing. None of it is glamorous, and most of it will not make a site rank on its own, but any one of these can quietly cap your ceiling no matter how good your content is.
Status codes and redirect chains
Start with status codes because they are unambiguous. 404s that still have internal links pointing at them are bleeding link equity. 302s used where a 301 belongs are telling Google the move is temporary when it is not. Redirect chains, where A points to B points to C, waste crawl and dilute signals at every hop. Map every chain and collapse it to a single hop.
Canonicals and duplication
Canonical tags are where I find the most self-inflicted damage. Pages canonicalizing to the homepage. Paginated sets canonicalizing to page one and losing their deeper products. Parameter URLs with no canonical at all, multiplying into thousands of near-duplicates. Every canonical should point to the one version of a page you want ranked, and it should match the URL in your sitemap and your internal links. When they disagree, Google picks, and it does not always pick your favorite.
targetThe technical checklist that earns its place
HTTPS everywhere with no mixed content. One canonical home for every page, consistent across sitemap, links, and the tag itself. Robots.txt that blocks what should be blocked and nothing more. No accidental noindex on money pages (it happens more than you would believe). Clean redirect map, single hops only. Valid, lint-clean structured data on the templates that qualify for rich results. Hreflang correct and reciprocal if the site is multi-region. For the full treatment, this all lives in my technical SEO guide.
warningWATCH OUT
The most expensive bug I find on audits is a stray noindex tag or a Disallow line someone added during a staging push and forgot to remove. Always check robots.txt and the meta robots on your top revenue pages by hand. A tool will report it, but only if you actually read that row.
Find the ceiling, not the cosmetics
Structured data comes last here for a reason: it does not lift rankings by itself, but it earns rich results and feeds the AI engines. Validate it, do not just add it, and make the markup match what is visible, because invented review stars earn manual penalties. Above all, technical SEO is about removing caps on what good content can achieve. Hunt the issues that block indexing or split signals. Deprioritize the cosmetic ones no human or bot will notice.
CHAPTER 04
On-Page and Content: Does the Page Deserve to Rank?
Now we leave the plumbing and ask the question that actually decides rankings. For each important page, does it deserve to win? Technical perfection on a thin, intent-mismatched page is a fast car with no engine. This is where most of the real upside hides.
I audit on-page in three passes. First, intent. Pull the keywords each page targets and look at what currently ranks for them. If the page is a product page but the results are all how-to guides, you have an intent mismatch no title-tag tweak will fix. That is a strategy problem dressed up as an on-page problem, and it is worth catching early.
Second, the fundamentals. Title tags that earn the click and contain the primary term naturally. One H1 per page that matches the promise. A meta description that sells, even though it is not a ranking factor, because click-through is. Headings that structure the content for a skimmer. On a neglected site, fixing titles alone can move things in weeks.
Judging content depth honestly
Third, depth, and here you have to be ruthless. Open the page and ask: does this fully answer the query better than the pages above it? Not is it long. Word count is not depth. A 400-word answer that nails the question beats a 3,000-word one padded to hit a target. Look for genuine first-hand substance and specifics a rewrite of page one could never contain.
Example
On one audit, a client had 90 blog posts and 70 of them were 500-word listicles that all said roughly the same thing. None ranked. The fix was not to improve all 70. It was to consolidate the best five into genuinely deep guides, redirect the rest into them, and let the link equity pool. Traffic to the survivors roughly doubled within two quarters. Density beat volume, as it nearly always does.
targetSpotting the two silent killers
Thin content: pages too shallow to satisfy intent, flagged by low word count plus low time-on-page plus no rankings. Cannibalization: two or more pages competing for the same query, where Google keeps swapping which one ranks and neither wins. Both are content-audit territory. I cover the cleanup process in detail in my content audit guide and the specific fix in keyword cannibalization.
The output of this chapter is a content verdict for every important URL: keep as is, improve, consolidate, or remove. Page by page, that verdict is often the single most valuable thing in the audit, because it tells the client where to point their writers next instead of publishing more of what is not working.
CHAPTER 05
Internal Linking: The Free Lever Everyone Ignores
Internal linking is the most underrated section of any audit, because it costs nothing to fix and you control all of it. No outreach, no waiting on Google, no third parties. You decide how authority moves through your own site, and most sites do it by accident.
Three things I check every time. Click depth: how many clicks from the homepage to your most important pages. If a money page sits five clicks deep, you are telling Google it is unimportant, whatever you actually believe. The pages you care about most should be close to the surface and linked from your strongest pages.
Orphans: pages with zero internal links pointing at them. The crawler finds these by comparing your sitemap and analytics against the link graph. An orphan is a page Google struggles to find and has no reason to value, because nothing on your own site vouches for it. Every orphan is either a page to link or a page to delete.
Your internal links are the one ranking lever you control completely. No outreach, no budget, no waiting. If you are not auditing them, you are leaving the cheapest wins on the site untouched.
Where does the authority pool?
Look at the distribution of internal links. Most sites accidentally pour links into the footer and navigation, blasting equity at the privacy policy and the contact page while their best commercial pages get a single link from one blog post. Map your highest-value pages and check whether your internal link counts actually reflect that priority. They almost never do, and rebalancing them is free.
lightbulbPRO TIP
A fast heuristic: sort all URLs by internal inlinks descending, then by business value descending. Where those two orders disagree most is your internal linking to-do list. Pages high in value but low in links are starving, and they are the easiest wins you will find all audit.
Free and underused
Anchor text matters here too, but gently: internal anchors should be descriptive and varied, clarifying rather than gaming. A serious internal link pass needs no budget and no permission, and it is often the highest-ROI item out of an audit. For the full playbook, see my internal linking guide.
CHAPTER 06
Backlinks and E-E-A-T: The Trust Layer
Off the site now, into the trust layer. Backlinks and E-E-A-T are different things, but I audit them together because they answer the same underlying question: does the rest of the world treat this site as credible, and would a careful person trust it?
On backlinks, I am looking for quality and relevance, not raw volume. Pull the profile from your link tool of choice and judge it. Are the linking domains relevant, or a graveyard of directories and irrelevant blogs? Is the growth curve natural, or are there suspicious spikes that scream a past link scheme? A hundred links from real, relevant sites beat ten thousand from a network nobody respects.
Benchmark against the competitors who actually outrank the client for the money terms. If the three sites beating them have links from industry publications and the client has none, that gap is a finding with a clear action. Resist the urge to obsess over a toxic-links cleanup unless there is real evidence of a penalty. Most disavow panic is wasted effort on links Google already ignores.
targetWhat I report on the link profile
Referring domains versus the top three ranking competitors. Quality and topical relevance of the strongest links, not just the count. Anchor text distribution, watching for over-optimized money-term anchors that look manipulative. Any unnatural pattern worth investigating. The action is almost always earn more relevant links here, rarely disavow these. For the deeper how-to, this connects to the wider link-building work outside the audit.
Auditing E-E-A-T as four questions
E-E-A-T is not a score you can read off a tool, so I audit it by asking the four questions on the pages that matter. Where is the proof of first-hand experience? Where is the proof of expertise, like a credentialed author? Who off-site vouches for this site? And why would a stranger trust this page, today, on the first read? Gaps in any of those four are findings, and on YMYL topics they move from nice-to-have to mandatory.
lightbulbPRO TIP
Run the E-E-A-T pass hardest on your Your Money or Your Life pages: health, finance, legal, anything that affects a reader's wellbeing. A missing author bio on a recipe is cosmetic. The same gap on a page about medication or mortgages is a real ranking liability.
The signals are concrete and checkable: named authors with relevant bios, real credentials shown not just claimed, primary-source citations, a strong About page with a transparent identity, and Organization and Author schema that links to verifiable profiles. I lay out the full framework in my E-E-A-T guide, and an audit is the perfect moment to find where those proofs are missing.
CHAPTER 07
Core Web Vitals: Speed as an Audit Input
Performance gets either ignored or wildly overhyped in audits. Let me set the weight correctly. Core Web Vitals are a real, confirmed ranking signal, but a small one, and they are mostly a tiebreaker. A fast page with weak content still loses. That said, on a competitive query where everything else is equal, speed and stability can be the edge.
The first thing to get right is field data versus lab data. Field data is what real Chrome users actually experienced, reported through the Chrome User Experience Report and visible in Search Console and PageSpeed Insights. That is what Google uses. Lab data, from a single Lighthouse run, is a diagnostic tool for finding the cause, not for judging whether you pass. Always anchor your verdict to the field data.
The three metrics, in plain terms
- Largest Contentful Paint (LCP): how fast the main content shows up. Usually a server, image, or render-blocking-resource problem.
- Interaction to Next Paint (INP): how snappy the page feels when the user taps or clicks. Usually a heavy JavaScript problem.
- Cumulative Layout Shift (CLS): how much the page jumps around as it loads. Usually missing image dimensions or late-loading fonts and ads.
In the audit, I report which of the three is failing at the 75th percentile, on mobile, because that is the bar Google holds. I do not report a generic speed score. I report the specific failing metric and its likely cause, because LCP, INP, and CLS have completely different fixes and lumping them into one number helps nobody implement anything.
Example
A media site was failing CLS sitewide and could not work out why their score was red despite a fast server. The cause was ad slots with no reserved height, so every page visibly jumped as the ads loaded. Reserving the dimensions fixed CLS without touching a single line of content. The lesson: the metric tells you exactly where to look, which is why you report the metric and not the score.
targetWeight it correctly
Do not let a performance section eat the whole audit. If a site is failing on indexation or running thin content, Core Web Vitals belongs in the medium-priority pile, not the headline. Fix the things that uncap rankings first, then chase the tiebreaker. Treat performance as one input among many, weighted by where the site sits competitively.
For the full diagnosis-and-fix process on each metric, including the common causes and the order to tackle them, see my Core Web Vitals guide.
CHAPTER 08
GEO Readiness: Auditing for the AI Answer Layer
This is the chapter most audits in the wild still skip, and it is the one that separates a current playbook from a 2019 one. A growing share of searches now end in an AI answer instead of a list of blue links. So I audit a new question: when an AI engine builds an answer in your space, can it find you, parse you, and quote you with confidence?
Generative Engine Optimization is not a separate discipline bolted on the side. It is mostly the same fundamentals seen through a new lens. The same clear structure, strong E-E-A-T, and clean technical foundation that win in classic search also make a page easy for a language model to extract and trust. But there are specific things I now check that a 2019 audit never would have.
Can the AI crawlers even reach you?
First, discoverability for AI crawlers. Check robots.txt for the AI user-agents like GPTBot, OAI-SearchBot, ClaudeBot, and PerplexityBot. Plenty of sites have blocked these without realizing it, through a blanket rule or an over-eager security plugin, then wonder why they are invisible in ChatGPT. An audit should flag whether that block is a deliberate choice or an accident, because being uncited because you blocked the crawler is a self-inflicted wound.
targetThe GEO readiness checklist
AI crawlers allowed (or a conscious decision to block them). Content structured so answers are extractable: clear headings, direct answers near the top, lists and tables where they fit. Genuine first-hand experience and specifics a model cannot generate itself. Strong entity clarity through schema and a consistent identity across the web. Factual, citable statements rather than vague marketing prose. This is the audit lens for the AI era, and it maps closely onto good E-E-A-T.
Second, extractability. AI engines lift answers from content structured to be lifted. A page that buries its answer under 600 words of preamble is hard to quote. A page that states the answer directly, supports it, and structures the supporting points clearly is easy to quote. In the audit I flag the high-intent pages that are structurally hostile to extraction, because that is a fixable reason they are not getting cited.
lightbulbPRO TIP
The single best GEO signal is something an AI cannot manufacture: real, first-hand, specific experience. Original data, your own testing, photos, named details. When you audit a page for AI readiness, ask what is on here that a model could not have written from thin air. If the answer is nothing, that is your finding.
Treat GEO readiness as a first-class section, not an afterthought. The encouraging part is that the work overlaps heavily with everything else in this audit. Fix the fundamentals, prove your experience, structure for clarity, and you improve classic rankings and AI citations at the same time. The payoff is double, and skipping this section dates your whole audit.
CHAPTER 09
Prioritize and Present: Turning Findings Into Action
You have gathered everything. Crawl, indexation, technical, on-page, links, E-E-A-T, performance, GEO. Now comes the part that actually determines whether your audit changes anything: turning a pile of findings into a ranked plan someone will execute. A brilliant diagnosis that nobody acts on is worth exactly nothing.
I rank every finding on two axes: impact and effort. Impact is how much traffic, revenue, or risk reduction the fix is likely to deliver. Effort is how hard it is to implement, factoring in developer time and dependencies. Plot them and your priorities draw themselves. High impact and low effort go to the top. Low impact and high effort go to the bottom, or off the list entirely.
| Low effort | High effort | |
|---|---|---|
| High impact | Do first. The headline wins. | Plan and schedule. The big projects. |
| Low impact | Quick housekeeping when convenient. | Usually not worth doing. Cut it. |
Presenting so it actually gets done
Be honest about that bottom-right quadrant. High effort and low impact is where good audits show discipline: you note the issue exists and explicitly recommend not doing it yet. Then structure the deliverable for the person who reads it, not for you. Open with an executive summary a non-technical decision-maker understands in two minutes: the state of the site, the three things costing the most, and what fixing them is worth. Then the prioritized issue list. Then a technical appendix for whoever implements it. Three layers, three audiences, one document.
targetWriting a finding that gets fixed
Every issue gets four lines. What it is, in plain language. Why it matters, tied to traffic, revenue, or risk. The specific fix, concrete enough to action. The rough effort, so they can plan. Skip any of the four and the finding stalls. Why it matters is the one people forget, and it is the one that gets the ticket approved. For turning this into ongoing reporting, this dovetails with my SEO strategy guide.
An audit nobody implements is a very expensive way of being right. Your job is not to be thorough. Your job is to get the important things fixed.Shmul
Close with a clear next step. Not here are 200 problems, good luck. Instead: start with these three this month, here is who owns each, here is how we measure it. An audit is the beginning of work, not the end. The best audit I hand over is the one that gets the top three findings shipped within a quarter, and that is the standard I hold every one of mine to.
Frequently asked
How long does an SEO audit take?expand_more
What tools do I need to do an SEO audit?expand_more
How often should I audit a website?expand_more
What is the most common issue found in SEO audits?expand_more
Should an SEO audit include AI and GEO readiness?expand_more
How do I prioritize the findings from an audit?expand_more
Want this done for you?
I help brands win on Google and get cited in AI search. Tell me about your project.