Every site has its own development release schedule, publishing cadence, and a myriad of other variables that could affect the need for technical analysis.
So how often should you perform technical website crawls for SEO? It depends.
What does it depend on? That is the crucial question.
Let’s take a quick look at what a website crawl is and why we run them before diving into how frequently to do them.
What Is a Technical SEO Website Crawl?
A crawl of a website is when a software’s “crawler,” or bot, visits each page on a website extracting data as it goes. This is similar to how a search engine’s bot might visit your site.
It will follow the commands you give it through respecting or ignoring your robots.txt, telling it to follow or disregard nofollow tags, and other conditions you can specify.
It will then crawl each page it can find by following links and reading XML sitemaps.
As it goes, the crawler will bring back information about the pages. This might be server response codes like 404, the presence of a no-index tag on the page, or whether bots would be blocked from crawling it via the robots.txt, for example.
It may also bring back HTML information like page titles and descriptions, the layout of the site’s architecture, and any duplicate content discovered.
All of this information gives you a powerful snapshot of your website’s ability to be crawled and indexed.
It can also highlight issues that may affect rankings, such as load speed or missing meta data.
The Purpose of a Technical SEO Website Crawl
When you conduct a crawl of a site, it’s usually to identify one or more of the following issues that could be affecting:
Running a site crawl is an easy job once you have the software in place. If you are looking to spot potential or current issues with your site, it makes sense to crawl it regularly and often.
Why Wouldn’t You Crawl a Site All the Time?
In SEO, there are near-unlimited tasks we could be carrying out at any given moment — SERP analyses, refreshing meta titles, and rewriting copy with the hopes of ranking higher among them.
Without a strategy behind these activities, you are at best distracting yourself from impactful work. At worst, you could be reducing the performance of your site.
As with other SEO tasks, there must be a strategy behind website crawls.
The flip-side of the question “How often should you perform technical website crawls?” is understanding why you wouldn’t run them all the time.
Essentially, they take up time and resources — if not to run, then at least to analyze effectively.
Adding a URL to a website crawler and clicking go isn’t a particularly onerous task. It becomes even less of a time drain if you schedule crawls to happen automatically.
Make sure you optimize for user experience metrics.
Are you optimizing for user experience? Enhancing site speed, content stability and interactivity can boost organic rankings, brand awareness and sales.
So why is time a deciding factor in how often you crawl a site?
It’s because there is no point in crawling a site if you are not going to analyze the results. That’s what takes time — the interpretation of the data.
You may well have software that highlights errors in a color-coded traffic-light system of urgency that you can cast your eye down quickly. This isn’t analyzing a crawl.
You may miss important issues that way. You might get overly reliant on a tool to tell you how your site is optimized.
Although very helpful, those sorts of reports need to be coupled with deeper checks and analysis to see how your site is supporting your SEO strategy.
There will likely be good reasons why you would want to set up these automated reports to run frequently. You may have a few issues like server errors that you want alerted to every day.
These should be considered alerts, though, and ones that may need a deeper investigation. Proper analysis of your crawls, with knowledge of your SEO plan, takes time.
Do you have the capacity, or need, to do that full crawl and analysis daily?
In order to crawl your site, you will need software.
Some software is free to use in an unlimited manner once you have paid a license fee. Others will charge you depending on how much you use it.
If your crawling software cost is based on usage, crawling your site every day might be cost-prohibitive. You may end up using your month’s allowance too early, meaning you can’t crawl the site when you need to.
Unfortunately, some sites rely on servers that are not particularly robust. As a result, a crawl conducted too quickly or at a busy time, can bring the site down.
I’ve experienced frantic calls from the server manager to the SEO team asking if we’re crawling the site again.
I’ve also worked on sites that have crawling tools blocked in the robots.txt in the hopes it will prevent an over-zealous SEO bringing down the site.
Although this obviously isn’t an ideal situation to be in, for SEOs working for smaller companies, it’s an all too common scenario.
Crawling the website safely might require that tools are slowed down, rendering the process more time-consuming.
It might mean liaising with the individual in charge of maintaining the server to ensure they can prepare for the crawl.
Doing this too frequently or without good reason isn’t sustainable.
Alternatives to Crawling Your Site
You don’t necessarily need to crawl your site daily in order to pick up on the issues. You may be able to reduce the need for frequent crawls by putting other processes and tools in place.
Software That Monitors for Changes
Some software can monitor your site for a whole variety of changes. For instance, you can set up an alert for individual pages to monitor if content changes.
This can be helpful if you have important conversion pages that are critical to the success of your site and you want to know the moment anyone makes a change to them.
You can also use software to alert you to server status, SSL expiration, robots.txt changes, XML sitemap validation issues. All of these types of alerts can reduce your need to crawl the site to identify issues.
Instead, you can save those crawls and audits for when an issue is discovered and needs to be remedied.
Processes That Inform SEO Professionals of Changes/Plans
The other way to minimize the need to crawl your site often is by putting in processes with other team members that keep you in the loop of changes that might be happening to the site. This is easier said than done in most instances but is a good practice to instill.
If you have access to the development team or agency’s ticketing system and are in frequent communications with the project manager, you are likely to know when deployments might affect SEO.
Even if you don’t know exactly what the roll-out will change, if you are aware of deployment dates, you can schedule your crawls to happen around them.
By staying aware of when new pages are going live, content is going to be rewritten, or new products launched, you will know when a crawl will be needed.
This will save you from needing to pre-emptively crawl weekly in case of changes.
Automated Crawls With Tailored Reports
As mentioned above, crawling tools often allow you to schedule your crawls. You may be in the position that this is something your server and your processes can withstand.
Don’t forget that you still need to read and analyze the crawls, so scheduling them won’t necessarily save you that much time unless they are producing an insightful report at the end.
You may be able to output the results of the crawl into a dashboard that alerts you to the specific issues you are concerned about.
For instance, it may give you a snapshot of how the volume of pages returning 404 server responses has increased over time.
This automation and reporting could then give cause for you to conduct a more specific crawl and analysis rather than requiring very frequent human-initiated crawling.
When Should a Crawl Be Done?
As we’ve already discussed, frequent crawls just to check up on on-site health might not be necessary.
Crawls should really be carried out in the following situations.
Before Development or Content Changes
If you are preparing your site for a change — for instance, a migration of content to a new URL structure — you will need to crawl your site.
This will help you to identify if there are any issues already existing on the pages that are changing that could affect their performance post-migration.
Crawling your site before a development or content change is about to be carried out on the site ensures it is in the optimum condition for that change to be positive.
Before Carrying Out Experiments
If you are preparing to carry out an experiment on your site, for example, checking to see what effect disavowing spammy backlinks might have, you need to control the variables.
Crawling your website to get an idea of any other issues that might also affect the outcome of the experiment is important.
You want to be able to say with confidence that it was the disavow file that caused the increase in rankings for a troubled area of your site, and not that those URLs’ load speed had increased around the same time.
When Something Has Happened
You will need to check up on any major changes on your site that could affect the code. This will require a technical crawl.
For example, after a migration, once new development changes have been deployed, or work to add schema mark-up to the site — anything that could have been broken or not deployed correctly.
When You Are Alerted to an Issue
It may be that you are alerted to a technical SEO issue, like a broken page, through tools or human discovery. This should kick-start your crawl and audit process.
The idea of the crawl will be to ascertain if the issue is widespread or contained to the area of the site you have already been alerted to.
What Can Affect How Often You Need to Perform Technical SEO Crawls?
No two websites are identical (unless yours has been cloned, but that’s a different issue). Sites will have different crawl and audit needs based on a variety of factors.
Size of site, its complexity, and how often things change can impact the need to crawl it.
The need to crawl your website frequently if it is only a few pages is low.
Chances are you are well aware of what changes are being made to the small site and will easily be able to spot any significant problems. You are firmly in the loop of any development changes.
Enterprise sites, however, may be tens of thousands of pages big. These are likely to have more issues arise as changes are deployed across hundreds of pages at a time.
With just one bug, you could find a large volume of pages affected at once. Websites that size may need much more frequent crawls.
The type of website you are working on might also dictate how often and regularly it needs to be crawled.
An informational site that has few changes to its core pages until its annual review will likely need to be crawled less frequently than one with product pages go live often.
One of the particular nuances of ecommerce sites when it comes to SEO is the stock. Product pages might come online every day, and products may go out of stock as frequently. This can raise technical SEO issues that need to be dealt with quickly.
You might find that a website’s way of dealing with out-of-stock products is to redirect them, temporarily or permanently. It might be that out-of-stock products return a 404 code.
Whatever method for dealing with them is chosen, you need to be alerted to this when it happens.
You may be tempted to crawl your site daily to pick up on these new or deleted pages. There are better ways of identifying these changes though, as we’ve already discussed.
A website monitoring tool would alert you to these pages returning a 404 status code. Additional software might be out of your current budget, however. In this instance, you might still need to crawl your site weekly or more often.
This is one of the examples where automated crawls to catch these issues would come in handy.
News websites tend to add new pages often; there may be multiple new pages a day, sometimes hundreds for large news sites. This is a lot of change to a site happening each day.
Depending on your internal processes, these new pages may be published with great consideration of how they will affect a site’s SEO performance… or very little.
Forum and User Generated Content
Any site that has the ability for the general public to add content will have an increased risk of technical SEO errors occurring.
For instance, broken links, duplicate content, and missing meta data are all common on sites with forums.
These sorts of sites may need more frequent crawls than content sites that only allow publishing by webmasters.
A content site with few template types may sound relatively low risk when it comes to incurring technical SEO issues. Unfortunately, if you have “many cooks” there is a risk of the broth being spoiled.
Users with little understanding of how to form URLs, or what are crucial CMS fields, might create technical SEO problems.
Although this is really a training issue, there may still be an increased need to crawl sites whilst that training is being completed.
Schedule and Cadence
The other important factor to consider is the schedule of other teams in your company.
Your development team might work in two-week sprints. You may only need to crawl your site once every two weeks to see their impact on your SEO efforts.
If your writers publish new blogs daily, you may want to crawl the site more frequently.
There is no one-size-fits-all schedule for technical website crawls. Your individual SEO strategy, processes, and type of website will all impact the optimal frequency for conducting crawls.
Your own capacity and resources will also affect this schedule.
Be considerate of your SEO strategy and implement other alerts and checks to minimize the need for frequent website crawls.
Your crawls should not just be a website maintenance tick-box exercise but in response to a preventative or reactive need.