Last Updated on: 14th January 2022, 11:03 pm
Technical SEO is a tricky, nuanced field that requires a deep knowledge of the inner workings of search engine algorithms. Technical SEO is NOT just about having an understanding of code and programming languages, though it certainly helps. The goal of this guide is to give you a broad overview of how technical SEO fits into your overall marketing strategy—and what you can do to improve your site’s performance in search engines.
When people think of SEO, they usually think of two things: content and links. However, a lot of people forget about the technical side of SEO, which is just as important as the rest of it. Technical SEO is essentially the rules and guidelines that Google has set up for website owners to follow in order to get their sites ranked higher.
What is Technical SEO?
Technical SEO is the process of optimizing a website for search engines. This includes many areas like page speed optimization, schema markup, mobile-friendly sites, and other technical on-site factors that help improve rankings in search engines.
How complicated is technical SEO?
Technical SEO is all about making sure that your website is ready for search engines to crawl and index. Although it’s a lot more technical than the white hat strategies, there are some simple things you can do to improve your search engine traffic.
It’s important to understand how search engines work and what they expect from you. A beginner’s guide to technical SEO will provide you with a solid foundation to build upon in the future.
How crawling works
Crawling is the act of visiting a website to retrieve pages and resources that are linked to other pages on the site. Crawling is also known as spiders, but this term is more commonly used to refer to search engine crawlers. Search engine crawlers are software applications that create a map of a website by following all of the links on a page and then following the links on those pages, and so on.
By crawling a website, search engines collect information about the site’s content, structure, connectivity, and other elements. This data is then used by search engines to determine how relevant any of your pages are.
A crawler has to start somewhere. Generally, they would create a list of all the URLs they find through links on pages. A secondary system to find more URLs are sitemaps that are created by users or various systems that have lists of pages.
All the URLs that need to be crawled or re-crawled are prioritized and added to the crawl queue. This is basically an ordered list of URLs Google wants to crawl.
The system that grabs the content of the pages.
These are various systems that handle canonicalization which we’ll talk about in a minute, send pages to the renderer which loads the page like a browser would, and processes the pages to get more URLs to crawl.
These are the stored pages that Google shows to users.
There are a few ways you can control what gets crawled on your website. Here are a few options.
A robots.txt file tells search engines where they can and can’t go on your site.
Just one quick note. Google may index pages that they can’t crawl if links are pointing to those pages. This can be confusing but if you want to keep pages from being indexed check out this guide and flowchart which can guide you through the process.
There’s a crawl-delay directive you can use in robots.txt that many crawlers support that lets you set how often they can crawl pages. Unfortunately, Google doesn’t respect this. For Google, you’ll need to change the crawl rate in Google Search Console as described here.
If you want the page to be accessible to some users but not search engines, then what you probably want is one of these three options:
- Some kind of login system;
- HTTP Authentication (where a password is required for access);
- IP Whitelisting (which only allows specific IP addresses to access the pages)
This type of setup is best for things like internal networks, member-only content, or for staging, test, or development sites. It allows for a group of users to access the page, but search engines will not be able to access them and will not index the pages.
Crawling is the act of visiting a website to retrieve pages and resources that are linked to from other pages on the site. Crawling is also known as spidering, but this term is more commonly used to refer to search engine crawlers. Search engine crawlers are software applications that create a map of a website by following all of the links on a page and then following the links on those pages, and so on.
By crawling a website, search engines collect information about the site’s content, structure, connectivity and other elements.
Crawlers are applications that can visit every single web page of the internet. The purpose for crawling is to find new pages, update existing lists of pages, or both. The general idea of crawling is to have a list of links and to follow each and every one of them until there are no more links left.
A robots meta tag is an HTML snippet that tells search engines how to crawl or index a certain page. It’s placed into the <head> section of a web page, and looks like this:
<meta name="robots" content="noindex" />
When there are multiple versions of the same page, Google will select one to store in their index. This process is called canonicalization and the URL selected as the canonical will be the one Google shows in search results. There are many different signals they use to select the canonical URL including:
Technical SEO quick wins
SEO is a complex field with a lot of moving parts. When you’re just getting started, it can be hard to know where to start. So if you’re an SEO beginner and would like a quick win to boost your traffic, here are some technical SEO tips that will help you rank faster.
It’s no secret that Google is obsessed with speed – in fact, they claim that site speed can directly impact how well you rank.
Make sure pages you want people to find can be indexed in Google. The two previous chapters were all about crawling and indexing and that was no accident.
Reclaim lost links
Websites tend to change their URLs over the years. In many cases, these old URLs have links from other websites. If they’re not redirected to the current pages then those links are lost and no longer count for your pages. It’s not too late to do these redirects and you can quickly reclaim any lost value. Think of this as the fastest link building you will ever do.
Site Explorer -> yourdomain.com -> Pages -> Best by Links -> add a “404 not found” HTTP response filter. I usually sort this by “Referring Domains”.
This is what it looks like for 1800flowers.com.
Looking at the first URL in archive.org, I see that this was previously the Mother’s Day page. By redirecting that one page to the current version, you’d reclaim 225 links from 59 different websites and there are plenty more opportunities.
You’ll want to 301 redirect any old URLs to their current locations to reclaim this lost value.
Add internal links
Internal links are links from one page on your site to another page on your site. They help your pages be found and also help the pages rank better. We have a tool within Site Audit called “Link opportunities” that helps you quickly locate these opportunities.
Add schema markup
Schema markup is code that helps search engines understand your content better and powers many features that can help your website stand out from the rest in search results. Google has a search gallery that shows the various search features and the schema needed for your site to be eligible.
The basics of technical SEO will help you in your content creation efforts and make it easier to implement changes when necessary. While this guide is not comprehensive, it should provide a good foundation for anyone new to technical SEO. If you have any questions or need more information, please leave a comment below!