Technical SEO is the process of optimizing your website to help search engines like Google find, understand, and index your pages.

Although modern search engines like Google are relatively good at discovering and understanding content, they’re far from perfect. Technical issues can easily prevent them from crawling, indexing, and showing web pages in the search results.

In this post, we’ll cover a few technical SEO best practices that anyone can implement, regardless of technical prowess.

  1. Ensure important content is ‘crawlable’ and ‘indexable’
  2. Use HTTPs
  3. Fix duplicate content issues
  4. Create a sitemap
  5. Use hreflang for multilingual content
  6. Redirect HTTP to HTTPS
  7. Use schema markup to win ‘rich snippets’
  8. Fix orphaned pages
  9. Make sure your pages load fast
  10. Use schema to improve your chance of Knowledge Graph inclusion
  11. Don’t nofollow internal links

1. Ensure important content is ‘crawlable’ and ‘indexable’

Crawling is how search engines discover most new content. It’s where a spider visits and downloads new data from known webpages.

For example, let’s say you add a new page to your site and link to it from your homepage. When Google next crawls your homepage, it’ll discover the link to the new page. Then, if it decides the content on that page is valuable for searchers, it’ll get indexed.

This process works well, as long as you’re not blocking search engines from crawling or indexing a page.

Robots.txt is the file that tells search engines like Google which pages they can and can’t crawl. You can view it by navigating to yourwebsite.com/robots.txt.

In the example above, those two simple lines of code block search engines from crawling every page on the website. So you can see how temperamental this file can be, and how easy it is to make costly mistakes.

You can check which pages (if any) are blocked by robots.txt in Google Search Console. Just go to the Coverage report, toggle to view excluded URLs, then look for the “Blocked by robots.txt” error.

2 blocked by robots txt

If there are any URLs in there that shouldn’t be blocked, you’ll need to remove or edit your robots.txt file to fix the issue.

However, crawlable pages aren’t always indexable. If your webpage has a meta robots tag or x‑robots header set to “noindex,” search engines won’t be able to index the page.

You can check if that’s the case for any webpage with the free on-page report on Ahrefs’ SEO Toolbar.

3 noindex ahrefs toolbar

To check for rogue noindex tags across all pages, run a crawl with Site Audit in Ahrefs Webmaster Tools and check the Indexability report for “Noindex page” warnings.

4 noindex page site audit

Fix these by removing the ‘noindex’ meta tag or x‑robots-tag for any pages that should be indexed.

HTTPS encrypts the data sent between a website and its visitors. It helps protect sensitive information like credit card details from being compromised.

Given the benefits of HTTPS for web users, it probably comes as no surprise that it’s been a ranking factor since 2014.

How do you know if your site uses HTTPS?

Go to https://www.yourwebsite.com, and check for a lock icon in the loading bar.

5 lock icon loading bar

If you see a red ‘Not secure’ warning, you’re not using HTTPS, and you need to install a TLS/SSL certification. You can get these for free from LetsEncrypt.

6 not secure

If you see a grey ‘Not secure’ warning…

7 not secure mixed

… then you have a mixed content issue. That means the page itself is loading over HTTPS, but it’s loading resource files (images, CSS, etc.) over HTTP.

There are four ways to fix this issue:

  • Choose a secure host for the resource (if one is available).
  • Host the resource locally (if you’re legally allowed to do so).
  • Exclude the resource from your site.
  • Use the HTTP Content Security Policy (CSP)

However, if you have mixed content issues on one page, it’s quite likely that other pages are affected too. To check if that’s the case, crawl your site with Ahrefs Webmaster Tools. It checks for 100+ predefined SEO issues, including HTTP/HTTPS mixed content.

Recommended reading: What is HTTPS? Everything You Need to Know

3. Fix duplicate content issues

Duplicate content is where the same or similar content appears in more than one place on the web. It can happen on one website or across multiple sites.

For example, this post from Buffer appears at two locations:

https://buffer.com/library/social-media-manager-checklist
https://buffer.com/resources/social-media-manager-checklist

Despite what a lot of people think, Google doesn’t penalize sites for having duplicate content. They’ve confirmed this on multiple occasions.

But duplicate content can cause other issues, such as:

  • Undesirable or unfriendly URLs in search results;
  • Backlink dilution;
  • Wasted crawl budget;
  • Scraped or syndicated content outranking you.

You can see pages with duplicate content issues in Google Search Console. Just go to the Coverage report, toggle to view excluded URLs, then look for issues related to duplicates.

9 duplicate content search console

Google explains what these issues mean and how to fix them here.

However, Search Console only tells you about URLs that Google has recognized as duplicates. There could very well be other duplicate content issues that Google hasn’t noticed. To find these, run a free crawl with Ahrefs Webmaster Tools and check the Duplicate content report.

9 duplicate content

Fix the issues by choosing one URL within each group of duplicates to be the ‘canonical’ (main) version.

10 canonicalization

Sitemaps list all the important content on your website. They come in various formats, but XML files are the most common.

Here’s what our blog’s sitemap looks like:

11 xml sitemap

Many people question the importance of sitemaps these days, as Google can usually find most of your content even without one. However, a Google representative confirmed the important of sitemaps in 2019, stating that they’re the second most important source of URLs for Google:

Sitemaps are the second Discovery option most relevant for Googlebot @methode #SOB2019— Enrique Hidalgo (@EnriqueStinson) June 15, 2019

But why is this?

One reason is that sitemaps usually contain ‘orphan’ pages. These are pages that Google can’t find through crawling because they have no internal links from crawlable pages on your website.

Most modern CMS’ including Wix, Squarespace, and Shopify automatically generate a sitemap for you. If you’re using WordPress, you’ll need to create one using a popular SEO plugin like Yoast or RankMath.

12 yoast sitemap

You can then submit that to Google through Search Console.

13 sitemap search console

It’s worth noting that Google also sees the URLs in sitemaps as suggested canonicals. This can help combat duplicate content issues (see the previous point), but it’s still best practice to use canonical tags where possible.

https://www.youtube.com/watch?v=JLCwGo43fAY&feature=youtu.be&t=3m16s

Recommended reading: How to Create an XML Sitemap (and Submit It to Google)

5. Use hreflang for multilingual content

Hreflang is an HTML attribute used to specify the language and geographical targeting of a webpage. It’s used on sites with variations of pages in other languages or alternative geographic targeting.

For example, we have versions of our homepage in multiple languages:

13 ahrefs english14 ahrefs polish

Each of these variations uses hreflang to tell search engines about their language and geographic targeting.

There are two main reasons why hreflang is important for SEO:

  1. It helps combat duplicate content. Let’s say that you have two similar pages. Without hreflang, Google will likely see these pages as duplicates and index only one of them.
  2. It can help rankings. In this video, Google’s Gary Ilyes explains that pages in a hreflang cluster share ranking signals. That means if you have an English page with lots of links, the Spanish version of that page effectively shares those signals. That may help it to rank in Google in other countries.

Implementing hreflang is easy. Just add the appropriate hreflang tags to all versions of the page.

For instance, if you have versions of your homepage in English, Spanish, and German, you would add these hreflang tags to all of those pages:

Learn more about implementing hreflang and multilingual SEO in the resources below.

6. Redirect HTTP to HTTPS

Even if you’re using HTTPs, your website may be accessible to visitors at the HTTP version. This isn’t ideal, as there’s no point in having HTTPS if visitors can visit the non-secure version of your website.

To check if this is the case, try to navigate to the HTTP version of your site. If the browser redirects you automatically, then there likely isn’t an issue.

lock icon https

If you’re able to access the HTTP version, you need to redirect HTTP to HTTPs.

You can do this by adding the following code to your .htaccess file:

RewriteEngine On
RewriteCond %{HTTPS} off
RewriteRule ^(.*)$ https://%{HTTP_HOST}%{REQUEST_URI} [L,R=301]

Or, if you’re using WordPress, just change your WordPress Address and Site Address to the HTTPS version under Settings.

http to https wordpress

It’s also possible to do this at the server level, as explained here. Just make sure to use a permanent (301) redirect when doing this, not a temporary (302) redirect.

Learn more about 301 and 302 redirects in the resources below:

7. Use schema markup to win ‘rich snippets’

Rich snippets are search results with extra information shown under the title, description, and URL.

15 xero review rich snippets

The benefit of rich snippets is more real estate in the search results, and sometimes an improved clickthrough rate.

However, Google only shows rich snippets for certain types of content, and only if you provide them with information using schema markup. If you haven’t heard of schema markup before, it’s additional code that helps search engines to better understand and represent your content in the search results.

For example, if you had a recipe for Kung Pao Chicken on your site, you could add this markup to give Google information about cooking time, calories, and more:

Not only does this provide Google with more information about your page, but it would also make it eligible for rich snippets like this:

16 kung pao rich snippets

Learn more about implementing rich snippet schema in the guides below.

Orphaned pages have no internal links from crawlable pages on your website. As a result, search engines can’t find or index them (unless they have backlinks from other websites).

It’s often difficult to find orphaned pages with most auditing tools, as they crawl your site much like search engines do. However, if you’re using a CMS that generates a sitemap for you, you can use this as a source of URLs in Ahrefs’ Site Audit. Just check the option to crawl auto-detected sitemaps and backlinks in the crawl settings.

17 site audit url sources

Sidenote.

If the location of your sitemap isn’t in your robots.txt file, and isn’t accessible at yourwebsite.com/sitemap.xml, then you should check the “Specific sitemaps” option in the crawl settings and paste in your sitemap URL(s). 

When the crawl is complete, go to the Links report and check for the “Orphan page (has no incoming internal links)” issue.

18 orphan pages site audit

If any of the URLs are important, you should incorporate them into your site structure. This might mean adding internal links from your navigation bar or other relevant crawlable pages. If they’re not important, then you can delete, redirect, or ignore them. It’s up to you.

Recommended reading: Internal Links for SEO: An Actionable Guide

9. Make sure your pages load fast

Pages that load slow are annoying for visitors. That’s one of the reasons why Google made page speed a ranking factor on desktop in 2010, and on mobile in 2018.

Unfortunately, page speed is a complex topic. There are many tools and metrics you can use to benchmark speeds, but Google’s Pagespeed Insights is a reasonable starting point. It gives you a performance score from 0–100 on desktop and mobile and tells you about the areas that could use some improvement.

19 pagespeed insights

But rather than focusing on individual areas, let’s cover a few things that will likely have the most significant positive impact on your page speed for the least effort.

  • Switch to a faster DNS provider. Cloudflare is a good (and free) option. Just sign up for a free account, then swap your nameservers with your domain registrar.
  • Install a caching plugin. CCaching stores files temporarily so they can be delivered to visitors faster and more efficiently. If you’re using WordPress, WP Rocket and WP Super Cache are two good options.
  • Minify HTML, CSS, and JavaScript files. Minification removes whitespace and comments from code to reduce file sizes. You can do this with WP Rocket or Autoptimize.
  • Use a CDN. A Content Distribution Network (CDNs) store copies of your web pages on servers around the globe. It then connects visitors to the nearest server so there’s less distance for the requested files to travel. There are lots of CDN providers out there, but Cloudflare is a good option.
  • Compress your images. Images are usually the biggest files on a web page. Compressing them reduces their size and ensures they take as little time to load as possible. There are plenty of image compression plugins out there, but we like Shortpixel.

Learn more about improving page speed in the video and linked resources below.

Recommended reading: How to Improve Page Speed from Start to Finish (Advanced Guide)

10. Use schema to improve your chance of Knowledge Graph inclusion

Google’s Knowledge Graph is a knowledge-base of entities and the relationships between them. Its data often shows up in SERP features, like this Knowledge Panel for Ahrefs:

20 knowledge panel

While there’s no definitive process for getting in the Knowledge Graph, using organization markup can help.

You can add this with popular WordPress plugins like Yoast and RankMath, or create and add it manually using a schema markup generator.

Just make sure to:

  • Use at least the name, logo, url and sameAs properties
  • Include all social profiles as your sameAs reference (and Wikidata and Wikipedia pages where possible)
  • Validate the markup using Google’s structured data testing tool.

Here’s the organization markup we use:

It doesn’t matter too much which page you add the markup to, but the homepage, contact, or about page is usually your best bet. There’s no need to include it on every page, as Google’s John Mueller confirmed in a 2019 Webmaster Central hangout.

Recommended reading: Google’s Knowledge Graph Explained: How It Influences SEO

11. Don’t nofollow internal links

Nofollow links are commonly used to flag outbound links to pages that you don’t want to endorse. They tell Google not to “pass along ranking credit” to the linked page (although Google may choose to ignore that suggestion).

For that reason, they shouldn’t be used for internal links. Yet, according to our study of the top 110,000 websites, 3.6% of internal links are nofollowed.

21 internal links nofollow

Many website owners do this to try to block indexing of pages, but nofollow doesn’t work like that. Using nofollow on internal links can only do harm, as it could cut off crawling and lead to orphaned content.

This is a common issue when it comes to pagination.

nofollow links pagination

To check your site for nofollowed internal links, run a crawl in Ahrefs Webmaster Tools, then go to the Links report and look for related issues.

22 nofollow internal links issues

Fixing this problem is easy. Just remove the nofollow attribute from the affected links.

Final thoughts

Technical SEO is a complex business, and there are many more best practices that we didn’t have time to cover in this post. However, the advice above should be enough to nip the most common technical mishaps in the bud and easily put your website’s performance in the top 10% compared to the rest of the web.

Got questions or suggestions? Ping me on Twitter.