SEO - Canonical URL

Canonical Tags: What Are They?

An HTML element is known as rel="canonical" indicates the primary version for duplicate, nearly identical, or comparable sites. In other words, you can employ canonical tags to identify which version of a piece of content is the original authentic one. Google will index the canonical URL and show it to viewers when they perform a Google search.

<link rel="canonical" href="https://wwwtutorialspoint.com/seo/what-is-seo.htm"/>

Types of Canonical URLs

There are two types of Canonical URLs −

Self-Referencing Canonical URLs
Canonical URLs that direct users to the desired page from an alternative page.

The user-declared and Google-selected canonical can be seen using Google Search Console's URL Inspection tool.

Canonical URLs: Why Are They Important?

Since Google primarily indexes canonical URLs, canonical URLs are crucial. In a nutshell, if there is content duplicated on your website—that is, webpages with nearly or the same content—Google will only index one of them (which is canonical).
If your canonical URLs are set up correctly, Google will recognise your choice and that website as the official one. However, if you don't provide a canonical for identical or nearly identical pages, Google will determine a canonical based on its most accurate assessment.
You may not want Google to select that address as canonical. Therefore, you must manually establish a canonical URL if you intend to have the highest probability that it will be genuine.

Guidelines for Canonical URLs

Although canonicalization is complicated and sophisticated, most website proprietors must know a few best practices −

Utilising self-referencing canonical tags

A page's canonical tags that link to themselves are said to be self-referencing canonical tags.

Avoid using non-canonical URLs in your sitemap

Because Google considers non-canonical URLs as proposed canonicals, it advises against including them in your sitemap.

Canonical URLs shouldn't be configured to 404 pages

When a website or source cannot be located, a 404 error code is provided by the browser. The most common reason is that the website has been removed or inactive.

Multiple Canonical Tags Are Not Acceptable

Multiple canonical tags on a single webpage are not an acceptable practice scenario, and Google is likely to disregard all the canonical tags and opt not to index the website.

Why Do I Have Duplicate Content?

There are numerous causes for duplicating material on a website, including −

Region-specific content, such as a piece of writing with separate URLs for the USA and the UK but fundamentally the exact text in the same language.
Variations for different types of devices, like a web page with mobile and desktop versions.
Protocol variations, such as a website's HTTP and HTTPS versions.
The outcomes of the sorting and filtering operations on a category page, for instance, are site functionalities.
Accidental variations include, for instance, the test version of the website being unintentionally left open to crawlers.

Implementing canonical tags: the fundamentals

Implementing canonicals is simple. There are five obvious principles that you must always follow −

First Rule: Only Use Absolute URLs

The framework that follows should be used: <link rel="canonical" href="https://tutorialspoint.com/example-page/" />

In contrast to this one: <link rel="canonical" href="/example-page/" />

Second Rule: Make URLs in lowercas

Before using lowercase URLs for the canonical tags, you must ensure that lowercase URLs have been configured on the server. Google can regard the lowercase and uppercase URLs as two distinct URLs.

Third Rule: Select the appropriate domain protocol (HTTP or HTTPS)

Ensure you don't include any HTTP URLs within your canonical tags when you've migrated from SSL to HTTP. Performing so could produce a confusing situation with unanticipated consequences. Use the URL in the following format if you're on a secure domain (HTTPS) −

<link rel="canonical" href="https://tutorialspoint.com/example-page/" />

Compared to:

<link rel="canonical" href="http://tutorialspoint.com/example-page/" />

The reverse scenario holds if you're utilising HTTP, in any case.

Fourth Rule: Canonical self-referential tags should be used

The developer must define self-referencing URLs when utilising a custom CMS (Content Management System). Still, the majority of current mainstream CMSs do this effectively and automatically.

Fifth Rule: Every web page should have a single canonical tag

If a webpage contains more than one, Google will not consider both canonical tags. When there are several rel=canonical declarations, Google will likely ignore all rel=canonical recommendations.

The Best Way To Use Canonical

Canonical URLs can be specified in five different methods. Canonicalization signals include the following −

(rel=canonical) HTML tag.
The header for the HTTP protocol.
Sitemap.
Redirecting to a 301 page.
Internal linking.

What Not To Do While Canonicalizing?

The concept of canonicalization is a little complicated. As a result, many people need to learn about canonicalization and the best way to do it.

Those who attempt to canonicalized frequently make the following errors −

The canonicalized URL is configured with "noindex".
Using robots.txt to prevent access to the canonicalized URL.
Canonicalizing each web page with a pagination to the primary web page.
Choosing the canonicalized URL's HTTP code status to 4XX.
adding several rel=canonical tags
utilising hreflang without canonical tags
rel=canonical tags in the <body> section of source code.

Conclusion

Canonical tags are relatively straightforward. Simply put, they take some time to understand. Do not forget that canonical tags serve as a hint to search engine crawlers rather than a direction. So they might select an alternative canonical than the one you specify.