Googlebot Initiates Crawling With HTTP/2 Protocol

 

Googlebot now can crawl with HTTP/2 protocol starting from November 2020. Google has therefore updated its Googlebot Developers pages through the HTTP/2 protocol. The Googlebot developer page was updated on November 12, 2020, to reflect this change. This new change by Google was earlier announced in September which is now in effect as of November 2020.

Recommendation:

Google Faces New Antitrust Case In India Over Its Payments App

According To Google

Generally, Googlebot crawls over HTTP/1.1. However, starting November 2020, Googlebot may crawl sites that may benefit from it over HTTP/2 if it’s supported by the site.

Why HTTP/2 Network Protocol

HTTP/2 is the latest network protocol at the present time. It helps in a more efficient and faster transfer of data between a browser and a server. HTTP/2 basically decreases the amount of time it takes for a page to be delivered from a browser to a server. It also decreases the overhead by compressing HTTP header fields. As per the earlier network protocol (HTTP/1), several streams would have to be downloaded in parallel, the reasons being is that only one request at a time was enabled under the previous HTTP/1 version.

Now with new HTTP/2, Googlebot and browsers can take the benefits of the new “multiplexed” quality. This means several resources now can be downloaded with one stream via one connection rather than having to request multiple streams from multiple connections to download the same web page.

As Per The IETF FAQ page On Github:

HTTP/1.x has a problem called “head-of-line blocking,” where effectively only one request can be outstanding on a connection at a time.

…Multiplexing addresses these problems by allowing multiple request and response messages to be in flight at the same time; it’s even possible to intermingle parts of one message with another on the wire.

This, in turn, allows a client to use just one connection per origin to load a page.

HTTP/2 abilities are less server congestion and control server resources. Reducing the burden on servers is better for websites. At a few times, not only Googlebot but many other bots hit a site at the same time, resulting in the site responding in a sluggish manner due to so many server resources are in use. This is not good for users who are trying to see web pages and also bad for the publishers if Googlebot is not able to crawl the site due to the server being overused by rogue bots like hackers and scrapers.

According To Google

…starting November 2020, Googlebot may crawl sites that may benefit from it over HTTP/2 if it’s supported by the site.

This may save computing resources (for example, CPU, RAM) for the site and Googlebot, but otherwise it doesn’t affect indexing or ranking of your site.

HTTP/2 Crawling Opt-Out

Publishers can also opt-out of HTTP/2 crawling. The server may be configured to send a 421 server response code. The 421 status code is called as Misdirected Request by the Internet Engineering Task Force (IETF.org), which means the request for HTTP/2 is misdirected if it’s not available.

According To The IETF

The 421 (Misdirected Request) status code indicates that the request was directed at a server that is not able to produce a response.
This can be sent by a server that is not configured to produce responses for the combination of scheme and authority that are included in the request URI.

Recommendation By Google’s Developer Page

To opt out from crawling over HTTP/2, instruct the server that’s hosting your site to respond with a 421 HTTP status code when Googlebot attempts to crawl your site over HTTP/2. If that’s not feasible, you -can send a message to the Googlebot team- (however this solution is temporary).

Googlebot’s ability to crawl using the HTTP/2  protocol is good news for publishers. It will decrease server load and make it easier for Googlebot to crawl sites.

Google,

Leave a Reply