Indexing HTTPS pages by default
December 17th, 2015 | Published in Google Online Security
[Cross-posted from the Webmaster Central Blog]
At Google, user security has always been a top priority. Over the years, we’ve worked hard to promote a more secure web and to provide a better browsing experience for users. Gmail, Google search, and YouTube have had secure connections for some time, and we also started giving a slight ranking boost to HTTPS URLs in search results last year. Browsing the web should be a private experience between the user and the website, and must not be subject to eavesdropping, man-in-the-middle attacks, or data modification. This is why we’ve been strongly promoting HTTPS everywhere.
As a natural continuation of this, today we'd like to announce that we're adjusting our indexing system to look for more HTTPS pages. Specifically, we’ll start crawling HTTPS equivalents of HTTP pages, even when the former are not linked to from any page. When two URLs from the same domain appear to have the same content but are served over different protocol schemes, we’ll typically choose to index the HTTPS URL if:
- It doesn’t contain insecure dependencies.
- It isn’t blocked from crawling by robots.txt.
- It doesn’t redirect users to or through an insecure HTTP page.
- It doesn’t have a rel="canonical" link to the HTTP page.
- It doesn’t contain a noindex robots meta tag.
- It doesn’t have on-host outlinks to HTTP URLs.
- The sitemaps lists the HTTPS URL, or doesn’t list the HTTP version of the URL.
- The server has a valid TLS certificate.
Although our systems prefer the HTTPS version by default, you can also make this clearer for other search engines by redirecting your HTTP site to your HTTPS version and by implementing the HSTS header on your server.
We’re excited about taking another step forward in making the web more secure. By showing users HTTPS pages in our search results, we’re hoping to decrease the risk for users to browse a website over an insecure connection and making themselves vulnerable to content injection attacks. As usual, if you have any questions or comments, please let us know in the comments section below or in our webmaster help forums.