Handling domain redirection while maintaining different google crawl attributes using Cloudflare

Ghouse Mohamed

Ghouse Mohamed

August 31, 2023

neeto is a collection of different sofwares. Each neeto product gets its own page. For example NeetoCal gets the url https://www.neeto.com/neetocal.

When it comes to actually using these products you will be redirected to https://subdomain.neetocal.com. Here "subdomain" would be the subdomain allocated to you when you signed up for neeto.

We planned to add "Google sign in" feature to make it easier for the folks to both signup and to log in. During testing it all worked fine. However, when we asked "Google" to approve the app "NeetoCal" for "Google sign in" Google demanded that our users should be able to see the "Privacy Policy" and "Terms of conditions" on the website. In order to make Google happy we added a redirection from "neetocal.com" to "neeto.com/neetocal".

However, Google was not happy with it. The users are logging into https://subdomain.neetocal.com so the "privacy policy" and "terms of service" should be visible on the domain "neetocal.com" itself.

We are using cloudflare as our DNS provider. Using the tools provided to us by Cloudflare we decided to show the content of "neeto.com/neetocal" on "neetocal.com" without redirecting the user.

Note that in this case if you type "neetocal.com" then you will see the url change to "neetocal.com/neetocal" instantly. That's because the url of the main marketing site is "neeto.com/neetocal".

Cloudflare provides Page rules which we will be using to achieve our goals. Below is video of how it was done.

SEO duplicate content issue

The google search engine doesn't like it when we show the exactly same content on two different domains. Google thinks that the site is trying to cheat Google and Google will punish both the sites.

We want Google to index our main marketing site https://neeto.com/neetocal and we want Google to ignore "neetocal.com". One way to tell Google to not index the site is by adding a noindex meta tag.

<meta name="robots" content="noindex">

In the above example we are asking all bots to not to index the page containing the above meta tag.

We planned to inject this meta tag when a page is rendered for the url "neetocal.com" and we will not inject this tag when the page is rendered for url "neeto.com".

Upon more research we found that search engines also look at the response headers. Given below is the sequence that search engines follow for indexing the web pages.

  • Crawler gets the raw page source as a response to the HTTP request.
  • Crawler checks if x-robots-tag: noindex, nofollow header is present in the response.
  • Crawler checks the meta tags to determine if the page needs to be indexed or not.

If a page has x-robot-tag: noidex, nofollow then the crawler will not index the page.

Based on this information we decided to use Response Header Modification Rules feature of Cloudflare. Below is a video of how it was done.

If this blog was helpful, check out our full blog archive.

Stay up to date with our blogs.

Subscribe to receive email notifications for new blog posts.