Sometimes, your WordPress might not appear on Google's Search Results Page. The reason could be that your WordPress website blocks access to Google's crawlers to index it.
In this article, you will find the basic steps you can check to ensure that the website is visible and indexable by Google's crawlers.
Ensure that WordPress allows access to the website
WordPress can prevent Google from accessing/reading/indexing your website.
Ensure that Discourage search engines from indexing this site option in the WordPress Dashboard → Settings → Reading is disabled.
For a more granular control over what Google will be indexed and shown on its Search Results Page, check this complete guide on How to Get My Site indexed by Google.
Ensure that Google's access is not being blocked
Google's access can be blocked through the robots.txt file. This file, which contains special instructions for Google's crawlers, can be found at the Root of your server.
Using a Hosting File Manager app or a WordPress file manager plugin such as the Advanced File Manager plugin, check if, on the root folder, there is a file called robots.txt.
If there is one, check its content and make sure that Google's crawlers are not being blocked. The file (for Google access) should contain the following code:
User-agent: *
Disallow:
Using the above code in the robots.txt file, your website tells web crawlers to crawl all pages, including the homepage.
How does robots.txt work?
Search engines have two main jobs:
Crawling the web to discover content;
Indexing that content so that it can be served up to searchers who are looking for information.
To crawl sites, search engines follow links to get from one site to another — ultimately crawling across many billions of links and websites. This crawling behavior is sometimes known as “spidering.”
After arriving at a website but before crawling it, the search crawler will look for a robots.txt file.
If it finds one, the crawler will read that file before continuing through the page. Because the robots.txt file contains information about how the search engine should crawl, the information found there will instruct further crawler action on this site.
If the robots.txt file does not contain directives that disallow a user-agent’s activity (or if the site doesn’t have a robots.txt file), it will crawl other information on the site.
Other settings of robots.txt:
The robots.txt file must be placed in a website’s top-level (root) directory.
robots.txt is case sensitive: the file must be named robots.txt (not Robots.txt, robots.TXT, or otherwise)
Some user agents (robots) may choose to ignore your robots.txt file. This is especially common with crawlers like malware robots or email address scrapers.
The robots.txt file should be publicly available
Each subdomain on a root domain uses separate robots.txt files. This means that both
blog.example.com
andexample.com
should have their robots.txt files (atblog.example.com/robots.txt
andexample.com/robots.txt
)It’s generally a best practice to indicate the location of any sitemaps associated with the domain at the bottom of the robots.txt file.