For an excellent reason, site owners have always wanted to know if Google has indexed all of their site’s pages. I mean, you don’t have to be a genius to understand that if you want to receive some Google search traffic to your site, your site has to be indexed (duh!).
But Google never quite revealed to webmasters how many exactly of their pages are indexed. Even when using the site: operator (site:example.com), Google still only shows a representing lower number of pages which I’m sure left many webmasters frustrated and possibly cursing the Google gods.
However, that already belong to Google’s dark ages of the past, because apparently now the company is trying to be more transparent (even though sometimes it’s very confusing) and as such, it’s offering a complete full accurate glance on how a website has been indexed so far.
That’s right, it is now possible to see an overview of how the Googlebot has scoured your website in a new “Index Status” feature on Google Webmaster Tools (under the “Health” section). When you’ll enter it, the “Basic” graph option will appear like this:
On the Basic graph option you’ll see the the progress of how many pages of your site are in the index on a weekly basis, ready to appear on the search results, throughout a period of one year until now (until the last week to be exact).
Unless you removed a significant number of pages, you suppose to see this graph ascending, meaning more of your pages are being added to the index. If it’s not and the graph is descending, something went wrong and Google is removing more of the website’s pages than adding new ones.
Few reasons that can cause a drop on the Index Status are misconfiguration of the robots.txt file, long period of time where the site’s server was down or canonicalization issues. By the way, the Index status feature doesn’t include canonical pages, so there might be a gap between the number of pages the Googlebot crawled and the number of pages eventually indexed.
Anyways, if you want to dive deeper into the Index Status data, just hit the Advanced graph option button.
On the Advanced option, besides the total number of indexed pages, you can also see few other things:
- Ever crawled- An historical view of all your pages Google crawled regardless if they appear on the index or not.
- Not selected- Pages which Google choose not to show on the index (and on the search results), such as duplicate content pages or 301 redirects.
- Blocked by robots- Pages which are blocked by the robots.txt file.
Personally I think that the Index Status feature can be extremely useful to identify if your website is in good state and if there aren’t any problems that affects the site’s appearance on the index. Also, I can’t help wondering (and feeling a bit ungrateful about it) why it took Google so long to roll out this feature.