Appearance
Indexed, Though Blocked by robots.txt
Indexed, Though Blocked by robots.txt
This status indicates that Google has discovered the page, but Googlebot is blocked from crawling it due to rules in your robots.txt file.
Even though crawling is blocked, Google may still index the page if it finds the URL through other sources, such as:
- External links
- Internal links
- Previously indexed versions of the page
As a result, the page can appear in Google Search results without Google fully crawling its content.
Why does this happen?
- The page URL is disallowed in your
robots.txtfile - Google already knows about the URL from links or past crawls
robots.txtblocks crawling, not indexing
Recommended action (Best Practice)
If your goal is to prevent the page from appearing in Google Search results, blocking it in robots.txt alone is not sufficient.
Correct way to handle it:
- Remove the
robots.txtblock for that page or path - Add a
noindexdirective to the page (via meta tag or HTTP header) - Allow Googlebot to crawl the page so it can see the
noindexinstruction
