Most site owners are eager for Google to index their websites. However, there are instances in which you don’t want Google to index content on your site. Perhaps you have content for internal use only or accidentally used copywritten material.
There are two relatively easy ways to remove a page from Google. Be aware that none of these options are immediate. All will take time to register on the Google search result pages.
Methods for Removing a Page from Google
1. Robots Exclusion Protocol
The first option involves using the robots meta-element NOINDEX. You can warn Google off from indexing a webpage by adding NOINDEX to the <head> area of a webpage. Problem solved. Now all you have to do is wait for Google and Bing to register the change.
Code Example:
<meta name=”robots” content=”noindex”>
You can also implement an exclusion protocol with a robots.txt file. A DISALLOW statement in the robots.txt discourages Googlebot from indexing the pages identified in the text document. Unfortunately, the page may still be indexed, but the exclusion protocol should keep Google from displaying the content in search results. However, this is not a reliable way to keep these pages from skewing your analytics.
2. Google Search Console
Google Search Console (GSC) is a free tool that can be quickly and easily activated. By utilizing Google Search Console, you can select a page to be removed from Google’s index. Bing also offers their own webmaster console that you can use to remove webpages from their index.
In order to remove a URL using GSC:
-
-
- Log into Google Search Console and select the website you’d like to manage.
- Select “Optimization” in the left-hand navigation menu.
- Select the “Remove URL” option in the sub-menu.
- Select “Create a new request for removal.”
- Enter the URL for the page you want to be removed and confirm your choice.
- Wait. It may take up to 48 hours.
- GSC also now provides an update on the state of a removal request. If your request is denied, clicking “Learn more” will give you more information.
-
If you change your mind, you can cancel a removal request in Google Search Console by clicking “Reinclude” beside the URL.
Keep in mind that a removal using GSC or Bing’s Webmaster tools may only be temporary. After 90 days, the removal request will expire. The webpage may be re-added to the index if there are still links pointing to the page and it remains available for indexing. For this reason, it is still important to include a NOINDEX tag in the <head> of the webpage to ensure it isn’t indexed again.
Lastly, the remove page request in GSC is not the appropriate tool for fixing canonicalization issues or managing web migrations. There are more successful methods for resolving these situations with Google.
3. Hide the Page Behind a Login
Google is unable to index pages that require login for access. A sure-fire way to ensure internal content isn’t mistakenly indexed by google is to hide these pages behind a password. Secure information, such as personal and financial data, should always be protected in this manner.
4. Remove Old Content but Not the URL
Sometimes you just want to refresh Google’s cache of your content. Once you’ve updated the content on the page, you can wait for Google to re-index your content or you can request that Google remove the cache.
To request that the cache be removed until the page is crawled again, go to Google Search Console and proceed as you would to remove a page from the index. To keep the URL but delete the cache, select “Remove page from cache only.”
You can also add a NOARCHIVE tag to the page. This will prevent the page from being cached by Google until the tag is removed.
5. Delete the Page
Certainly, the most reliable way to prevent a webpage from being indexed is to delete it from the server. This will return a 404 or 410 status code. Upon receipt of a 404 or 410, google will remove the page from its index.
6. Remove Content on Another Website
If another site is infringing on your copyright by displaying your content on their website, you can file a Digital Millennium Copyright Act notice to request the removal of the content. You can also request removal using the public removal tool. You should also try contacting the site owner and ask that the content be removed or modified.
Methods That Don’t Work
1. Not Linking
There is no guarantee that eliminating links to a page will keep the page from being crawled by Google. You may be able to limit access to the page on your own site but you have no control over links elsewhere.
2. NOFOLLOW Attribute
Though a NOFOLLOW attribute in a link will stop Google from following that link, there is no way to prevent links to your page that do not have the NOFOLLOW attribute. Also, the NOFOLLOW attribute does not prevent a page from being indexed.
3. Forms, JavaScript, Flash
Customizing content with form inputs, JavaScript and Flash used to be a reliable way to limit indexing. However, Google is significantly better at crawling these formats and continues to improve.
Conclusion
If you’ve followed all the rules for removing content and yet it still appears, you may be dealing with multiple URLs displaying duplicate content or an outdated cache. This may require you to request the removal of several outdated pages.
Finally, keep in mind that there are consequences for removing pages from your site. It can have a temporary impact on your search engine rankings. Proceed cautiously with these methods for managing cached content.
For more information about how to remove content from Google, such as images, products, and videos, Google has a fairly useful help site that can guide you through the process.