Latest discoveries about deindexations and Search Engine crawling

Share on Facebook608Tweet about this on TwitterShare on Google+2Email this to someone

searchbotEvery time a blog gets deindexed on EBN, we save all the information we have about it so we can analyze it later. We then do batch analysis to see if there are any patterns or footprints that we can report to the community.

In the last few weeks, we found a few interesting things. We’re still discussing on how to implement these into Blog Health to improve deindexation prevention but they’re useful insights nonetheless.

Here’s what we found.

Search Engine Crawler is visiting old URLs all the time

If you buy an expired domain with history, old URLs get crawled often and for long periods of time. This can go on for months, even if the URLs report 404 error.

Unrelated content and/or language can cause deindexation

We found that rebuilding old domains and using unrelated content and/or different language can cause deindexation.

Comments increase Search Engine Crawler visits

Comment feed is checked daily. If there are no comments, the blog is crawled less often.

Blocking crawlers can cause deindexation

We did not find any issues with users of Spider Blocker, however a lot of users add more than one plugin and block additional crawlers. Do NOT do this. Use one blocker and block as little crawlers as possible.

Some domains are permanently penalized

Some penalized domains never get any crawler traffic and will therefore never get indexed. Unfortunately we don’t yet have the data on how long this penalty can persist or what is the root cause of it (email spam, malware, phishing etc.).

Search Engine Crawler still visits the blog after deindexation (!)

When a domain gets removed from the SERP (deindexed), the old URLs still get crawled regularly and that stops only after 5-7 days. This could mean there are still options to save your blog after it gets deindexed by rebuilding URLs with relevant content.

Since we’re using passive indexation check, this is the reason why our indexation status can be late for 7-14 days (while Blog Health is checked daily).

Summary

Here’s a quick recap:

  • Rebuild URLs with relevant content that would fit on the old domain. Use the same language.
  • Check domains in spam and malware databases before buying them.
  • Use only one spider blocker, we recommend our free Spider Blocker plugin and block only the most important crawlers.

While none of this is a complete surprise, it’s just something that we can now confirm with data, not just speculation.

In the future, we’re going to start collecting even more information about domains – from social metrics to backlinks and blacklist databases. Once we have that, our analysis and deindexation prevention will greatly improve.

Share on Facebook608Tweet about this on TwitterShare on Google+2Email this to someone

9 thoughts on “Latest discoveries about deindexations and Search Engine crawling

    1. Dejan Murko Post author

      It’s definitely one of the flags that can get your blogs deindexed. But it’s the same as with other flags – one (usually) is not a problem, but have a few and you’re looking at a very high chance of being deindexed.

      Reply
  1. Mike Haydon

    Thanks for sharing this Dejan. It’s particularly interesting that even after deindexing the crawlers keep coming back. The unrelated content is something I’ve been testing and early results seem to back up what you’re saying here.

    Reply
  2. Paul Tufts

    Dejan – really excellent work digging into the deindexation issues and accumulating the data. I feel that this type of extra input is a very valuable benefit to being part of the easy blog community. Thanks so much for all your hard work.

    Reply
  3. Samreen Zahra

    Thanks for sharing the knowledge. As you mentioned penalized pages, previously new URLs and new content was helpful and it was an alternative for penalized pages. This deindexation thing is quite amazing and helpful for more concentration. Looking forward for more posts to follow.

    Reply
  4. Amal B

    What if I am ready to put an extra effort to update my pbns weekly with quality articles along with other guidelines? That will make my all pbns look legit right?

    Reply
    1. Dejan Murko Post author

      Yes, if you can update blogs weekly and make them look good, the chances of deindexation go way down.

      Reply

Leave a Reply

Your email address will not be published. Required fields are marked *