Free SEO Proposal
Experience Results with       
       Experienced Marketers

To NoFollow or To NoIndex That is the Question

September 16, 2008 – 9:24 pm

by Dave Snyder

(Update: I had some inaccuracies and hard to understand parts of this pointed out by Edward Lewis of SEOConsultants.com, a person I trust and respect. Since I wrote this post to be a discussion starter and resource I thought it important to update.)

This is kind of a beginner to moderate post but I think it is something that confuses some marketers.

There is a difference between NoFollow and NoIndex.

NoFollow is an HTML attribute value used to instruct some search engines that a hyperlink should not influence the link target’s ranking in the search engine’s index.

NoIndex is an HTML meta tag that advises automated Internet bots to avoid indexing a Web page.

(Update: There is also a NoFollow meta tag, that works differently than the link level attribute)

When you disallow pages in your robots text file you are basically NoIndexing those files.

(Update: Halfdeck on Sphinn had a correction on this that was great: When you disallow pages you’re telling google not to crawl a page. It does not say “do not display this URL in the SERPs.” Noindex tells Google to crawl a page, pull links from the HTML and follow (unless nofollow is in META ROBOTS), but do not display the URL in the SERPs.)

What happens when you NoIndex or NoFollow?

What I think some marketers do not understand is that just because you NoIndex, or add a web page to your robots.txt, does not mean that it is NoFollowed.

The NoFollow attribute was created by Matt Cutts and Jason Shellen from Blogger.com in order to reduce search engine spam and the effectiveness of automated link building schemes. The NoFollow attribute functions differently in each of the three major engines.

(Updated: It has been noted that the creation of the NoFollow attribute at the link level was a creation of the three major engine combined.)

Google still sometimes follows the link, however does not index the page that is linked to, or show the existence of the link or use it for ranking purposes.

(Update: I thought it important to add an update here. Although Google states that the NoFollow will not be followed, that is not entirely true. Even Matt Cutts has stated the opposite. NoFollow attributes should not be used for any other purpose other than dropping a link from an SEs web graph.)

Yahoo does not utilize a NoFollwed link for ranking purposes, but does for everything else.

It has not been established whether MSN follows the link, but it does not index the linked to page or use the link in search ranking calculations.

When you want to NoFollow

Some utilize the attribute to prevent spam, and some use it to sculpt link juice.

There are a ton of differing opinions on these topics in the SEO community.

Michael Martinez from SEO Theory, thinks that any concept of link shaping is relatively flawed in logic, and helps perpetuate the use of the attribute.

Joost De Valk weighs in some great opinions here . But basically sees the value in the concept, and notes it is not new.

Rhea Drysdale told me she believes, “NoFollow is a basic tool everyone should use, but it shouldn’t be a band aid for poor site architecture.”

Michael Gray protects the concept of link juice shaping, but not if it means killing the concept of good site architecture.

On the topic of preventing spam, I have hundreds of blocked spam messages for SnydeySense.com that say otherwise. The initial intent of NoFollow has been an absolute failure, and all it has done is manipulate the web the way Google wants it to be manipulated.

On the topic of shaping link juice I think it has some value, but it is not a magic pill.

How to utilize NoFollow and NoIndex

So when do you use these concepts to make your site better?

If you have pages on your site that you do not want in the index, those pages need to be added to your robots.txt file, and or added to your robots meta tag using a NoIndex attribute.

(Update: Another great observation from Halfdeck on Sphinn: If you use disallow and META noindex simultaneously, the noindex doesn’t get read by Googlebot because disallow prevents the bot from scraping the content of the URL, which leads to a situation where a noindexed page still may show up in Google SERPs depending on its PageRank.)

If you want to make sure from this point that you do not have the engines passing any link equity to those pages add either the rel=”NoFollow” to your link tag or NoFollow to your robots tag.

If you utilize the NoFollow without the combination of a NoIndex or robots.txt file there is a chance that your page may still get indexed.

While shaping your site, in order to keep a majority of your link equity flowing to important pages, and in order to guide the spiders the way you wish, use the NoFollow and NoIndex attributes in conjunciton with one another.

A Search Engines View of Link Shaping

I was lucky enough to get the perspective of Jeremiah Andrick, Program Manager at Microsoft -Live Search:

So in terms of Pagerank sculpting, Live Search really doesn’t recommend putting much effort into this. For most sites there are probably much more impactful efforts that can be made. Creating a high quality site, etc. After you’ve taken care of high priority optimizations, then think about sculpting. Always ensure your best pages are near the top (especially if the site is big). For example, on MSDN they have Developer centers that lead to deeper content. As the owner of a site like MSDN you may want your products dev centers to be driven from your homepage, etc. Same with product pages in a shopping site. (I think Matt Cutts has described it this way.)

But there are some things you should be aware of:

Our view of the robots.txt protocol:

http://blogs.msdn.com/webmaster/archive/2008/06/03/robots-exclusion-protocol-joining-together-to-provide-better-documentation.aspx

We support the protocol with some minor differences most important to your question is that we treat nofollow on links as a suggestion not as imperative. Because some of the links people nofollow still help us understand the value of a page.

So if you choose to spend much time on this, the best thing to do is to do it at the robots.txt level. I wrote a blog post about some bad practices webmasters are trying in regards to changing their robots.txt different hours of the day and assuming that the engines will pick up the changes immediately. It is a really weird form of sculpting based on time and the desire to control server load and hiding content at different periods of the day. But the engines just don’t work based on the assumptions behind that behavior. Just don’t do stuff like that.

I personally believe that across the engines only very large sites will benefit from sculpting. I just don’t think for most sites that it will have much impact.

Where Should You Begin with These Concepts

You should begin this entire process by making sure your website has a strong information architecture. A strong IA is the first key in shaping your link equity.

From that point you can use the above strategies to further cut off pages you do not want to gain link equity.

For instance, if you have a privacy page coming off of your homepage,which is likely the most heavily linked to page on your site, you will not want link equity going to that page that can be more effectively used elsewhere on your site. So by Noindexing, NoFollowing the link to that page from your homepage you can take care of this issue.

Again this concept is not new, and not fool proof.

Testing has been done on the effects of link shaping, and most results have been found to be less than telling.

I utilize it as a secondary optimization technique, as a safety of sorts.

Other Strategies for NoFollow

Some industry types have been talking about a concept called black hole seo, which is basically the process of putting a cap on all outbound links from your site in order to cap all of your link equity.

The result is supposed to float a sites content to the top of the rankings.

There needs to be a massive amount of inbound links to a site for it to become one of the seo black holes. A working example of such a site is Wikipedia, who NoFollowed links and has had content pages steadily at the top of SERPs ever since.

(Update: One of my fav ladies , Jill Whalen , was correct in pointing out that Wikipedia was ranking for terms far before they capped there outbound links. I wanted to point out the documented ascension of rankings after the cap, but do not think I was clear at first.)

In the end, you are not likely to build enough links to create one of these black hole sites, and you could actually hinder organic link building by capping all of your outbound links. So this is not a strategy I recommend. Also it means that you are aiding the Google agenda for the web, and isn’t it more fun to fight the power?

Resource Update: Edward pointed me to some amazing resources on this topic today -

1) His own site NoArchive.net has some amazing information on how to utilize different Robots Meta Tags.

2) He also pointed me to what seems to be definitive post on this topic by Vanessa Fox. She really breaks this down in every facet imaginable. Kudos. 

No TweetBacks yet. (Be the first to Tweet this post)