Search and Social
SEARCH & SOCIAL
1-888-420-4SEO

Experience Results from Experienced Marketers

To NoFollow or To NoIndex That is the Question

September 16, 2008 – 9:24 pm

by Dave Snyder

(Update: I had some inaccuracies and hard to understand parts of this pointed out by Edward Lewis of SEOConsultants.com, a person I trust and respect. Since I wrote this post to be a discussion starter and resource I thought it important to update.)

This is kind of a beginner to moderate post but I think it is something that confuses some marketers.

There is a difference between NoFollow and NoIndex.

NoFollow is an HTML attribute value used to instruct some search engines that a hyperlink should not influence the link target’s ranking in the search engine’s index.

NoIndex is an HTML meta tag that advises automated Internet bots to avoid indexing a Web page.

(Update: There is also a NoFollow meta tag, that works differently than the link level attribute)

When you disallow pages in your robots text file you are basically NoIndexing those files.

(Update: Halfdeck on Sphinn had a correction on this that was great: When you disallow pages you’re telling google not to crawl a page. It does not say “do not display this URL in the SERPs.” Noindex tells Google to crawl a page, pull links from the HTML and follow (unless nofollow is in META ROBOTS), but do not display the URL in the SERPs.)

What happens when you NoIndex or NoFollow?

What I think some marketers do not understand is that just because you NoIndex, or add a web page to your robots.txt, does not mean that it is NoFollowed.

The NoFollow attribute was created by Matt Cutts and Jason Shellen from Blogger.com in order to reduce search engine spam and the effectiveness of automated link building schemes. The NoFollow attribute functions differently in each of the three major engines.

(Updated: It has been noted that the creation of the NoFollow attribute at the link level was a creation of the three major engine combined.)

Google still sometimes follows the link, however does not index the page that is linked to, or show the existence of the link or use it for ranking purposes.

(Update: I thought it important to add an update here. Although Google states that the NoFollow will not be followed, that is not entirely true. Even Matt Cutts has stated the opposite. NoFollow attributes should not be used for any other purpose other than dropping a link from an SEs web graph.)

Yahoo does not utilize a NoFollwed link for ranking purposes, but does for everything else.

It has not been established whether MSN follows the link, but it does not index the linked to page or use the link in search ranking calculations.

When you want to NoFollow

Some utilize the attribute to prevent spam, and some use it to sculpt link juice.

There are a ton of differing opinions on these topics in the SEO community.

Michael Martinez from SEO Theory, thinks that any concept of link shaping is relatively flawed in logic, and helps perpetuate the use of the attribute.

Joost De Valk weighs in some great opinions here . But basically sees the value in the concept, and notes it is not new.

Rhea Drysdale told me she believes, “NoFollow is a basic tool everyone should use, but it shouldn’t be a band aid for poor site architecture.”

Michael Gray protects the concept of link juice shaping, but not if it means killing the concept of good site architecture.

On the topic of preventing spam, I have hundreds of blocked spam messages for SnydeySense.com that say otherwise. The initial intent of NoFollow has been an absolute failure, and all it has done is manipulate the web the way Google wants it to be manipulated.

On the topic of shaping link juice I think it has some value, but it is not a magic pill.

How to utilize NoFollow and NoIndex

So when do you use these concepts to make your site better?

If you have pages on your site that you do not want in the index, those pages need to be added to your robots.txt file, and or added to your robots meta tag using a NoIndex attribute.

(Update: Another great observation from Halfdeck on Sphinn: If you use disallow and META noindex simultaneously, the noindex doesn’t get read by Googlebot because disallow prevents the bot from scraping the content of the URL, which leads to a situation where a noindexed page still may show up in Google SERPs depending on its PageRank.)

If you want to make sure from this point that you do not have the engines passing any link equity to those pages add either the rel=”NoFollow” to your link tag or NoFollow to your robots tag.

If you utilize the NoFollow without the combination of a NoIndex or robots.txt file there is a chance that your page may still get indexed.

While shaping your site, in order to keep a majority of your link equity flowing to important pages, and in order to guide the spiders the way you wish, use the NoFollow and NoIndex attributes in conjunciton with one another.

A Search Engines View of Link Shaping

I was lucky enough to get the perspective of Jeremiah Andrick, Program Manager at Microsoft -Live Search:

So in terms of Pagerank sculpting, Live Search really doesn’t recommend putting much effort into this. For most sites there are probably much more impactful efforts that can be made. Creating a high quality site, etc. After you’ve taken care of high priority optimizations, then think about sculpting. Always ensure your best pages are near the top (especially if the site is big). For example, on MSDN they have Developer centers that lead to deeper content. As the owner of a site like MSDN you may want your products dev centers to be driven from your homepage, etc. Same with product pages in a shopping site. (I think Matt Cutts has described it this way.)

But there are some things you should be aware of:

Our view of the robots.txt protocol:

http://blogs.msdn.com/webmaster/archive/2008/06/03/robots-exclusion-protocol-joining-together-to-provide-better-documentation.aspx

We support the protocol with some minor differences most important to your question is that we treat nofollow on links as a suggestion not as imperative. Because some of the links people nofollow still help us understand the value of a page.

So if you choose to spend much time on this, the best thing to do is to do it at the robots.txt level. I wrote a blog post about some bad practices webmasters are trying in regards to changing their robots.txt different hours of the day and assuming that the engines will pick up the changes immediately. It is a really weird form of sculpting based on time and the desire to control server load and hiding content at different periods of the day. But the engines just don’t work based on the assumptions behind that behavior. Just don’t do stuff like that.

I personally believe that across the engines only very large sites will benefit from sculpting. I just don’t think for most sites that it will have much impact.

Where Should You Begin with These Concepts

You should begin this entire process by making sure your website has a strong information architecture. A strong IA is the first key in shaping your link equity.

From that point you can use the above strategies to further cut off pages you do not want to gain link equity.

For instance, if you have a privacy page coming off of your homepage,which is likely the most heavily linked to page on your site, you will not want link equity going to that page that can be more effectively used elsewhere on your site. So by Noindexing, NoFollowing the link to that page from your homepage you can take care of this issue.

Again this concept is not new, and not fool proof.

Testing has been done on the effects of link shaping, and most results have been found to be less than telling.

I utilize it as a secondary optimization technique, as a safety of sorts.

Other Strategies for NoFollow

Some industry types have been talking about a concept called black hole seo, which is basically the process of putting a cap on all outbound links from your site in order to cap all of your link equity.

The result is supposed to float a sites content to the top of the rankings.

There needs to be a massive amount of inbound links to a site for it to become one of the seo black holes. A working example of such a site is Wikipedia, who NoFollowed links and has had content pages steadily at the top of SERPs ever since.

(Update: One of my fav ladies , Jill Whalen , was correct in pointing out that Wikipedia was ranking for terms far before they capped there outbound links. I wanted to point out the documented ascension of rankings after the cap, but do not think I was clear at first.)

In the end, you are not likely to build enough links to create one of these black hole sites, and you could actually hinder organic link building by capping all of your outbound links. So this is not a strategy I recommend. Also it means that you are aiding the Google agenda for the web, and isn’t it more fun to fight the power?

Resource Update: Edward pointed me to some amazing resources on this topic today -

1) His own site NoArchive.net has some amazing information on how to utilize different Robots Meta Tags.

2) He also pointed me to what seems to be definitive post on this topic by Vanessa Fox. She really breaks this down in every facet imaginable. Kudos. 

  1. 14 Responses to “To NoFollow or To NoIndex That is the Question”

  2. Hey Dave,
    I agree with you that this type of post to most SEO’s is beginner to moderate and also agree that this topic confuses not just some marketers but most. As you have referenced and supported with your own statements and I agree with, that the best way to maximize your site link equity is through the sites information architecture (IA).

    Mostly in my view the assumed importance of these techniques over a well thought out IA, is due to the lack of knowledge a company has when setting out to develop a strong web business initiative. If no one in the company and that includes the Executives, IT/Development, Marketing/Editorial and Product Directors do not have a clear understanding as to how all elements of a web site need to work together, and then they cobble together some solution that is most likely designed to keep everyone working. Over time when the site does not perform they fall back on some heavily talked about ‘secondary technique’ as the silver bullet to fix all problems. When in reality they have to go back and rebuild from the beginning and pay attention to business fundamentals

    Christopher Hart
    Director, Eastern Region Operations
    Bruce Clay Inc.

    By Christopher Hart on Sep 17, 2008

  3. I just had to use the noindex tag the other day to keep the order page for a client from showing up in SERPs… i would rather they hit the other pages based on marketing content, however the page showed up due to a bunch of links to it for people to order the DVD.

    also add the googlebot=”noarchive” attribute to really remove the page form google..

    RE: no follow. i think it shoudl be written into forums, blog, etc that allow public comments without moderation to keep spam from getting link juice… otherwise, if i am going to take the time to make a link to another website from mine.. please follow, i want you to go there, otherwise i would not have made the link.

    By paisley on Sep 17, 2008

  4. I recently wrote a blog piece outing some big name bloggers who don’t have any “dofollow” links, as it’s my belief that those blogs could not only benefit from the link juice, including the big time bloggers, but that it’s kind of a gift and responsibility they could be giving back to those people who do take the time to comment. If bloggers worry about some of the links they get, they can always delete or turn those few off.

    By Mitch on Sep 17, 2008

  5. I say get rid of nofollow espeacialy from blogs.
    Time is valuable and if im going to sit down and comment on your blog, i would rather somthing in return.

    On my blogs i ALWAYS remove the no-follow tag and make it ‘approve first’ and delete spam before it gets onto the page.

    Spammers will spam both nofollow and do-follow blogs, so it doesnt matter which you use.

    Their are 2 ways to noindex
    1. in metatag
    2. in robot.txt

    i say forget about your metatag and only use robots.txt.

    i only use it for not indexing my ‘free’ pages like when im building my email list and giving stuff away for free.

    ok, since i dont get a reward (link) from your site, ill stop here! =]

    Nice article BTW.

    By Christopher on Sep 17, 2008

  6. I’ve read a post before that blogs that have nofollow attributes are causing negative effect on their rankings as well…what do you think of this? is it true?

    By Internet Marketing Joy on Sep 17, 2008

  7. I am going to respectfully disagree with you on one point.

    “Google still sometimes follows the link, however does not index the page that is linked to, or show the existence of the link or use it for ranking purposes.”

    There are nofollowed links from Yahoo! Answers to my client homepage that does show as a link in my Webmaster Tools list of backlinks. While I know Google does not use it for discovery, the do show the existence of the link in Webmaster Tools. Google has always been selective in showing backlinks publicly, but they do recognize nofollows as links. Just not as “juicy” as others. ;)

    That would be my only point. Otherwise, with the updates you added, this is a must read for all SEOs who are still having issues understanding these attributes and meta tags. Great job ;)

    By Kate Morris on Sep 17, 2008

  8. thanks for the informative information.

    By filipino blogger on Sep 17, 2008

  9. This really isn’t a complicated subject but for some reason this post makes it sound that way! Although the advice is ok the description of the ways to noindex/nofollow is confusing. You also stated that when using the nofollow attribute “Google still sometimes follows the link, however does not index the page that is linked to, or show the existence of the link or use it for ranking purposes”

    This is incorrect. You can get a site indexed just with nofollowed links as Jon Waraas proved: http://www.jonwaraas.com/nofollow-test-results-are-in/

    There is a much more straightforward way to explain the types of noindex/nofollow:

    1. Meta tags. There is the nofollow meta tag, this essentially says to Google don’t follow all of the links on this page. Often G will follow the links to check if the pages linked to are indexable, but will not pass any ranking factors through those links.

    There is the noindex metatag, this tells Google not to index the content on the current page, hence the page will not rank for it’s content and probably wont appear in the SERPs.

    You could use noindex,follow together and Google won’t index the content but will follow all the links in the usual way (ranking/indexing) this is often useful for sitemap pages which do contain any relevant content but you still want Google to utilize them.

    2. Link attributes: You can add the rel=”nofollow” attribute to a link. This is often used when linking out to external sites to consilodate link equity, or if the link could be to an untrusted source (e.g. blog commentors sites). It can also be used to sculpt pagerank so irrelevant pages on your site do not dilute the pagerank that can be passed to more appropriate pages, e.g. it’s worth nofollowing links to your privacy policy which you don’t want to rank.

    The rel=”noindex” link attribute is a poor way of trying to stop a page being indexed since anybody else could link to that page rendering your noindex link useless.

    Hope this helps!

    By Andrew on Sep 18, 2008

  10. @kate and @andrew - thanks for the comments

    while I am aware of the fact that there is mounds of evidence to the contrary I was relaying the official position of Google in respect to its use of the “NoFollow” attribute

    From their WebMaster Help

    “How does Google handle nofollowed links?

    We don’t follow them. This means that Google does not transfer PageRank or anchor text across these links. Essentially, using nofollow causes us to drop the target links from our overall graph of the web. However, the target pages may still appear in our index if other sites link to them without using nofollow, or if the URLs are submitted to Google in a Sitemap. Also, it’s important to note that other search engines may handle nofollow in slightly different ways.”

    Experimentation aside, it is important to for people young to the industry to know the SEs official position on this topic.

    They go as far as to state that rel=”NoFollow” can be sued for bot control, which is obviously not true.

    “Crawl prioritization: Search engine robots can’t sign in or register as a member on your forum, so there’s no reason to invite Googlebot to follow “register here” or “sign in” links. Using nofollow on these links enables Googlebot to crawl other pages you’d prefer to see in Google’s index. However, a solid information architecture — intuitive navigation, user- and search-engine-friendly URLs, and so on — is likely to be a far more productive use of resources than focusing on crawl prioritization via nofollowed links.”

    While there is disinformation it is still to look at what Google is saying, especially for newer SEOs and marketers.

    Also @andrew I think you will enjoy my next post that will explain why your concept of utilizing rel=”nofollow” to sculpt link equity is perhaps not the best route

    And Andrew I don’t think this topic is as clear as you think, since both robots.txt and Meta robots NoIndex handle bots differently, and not many know this .

    By davesnyder on Sep 18, 2008

  11. @christopher
    “i say forget about your metatag and only use robots.txt”

    This is not the best solution. The meta tag works better, the robots text file can be queried, however the “noindex” tag is associated with the URL in Google’s DATABASE.

    again.. if you want a page removed.. use the following

    I’m actually sending this to a fortune 500 client right now to remove a form in an iframe from google’s results..

    it will be removed by monday at lunch..

    anyone care to wager?

    By paisley on Sep 18, 2008

  12. Cool and indeed worth covering the robots.txt usage.

    Probably worth looking at TripAdvisor if you are posting on sculpting pagerank as they heavily control what is nofollowed throughout the site. The rest of our Expedia Inc sites not so “optimised” in this way.

    By Andrew on Sep 18, 2008

  1. 3 Trackback(s)

  2. Sep 17, 2008: Joe Hall » Links are making people go crazy!
  3. Sep 19, 2008: .eduGuru Links of the Week for September 19th, 2008 | .eduGuru
  4. Sep 22, 2008: Link Shaping and Bot Herding | SEO Blog | Search and Social

Post a Comment