Yahoo Crawler obeys a new class: Robots-Nocontent
Submit to Digg.com!
May 2nd, 2007 by
Jay Westerdal
The Yahoo blog reports that they now obey a “Robots No Content” tag. The tag allows webmasters to omit content from their page. If there is text on a webpage that should not be indexed by a robot, then a webmaster can tell the robot to ignore it. The implications are broad. Now Search Engine could direct webmasters to clearly identify paid text on a page inside of a “Robot’s No Content” tag. The tag is also useful for headers and footers of a site. A webmaster can separate out the template from the content. No word from Google or MSN yet if they will support the tag, but we can expect Google to be the first to follow.
- <div class=”robots-nocontent”> This is the navigational menu of the site and is common on all pages. It contains many terms and keywords not related to this site</div>
- <span class=”robots-nocontent”> This is the site header that is present on all pages of the site and is not related to any particular page</span>
- <p class=”robots-nocontent”> This is a boilerplate legal disclaimer required on each page of the site
</p>- <div class=”robots-nocontent”> This is a section where ads are displayed on the page. Words that show up in ads may be entirely unrelated to the page contents</div>
If a webmaster already has a class tag they can still use the functionality. More then one class is allowed in class parameter. So for example if the tag looked was <p class=”hightlight”>green tea</p>, it would be changed to <p class=”hightlight robots-nocontent”>green tea</p>
Now for the interesting part, how can this be gamed? Well how about something like this…
<span class=”robots-nocontent”>Nearly 50.77 percent of the U.S. peanut production went to </span> free <span class=”robots-nocontent”> peanut butter factories in 2001. This makes the U.S. the world’s largest peanut butter supplier and consumer. Peanuts grown in other countries are usually harvested for cooking oil called peanut oil.<P>
There are many types of peanuts. Small-seed peanuts are rich in oil and usually grown for peanut butter and oil. In the U.S., Runner Types and Spanish Types are two families of peanuts grown in southern states including Alabama, Florida, Georgia, Oklahoma, South Carolina and Texas. The first three states produce 60% of the peanuts that are used in peanut butter. These three states also produce</span>porn <span class=”robots-nocontent”>and oranges.<P>
After harvest, peanuts are sent to factories for inspection. The inspected peanuts are roasted in ovens. After roasting, they are rapidly cooled by air to stop cooking. This helps to retain its color and oil contents.<BR>
If the page had lots of incoming links, someone could change the context of the page. Would the Search Engine believe something like this? I think so – the weight of the two remaining words would carry the page meaning into the search index.
« Newer Post Older Post »
Posted in SEO, Yahoo |
3 Comments »



May 5th, 2007 at 11:12 PM
Interesting.
Any plans to make DomainTools’ SEO Text Browser do the same?
May 6th, 2007 at 10:49 AM
If Google follows it, you bet we will.
October 3rd, 2007 at 9:58 PM
Why would Google be sure to follow? That’s highly speculative