Mitch Lynn: Search Engine Exclusion Protocol

There are many states of affairs where you don't desire a webpage to be crawled by a web crawler. This article explicates some of the methods that tin be employed to forestall your webpage from being indexed by the hunt engines.

A webmaster can instruct a hunt engine user agent (webbot/spider/crawler) what to index and what not to index using particular external data data files called automatons exclusion communications protocol files and on page meta tags or nofollow attributes. It is utile to instruct a hunt engine not to creep and index a page that is under construction.

Adding the nofollow property to a nexus takes the form:

Anchor Text

And forestalls a nexus from being followed by hunt engine spiders. The "nofollow" property tin also be used in a automatons meta tag placed in the caput of a webpage.

The followers will instruct hunt engines not to index this page and not to follow any golf course from this page for usage in indexing or weighting

The followers will state a spider not to index this page, but to let the followers of golf course that can then be indexed and weighted

The followers will instruct the spider to index this page but not to follow any golf course from it and is most commonly used in message boards

The "Robots exclusion protocol" is used to forestall directories from being indexed in a separate robots.txt data file which is located in the site's root directory.

The followers direction states the hunt engines to disallow NO directories for any hunt engine.

User-agent: * Disallow:

Conversely, the followers bid will disallow ALL directories for any hunt engine.

User-agent: * Disallow: /

Digg It	Del.icio.us
Furl It	Reddit
Spurl It	Fark It
RSS	ATOM

Thursday, September 4, 2008

Search Engine Exclusion Protocol

No comments:

Mitch Lynn

Cool Sites

Mitch Lynn

About Me

Bookmark

Blog Archive