How will you stop web spiders from crawling certain directories on your website?

seenagapeJanuary 23, 2012

WWW wanderers or spiders are programs that traverse many pages in the World Wide Web by recursively retrieving linked pages. Search engines like Google, frequently spider web pages for indexing.

PrepAway - Latest Free Exam Questions & Answers

A.
Place robots.txt file in the root of your website with listing of directories that you don’t want to be crawled

B.
Place authentication on root directories that will prevent crawling from these spiders

C.
Place "HTTP:NO CRAWL" on the html pages that you don’t want the crawlers to index

D.
Enable SSL on the restricted directories which will block these spiders from crawling

Explanation:
WWW Robots (also called wanderers or spiders) are programs that traverse many pages in the World Wide Web by recursively retrieving linked pages. The method used to exclude robots from a server is to create a file on the server which specifies an access policy for robots.
This file must be accessible via HTTP on the local URL "/robots.txt".
http://www.robotstxt.org/orig.html#format

Get 50% Discount on All Your Purchases
at PrepAway.com - Latest Exam Questions

This is ONE TIME OFFER

Enter your email address to receive your 50% off dicount code:

SPECIAL OFFER: GET 50% OFF

Use Discount Code:

ECCouncil Exam Questions

Free EC-Council Study Guide

How will you stop web spiders from crawling certain directories on your website?

Leave a Reply Cancel reply