ISSN : 2319-7323




INTERNATIONAL JOURNAL OF COMPUTER SCIENCE ENGINEERING


Open Access

ABSTRACT

Title : Differences in Caching of Robots.txt by Search Engine Crawlers
Authors : Jeeva Jose
Keywords : Web Log Mining; Search Engine Crawlers; robots.txt; cache
Issue Date : September 2015
Abstract : Web Log Mining gives insight to the behavior of search engine crawlers accessing a Website. Crawlers periodically visit the Website and update contents on the Website. The behavior of search engine crawlers gives vital information about the ethics of crawlers, dynamicity of crawling, how much they contribute to the server load and so on. Ethical crawlers initially access the “robots.txt” file and then proceeds to the crawling process according to the permissions and restrictions given in this file. This paper is an attempt to identify the differences of various search engine crawlers and the time delay in caching the “robots.txt” file. The results revealed that there is a significant difference in the caching of “robots.txt” file by various crawlers.
Page(s) : 208-213
ISSN : 2319-7323
Source : Vol. 4, No.5