perl-WWW-RobotRules

database of robots.txt-derived permissions

This module parses _/robots.txt_ files as specified in "A Standard for Robot Exclusion", at <<a href="http://www.robotstxt.org/wc/norobots.html>">http://www.robotstxt.org/wc/norobots.html></a> Webmasters can use the _/robots.txt_ file to forbid conforming robots from accessing parts of their web site. The parsed files are kept in a WWW::RobotRules object, and this object provides methods to check if access to a given URL is prohibited. The same WWW::RobotRules object can be used for one or more parsed _/robots.txt_ files on any number of hosts. The following methods are provided: * $rules = WWW::RobotRules->new($robot_name) This is the constructor for WWW::RobotRules objects. The first argument given to new() is the name of the robot. * $rules->parse($robot_txt_url, $content, $fresh_until) The parse() method takes as arguments the URL that was used to retrieve the _/robots.txt_ file, and the contents of the file. * $rules->allowed($uri) Returns TRUE if this robot is allowed to retrieve this URL. * $rules->agent([$name]) Get/set the agent name. NOTE: Changing the agent name will clear the robots.txt rules and expire times out of the cache.

沒有可用的 openSUSE Leap 15.5 官方套件

發行版

openSUSE Tumbleweed

openSUSE Leap 15.6

openSUSE Leap 15.5

openSUSE Leap 15.4

不支援的發行版本

下列發行版本並未被官方支援。使用這些套件需要自行承擔風險。