perl-WWW-RobotRules

database of robots.txt-derived permissions

This module parses _/robots.txt_ files as specified in "A Standard for Robot Exclusion", at <<a href="http://www.robotstxt.org/wc/norobots.html>">http://www.robotstxt.org/wc/norobots.html></a> Webmasters can use the _/robots.txt_ file to forbid conforming robots from accessing parts of their web site. The parsed files are kept in a WWW::RobotRules object, and this object provides methods to check if access to a given URL is prohibited. The same WWW::RobotRules object can be used for one or more parsed _/robots.txt_ files on any number of hosts. The following methods are provided: * $rules = WWW::RobotRules->new($robot_name) This is the constructor for WWW::RobotRules objects. The first argument given to new() is the name of the robot. * $rules->parse($robot_txt_url, $content, $fresh_until) The parse() method takes as arguments the URL that was used to retrieve the _/robots.txt_ file, and the contents of the file. * $rules->allowed($uri) Returns TRUE if this robot is allowed to retrieve this URL. * $rules->agent([$name]) Get/set the agent name. NOTE: Changing the agent name will clear the robots.txt rules and expire times out of the cache.

Ei virallista pakettia saatavilla: openSUSE Leap 16.0

Jakelut

openSUSE Tumbleweed

virallinen julkaisu Viralliset

6.02

Jakelut

openSUSE Tumbleweed

openSUSE Leap 16.0

openSUSE Leap 15.6

SLFO Main

Ei-tuetut jakelut

openSUSE:12.2

openSUSE:12.3

openSUSE:13.1

openSUSE:13.2

openSUSE:ALP:Experimental:Slowroll:Base

openSUSE:Leap:15.0

openSUSE:Leap:15.1

openSUSE:Leap:15.2

openSUSE:Leap:42.1

openSUSE:Leap:42.2

openSUSE:Leap:42.3

openSUSE:12.1

SUSE:SLE-15:GA

SUSE:ALP:Workbench

SUSE:ALP