perl-WWW-RobotRules

database of robots.txt-derived permissions

This module parses _/robots.txt_ files as specified in "A Standard for Robot Exclusion", at <<a href="http://www.robotstxt.org/wc/norobots.html>">http://www.robotstxt.org/wc/norobots.html></a> Webmasters can use the _/robots.txt_ file to forbid conforming robots from accessing parts of their web site. The parsed files are kept in a WWW::RobotRules object, and this object provides methods to check if access to a given URL is prohibited. The same WWW::RobotRules object can be used for one or more parsed _/robots.txt_ files on any number of hosts. The following methods are provided: * $rules = WWW::RobotRules->new($robot_name) This is the constructor for WWW::RobotRules objects. The first argument given to new() is the name of the robot. * $rules->parse($robot_txt_url, $content, $fresh_until) The parse() method takes as arguments the URL that was used to retrieve the _/robots.txt_ file, and the contents of the file. * $rules->allowed($uri) Returns TRUE if this robot is allowed to retrieve this URL. * $rules->agent([$name]) Get/set the agent name. NOTE: Changing the agent name will clear the robots.txt rules and expire times out of the cache.

Ei virallista pakettia saatavilla: openSUSE Leap 15.5

Jakelut

openSUSE Tumbleweed

openSUSE Leap 15.6

openSUSE Leap 15.5

openSUSE Leap 15.4

Ei-tuetut jakelut

Seuraavia jakeluja ei virallisesti tueta. Käytä näitä paketteja omalla vastuullasi.