NAME¶

parsero - Audit tool for robots.txt of a site

SYNOPSIS¶

parsero [-h] [-u URL] [-o] [-sb] [-f FILE]

DESCRIPTION¶

Parsero is a free script written in Python which reads the Robots.txt file of a web server through the network and looks at the Disallow entries. The Disallow entries tell the search engines what directories or files hosted on a web server mustn't be indexed. For example, "Disallow: /portal/login" means that the content on www.example.com/portal/login it's not allowed to be indexed by crawlers like Google, Bing, Yahoo... This is the way the administrator have to not share sensitive or private information with the search engines.

OPTIONS¶

-h, --help: Show help message and exit.
-u URL: Type the URL which will be analyzed.
-o: Show only the "HTTP 200" status code.
-sb: Search in Bing indexed Disallows.
-f FILE: Scan a list of domains from a list.

EXAMPLE¶

Common usage:



    $ parsero -u www.example.com

Using a list of domains from a list:



    $ parsero -f /tmp/list-of-domains.txt

AUTHOR¶

parsero was written by Javier Nieto <javier.nieto@behindthefirewalls.com>.

This manual page was written by Thiago Andrade Marques <thmarques@gmail.com> for the Debian project (but may be used by others).

27 Jan 2020

parsero-0.0+git20140929.e5b585a

Source file:	parsero.1.en.gz (from parsero )
Source last updated:	2020-01-27T20:11:23Z
Converted to HTML:	2022-09-07T21:11:25Z