NAME¶
dl10n-spider -- crawl translator mailing lists (and BTS) for status updates
SYNOPSIS¶
dl10n-spider [options] lang+
DESCRIPTION¶
This script parses the debian-l10n-<language> mailing list archives. It
looks for emails which title follow a specific format indicating what the
author intend to translate, or the current status of his work on this
translation.
Those informations are saved to a dl10n database which can then be used to build
a l10n coordination page or any other useless statistics.
get_header extract the email header from the html page. This header starts at
<!--X-Head-of-Message--> and stops at
<!--X-Head-of-Message-End-->. As it contains html tags, they are also
removed.
It gets a reference to an array of line (HTML) containing the html code of the
page.
It returns a reference to an array containing the email header lines.
get_message(LANGUAGE, YEAR, MONTH, MESSAGE)¶
get_message requests message to the archives of a l10n mailling.
It gets the language string (LANGUAGE), year (YEAR), month (MONTH) and message
number (MESSAGE) integers.
It return a reference to an array containing the html lines or 'undef' if an
error occured.
get_indexpage¶
retrieves all messages numbers and subjects from a page of messages sorted by
date. It return a hash table with message number as keys and subject as values
(this is really quicker than retrieving each message).
LICENSE¶
This program is free software; you can redistribute it and/or modify it under
the terms of the GNU General Public License as published by the Free Software
Foundation; either version 2 of the License, or (at your option) any later
version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY
WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR
A PARTICULAR PURPOSE. See the GNU General Public License for more details. #
You should have received a copy of the GNU General Public License along with
this program; if not, write to the Free Software Foundation, Inc., 59 Temple
Place - Suite 330, Boston, MA 02111-1307, USA.
COPYRIGHT (C)¶
2003,2004 Tim Dijkstra
2004 Nicolas Bertolissio
2004 Martin Quinson