Scroll to navigation

PUBLIC-INBOX-EXTINDEX(1) public-inbox user manual PUBLIC-INBOX-EXTINDEX(1)

NAME

public-inbox-extindex - create and update external search indices

SYNOPSIS

public-inbox-extindex [OPTIONS] EXTINDEX_DIR INBOX_DIR...

public-inbox-extindex [OPTIONS] [EXTINDEX_DIR] --all

DESCRIPTION

public-inbox-extindex creates and updates an external search and overview database used by the read-only public-inbox PSGI (HTTP), NNTP, and IMAP interfaces. This requires either the Search::Xapian XS bindings OR the Xapian SWIG bindings, along with DBD::SQLite and DBI Perl modules.

OPTIONS

These switches behave as they do for public-inbox-index(1)
Index all "publicinbox" entries in "PI_CONFIG".

"publicinbox" entries indexed by "public-inbox-extindex" can have full Xapian searching abilities with the per-"publicinbox" "indexlevel" set to "basic" and their respective Xapian ("xap15" or "xapian15") directories removed. For multiple public-inboxes where cross-posting is common, this allows significant space savings on Xapian indices.

Perform garbage collection instead of indexing. Use this if inboxes are removed from the extindex, or if messages are purged or removed from some inboxes.
Forces a re-index of all messages in the extindex. This can be used for in-place upgrades and bugfixes while read-only server processes are utilizing the index. Keep in mind this roughly doubles the size of the already-large Xapian database.

The extindex locks will be released roughly every 10s to allow public-inbox-mda(1) and public-inbox-watch(1) processes to write to the extindex.

Used with "--reindex", it will only look for new and stale entries and not touch already-indexed messages.

FILES

public-inbox-extindex-format(5)

CONFIGURATION

public-inbox-extindex does not currently write to the public-inbox-config(5) file, configuration may be entered manually. The extindex name of "all" is a special case which corresponds to indexing "--all" inboxes. An example for "--all" is as follows:

        [extindex "all"]
                topdir = /path/to/extindex_dir
                url = all
                coderepo = foo
                coderepo = bar

See public-inbox-config(5) for more details.

ENVIRONMENT

Used to override the default "~/.public-inbox/config" value.
The number of documents to update before committing changes to disk. This environment is handled directly by Xapian, refer to Xapian API documentation for more details.

Setting "XAPIAN_FLUSH_THRESHOLD" or "publicinbox.indexBatchSize" for a large "--reindex" may cause public-inbox-mda(1), public-inbox-learn(1) and public-inbox-watch(1) tasks to wait long and unpredictable periods of time during "--reindex".

Default: none, uses "publicinbox.indexBatchSize"

UPGRADING

Occasionally, public-inbox will update it's schema version and require a full index by running this command.

CONTACT

Feedback welcome via plain-text mail to <mailto:meta@public-inbox.org>

The mail archives are hosted at <https://public-inbox.org/meta/> and <http://4uok3hntl7oi7b4uf4rtfwefqeexfzil2w6kgk2jn5z2f764irre7byd.onion/meta/>

COPYRIGHT

Copyright all contributors <mailto:meta@public-inbox.org>

License: AGPL-3.0+ <https://www.gnu.org/licenses/agpl-3.0.txt>

SEE ALSO

Search::Xapian, DBD::SQLite

1993-10-02 public-inbox.git