.\" DO NOT MODIFY THIS FILE! It was generated by help2man 1.47.4. .TH OMINDEX "1" "April 2017" "xapian-omega 1.4.4" "User Commands" .SH NAME omindex \- Index static website data via the filesystem .SH SYNOPSIS .B omindex [\fI\,OPTIONS\/\fR] \fI\,--db DATABASE \/\fR[\fI\,BASEDIR\/\fR] \fI\,DIRECTORY\/\fR .SH DESCRIPTION omindex \- Index static website data via the filesystem .PP DIRECTORY is the directory to start indexing from. .PP BASEDIR is the directory corresponding to URL (default: DIRECTORY). .SH OPTIONS .TP \fB\-d\fR, \fB\-\-duplicates\fR set duplicate handling ('ignore' or 'replace') .TP \fB\-p\fR, \fB\-\-no\-delete\fR skip the deletion of documents corresponding to deleted files (\fB\-\-preserve\-nonduplicates\fR is a deprecated alias for \fB\-\-no\-delete\fR) .TP \fB\-e\fR, \fB\-\-empty\-docs\fR=\fI\,ARG\/\fR how to handle documents we extract no text from: ARG can be index, warn (issue a diagnostic and index), or skip. (default: warn) .TP \fB\-D\fR, \fB\-\-db\fR=\fI\,DATABASE\/\fR path to database to use .TP \fB\-U\fR, \fB\-\-url\fR=\fI\,URL\/\fR base url BASEDIR corresponds to (default: /) .TP \fB\-M\fR, \fB\-\-mime\-type\fR=\fI\,EXT\/\fR:TYPE assume any file with extension EXT has MIME Content\-Type TYPE, instead of using libmagic (empty TYPE removes any existing mapping for EXT) .TP \fB\-F\fR, \fB\-\-filter\fR=\fI\,M[\/\fR,[T][,C]]:CMD process files with MIME Content\-Type M using command CMD, which produces output (on stdout or in a temporary file) with format T (Content\-Type or file extension; currently txt (default) or html) in character encoding C (default: UTF\-8). E.g. \fB\-Fapplication\fR/octet\-stream:'strings \fB\-n8\fR' or \fB\-Ftext\fR/x\-foo,,utf\-16:'foo2utf16 %f %t' .TP \fB\-l\fR, \fB\-\-depth\-limit\fR=\fI\,LIMIT\/\fR set recursion limit (0 = unlimited) .TP \fB\-f\fR, \fB\-\-follow\fR follow symbolic links .TP \fB\-i\fR, \fB\-\-ignore\-exclusions\fR ignore meta robots tags and similar exclusions .TP \fB\-S\fR, \fB\-\-spelling\fR index data for spelling correction .TP \fB\-m\fR, \fB\-\-max\-size\fR maximum size of file to index (in bytes or with a suffix of 'K'/'k', 'M'/'m', 'G'/'g') (default: unlimited) .TP \fB\-\-sample\fR=\fI\,SOURCE\/\fR what to use for the stored sample of text for HTML documents \- SOURCE can be 'body' or \&'description' (default: 'body') .TP \fB\-E\fR, \fB\-\-sample\-size\fR=\fI\,SIZE\/\fR maximum size for the document text sample (supports the same formats as \fB\-\-max\-size\fR). (default: 512) .TP \fB\-T\fR, \fB\-\-title\-size\fR=\fI\,SIZE\/\fR maximum size for the document title (supports the same formats as \fB\-\-max\-size\fR). (default: 128) .TP \fB\-R\fR, \fB\-\-retry\-failed\fR retry files which omindex failed to extract text from on a previous run .TP \fB\-\-opendir\-sleep\fR=\fI\,SECS\/\fR sleep for SECS seconds before opening each directory \- sleeping for 2 seconds seems to reliably work around problems with indexing files on Microsoft DFS shares. .TP \fB\-C\fR, \fB\-\-track\-ctime\fR track each file's ctime so we can detect changes to ownership or permissions. .TP \fB\-v\fR, \fB\-\-verbose\fR show more information about what is happening .TP \fB\-\-overwrite\fR create the database anew (the default is to update if the database already exists) .TP \fB\-s\fR, \fB\-\-stemmer\fR=\fI\,LANG\/\fR set the stemming language (default: english). Possible values: arabic armenian basque catalan danish dutch earlyenglish english finnish french german german2 hungarian italian kraaij_pohlmann lovins norwegian porter portuguese romanian russian spanish swedish turkish (pass 'none' to disable stemming) .TP \fB\-h\fR, \fB\-\-help\fR display this help and exit .TP \fB\-V\fR, \fB\-\-version\fR output version information and exit .PP Please report bugs at: https://xapian.org/bugs