.\" Automatically generated by Pod::Man 2.25 (Pod::Simple 3.16) .\" .\" Standard preamble: .\" ======================================================================== .de Sp \" Vertical space (when we can't use .PP) .if t .sp .5v .if n .sp .. .de Vb \" Begin verbatim text .ft CW .nf .ne \\$1 .. .de Ve \" End verbatim text .ft R .fi .. .\" Set up some character translations and predefined strings. \*(-- will .\" give an unbreakable dash, \*(PI will give pi, \*(L" will give a left .\" double quote, and \*(R" will give a right double quote. \*(C+ will .\" give a nicer C++. Capital omega is used to do unbreakable dashes and .\" therefore won't be available. \*(C` and \*(C' expand to `' in nroff, .\" nothing in troff, for use with C<>. .tr \(*W- .ds C+ C\v'-.1v'\h'-1p'\s-2+\h'-1p'+\s0\v'.1v'\h'-1p' .ie n \{\ . ds -- \(*W- . ds PI pi . if (\n(.H=4u)&(1m=24u) .ds -- \(*W\h'-12u'\(*W\h'-12u'-\" diablo 10 pitch . if (\n(.H=4u)&(1m=20u) .ds -- \(*W\h'-12u'\(*W\h'-8u'-\" diablo 12 pitch . ds L" "" . ds R" "" . ds C` "" . ds C' "" 'br\} .el\{\ . ds -- \|\(em\| . ds PI \(*p . ds L" `` . ds R" '' 'br\} .\" .\" Escape single quotes in literal strings from groff's Unicode transform. .ie \n(.g .ds Aq \(aq .el .ds Aq ' .\" .\" If the F register is turned on, we'll generate index entries on stderr for .\" titles (.TH), headers (.SH), subsections (.SS), items (.Ip), and index .\" entries marked with X<> in POD. Of course, you'll have to process the .\" output yourself in some meaningful fashion. .ie \nF \{\ . de IX . tm Index:\\$1\t\\n%\t"\\$2" .. . nr % 0 . rr F .\} .el \{\ . de IX .. .\} .\" .\" Accent mark definitions (@(#)ms.acc 1.5 88/02/08 SMI; from UCB 4.2). .\" Fear. Run. Save yourself. No user-serviceable parts. . \" fudge factors for nroff and troff .if n \{\ . ds #H 0 . ds #V .8m . ds #F .3m . ds #[ \f1 . ds #] \fP .\} .if t \{\ . ds #H ((1u-(\\\\n(.fu%2u))*.13m) . ds #V .6m . ds #F 0 . ds #[ \& . ds #] \& .\} . \" simple accents for nroff and troff .if n \{\ . ds ' \& . ds ` \& . ds ^ \& . ds , \& . ds ~ ~ . ds / .\} .if t \{\ . ds ' \\k:\h'-(\\n(.wu*8/10-\*(#H)'\'\h"|\\n:u" . ds ` \\k:\h'-(\\n(.wu*8/10-\*(#H)'\`\h'|\\n:u' . ds ^ \\k:\h'-(\\n(.wu*10/11-\*(#H)'^\h'|\\n:u' . ds , \\k:\h'-(\\n(.wu*8/10)',\h'|\\n:u' . ds ~ \\k:\h'-(\\n(.wu-\*(#H-.1m)'~\h'|\\n:u' . ds / \\k:\h'-(\\n(.wu*8/10-\*(#H)'\z\(sl\h'|\\n:u' .\} . \" troff and (daisy-wheel) nroff accents .ds : \\k:\h'-(\\n(.wu*8/10-\*(#H+.1m+\*(#F)'\v'-\*(#V'\z.\h'.2m+\*(#F'.\h'|\\n:u'\v'\*(#V' .ds 8 \h'\*(#H'\(*b\h'-\*(#H' .ds o \\k:\h'-(\\n(.wu+\w'\(de'u-\*(#H)/2u'\v'-.3n'\*(#[\z\(de\v'.3n'\h'|\\n:u'\*(#] .ds d- \h'\*(#H'\(pd\h'-\w'~'u'\v'-.25m'\f2\(hy\fP\v'.25m'\h'-\*(#H' .ds D- D\\k:\h'-\w'D'u'\v'-.11m'\z\(hy\v'.11m'\h'|\\n:u' .ds th \*(#[\v'.3m'\s+1I\s-1\v'-.3m'\h'-(\w'I'u*2/3)'\s-1o\s+1\*(#] .ds Th \*(#[\s+2I\s-2\h'-\w'I'u*3/5'\v'-.3m'o\v'.3m'\*(#] .ds ae a\h'-(\w'a'u*4/10)'e .ds Ae A\h'-(\w'A'u*4/10)'E . \" corrections for vroff .if v .ds ~ \\k:\h'-(\\n(.wu*9/10-\*(#H)'\s-2\u~\d\s+2\h'|\\n:u' .if v .ds ^ \\k:\h'-(\\n(.wu*10/11-\*(#H)'\v'-.4m'^\v'.4m'\h'|\\n:u' . \" for low resolution devices (crt and lpr) .if \n(.H>23 .if \n(.V>19 \ \{\ . ds : e . ds 8 ss . ds o a . ds d- d\h'-1'\(ga . ds D- D\h'-1'\(hy . ds th \o'bp' . ds Th \o'LP' . ds ae ae . ds Ae AE .\} .rm #[ #] #H #V #F C .\" ======================================================================== .\" .IX Title "SWISH.CGI 7" .TH SWISH.CGI 7 "2012-03-12" "perl v5.14.2" "User Contributed Perl Documentation" .\" For nroff, turn off justification. Always turn off hyphenation; it makes .\" way too many mistakes in technical documents. .if n .ad l .nh .SH "NAME" swish.cgi \-\- Example Perl script for searching with the SWISH\-E search engine. .SH "DESCRIPTION" .IX Header "DESCRIPTION" \&\f(CW\*(C`swish.cgi\*(C'\fR is a \s-1CGI\s0 script for searching with the SWISH-E search engine version 2.1\-dev and above. It returns results a page at a time, with matching words from the source document highlighted, showing a few words of content on either side of the highlighted word. .PP The script is highly configurable. Features include searching multiple (or selectable) indexes, limiting searches to a subset of documents, sorting by a number of different properties, and limiting results to a date range. .PP On unix type systems the swish.cgi script is installed in the directory \&\f(CW$prefix\fR/lib/swish\-e, which is typically /usr/local/lib/swish\-e. This can be overridden by the configure options \-\-prefix or \-\-libexecdir. .PP The standard configuration (i.e. not using a config file) should work with most swish index files. Customization of the parameters will be needed if you are indexing special meta data and want to search and/or display the meta data. The configuration can be modified by editing this script directly, or by using a configuration file (.swishcgi.conf by default). The script's configuration file is described below. .PP You are strongly encouraged to get the default configuration working before making changes. Most problems using this script are the result of configuration modifications. .PP The script is modular in design. Both the highlighting code and output generation is handled by modules, which are included in the \&\fIexample/modules\fR distribution directory and installed in the \&\f(CW$libexecdir\fR/perl directory. This allows for easy customization of the output without changing the main \s-1CGI\s0 script. .PP Included with the Swish-e distribution is a module to generate standard \s-1HTML\s0 output. There's also modules and template examples to use with the popular Perl templating systems HTML::Template and Template-Toolkit. This is very useful if your site already uses one of these templating systems The HTML::Template and Template-Toolkit packages are not distributed with Swish-e. They are available from the \s-1CPAN\s0 (http://search.cpan.org). .PP This scipt can also run basically unmodified as a mod_perl handler, providing much better performance than running as a \s-1CGI\s0 script. Usage under mod_perl is described below. .PP Please read the rest of the documentation. There's a \f(CW\*(C`DEBUGGING\*(C'\fR section, and a \f(CW\*(C`FAQ\*(C'\fR section. .PP This script should work on Windows, but security may be an issue. .SH "REQUIREMENTS" .IX Header "REQUIREMENTS" A reasonably current version of Perl. 5.00503 or above is recommended (anything older will not be supported). .PP The Date::Calc module is required to use the date range feature of the script. The Date::Calc module is also available from \s-1CPAN\s0. .SH "INSTALLATION" .IX Header "INSTALLATION" Here's an example installation session under Linux. It should be similar for other operating systems. .PP For the sake of simplicity in this installation example all files are placed in web server space, including files such as swish-e index and configuration files that would normally not be made available via the web server. Access to these files should be limited once the script is running. Either move the files to other locations (and adjust the script's configuration) or use features of the web server to limit access (such as with \fI.htaccess\fR). .PP Please get a simple installation working before modifying the configuration file. Most problems reported for using this script have been due to improper configuration. .PP The script's default settings are setup for initial testing. By default the settings expect to find most files and the swish-e binary in the same directory as the script. .PP For \fIsecurity\fR reasons, once you have tested the script you will want to change settings to limit access to some of these files by the web server (either by moving them out of web space, or using access control such as \fI.htaccess\fR). An example of using \fI.htaccess\fR on Apache is given below. .PP It's expected that swish-e has already been unpacked and the swish-e binary has be compiled from source and \*(L"make install\*(R" has been run. If swish-e was installed from a vendor package (such as from a \s-1RPM\s0 or Debian package) see that pakage's documentation for where files are installed. .PP Example Installation: .IP "1 Symlink or copy the swish.cgi." 4 .IX Item "1 Symlink or copy the swish.cgi." Symlink (or copy if your platform or webserver does not allow symlinks) the swish.cgi script from the installation directory to a local directory. Typically, this would be the cgi-bin directory or a location where \s-1CGI\s0 script are located. In this example a new directory is created and the script is symlinked. .Sp .Vb 3 \& ~$ mkdir swishdir \& ~$ cd swishdir \& ~/swishdir$ ln \-s /usr/local/lib/swish\-e/swish.cgi .Ve .Sp The installation directory is set at configure time with the \-\-prefix or \&\-\-libexecdir options, but by default is in /usr/local/lib/swish\-e. .IP "2 Create an index" 4 .IX Item "2 Create an index" Use an editor and create a simple configuration file for indexing your files. In this example the Apache documentation is indexed. Last we run a simple query to test that the index works correctly. .Sp .Vb 7 \& ~/swishdir$ cat swish.conf \& IndexDir /usr/local/apache/htdocs \& IndexOnly .html .htm \& DefaultContents HTML* \& StoreDescription HTML*
200000 \& MetaNames swishdocpath swishtitle \& ReplaceRules remove /usr/local/apache/ .Ve .Sp If you do not have the Apache docs installed then pick another directory to index such as /usr/share/doc. .Sp Create the index. .Sp .Vb 10 \& ~/swishdir$ swish\-e \-c swish.conf \& Indexing Data Source: "File\-System" \& Indexing "/usr/local/apache/htdocs" \& Removing very common words... \& no words removed. \& Writing main index... \& Sorting words ... \& Sorting 7005 words alphabetically \& Writing header ... \& Writing index entries ... \& Writing word text: Complete \& Writing word hash: Complete \& Writing word data: Complete \& 7005 unique words indexed. \& 5 properties sorted. \& 124 files indexed. 1485844 total bytes. 171704 total words. \& Elapsed time: 00:00:02 CPU time: 00:00:02 \& Indexing done! .Ve .Sp Now, verify that the index can be searched: .Sp .Vb 8 \& ~/swishdir$ swish\-e \-w install \-m 1 \& # SWISH format: 2.1\-dev\-25 \& # Search words: install \& # Number of hits: 14 \& # Search time: 0.001 seconds \& # Run time: 0.040 seconds \& 1000 htdocs/manual/dso.html "Apache 1.3 Dynamic Shared Object (DSO) support" 17341 \& . .Ve .Sp Let's see what files we have in our directory now: .Sp .Vb 5 \& ~/swishdir$ ls \-1 \& index.swish\-e \& index.swish\-e.prop \& swish.cgi \& swish.conf .Ve .IP "3 Test the \s-1CGI\s0 script" 4 .IX Item "3 Test the CGI script" This is a simple step, but often overlooked. You should test from the command line instead of jumping ahead and testing with the web server. See the \f(CW\*(C`DEBUGGING\*(C'\fR section below for more information. .Sp .Vb 2 \& ~/swishdir$ ./swish.cgi | head \& Content\-Type: text/html; charset=ISO\-8859\-1 \& \& \& \& \&