Scroll to navigation

KinoSearch1::InvIndexer(3pm) User Contributed Perl Documentation KinoSearch1::InvIndexer(3pm)
 

NAME

KinoSearch1::InvIndexer - build inverted indexes

SYNOPSIS

    use KinoSearch1::InvIndexer;
    use KinoSearch1::Analysis::PolyAnalyzer;
    my $analyzer
        = KinoSearch1::Analysis::PolyAnalyzer->new( language => 'en' );
    my $invindexer = KinoSearch1::InvIndexer->new(
        invindex => '/path/to/invindex',
        create   => 1,
        analyzer => $analyzer,
    );
    $invindexer->spec_field( 
        name  => 'title' 
        boost => 3,
    );
    $invindexer->spec_field( name => 'bodytext' );
    while ( my ( $title, $bodytext ) = each %source_documents ) {
        my $doc = $invindexer->new_doc($title);
        $doc->set_value( title    => $title );
        $doc->set_value( bodytext => $bodytext );
        $invindexer->add_doc($doc);
    }
    $invindexer->finish;

DESCRIPTION

The InvIndexer class is KinoSearch1's primary tool for creating and modifying inverted indexes, which may be searched using KinoSearch1::Searcher.

METHODS

new

    my $invindexer = KinoSearch1::InvIndexer->new(
        invindex => '/path/to/invindex',  # required
        create   => 1,                    # default: 0
        analyzer => $analyzer,            # default: no-op Analyzer
    );
Create an InvIndexer object.
invindex - can be either a filepath, or an InvIndex subclass such as KinoSearch1::Store::FSInvIndex or KinoSearch1::Store::RAMInvIndex.
create - create a new invindex, clobbering an existing one if necessary.
analyzer - an object which subclasses KinoSearch1::Analysis::Analyzer, such as a PolyAnalyzer.

spec_field

    $invindexer->spec_field(
        name       => 'url',      # required
        boost      => 1,          # default: 1,
        analyzer   => undef,      # default: analyzer spec'd in new()
        indexed    => 0,          # default: 1
        analyzed   => 0,          # default: 1
        stored     => 1,          # default: 1
        compressed => 0,          # default: 0
        vectorized => 0,          # default: 1
    );
Define a field.
name - the field's name.
boost - A multiplier which determines how much a field contributes to a document's score.
analyzer - By default, all indexed fields are analyzed using the analyzer that was supplied to new(). Supplying an alternate for a given field overrides the primary analyzer.
indexed - index the field, so that it can be searched later.
analyzed - analyze the field, using the relevant Analyzer. Fields such as "category" or "product_number" might be indexed but not analyzed.
stored - store the field, so that it can be retrieved when the document turns up in a search.
compressed - compress the stored field, using the zlib compression algorithm.
vectorized - store the field's "term vectors", which are required by KinoSearch1::Highlight::Highlighter for excerpt selection and search term highlighting.

new_doc

    my $doc = $invindexer->new_doc;
Spawn an empty KinoSearch1::Document::Doc object, primed to accept values for the fields spec'd by spec_field.

add_doc

    $invindexer->add_doc($doc);
Add a document to the invindex.

add_invindexes

    my $invindexer = KinoSearch1::InvIndexer->new( 
        invindex => $invindex,
        analyzer => $analyzer,
    );
    $invindexer->add_invindexes( $another_invindex, $yet_another_invindex );
    $invindexer->finish;
Absorb existing invindexes into this one. May only be called once per InvIndexer. add_invindexes() and add_doc() cannot be called on the same InvIndexer.

delete_docs_by_term

    my $term = KinoSearch1::Index::Term->new( 'id', $unique_id );
    $invindexer->delete_docs_by_term($term);
Mark any document which contains the supplied term as deleted, so that it will be excluded from search results. For more info, see Deletions in KinoSearch1::Docs::FileFormat.

finish

    $invindexer->finish( 
        optimize => 1, # default: 0
    );
Finish the invindex. Invalidates the InvIndexer. Takes one hash-style parameter.
optimize - If optimize is set to 1, the invindex will be collapsed to its most compact form, which will yield the fastest queries.

COPYRIGHT

Copyright 2005-2010 Marvin Humphrey

LICENSE, DISCLAIMER, BUGS, etc.

See KinoSearch1 version 1.01.
2014-08-15 perl v5.20.0