.\" Automatically generated by Pod::Man 4.11 (Pod::Simple 3.35) .\" .\" Standard preamble: .\" ======================================================================== .de Sp \" Vertical space (when we can't use .PP) .if t .sp .5v .if n .sp .. .de Vb \" Begin verbatim text .ft CW .nf .ne \\$1 .. .de Ve \" End verbatim text .ft R .fi .. .\" Set up some character translations and predefined strings. \*(-- will .\" give an unbreakable dash, \*(PI will give pi, \*(L" will give a left .\" double quote, and \*(R" will give a right double quote. \*(C+ will .\" give a nicer C++. Capital omega is used to do unbreakable dashes and .\" therefore won't be available. \*(C` and \*(C' expand to `' in nroff, .\" nothing in troff, for use with C<>. .tr \(*W- .ds C+ C\v'-.1v'\h'-1p'\s-2+\h'-1p'+\s0\v'.1v'\h'-1p' .ie n \{\ . ds -- \(*W- . ds PI pi . if (\n(.H=4u)&(1m=24u) .ds -- \(*W\h'-12u'\(*W\h'-12u'-\" diablo 10 pitch . if (\n(.H=4u)&(1m=20u) .ds -- \(*W\h'-12u'\(*W\h'-8u'-\" diablo 12 pitch . ds L" "" . ds R" "" . ds C` "" . ds C' "" 'br\} .el\{\ . ds -- \|\(em\| . ds PI \(*p . ds L" `` . ds R" '' . ds C` . ds C' 'br\} .\" .\" Escape single quotes in literal strings from groff's Unicode transform. .ie \n(.g .ds Aq \(aq .el .ds Aq ' .\" .\" If the F register is >0, we'll generate index entries on stderr for .\" titles (.TH), headers (.SH), subsections (.SS), items (.Ip), and index .\" entries marked with X<> in POD. Of course, you'll have to process the .\" output yourself in some meaningful fashion. .\" .\" Avoid warning from groff about undefined register 'F'. .de IX .. .nr rF 0 .if \n(.g .if rF .nr rF 1 .if (\n(rF:(\n(.g==0)) \{\ . if \nF \{\ . de IX . tm Index:\\$1\t\\n%\t"\\$2" .. . if !\nF==2 \{\ . nr % 0 . nr F 2 . \} . \} .\} .rr rF .\" ======================================================================== .\" .IX Title "Bio::DB::GFF::Adaptor::dbi::pg_fts 3pm" .TH Bio::DB::GFF::Adaptor::dbi::pg_fts 3pm "2020-01-13" "perl v5.30.0" "User Contributed Perl Documentation" .\" For nroff, turn off justification. Always turn off hyphenation; it makes .\" way too many mistakes in technical documents. .if n .ad l .nh .SH "NAME" Bio::DB::GFF::Adaptor::dbi::pg_fts \-\- Database adaptor for a specific postgres schema with a TSearch2 implementation .SH "SYNOPSIS" .IX Header "SYNOPSIS" .Vb 3 \& #create new GFF database connection \& my $db = Bio::DB::GFF\->new( \-adaptor => \*(Aqdbi::pg_fts\*(Aq, \& \-dsn => \*(Aqdbi:Pg:dbname=worm\*(Aq); \& \& #add full text indexing \*(Aqstuff\*(Aq \& #assumes that TSearch2 is available to PostgreSQL \& #this will take a VERY long time for a reasonably large database \& $db\->install_TSearch2(); \& \& ...some time later... \& #we don\*(Aqt like full text searching... \& $db\->remove_TSearch2(); .Ve .SH "DESCRIPTION" .IX Header "DESCRIPTION" This adaptor is based on Bio::DB::GFF::Adaptor::dbi::pg but it implements the TSearch2 PostgreSQL contrib module for fast full text searching. To use this module with your PostgreSQL \s-1GFF\s0 database, you need to make TSearch2 available in the database. .PP To use this adaptor, follow these steps: .IP "Install TSearch2 contrib module for Pg" 4 .IX Item "Install TSearch2 contrib module for Pg" Can be as easy as `sudo yum install postgresql\-contrib`, or you may need to recompile PostgreSQL to include it. See for more details .IP "Load the TSearch2 functions to you database" 4 .IX Item "Load the TSearch2 functions to you database" .Vb 1 \& % cat tsearch2.sql | psql .Ve .IP "Load your data using the pg adaptor:" 4 .IX Item "Load your data using the pg adaptor:" .Vb 1 \& % bp_pg_bulk_load_gff.pl \-c \-d yeast saccharomyces_cerevisiae.gff .Ve .Sp or .Sp .Vb 1 \& % bp_load_gff.pl \-c \-d yeast \-a dbi::pg saccharomyces_cerevisiae.gff .Ve .IP "Add GFF/TSearch2 specific modifications" 4 .IX Item "Add GFF/TSearch2 specific modifications" Execute a perl script like this one: .Sp .Vb 2 \& #!/usr/bin/perl \-w \& use strict; \& \& use Bio::DB::GFF; \& \& my $db = Bio::DB::GFF\->new( \& \-adaptor => \*(Aqdbi::pg_fts\*(Aq, \& \-dsn => \*(Aqdbi:Pg:dbname=yeast\*(Aq, \& \-user => \*(Aqscott\*(Aq, \& ); \& \& print "Installing TSearch2 columns...\en"; \& \& $db\->install_TSearch2(); \& \& print "Done\en"; .Ve .PP Note that this last step will take a long time. For a S. cerevisiae database with 15K rows, it took over an hour on my laptop, and with a C. elegans database (~10 million rows) it took well over a day. .PP If at some point you add more data you your database, you need to run a similar script to the one above, only executing the \fBupdate_TSearch2()\fR method. Finally, if you want to remove the TSearch2 columns from your database and go back to using the pg adaptor, you can execute a script like the one above, only executing the \fBremove_TSearch2()\fR method. .SH "NOTES ABOUT TSearch2 SEARCHING" .IX Header "NOTES ABOUT TSearch2 SEARCHING" You should know a few things about how searching with TSearch2 works in the GBrowse environment: .IP "1." 4 TSearch2 does not do wild cards, so you should encourage your users not to use them. If wild cards are used, the adaptor will fall back on an \s-1ILIKE\s0 search, which will be much slower. .IP "2." 4 However, TSearch2 does do 'word stemming'. That is, if you search for 'copy', it will find 'copy', 'copies', and 'copied'. .IP "3." 4 TSearch2 does not do phrase searching; all of the terms in the search string are ANDed together. .SH "ACKNOWLEDGEMENTS" .IX Header "ACKNOWLEDGEMENTS" Special thanks to Russell Smithies and Paul Smale at AgResearch in New Zealand for giving me their recipe for doing full text indexing in a \s-1GFF\s0 database. .SH "BUGS" .IX Header "BUGS" Please report bugs to the BioPerl and/or GBrowse mailing lists ( and respectively). .SH "SEE ALSO" .IX Header "SEE ALSO" Please see Bio::DB::GFF::Adaptor::dbi::pg for more information about tuning your PostgreSQL server for \s-1GFF\s0 data, and for general information about \s-1GFF\s0 database access, see Bio::DB::GFF. .SH "AUTHOR" .IX Header "AUTHOR" Scott Cain, cain@cshl.edu .SH "APPENDIX" .IX Header "APPENDIX" .SS "search_notes" .IX Subsection "search_notes" .Vb 6 \& Title : search_notes \& Usage : @search_results = $db\->search_notes("full text string",$limit) \& Function: Search the notes for a text string, using PostgreSQL TSearch2 \& Returns : array of results \& Args : full text search string, and an optional row limit \& Status : public .Ve .PP This is based on the mysql-specific method that makes use of the TSearch2 functionality in PosgreSQL's contrib directory. Given a search string, it performs a full-text search of the notes table and returns an array of results. Each row of the returned array is a arrayref containing the following fields: .PP .Vb 3 \& column 1 A Bio::DB::GFF::Featname object, for passing to segment() \& column 2 The text of the note \& column 3 A relevance score. .Ve .SS "make_features_by_name_where_part" .IX Subsection "make_features_by_name_where_part" .Vb 3 \& Title : make_features_by_name_where_part \& Function: constructs a TSearch2\-compliant WHERE clause for a name search \& Status : protected .Ve .SS "install_TSearch2" .IX Subsection "install_TSearch2" .Vb 4 \& Title : install_TSearch2 \& Function: installs schema modifications for use with TSearch2 \& Usage : $db\->install_TSearch2 \& Status : public .Ve .SS "update_TSearch2" .IX Subsection "update_TSearch2" .Vb 4 \& Title : update_TSearch2 \& Function: Updates TSearch2 columns \& Usage : $db\->update_TSearch2 \& Status : public .Ve .SS "remove_TSearch2" .IX Subsection "remove_TSearch2" .Vb 4 \& Title : remove_TSearch2 \& Function: Removes TSearch2 columns \& Usage : $db\->remove_TSearch2 \& Status : public .Ve