'\" t .\" Title: gt-select .\" Author: [FIXME: author] [see http://www.docbook.org/tdg5/en/html/author] .\" Generator: DocBook XSL Stylesheets vsnapshot .\" Date: 02/28/2024 .\" Manual: GenomeTools Manual .\" Source: GenomeTools 1.6.5 .\" Language: English .\" .TH "GT\-SELECT" "1" "02/28/2024" "GenomeTools 1\&.6\&.5" "GenomeTools Manual" .\" ----------------------------------------------------------------- .\" * Define some portability stuff .\" ----------------------------------------------------------------- .\" ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .\" http://bugs.debian.org/507673 .\" http://lists.gnu.org/archive/html/groff/2009-02/msg00013.html .\" ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .ie \n(.g .ds Aq \(aq .el .ds Aq ' .\" ----------------------------------------------------------------- .\" * set default formatting .\" ----------------------------------------------------------------- .\" disable hyphenation .nh .\" disable justification (adjust text to left margin only) .ad l .\" ----------------------------------------------------------------- .\" * MAIN CONTENT STARTS HERE * .\" ----------------------------------------------------------------- .SH "NAME" gt-select \- Select certain features (specified by the used options) from given GFF3 file(s)\&. .SH "SYNOPSIS" .sp \fBgt select\fR [option \&...] [GFF3_file \&...] .SH "DESCRIPTION" .PP \fB\-retainids\fR [\fIyes|no\fR] .RS 4 when available, use the original IDs provided in the source file (memory consumption is proportional to the input file size(s)) (default: no) .RE .PP \fB\-seqid\fR [\fIstring\fR] .RS 4 select feature with the given sequence ID (all comments are selected)\&. (default: undefined) .RE .PP \fB\-source\fR [\fIstring\fR] .RS 4 select feature with the given source (the source is column 2 in regular GFF3 lines) (default: undefined) .RE .PP \fB\-contain\fR [\fIstart\fR \fIend\fR] .RS 4 select all features which are contained in the given range (default: undefined) .RE .PP \fB\-overlap\fR [\fIstart\fR \fIend\fR] .RS 4 select all features which do overlap with the given range (default: undefined) .RE .PP \fB\-strand\fR [\fIstring\fR] .RS 4 select all top\-level features(i\&.e\&., features without parents) whose strand equals the given one (must be one of \fI+\-\&.?\fR) (default: undefined) .RE .PP \fB\-targetstrand\fR [\fIstring\fR] .RS 4 select all top\-level features (i\&.e\&., features without parents) which have exactly one target attribute whose strand equals the given one (must be one of \fI+\-\&.?\fR) (default: undefined) .RE .PP \fB\-targetbest\fR [\fIyes|no\fR] .RS 4 if multiple top\-level features (i\&.e\&., features without parents) with exactly one target attribute have the same target_id, keep only the feature with the best score\&. If \-targetstrand is used at the same time, this option is applied after \-targetstrand\&. Memory consumption is proportional to the input file size(s)\&. (default: no) .RE .PP \fB\-hascds\fR [\fIyes|no\fR] .RS 4 select all top\-level features which do have a CDS child (default: no) .RE .PP \fB\-maxgenelength\fR [\fIvalue\fR] .RS 4 select genes up to the given maximum length (default: undefined) .RE .PP \fB\-maxgenenum\fR [\fIvalue\fR] .RS 4 select the first genes up to the given maximum number (default: undefined) .RE .PP \fB\-mingenescore\fR [\fIvalue\fR] .RS 4 select genes with the given minimum score (default: undefined) .RE .PP \fB\-maxgenescore\fR [\fIvalue\fR] .RS 4 select genes with the given maximum score (default: undefined) .RE .PP \fB\-minaveragessp\fR [\fIvalue\fR] .RS 4 set the minimum average splice site probability (default: undefined) .RE .PP \fB\-rule_files\fR .RS 4 specify Lua filter rule files to be used for selection (terminate list with \fI\-\-\fR) .RE .PP \fB\-rule_logic\fR [\fI\&...\fR] .RS 4 select how multiple Lua files should be combined choose from AND|OR (default: AND) .RE .PP \fB\-dropped_file\fR [\fIfilename\fR] .RS 4 save non\-selected features to file (default: undefined) .RE .PP \fB\-v\fR [\fIyes|no\fR] .RS 4 be verbose (default: no) .RE .PP \fB\-o\fR [\fIfilename\fR] .RS 4 redirect output to specified file (default: undefined) .RE .PP \fB\-gzip\fR [\fIyes|no\fR] .RS 4 write gzip compressed output file (default: no) .RE .PP \fB\-bzip2\fR [\fIyes|no\fR] .RS 4 write bzip2 compressed output file (default: no) .RE .PP \fB\-force\fR [\fIyes|no\fR] .RS 4 force writing to output file (default: no) .RE .PP \fB\-help\fR .RS 4 display help and exit .RE .PP \fB\-version\fR .RS 4 display version information and exit .RE .sp File format for option \fI\-rule_files\fR: .sp The files supplied to option \fI\-rule_files\fR define a function for filtering by user given criteria (see example below): .sp .if n \{\ .RS 4 .\} .nf function filter(gn) target = "exon" for curnode in gn:children() do if (curnode:get_type() == target) then return false end end return true end .fi .if n \{\ .RE .\} .sp The above function iterates over all children of \fIgn\fR and checks whether there is a node of type \fIexon\fR\&. If there is such a node the function returns \fIfalse\fR, indicating that the parent node \fIgn\fR will not be sorted out\&. .sp NOTE: The function must be named \fIfilter\fR and must return \fIfalse\fR, indicating that the node survived the filtering process\&. .SH "REPORTING BUGS" .sp Report bugs to https://github\&.com/genometools/genometools/issues\&.