.\" Man page generated from reStructuredText. . .TH "ECOTAXSPECIFICITY" "1" "Jul 27, 2019" " 1.02 13" "OBITools" .SH NAME ecotaxspecificity \- description of ecotaxspecificity . .nr rst2man-indent-level 0 . .de1 rstReportMargin \\$1 \\n[an-margin] level \\n[rst2man-indent-level] level margin: \\n[rst2man-indent\\n[rst2man-indent-level]] - \\n[rst2man-indent0] \\n[rst2man-indent1] \\n[rst2man-indent2] .. .de1 INDENT .\" .rstReportMargin pre: . RS \\$1 . nr rst2man-indent\\n[rst2man-indent-level] \\n[an-margin] . nr rst2man-indent-level +1 .\" .rstReportMargin post: .. .de UNINDENT . RE .\" indent \\n[an-margin] .\" old: \\n[rst2man-indent\\n[rst2man-indent-level]] .nr rst2man-indent-level -1 .\" new: \\n[rst2man-indent\\n[rst2man-indent-level]] .in \\n[rst2man-indent\\n[rst2man-indent-level]]u .. .sp The \fI\%ecotaxspecificity\fP command evaluates barcode resolution at different taxonomic ranks. .sp As inputs, it takes a sequence record file annotated with taxids in the sequence header, and a database formated as an ecopcr database (see obitaxonomy) or a NCBI taxdump (see NCBI ftp site). .sp An example of output is reported below: .INDENT 0.0 .INDENT 3.5 .sp .nf .ft C Number of sequences added in graph: 284 Number of nodes in all components: 269 Number of sequences lost: 15! rank taxon_ok taxon_total percent order 8 8 100.00 superfamily 1 1 100.00 parvorder 1 1 100.00 subkingdom 1 1 100.00 superkingdom 1 1 100.00 kingdom 3 3 100.00 phylum 5 5 100.00 infraorder 1 1 100.00 subfamily 3 3 100.00 class 6 6 100.00 species 35 176 19.89 superorder 1 1 100.00 suborder 1 1 100.00 subtribe 1 1 100.00 subclass 3 3 100.00 genus 9 15 60.00 superclass 1 1 100.00 family 10 10 100.00 tribe 2 2 100.00 subphylum 1 1 100.00 .ft P .fi .UNINDENT .UNINDENT .sp In this example, the input sequence file contains 284 sequence records, but only 269 have been examined, because taxonomic information was not recovered for the the 15 remaining ones. .sp “Taxon_total” refers to the number of different taxa observed at this rank in the sequence record file (when taxonomic information is available at this rank), and “taxon_ok” corresponds to the number of taxa that the barcode sequence identifies unambiguously in the taxonomic database. In this example, the sequence records correspond to 176 different species, but only 35 of these have specific barcodes. “percent” is the percentage of unambiguously identified taxa among the total number of taxa (taxon_ok/taxon_total*100). .SH ECOTAXSPECIFICITY SPECIFIC OPTIONS .INDENT 0.0 .TP .B \-e INT, \-\-errors= .INDENT 7.0 .INDENT 3.5 Two sequences are considered as different if they have INT or more differences (default: 1). .UNINDENT .UNINDENT .sp \fIExample:\fP .INDENT 7.0 .INDENT 3.5 .INDENT 0.0 .INDENT 3.5 .INDENT 0.0 .INDENT 3.5 .sp .nf .ft C > ecotaxspecificity \-d my_ecopcr_database \-e 5 seq.fasta .ft P .fi .UNINDENT .UNINDENT .UNINDENT .UNINDENT .sp This command considers that two sequences with less than 5 differences correspond to the same barcode. .UNINDENT .UNINDENT .UNINDENT .SH TAXONOMY RELATED OPTIONS .INDENT 0.0 .TP .B \-d , \-\-database= ecoPCR taxonomy Database name .UNINDENT .INDENT 0.0 .TP .B \-t , \-\-taxonomy\-dump= NCBI Taxonomy dump repository name .UNINDENT .SH OPTIONS TO SPECIFY INPUT FORMAT .SS Restrict the analysis to a sub\-part of the input file .INDENT 0.0 .TP .B \-\-skip The N first sequence records of the file are discarded from the analysis and not reported to the output file .UNINDENT .INDENT 0.0 .TP .B \-\-only Only the N next sequence records of the file are analyzed. The following sequences in the file are neither analyzed, neither reported to the output file. This option can be used conjointly with the \fI–skip\fP option. .UNINDENT .SS Sequence annotated format .INDENT 0.0 .TP .B \-\-genbank Input file is in genbank format. .UNINDENT .INDENT 0.0 .TP .B \-\-embl Input file is in embl format. .UNINDENT .SS fasta related format .INDENT 0.0 .TP .B \-\-fasta Input file is in fasta format (including OBITools fasta extensions). .UNINDENT .SS fastq related format .INDENT 0.0 .TP .B \-\-sanger Input file is in Sanger fastq format (standard fastq used by HiSeq/MiSeq sequencers). .UNINDENT .INDENT 0.0 .TP .B \-\-solexa Input file is in fastq format produced by Solexa (Ga IIx) sequencers. .UNINDENT .SS ecoPCR related format .INDENT 0.0 .TP .B \-\-ecopcr Input file is in ecoPCR format. .UNINDENT .INDENT 0.0 .TP .B \-\-ecopcrdb Input is an ecoPCR database. .UNINDENT .SS Specifying the sequence type .INDENT 0.0 .TP .B \-\-nuc Input file contains nucleic sequences. .UNINDENT .INDENT 0.0 .TP .B \-\-prot Input file contains protein sequences. .UNINDENT .SH COMMON OPTIONS .INDENT 0.0 .TP .B \-h, \-\-help Shows this help message and exits. .UNINDENT .INDENT 0.0 .TP .B \-\-DEBUG Sets logging in debug mode. .UNINDENT .SH ECOTAXSPECIFICITY USED SEQUENCE ATTRIBUTE .INDENT 0.0 .INDENT 3.5 .INDENT 0.0 .IP \(bu 2 taxid .UNINDENT .UNINDENT .UNINDENT .SH AUTHOR The OBITools Development Team - LECA .SH COPYRIGHT 2019 - 2015, OBITool Development Team .\" Generated by docutils manpage writer. .