table of contents
OBITAXONOMY(1) | OBITools | OBITAXONOMY(1) |
NAME¶
obitaxonomy - description of obitaxonomy
The obitaxonomy command can generate an ecoPCR database from a NCBI taxdump (see NCBI ftp site) and allows managing the taxonomic data contained in both types of database.
Several types of editing are possible:
Adding a taxon to the database
Deleting a taxon from the database
Adding a species to the database
Adding a preferred scientific name for a taxon in the database
Adding all the taxa from a sequence file in the ``OBITools`` extended :doc:`fasta <../fasta>` format to the database
The header of each sequence record must contain the attribute defined by the -k option (default key: species_name), whose value is the scientific name of the taxon to be added.
A taxonomic path for each sequence record can be specified with the -p option, as the attribute key that contains the taxonomic path of the taxon to be added.
A restricting ancestor can be specified with the -A option, either as a taxid (integer) or a key (string). If it is a taxid, this taxid is the default taxid under which the new taxon is added if none of his ancestors are specified or can be found. If it is a key, obitaxonomy looks for the ancestor taxid in the corresponding attribute, and the new taxon is systematically added under this ancestor. By default, the restricting ancestor is the root of the taxonomic tree for all the new taxa.
If neither a path nor an ancestor is specified in the header of the sequence record, obitaxonomy tries to read the taxon name as a species name and to find the genus in the taxonomic database. If the genus is found, the new taxon is added under it. If not, it is added under the restricting ancestor.
It is highly recommended checking what was exactly done by reading the output, since obitaxonomy uses ad hoc parsing and decision rules.
Done by using the -F option.
Notes:
- When a taxon is added, a new taxid is assigned to it. The minimum for the new taxids can be specified by the -m option and is equal to 10000000 by default.
- For each modification, a line is printed with details on what was done.
OBITAXONOMY SPECIFIC OPTIONS¶
Example:
> obitaxonomy -d my_ecopcr_database \
-a 'Gentiana alpina':'species':49934
Adds a taxon with the scientific name Gentiana alpina and the rank species under the taxon whose taxid is 49934.
Example:
> obitaxonomy -d my_ecopcr_database -m 1000000000 \
-a 'Gentiana alpina':'species':49934
Adds a taxon with the scientific name Gentiana alpina and the rank species under the taxon whose taxid is 49934, with a taxid greater than or equal to 1000000000.
Example:
> obitaxonomy -d my_ecopcr_database -D 10000832
Deletes the local taxon with the taxid 10000832 from the taxonomic database.
Example:
> obitaxonomy -d my_ecopcr_database -s 'Gentiana alpina'
Adds the species with the scientific name Gentiana alpina under the genus Gentiana.
Example:
> obitaxonomy -d my_ecopcr_database \
-f 'Gentiana algida':50748
Adds the favorite scientific name Gentiana algida for the taxid 50748 in the taxonomic database.
Example:
> obitaxonomy -d my_ecopcr_database \
-k my_taxon_name_key -F my_sequences.fasta
Adds the taxon of each sequence record from the file my_sequences.fasta in the taxonomic database, based on the scientific name contained in the my_taxon_name_key attribute.
- -k <KEY_NAME>, --key-name=<KEY_NAME>
- Works with the -F option. Defines the key of the attribute that contains the scientific name of the taxon to be added. See example above.
Example:
> obitaxonomy -d my_ecopcr_database -a 33090 \
-k my_taxon_name_key -F my_sequences.fasta
Adds the taxon of each sequence record from the file my_sequences.fasta in the taxonomic database, based on the scientific name contained in the my_taxon_name_key attribute. If the genus of the new taxon cannot be found, the new taxon is added under the taxon whose taxid is 33090.
Example:
> obitaxonomy -d my_ecopcr_database -p my_taxonomic_path_key \
-k my_taxon_name_key -F my_sequences.fasta
Adds the taxon of each sequence record from the file my_sequences.fasta in the taxonomic database, based on the scientific name contained in the my_taxon_name_key attribute. Each ancestor contained in the my_taxonomic_path_key attribute is added if it does not already exist, and the new taxon is added under the latest ancestor of the path.
TAXONOMY RELATED OPTIONS¶
- -d <FILENAME>, --database=<FILENAME>
- ecoPCR taxonomy Database name
- -t <FILENAME>, --taxonomy-dump=<FILENAME>
- NCBI Taxonomy dump repository name
COMMON OPTIONS¶
- -h, --help
- Shows this help message and exits.
- --DEBUG
- Sets logging in debug mode.
AUTHOR¶
The OBITools Development Team - LECA
COPYRIGHT¶
2019 - 2015, OBITool Development Team
July 27, 2019 | 1.02 13 |