NAME¶
cleanasn - clean up irregularities in NCBI ASN.1 objects
SYNOPSIS¶
cleanasn [
-] [
-A filename] [
-C str] [
-D str] [
-F str] [
-K str] [
-L filename] [
-M filename] [
-N str] [
-P str] [
-Q str] [
-R] [
-S str] [
-T] [
-U str] [
-V str] [
-X str] [
-Z str] [
-a str] [
-b] [
-c] [
-d str] [
-f str] [
-i filename] [
-j filename] [
-k filename] [
-m str] [
-n path] [
-o filename] [
-p path] [
-q path] [
-r path] [
-v path] [
-x ext]
DESCRIPTION¶
cleanasn is a utility program to clean up irregularities in NCBI ASN.1
objects.
OPTIONS¶
A summary of options is included below.
- -
- Print usage message
- -A filename
- Accession list file
- -C str
- Sequence operations, per the flags in str:
- c
- Compress
- d
- Decompress
- v
- Virtual gaps inside segmented sequence
- s
- Convert segmented set to delta sequence
- -D str
- Clean up descriptors, per the flags in str:
- t
- Remove Title
- c
- Remove Comment
- n
- Remove Nuc-Prot Set title
- e
- Remove Pop/Phy/Mut/Eco Set title
- m
- Remove mRNA title
- p
- Remove Protein title
- -F str
- Clean up features, per the flags in str:
- u
- Remove User-objects
- d
- Remove db_xrefs
- e
- Remove /evidence and /inference
- r
- Remove redundant gene xrefs
- f
- Fuse duplicate features
- k
- Package coding-region or parts features
- z
- Delete or update EC numbers
- -K str
- Perform a general cleanup, per the flags in str:
- b
- BasicSeqEntryCleanup
- p
- C++ BasicCleanup (via an external utility)
- s
- SeriousSeqEntryCleanup
- g
- GpipeSeqEntryCleanup
- n
- Normalize descriptor order
- u
- Remove NcbiCleanup User Objects
- c
- Synchronize genetic Codes
- d
- Resynchronize CDS partials
- m
- Resynchronize mRNA partials
- t
- Resynchronize Peptide partials
- a
- Adjust consensus splice
- i
- Promote to "worst" Seq-ID
- -L filename
- Log file
- -M filename
- Macro file
- -N str
- Clean up links, per the flags in str:
- o
- Link CDS mRNA by Overlap
- p
- Link CDS mRNA by Product
- r
- Reassign feature IDs
- f
- Fix missing reciprocal feature IDs
- c
- Clear feature IDs
- -P
- Publication options:
- a
- Remove All publications
- s
- Remove Serial number
- f
- Remove Figure, numbering, and name
- r
- Remove Remark
- u
- Update PMID-only publication
- #
- Replace unpublished with PMID
- -Q str
- Report:
- c
- Record count
- r
- ASN.1 BSEC report
- s
- ASN.1 SSEC report
- n
- NORM vs. SSEC report
- e
- PopPhyMutEco AutoDef report
- o
- Overlap report
- l
- Latitude-longitude country diff
- d
- Log SSEC differences
- g
- GenBank SSEC diff
- f
- asn2gb/asn2flat diff
- h
- Seg-to-delta GenBank diff
- v
- Validator SSEC diff
- m
- Modernize Gene/RNA/PCR
- u
- Unpublished Pub lookup
- p
- Published Pub lookup
- j
- Unindexed Journal report
- x
- Custom scan
- -R
- Remote fetching from ID (NCBI sequence databases)
- -S str
- Selective difference filter (capital letters skip)
- s
- SSEC
- b
- BSEC
- A
- Author
- p
- Publication
- l
- Location
- r
- RNA
- q
- Qualifier sort order
- g
- Genbank block
- k
- Package CdRegion or parts features
- m
- Move publication
- o
- Leave duplicate Bioseq publication
- d
- Automatic definition line
- e
- Pop/Phy/Mut/Eco Set definition line
- -T
- Taxonomy Lookup
- -U str
- Modernize, per the flags in str:
- g
- Genes
- r
- RNA
- p
- PCR Primers
- -V str
- Remove features by validator severity:
- r
- Reject
- e
- Error
- w
- Warning
- i
- Info
- -X str
- Miscellaneous options, per str:
- d
- Automatic definition line
- e
- Pop/Phy/Mut/Eco Set definition line
- n
- Instantiate NC title
- m
- Instantiate NM titles
- x
- Special XM titles
- p
- Instantiate Protein titles
- c
- Create mRNAs for coding sequences
- f
- Fix reciprocal protein_id/transcript_id
- -Z str
- Remove indicated User-object
- -a str
- ASN.1 type
- a
- Any (default)
- e
- Seq-entry
- b
- Bioseq
- s
- Bioseq-set
- m
- Seq-submit
- t
- Batch Processing [String]
- -b
- Input ASN.1 is Binary
- -c
- Input ASN.1 is Compressed
- -d str
- Source database
- a
- Any (default)
- g
- GenBank
- e
- EMBL
- d
- DDBJ
- b
- EMBL or DDBJ
- r
- RefSeq
- n
- NCBI
- v
- Only segmented sequences
- w
- Exclude segmented sequences
- x
- Exclude EMBL/DDBJ
- y
- Exclude gbcon, gbest, gbgss, gbhtg, gbpat, gbsts
- -f str
- Substring filter
- -i filename
- Single input file (defaults to stdin)
- -j filename
- First filename
- -k filename
- Last filename
- -m str
- Flatfile mode:
- r
- Release
- e
- Entrez
- s
- Sequin
- d
- Dump
- -n path
- asn2flat executable (default is
/netopt/ncbi_tools/bin/asn2flat)
- -o filename
- Single output file (defaults to stdout)
- -p path
- Process all matching files in path
- -q path
- ffdiff executable (default is
/netopt/genbank/subtool/bin/ffdiff)
- -r path
- Path for results
- -v path
- asnval executable (default is
/netopt/ncbi_tools/bin/asnval)
- -x ext
- File selection suffix for use with -p (defaults to
.ent)
AUTHOR¶
The National Center for Biotechnology Information.
SEE ALSO¶
asndisc(1),
asnval(1),
sequin(1).