Scroll to navigation

NORMALISEFASTA(1) General Commands Manual NORMALISEFASTA(1)

NAME

normalisefasta - normalise line length in a FastA file

SYNOPSIS

normalisefasta [options]

DESCRIPTION

normalisefasta reads a FastA file from standard input and outputs a reformatted version of the file with a consistent length of the lines containing sequence information on standard output. The program can either produce uncompressed or BGZF compressed output. For uncompressed output a FastA index (.fai) is produced on the standard error channel.

The following key=value pairs can be given:

cols=<[80]> line width for the lines containing sequence information in number of bases. This option is only considered for uncompressed output (i.e. bgzf=0)

bgzf=<0|1> produce uncompressed (bgzf=0) or compressed (bgzf=1) output

index=<> if bgzf=1 this key can be used for giving the file name for the index file allowing (pseudo) random access in the output file. If the key is not given when bgzf=1, then no index is written.

level=<-1|0|1|9|11>: set compression level of the output file if bgzf=1. Valid values are

-1:
zlib/gzip default compression level
0:
uncompressed
1:
zlib/gzip level 1 (fast) compression
9:
zlib/gzip level 9 (best) compression

If libmaus has been compiled with support for igzip (see https://software.intel.com/en-us/articles/igzip-a-high-performance-deflate-compressor-with-optimizations-for-genomic-data) then an additional valid value is

11:
igzip compression

minlength=<[0]> Minimum length. Reads shorter than this will be discarded. By default all reads are kept.

AUTHOR

Written by German Tischler.

REPORTING BUGS

Report bugs to <germant@miltenyibiotec.de>

COPYRIGHT

Copyright © 2009-2014 German Tischler, © 2011-2014 Genome Research Limited. License GPLv3+: GNU GPL version 3 <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law.

January 2014 BIOBAMBAM