Scroll to navigation

BAMAUXMERGE2(1) General Commands Manual BAMAUXMERGE2(1)

NAME

bamauxmerge2 - merge information in unmapped and mapped BAM files

SYNOPSIS

bamauxmerge2 [options] in_unmapped in_mapped

DESCRIPTION

bamauxmerge2 reads and merges two BAM files which are expected to have the following properties

  • the first file contains only unmapped reads and it's header contains no SQ lines

  • the second file was produced by an aligner based on the content of the first file.

  • both files are sorted in query name order into a single alignment file.

The headers of the two files are merged in the following file:

  • the SQ lines contained in the header of the second file are appended to the header of the first file to obtain the header of the output file

  • all other header information from the second file is discarded

The output records are constructed in the following way:

1.
Take a record from the second file

2.
Copy all aux fields from the corresponding record in the first file which are not already present.

3.
Reinsert clipped adapter bases/quality values stored in the qs/qq by aux fields by fastqtobam2 and remove the qs/qq aux fields while inserting appropriate soft clipping CIGAR operations.

4.
Fix mate information like bamfixmateinformation.

5.
Insert the mate CIGAR information fields MC and MS if the mate is aligned.

6.
Insert the MQ (mate quality) aux field.

The following key=value pairs can be given:

zz=<0|1>: replace read name by content of nn aux field. Valid values are

1:
replace read name
0:
do not replace read name

calmdnm=<0|1>: recompute MD and NM aux fields. Valid values are

1:
recompute MD and NM aux fields. This requires the calmdnmreference key to be set to the name of an appropriate FastA file.
0:
do not recompute MD and NM aux fields

calmdnmreference=<>: reference FastA file.

replacecigar=<0|1>: replace M cigar operations by the appropriate = and X operations. Valid values are

1:
replace cigar operations. This requires the calmdnmreference key to be set on invocation.
0:
do not replace cigar operations

hash=<crc32prod>: hash used for computing sequence checksums. See bamseqchksum for further information.

filehash=<md5>: hash used for computing output file checksums.

chksumfn=<>: file name used for storing sequence checksum information. By default this information is not saved.

filehashfn=<>: file name used for storing file checksum information. By default this information is not saved.

level=<-1|0|1|9>: set compression level of the output BAM file. Valid values are

-1:
zlib/gzip default compression level
0:
uncompressed
1:
zlib/gzip level 1 (fast) compression
9:
zlib/gzip level 9 (best) compression

verbose=<1>: Valid values are

1:
print progress report on standard error
0:
do not print progress report

tmpfile=<filename>: prefix for temporary files. By default the temporary files are created in the current directory

md5=<0|1>: md5 checksum creation for output file. Valid values are

0:
do not compute checksum. This is the default.
1:
compute checksum. If the md5filename key is set, then the checksum is written to the given file. If md5filename is unset, then no checksum will be computed.

md5filename file name for md5 checksum if md5=1.

threads=<[1]>: number of threads used for processing. By default 1 thread is used. Set to 0 for using as many threads as CPU cores detected.

AUTHOR

Written by German Tischler.

REPORTING BUGS

Report bugs to <germant@miltenyibiotec.de>

COPYRIGHT

Copyright © 2009-2019 German Tischler, © 2011-2013 Genome Research Limited. License GPLv3+: GNU GPL version 3 <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law.

July 2019 BIOBAMBAM