'\" t .\" Title: gt-fingerprint .\" Author: [FIXME: author] [see http://www.docbook.org/tdg5/en/html/author] .\" Generator: DocBook XSL Stylesheets vsnapshot .\" Date: 07/22/2020 .\" Manual: GenomeTools Manual .\" Source: GenomeTools 1.6.1 .\" Language: English .\" .TH "GT\-FINGERPRINT" "1" "07/22/2020" "GenomeTools 1\&.6\&.1" "GenomeTools Manual" .\" ----------------------------------------------------------------- .\" * Define some portability stuff .\" ----------------------------------------------------------------- .\" ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .\" http://bugs.debian.org/507673 .\" http://lists.gnu.org/archive/html/groff/2009-02/msg00013.html .\" ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .ie \n(.g .ds Aq \(aq .el .ds Aq ' .\" ----------------------------------------------------------------- .\" * set default formatting .\" ----------------------------------------------------------------- .\" disable hyphenation .nh .\" disable justification (adjust text to left margin only) .ad l .\" ----------------------------------------------------------------- .\" * MAIN CONTENT STARTS HERE * .\" ----------------------------------------------------------------- .SH "NAME" gt-fingerprint \- Compute MD5 fingerprints for each sequence given in a set of sequence files\&. .SH "SYNOPSIS" .sp \fBgt fingerprint\fR [option \&...] sequence_file [\&...] .SH "DESCRIPTION" .PP \fB\-check\fR [\fIfilename\fR] .RS 4 compare all fingerprints contained in the given checklist file with checksums in given sequence_files(s)\&. The comparison is successful, if all fingerprints given in checkfile can be found in the sequence_file(s) in the exact same quantity and vice versa\&. (default: undefined) .RE .PP \fB\-duplicates\fR [\fIyes|no\fR] .RS 4 show duplicate fingerprints from given sequence_file(s)\&. (default: no) .RE .PP \fB\-extract\fR [\fIstring\fR] .RS 4 extract the sequence(s) with the given fingerprint from sequence file(s) and show them on stdout\&. (default: undefined) .RE .PP \fB\-width\fR [\fIvalue\fR] .RS 4 set output width for FASTA sequence printing (0 disables formatting) (default: 0) .RE .PP \fB\-o\fR [\fIfilename\fR] .RS 4 redirect output to specified file (default: undefined) .RE .PP \fB\-gzip\fR [\fIyes|no\fR] .RS 4 write gzip compressed output file (default: no) .RE .PP \fB\-bzip2\fR [\fIyes|no\fR] .RS 4 write bzip2 compressed output file (default: no) .RE .PP \fB\-force\fR [\fIyes|no\fR] .RS 4 force writing to output file (default: no) .RE .PP \fB\-help\fR .RS 4 display help and exit .RE .PP \fB\-version\fR .RS 4 display version information and exit .RE .sp If neither option \fI\-check\fR nor option \fI\-duplicates\fR is used, the fingerprints for all sequences are shown on stdout\&. .sp Fingerprint of a sequence is case insensitive\&. Thus MD5 fingerprint of two identical sequences will be the same even if one is soft\-masked\&. .SH "EXAMPLES" .sp Compute (unified) list of fingerprints: .sp .if n \{\ .RS 4 .\} .nf $ gt fingerprint U89959_ests\&.fas | sort | uniq > U89959_ests\&.checklist_uniq .fi .if n \{\ .RE .\} .sp Compare fingerprints: .sp .if n \{\ .RS 4 .\} .nf $ gt fingerprint \-check U89959_ests\&.checklist_uniq U89959_ests\&.fas 950b7715ab6cc030a8c810a0dba2dd33 only in sequence_file(s) .fi .if n \{\ .RE .\} .sp Make sure a sequence file contains no duplicates (not the case here): .sp .if n \{\ .RS 4 .\} .nf $ gt fingerprint \-duplicates U89959_ests\&.fas 950b7715ab6cc030a8c810a0dba2dd33 2 gt fingerprint: error: duplicates found: 1 out of 200 (0\&.500%) .fi .if n \{\ .RE .\} .sp Extract sequence with given fingerprint: .sp .if n \{\ .RS 4 .\} .nf $ gt fingerprint \-extract 6d3b4b9db4531cda588528f2c69c0a57 U89959_ests\&.fas >SQ;8720010 TTTTTTTTTTTTTTTTTCCTGACAAAACCCCAAGACTCAATTTAATCAATCCTCAAATTTACATGATAC CAACGTAATGGGAGCTTAAAAATA .fi .if n \{\ .RE .\} .SH "RETURN VALUES" .sp .RS 4 .ie n \{\ \h'-04'\(bu\h'+03'\c .\} .el \{\ .sp -1 .IP \(bu 2.3 .\} 0 everything went fine (\fI\-check\fR: the comparison was successful; \fI\-duplicates\fR: no duplicates found) .RE .sp .RS 4 .ie n \{\ \h'-04'\(bu\h'+03'\c .\} .el \{\ .sp -1 .IP \(bu 2.3 .\} 1 an error occurred (\fI\-check\fR: the comparison was not successful; \fI\-duplicates\fR: duplicates found) .RE .SH "REPORTING BUGS" .sp Report bugs to https://github\&.com/genometools/genometools/issues\&.