.\" Automatically generated by Pandoc 2.2.1 .\" .TH "alignment\-thin" "1" "Feb 2018" "" "" .hy .SH NAME .PP \f[B]alignment\-thin\f[] \- Remove sequences or columns from an alignment. .SH SYNOPSIS .PP \f[B]alignment\-thin\f[] \f[I]alignment\-file\f[] [OPTIONS] .SH DESCRIPTION .PP Remove sequences or columns from an alignment. .SH GENERAL OPTIONS: .TP .B \f[B]\-h\f[], \f[B]\-\-help\f[] Print usage information. .RS .RE .TP .B \f[B]\-v\f[], \f[B]\-\-verbose\f[] Output more log messages on stderr. .RS .RE .SH SEQUENCE FILTERING OPTIONS: .TP .B \f[B]\-\-protect\f[] \f[I]arg\f[] Sequences that cannot be removed (comma\-separated). .RS .RE .TP .B \f[B]\-\-remove\f[] \f[I]arg\f[] Remove sequences in comma\-separated list \f[I]arg\f[]. .RS .RE .TP .B \f[B]\-\-longer\-than\f[] \f[I]arg\f[] Remove sequences not longer than \f[I]arg\f[]. .RS .RE .TP .B \f[B]\-\-shorter\-than\f[] \f[I]arg\f[] Remove sequences not shorter than \f[I]arg\f[]. .RS .RE .TP .B \f[B]\-\-cutoff\f[] \f[I]arg\f[] Remove similar sequences with #mismatches < cutoff. .RS .RE .TP .B \f[B]\-\-down\-to\f[] \f[I]arg\f[] Remove similar sequences down to \f[I]arg\f[] sequences. .RS .RE .TP .B \f[B]\-\-remove\-crazy\f[] \f[I]arg\f[] Remove \f[I]arg\f[] outlier sequences \-\- defined as sequences that are missing too many conserved sites. .RS .RE .TP .B \f[B]\-\-conserved\f[] \f[I]arg\f[] (=0.75) Fraction of sequences that must contain a letter for it to be considered conserved. .RS .RE .SH COLUMN FILTERING OPTIONS: .TP .B \f[B]\-\-min\-letters\f[] \f[I]arg\f[] Remove columns with fewer than \f[I]arg\f[] letters. .RS .RE .TP .B \f[B]\-\-remove\-unique\f[] \f[I]arg\f[] Remove insertions in a single sequence if longer than \f[I]arg\f[] letters .RS .RE .SH OUTPUT OPTIONS: .TP .B \f[B]\-\-sort\f[] Sort partially ordered columns to group similar gaps. .RS .RE .TP .B \f[B]\-\-show\-lengths\f[] Just print out sequence lengths. .RS .RE .TP .B \f[B]\-\-find\-dups\f[] \f[I]arg\f[] For each sequence, find the closest other sequence. .RS .RE .SH EXAMPLES: .PP Remove columns without a minimum number of letters: .IP .nf \f[C] %\ alignment\-thin\ \-\-min\-letters=5\ file.fasta\ >\ file\-thinned.fasta \f[] .fi .PP Remove sequences by name: .IP .nf \f[C] %\ alignment\-thin\ \-\-remove=seq1,seq2\ file.fasta\ >\ file2.fasta \f[] .fi .PP Remove short sequences: .IP .nf \f[C] %\ alignment\-thin\ \-\-longer\-than=250\ file.fasta\ >\ file\-long.fasta \f[] .fi .PP Remove sequences with <= 5 differences from the closest other sequence: .IP .nf \f[C] %\ alignment\-thin\ \-\-cutoff=5\ file.fasta\ >\ more\-than\-5\-differences.fasta \f[] .fi .PP Like \-\-cutoff, but stop when we have the right number of sequences: .IP .nf \f[C] %\ alignment\-thin\ \-\-down\-to=30\ file.fasta\ >\ file\-30taxa.fasta \f[] .fi .PP Remove dissimilar sequences that are missing conserved columns: .IP .nf \f[C] %\ alignment\-thin\ \-\-remove\-crazy=10\ file.fasta\ >\ file2.fasta \f[] .fi .PP Protect some sequences from being removed: .IP .nf \f[C] %\ alignment\-thin\ \-\-down\-to=30\ file.fasta\ \-\-protect=seq1,seq2\ >\ file2.fasta \f[] .fi .SH REPORTING BUGS: .PP BAli\-Phy online help: . .PP Please send bug reports to . .SH AUTHORS Benjamin Redelings.