Scroll to navigation

ROCKHOPPER(1) General Commands Manual ROCKHOPPER(1)

NAME

rockhopper - system for analyzing bacterial RNA-seq data (command line tool)

SYNOPSIS

rockhopper [options]

DESCRIPTION

rockhopper is a comprehensive and user-friendly system for computational analysis of bacterial RNA-seq data. As input, it takes RNA sequencing reads output by high-throughput sequencing technology (FASTQ, QSEQ, FASTA, SAM, or BAM files).

REQUIRED ARGUMENTS

a comma separated list of sequencing files (in FASTQ, QSEQ, FASTA, SAM, or BAM format) for replicate experiments, one list per experimental condition (mate-pair files should be delimited by '%')

REFERENCE BASED ASSEMBLY VS. DE NONO ASSEMBLY

If the -g option is used, then rockhopper aligns reads to one or more reference genomes, otherwise, rockhopper performs de novo transcript assembly.

a comma separated list of directories, each containing a genome file (*.fna), gene file (*.ptt), and rna file (*.rnt)

OPTIONAL ARGUMENTS FOR EITHER REFERENCE BASED ASSEMBLY OR DE NOVO ASSEMBLY

reverse complement single-end reads (default is false)
orientation of two mate reads for paired-end read, f=forward and r=reverse_complement (default is fr)
maximum number of bases between mate pairs for paired-end reads (default is 500)
identify 1 alignment (true) or identify all optimal alignments (false), (default is true)
number of processors (default is self-identification of processors)
compute differential expression for transcripts in pairs of experimental conditions (default is true)
RNA-seq experiments are strand specific (true) or strand ambiguous (false), (default is true)
labels for each condition
directory where output files are written (default is Rockhopper_Results/)
verbose output including raw/normalized counts aligning to each gene (default is false)
output a SAM format file
output time taken to execute program

OPTIONAL ARGUMENTS FOR REFERENCE BASED ASSEMBLY ONLY

allowed mismatches as percent of read length (default is 0.15)
minimum seed as percent of read length (default is 0.33)
compute operons (default is true)
identify transcript boundaries including UTRs and ncRNAs (default is true)
minimum expression of UTRs and ncRNAs, a number in range [0.0, 1.0] (default is 0.5)

OPTIONAL ARGUMENTS FOR DE NOVO ASSEMBLY ONLY

size of k-mer, range of values is 15 to 31 (default is 25)
minimum length required to use a sequencing read after trimming/processing (default is 35)
size of k-mer hashtable is ~ 2^n (default is 25). HINT: should normally be 25 or, if more memory is available, 26. WARNING: if increased above 25 then more than 1.2M of memory must be allocated
minimum number of full length reads required to map to a de novo assembled trancript (default is 20)
minimum length of de novo assembled transcripts (default is 2*k)
minimum count of k-mer to use it to seed a new de novo assembled transcript (default is 50)
minimum count of k-mer to use it to extend an existing de novo assembled transcript (default is 5)

EXAMPLES

reference based assembly with single-end reads
% rockhopper <options> -g genome_DIR1,genome_DIR2 aerobic_replicate1.fastq,aerobic_replicate2.fastq anaerobic_replicate1.fastq,anaerobic_replicate2.fastq

de novo assembly with single-end reads
% rockhopper <options> aerobic_replicate1.fastq,aerobic_replicate2.fastq anaerobic_replicate1.fastq,anaerobic_replicate2.fastq

SEE ALSO

https://cs.wellesley.edu/~btjaden/Rockhopper/

rockhoppergui(1)

February 2022