Scroll to navigation

SHASTA(1) User Commands SHASTA(1)

NAME

shasta - nanopore whole genome assembly tool

DESCRIPTION

Options allowed only on the command line:

Write a help message.
Identify the Shasta version.
Configuration file name.
Names of input files containing reads. Specify at least one.
Name of the output directory. If command is assemble, this directory must not exist.
Command to run. Must be one of: assemble, saveBinaryData, cleanupBinaryData, explore, createBashCompletionScript
Specify whether allocated memory is anonymous or backed by a filesystem. Allowed values: anonymous, filesystem.
Specify the type of pages used to back memory. Allowed values: disk, 4K , 2M (for best performance). All combinations (memoryMode, memoryBacking) are allowed except for (anonymous, disk). Some combinations require root privilege, which is obtained using sudo and may result in a password prompting depending on your sudo set up.
Number of threads, or 0 to use one thread per virtual processor.
Specify allowed access for --command explore. Allowed values: user, local, unrestricted. DO NOT CHANGE FROM DEFAULT VALUE WITHOUT UNDERSTANDING THE SECURITY IMPLICATIONS.
Port to be used by the http server (command --explore).

Options allowed on the command line and in the config file:

Read length cutoff. Shorter reads are discarded.
If set, skip the Linux cache when loading reads. This is done by specifying the O_DIRECT flag when opening input files containing reads.
Skip flagging palindromic reads. Oxford Nanopore reads should be flagged for better results.
Used for palindromic read detection.
Used for palindromic read detection.
Used for palindromic read detection.
Used for palindromic read detection.
Used for palindromic read detection.
Used for palindromic read detection.
Method to generate marker k-mers: 0 = random, 1 = random, excluding globally overenriched,2 = random, excluding overenriched even in a single read,3 = read from file.
Length of marker k-mers (in run-length space).
Fraction k-mers used as a marker.
Enrichment threshold for Kmers.generationMethod 1 and 2.
The absolute path of a file containing the k-mers to be used as markers, one per line. A relative path is not accepted. Only used if Kmers.generationMethod is 3.
Controls the version of the LowHash algorithm to use. Can be 0 (default) or 1.(experimental).
The number of consecutive markers that define a MinHash/LowHash feature.
Defines how low a hash has to be to be used with the LowHash algorithm.
The number of MinHash/LowHash iterations, or 0 to let --MinHash.alignmentCandidatesPerRead control the number of iterations.
If --MinHash.minHashIterationCount is 0, MinHash iteration is stopped when the average number of alignment candidates that each read is involved in reaches this value. If --MinHash.minHashIterationCount is not 0, this is not used.
The minimum bucket size to be used by the LowHash algorithm.
The maximum bucket size to be used by the LowHash algorithm.
The minimum number of times a pair of reads must be found by the MinHash/LowHash algorithm in order to be considered a candidate alignment.
Skip the MinHash algorithm and mark all pairs of reads as alignmentcandidates with both orientation. This should only be used for experimentation on very small runs because it is very time consuming.
The alignment method to be used to create the read graph & the marker graph. 0 = old Shasta method, 1 = SeqAn (slow), 3 = banded SeqAn.
The maximum number of markers that an alignment is allowed to skip.
The maximum amount of marker drift that an alignment is allowed to tolerate between successive markers.
The maximum number of unaligned markers tolerated at the beginning and end of an alignment.
Marker frequency threshold. Markers more frequent than this value in either of two oriented reads being aligned are discarded and not used to compute the alignment.
The minimum number of aligned markers for an alignment to be used.
The minimum fraction of aligned markers for an alignment to be used.
Match score for marker alignments (only used for alignment methods 1 and 3).
Mismatch score for marker alignments (only used for alignment methods 1 and 3).
Gap score for marker alignments (only used for alignment methods 1 and 3).
Downsampling factor (only used for alignment method 3).
Amount to extend the downsampled band (only used for alignment method 3).
If not zero, alignments between reads from the same nanopore channel and close in time are suppressed. The "read" meta data fields from the FASTA or FASTQ header are checked. If their difference, in absolute value, is less than the value of this option, the alignment is suppressed. This can help avoid assembly artifact. This check is only done if the two reads have identical meta data fields "runid", "sampleid", and "ch". If any of these meta data fields are missing, this check is suppressed and this option has no effect.
Suppress containment alignments, that is alignments in which one read is entirely contained in another read, except possibly for up to maxTrim markers at the beginning and end.
The method used to create the read graph (0 = undirected, default, 1 = directed, experimental).
The maximum number of alignments to be kept for each read.
The minimum size (number of oriented reads) of a connected component of the read graph to be kept. This is currently ignored.
Used for chimeric read detection.
Maximum distance (edges) for flagCrossStrandReadGraphEdges. Set this to zero to entirely suppress flagCrossStrandReadGraphEdges.
Maximum number of alignments to be kept for each contained read (only used when creationMethod is 1).
Maximum number of alignments to be kept in each direction (forward, backward) for each uncontained read (only used when creationMethod is 1).
Remove conflicts from the read graph. Experimental - do not use.
Minimum number of markers for a marker graph vertex.
Maximum number of markers for a marker graph vertex.
Used during approximate transitive reduction. Marker graph edges with coverage lower than this value are always marked as removed regardless of reachability.
Used during approximate transitive reduction. Marker graph edges with coverage higher than this value are never marked as removed regardless of reachability.
Used during approximate transitive reduction.
Used during approximate transitive reduction.
Number of prune iterations.
Maximum lengths (in markers) used at each iteration of simplifyMarkerGraph.
Experimental. Cross edge coverage threshold. If this is not zero, assembly graph cross-edges with average edge coverage less than this value are removed, together with the corresponding marker graph edges. A cross edge is defined as an edge v0->v1 with out-degree(v0)>1, in-degree(v1)>1.
Experimental. Length threshold, in markers, for the marker graph refinement step, or 0 to turn off the refinement step.
Perform approximate reverse transitive reduction of the marker graph.
Maximum average edge coverage for a cross edge of the assembly graph to be removed.
Controls assembly of long marker graph edges.
Selects the consensus caller for repeat counts. See the documentation for available choices.
Used to request storing coverage data in binary format.
Used to specify the minimum length of an assembled segment for which coverage data in csv format should be stored. If 0, no coverage data in csv format is stored.
Used to request writing the reads that contributed to assembling each segment.
Experimental. Specify the method used to detangle the assembly graph. 0 = no detangling, 1 = basic detangling.
August 2020 shasta