'\" t
.\"     Title: load2sqlitedb
.\"    Author: [see the "AUTHOR(S)" section]
.\" Generator: Asciidoctor 2.0.20
.\"      Date: 
.\"    Manual: \ \&
.\"    Source: \ \&
.\"  Language: English
.\"
.TH "LOAD2SQLITEDB" "1" "" "\ \&" "\ \&"
.ie \n(.g .ds Aq \(aq
.el       .ds Aq '
.ss \n[.ss] 0
.nh
.ad l
.de URL
\fI\\$2\fP <\\$1>\\$3
..
.als MTO URL
.if \n[.g] \{\
.  mso www.tmac
.  am URL
.    ad l
.  .
.  am MTO
.    ad l
.  .
.  LINKSTYLE blue R < >
.\}
.SH "NAME"
load2sqlitedb \- load genome sequences and extrinsic evidence hints into a SQLite database
.SH "SYNOPSIS"
.sp
\fBload2sqlitedb\fP [OPTIONS] \-\-species=\fISPECIES\fP \-\-dbaccess=\fIdatabase.db\fP \fIinputfilename\fP
.SH "DESCRIPTION"
.sp
\fBload2sqlitedb\fP is a tool to load genome sequences and extrinsic evidence hints into a SQLite database.
.br
When storing genomes/hints of multiple organisms call this program repeatedly for each one.
.SH "OPTIONS"
.sp
\fIinputfilename\fP refers to a genome file in FASTA format or a hints file in GFF format.
.SS "Mandatory options"
.sp
\fB\-s\fP, \fB\-\-species\fP=\fISPECIES\fP
.RS 4
SPECIES is a name for your species and the same identifier as is used in the treefile and alnfile parameters to augustus.
.RE
.sp
\fB\-d\fP, \fB\-\-dbaccess\fP=\fIdatabase.db\fP
.RS 4
The file name of the SQLite database that will be opened or created if it does not already exist.
.RE
.SS "Additional options"
.sp
\fB\-c\fP, \fB\-\-chunksize\fP=\fIsize\fP
.RS 4
This option is only relevant when loading a sequence file.
.br
The sequences in the input genome are split into chunks of this size so
that subsequent retrievals of small sequence ranges do not require to read
the complete \- potentially much longer \- chromosome. (\(lA 1000000). Default Value: 50000
.RE
.sp
\fB\-i\fP, \fB\-\-noIdx\fP
.RS 4
Use this flag to suppress the building of indices on the database tables.
.br
If you are going to load several genomes and/or hint files in a row, this option
is recommended to speed up the loading. But make sure to build indices with
\-\-makeIdx after all genomes/hints are loaded. Otherwise, data retrieval operations
can be very slow.
.RE
.sp
\fB\-m\fP, \fB\-\-makeIdx\fP
.RS 4
Use this flag to build the indices on the database tables after loading several genomes and/or hint files with \-\-noIdx.
.br
Only call this once for all species, e.g. load2sqlitedb \-\-makeIdx \-\-dbaccess=database.db
.RE
.sp
\fB\-r\fP, \fB\-\-clean\fP
.RS 4
Makes a clean load deleting existing hints/genome for the species from the database.+
When called with a gff file, only the hints for the species are delete, but not the genome.
.br
When called with a fasta file, both hints and genome for the species are deleted.
.RE
.sp
\fB\-h\fP, \fB\-\-help\fP
.RS 4
Produce help message.
.RE
.SH "EXAMPLE"
.sp
.if n .RS 4
.nf
.fam C
  load2sqlitedb \-\-species=mouse \-\-dbaccess=vertebrates.db mouse.fa
  load2sqlitedb \-\-species=mouse \-\-dbaccess=vertebrates.db mouse.hints.gff
  load2sqlitedb \-\-species=human \-\-dbaccess=vertebrates.db human.fa
  load2sqlitedb \-\-species=human \-\-dbaccess=vertebrates.db human.hints.gff
.fam
.fi
.if n .RE
.SH "AUTHORS"
.sp
AUGUSTUS was written by M. Stanke, O. Keller, S. König, L. Gerischer and L. Romoth.