.\" Automatically generated by Pod::Man 4.09 (Pod::Simple 3.35) .\" .\" Standard preamble: .\" ======================================================================== .de Sp \" Vertical space (when we can't use .PP) .if t .sp .5v .if n .sp .. .de Vb \" Begin verbatim text .ft CW .nf .ne \\$1 .. .de Ve \" End verbatim text .ft R .fi .. .\" Set up some character translations and predefined strings. \*(-- will .\" give an unbreakable dash, \*(PI will give pi, \*(L" will give a left .\" double quote, and \*(R" will give a right double quote. \*(C+ will .\" give a nicer C++. Capital omega is used to do unbreakable dashes and .\" therefore won't be available. \*(C` and \*(C' expand to `' in nroff, .\" nothing in troff, for use with C<>. .tr \(*W- .ds C+ C\v'-.1v'\h'-1p'\s-2+\h'-1p'+\s0\v'.1v'\h'-1p' .ie n \{\ . ds -- \(*W- . ds PI pi . if (\n(.H=4u)&(1m=24u) .ds -- \(*W\h'-12u'\(*W\h'-12u'-\" diablo 10 pitch . if (\n(.H=4u)&(1m=20u) .ds -- \(*W\h'-12u'\(*W\h'-8u'-\" diablo 12 pitch . ds L" "" . ds R" "" . ds C` "" . ds C' "" 'br\} .el\{\ . ds -- \|\(em\| . ds PI \(*p . ds L" `` . ds R" '' . ds C` . ds C' 'br\} .\" .\" Escape single quotes in literal strings from groff's Unicode transform. .ie \n(.g .ds Aq \(aq .el .ds Aq ' .\" .\" If the F register is >0, we'll generate index entries on stderr for .\" titles (.TH), headers (.SH), subsections (.SS), items (.Ip), and index .\" entries marked with X<> in POD. Of course, you'll have to process the .\" output yourself in some meaningful fashion. .\" .\" Avoid warning from groff about undefined register 'F'. .de IX .. .if !\nF .nr F 0 .if \nF>0 \{\ . de IX . tm Index:\\$1\t\\n%\t"\\$2" .. . if !\nF==2 \{\ . nr % 0 . nr F 2 . \} .\} .\" .\" Accent mark definitions (@(#)ms.acc 1.5 88/02/08 SMI; from UCB 4.2). .\" Fear. Run. Save yourself. No user-serviceable parts. . \" fudge factors for nroff and troff .if n \{\ . ds #H 0 . ds #V .8m . ds #F .3m . ds #[ \f1 . ds #] \fP .\} .if t \{\ . ds #H ((1u-(\\\\n(.fu%2u))*.13m) . ds #V .6m . ds #F 0 . ds #[ \& . ds #] \& .\} . \" simple accents for nroff and troff .if n \{\ . ds ' \& . ds ` \& . ds ^ \& . ds , \& . ds ~ ~ . ds / .\} .if t \{\ . ds ' \\k:\h'-(\\n(.wu*8/10-\*(#H)'\'\h"|\\n:u" . ds ` \\k:\h'-(\\n(.wu*8/10-\*(#H)'\`\h'|\\n:u' . ds ^ \\k:\h'-(\\n(.wu*10/11-\*(#H)'^\h'|\\n:u' . ds , \\k:\h'-(\\n(.wu*8/10)',\h'|\\n:u' . ds ~ \\k:\h'-(\\n(.wu-\*(#H-.1m)'~\h'|\\n:u' . ds / \\k:\h'-(\\n(.wu*8/10-\*(#H)'\z\(sl\h'|\\n:u' .\} . \" troff and (daisy-wheel) nroff accents .ds : \\k:\h'-(\\n(.wu*8/10-\*(#H+.1m+\*(#F)'\v'-\*(#V'\z.\h'.2m+\*(#F'.\h'|\\n:u'\v'\*(#V' .ds 8 \h'\*(#H'\(*b\h'-\*(#H' .ds o \\k:\h'-(\\n(.wu+\w'\(de'u-\*(#H)/2u'\v'-.3n'\*(#[\z\(de\v'.3n'\h'|\\n:u'\*(#] .ds d- \h'\*(#H'\(pd\h'-\w'~'u'\v'-.25m'\f2\(hy\fP\v'.25m'\h'-\*(#H' .ds D- D\\k:\h'-\w'D'u'\v'-.11m'\z\(hy\v'.11m'\h'|\\n:u' .ds th \*(#[\v'.3m'\s+1I\s-1\v'-.3m'\h'-(\w'I'u*2/3)'\s-1o\s+1\*(#] .ds Th \*(#[\s+2I\s-2\h'-\w'I'u*3/5'\v'-.3m'o\v'.3m'\*(#] .ds ae a\h'-(\w'a'u*4/10)'e .ds Ae A\h'-(\w'A'u*4/10)'E . \" corrections for vroff .if v .ds ~ \\k:\h'-(\\n(.wu*9/10-\*(#H)'\s-2\u~\d\s+2\h'|\\n:u' .if v .ds ^ \\k:\h'-(\\n(.wu*10/11-\*(#H)'\v'-.4m'^\v'.4m'\h'|\\n:u' . \" for low resolution devices (crt and lpr) .if \n(.H>23 .if \n(.V>19 \ \{\ . ds : e . ds 8 ss . ds o a . ds d- d\h'-1'\(ga . ds D- D\h'-1'\(hy . ds th \o'bp' . ds Th \o'LP' . ds ae ae . ds Ae AE .\} .rm #[ #] #H #V #F C .\" ======================================================================== .\" .IX Title "WIGGLE2GFF3 1p" .TH WIGGLE2GFF3 1p "2018-10-03" "perl v5.26.2" "User Contributed Perl Documentation" .\" For nroff, turn off justification. Always turn off hyphenation; it makes .\" way too many mistakes in technical documents. .if n .ad l .nh .SH "NAME" wiggle2gff3.pl \- Converts UCSC WIG format files into gff3 files .SH "SYNOPSIS" .IX Header "SYNOPSIS" .Vb 1 \& wiggle2gff3.pl [options] WIG_FILE > load_data.gff3 .Ve .PP Converts \s-1UCSC WIG\s0 format files into gff3 files suitable for loading into GBrowse databases. This is used for high-density quantitative data such as \s-1CNV, SNP\s0 and expression arrays. .SH "DESCRIPTION" .IX Header "DESCRIPTION" Use this converter when you have dense quantitative data to display using the xyplot, density, or heatmap glyphs, and too many data items (thousands) to load into GBrowse. It creates one or more space\- efficient binary files containing the quantitative data, as well as a small \s-1GFF3\s0 file that can be loaded into Chado or other GBrowse databases. .PP Typical usage is as follows: .PP .Vb 1 \& % wiggle2gff3.pl \-\-method=microarray_oligo my_data.wig > my_data.gff3 .Ve .SS "Options" .IX Subsection "Options" The following options are accepted: .PP .Vb 3 \& \-\-method= Set the method for the GFF3 lines representing \& each quantitative data point in the track. \& The default is "microarray_oligo." \& \& \-\-source= Set the source field for the GFF3 file. The default is \& none. \& \& \-\-gff3 Create a GFF3\-format file (the default) \& \& \-\-featurefile Create a "featurefile" format file \-\- this is the \& simplified format used for GBrowse uploads. This \& option is incompatible with the \-\-gff3 option. \& \& \-\-sample If true, then very large files (>5 MB) will be sampled \& to obtain minimum, maximum and standard deviation; otherwise \& the entire file will be scanned to obtain these statistics. \& This will process the files faster but may miss outlier \& values. \& \& \-\-path= Specify the directory in which to place the binary wiggle \& files. The default is the current temporary directory \& (/tmp or whatever is appropriate for your operating system). \& \& \-\-base= Same as "\-\-path". \& \& \& \-\-trackname specify the trackname base for the wigfile creation \& \& \-\-help This documentation. .Ve .PP This script will accept a variety of option styles, including abbreviated options (\*(L"\-\-meth=foo\*(R"), single character options (\*(L"\-m foo\*(R"), and other common variants. .SS "Binary wiggle files" .IX Subsection "Binary wiggle files" The binary \*(L"wiggle\*(R" files created by this utility are readable using the Bio::Graphics::Wiggle module. The quantitative data is scaled to the range of 1\-255 (losing lots of precision, but still more than enough for data visualization), and stored in a packed format in which each file corresponds to the length of a single chromosome or contig. .PP Once created, the binary files should not be moved or renamed, unless you are careful to make corresponding changes to the pathnames given by the \*(L"wigfile\*(R" attribute in the \s-1GFF3\s0 file feature lines. You should also be careful about using the cp command to copy the binary files; they are formatted with \*(L"holes\*(R" in such a way that missing data does not take up any space on disk. If you cp them, the holes will fill up with zeroes and the space savings will be lost. Better to use the \&\*(L"tar\*(R" command with its \-\-sparse option to move the files from one place to another. .SS "Example \s-1WIG\s0 File" .IX Subsection "Example WIG File" This example is from : .PP .Vb 10 \& # filename: example.wig \& # \& # 300 base wide bar graph, autoScale is on by default == graphing \& # limits will dynamically change to always show full range of data \& # in viewing window, priority = 20 positions this as the second graph \& # Note, zero\-relative, half\-open coordinate system in use for bed format \& track type=wiggle_0 name="Bed Format" description="BED format" \e \& visibility=full color=200,100,0 altColor=0,100,200 priority=20 \& chr19 59302000 59302300 \-1.0 \& chr19 59302300 59302600 \-0.75 \& chr19 59302600 59302900 \-0.50 \& chr19 59302900 59303200 \-0.25 \& chr19 59303200 59303500 0.0 \& chr19 59303500 59303800 0.25 \& chr19 59303800 59304100 0.50 \& chr19 59304100 59304400 0.75 \& chr19 59304400 59304700 1.00 \& # 150 base wide bar graph at arbitrarily spaced positions, \& # threshold line drawn at y=11.76 \& # autoScale off viewing range set to [0:25] \& # priority = 10 positions this as the first graph \& # Note, one\-relative coordinate system in use for this format \& track type=wiggle_0 name="variableStep" description="variableStep format" \e \& visibility=full autoScale=off viewLimits=0.0:25.0 color=255,200,0 \e \& yLineMark=11.76 yLineOnOff=on priority=10 \& variableStep chrom=chr19 span=150 \& 59304701 10.0 \& 59304901 12.5 \& 59305401 15.0 \& 59305601 17.5 \& 59305901 20.0 \& 59306081 17.5 \& 59306301 15.0 \& 59306691 12.5 \& 59307871 10.0 \& # 200 base wide points graph at every 300 bases, 50 pixel high graph \& # autoScale off and viewing range set to [0:1000] \& # priority = 30 positions this as the third graph \& # Note, one\-relative coordinate system in use for this format \& track type=wiggle_0 name="fixedStep" description="fixed step" visibility=full \e \& autoScale=off viewLimits=0:1000 color=0,200,100 maxHeightPixels=100:50:20 \e \& graphType=points priority=30 \& fixedStep chrom=chr19 start=59307401 step=300 span=200 \& 1000 \& 900 \& 800 \& 700 \& 600 \& 500 \& 400 \& 300 \& 200 \& 100 .Ve .PP You can convert this into a loadable \s-1GFF3\s0 file with the following command: .PP .Vb 2 \& wiggle2gff3.pl \-\-meth=example \-\-so=example \-\-path=/var/gbrowse/db example.wig \e \& > example.gff3 .Ve .PP The output will look like this: .PP .Vb 1 \& ##gff\-version 3 \& \& chr19 example example 59302001 59304700 . . . Name=Bed Format;wigfile=/var/gbrowse/db/track001.chr19.1199828298.wig \& chr19 example example 59304701 59308020 . . . Name=variableStep;wigfile=/var/gbrowse/db/track002.chr19.1199828298.wig \& chr19 example example 59307401 59310400 . . . Name=fixedStep;wigfile=/var/gbrowse/db/track003.chr19.1199828298.wig .Ve .SH "PROBLEMS" .IX Header "PROBLEMS" This script has trouble with wig files from very fragmented genomes (>100K scaffolds). In this case, you may wish to run split_wig.pl, which splits the original wig file into a series of smaller files with a maximum of 900 scaffolds each. It then runs wiggle2gff3.pl for each subfile and stores the results in separate folders. .SH "SEE ALSO" .IX Header "SEE ALSO" Bio::DB::GFF, bp_bulk_load_gff.pl, bp_fast_load_gff.pl, bp_load_gff.pl, bp_seqfeature_load.pl .SH "AUTHOR" .IX Header "AUTHOR" Lincoln Stein . .PP Copyright (c) 2008 Cold Spring Harbor Laboratory .PP This package is free software; you can redistribute it and/or modify it under the terms of the \s-1GPL\s0 (either version 1, or at your option, any later version) or the Artistic License 2.0. Refer to \s-1LICENSE\s0 for the full license text. See \s-1DISCLAIMER\s0.txt for disclaimers of warranty.