.\" Automatically generated by Pod::Man 4.11 (Pod::Simple 3.35) .\" .\" Standard preamble: .\" ======================================================================== .de Sp \" Vertical space (when we can't use .PP) .if t .sp .5v .if n .sp .. .de Vb \" Begin verbatim text .ft CW .nf .ne \\$1 .. .de Ve \" End verbatim text .ft R .fi .. .\" Set up some character translations and predefined strings. \*(-- will .\" give an unbreakable dash, \*(PI will give pi, \*(L" will give a left .\" double quote, and \*(R" will give a right double quote. \*(C+ will .\" give a nicer C++. Capital omega is used to do unbreakable dashes and .\" therefore won't be available. \*(C` and \*(C' expand to `' in nroff, .\" nothing in troff, for use with C<>. .tr \(*W- .ds C+ C\v'-.1v'\h'-1p'\s-2+\h'-1p'+\s0\v'.1v'\h'-1p' .ie n \{\ . ds -- \(*W- . ds PI pi . if (\n(.H=4u)&(1m=24u) .ds -- \(*W\h'-12u'\(*W\h'-12u'-\" diablo 10 pitch . if (\n(.H=4u)&(1m=20u) .ds -- \(*W\h'-12u'\(*W\h'-8u'-\" diablo 12 pitch . ds L" "" . ds R" "" . ds C` "" . ds C' "" 'br\} .el\{\ . ds -- \|\(em\| . ds PI \(*p . ds L" `` . ds R" '' . ds C` . ds C' 'br\} .\" .\" Escape single quotes in literal strings from groff's Unicode transform. .ie \n(.g .ds Aq \(aq .el .ds Aq ' .\" .\" If the F register is >0, we'll generate index entries on stderr for .\" titles (.TH), headers (.SH), subsections (.SS), items (.Ip), and index .\" entries marked with X<> in POD. Of course, you'll have to process the .\" output yourself in some meaningful fashion. .\" .\" Avoid warning from groff about undefined register 'F'. .de IX .. .nr rF 0 .if \n(.g .if rF .nr rF 1 .if (\n(rF:(\n(.g==0)) \{\ . if \nF \{\ . de IX . tm Index:\\$1\t\\n%\t"\\$2" .. . if !\nF==2 \{\ . nr % 0 . nr F 2 . \} . \} .\} .rr rF .\" ======================================================================== .\" .IX Title "BP_PROCESS_GADFLY 1p" .TH BP_PROCESS_GADFLY 1p "2020-10-28" "perl v5.30.3" "User Contributed Perl Documentation" .\" For nroff, turn off justification. Always turn off hyphenation; it makes .\" way too many mistakes in technical documents. .if n .ad l .nh .SH "NAME" bp_process_gadfly.pl \- Massage Gadfly/FlyBase GFF files into a version suitable for the Generic Genome Browser .SH "SYNOPSIS" .IX Header "SYNOPSIS" .Vb 1 \& % bp_process_gadfly.pl ./RELEASE2 > gadfly.gff .Ve .SH "DESCRIPTION" .IX Header "DESCRIPTION" This script massages the \s-1RELEASE 3\s0 Flybase/Gadfly \s-1GFF\s0 files located at http://www.fruitfly.org/sequence/release3download.shtml into the \*(L"correct\*(R" version of the \s-1GFF\s0 format. .PP To use this script, download the whole genome \s-1FASTA\s0 file and save it to disk. (The downloaded file will be called something like \&\*(L"na_whole\-genome_genomic_dmel_RELEASE3.FASTA\*(R", but the link on the \&\s-1HTML\s0 page doesn't give the filename.) Do the same for the whole genome \s-1GFF\s0 annotation file (the saved file will be called something like \*(L"whole\-genome_annotation\-feature\-region_dmel_RELEASE3.GFF\*(R".) If you wish you can download the \s-1ZIP\s0 compressed versions of these files. .PP Next run this script on the two files, indicating the name of the downloaded \s-1FASTA\s0 file first, followed by the gff file: .PP .Vb 1 \& % bp_process_gadfly.pl na_whole\-genome_genomic_dmel_RELEASE3.FASTA whole\-genome_annotation\-feature\-region_dmel_RELEASE3.GFF > fly.gff .Ve .PP The gadfly.gff file and the fasta file can now be loaded into a Bio::DB::GFF database using the following command: .PP .Vb 1 \& % bulk_load_gff.pl \-d fly \-fasta na_whole\-genome_genomic_dmel_RELEASE3.FASTA fly.gff .Ve .PP (Where \*(L"fly\*(R" is the name of the database. Change it as appropriate. The database must already exist and be writable by you!) .PP The resulting database will have the following feature types (represented as \*(L"method:source\*(R"): .PP .Vb 10 \& Component:arm A chromosome arm \& Component:scaffold A chromosome scaffold (accession #) \& Component:gap A gap in the assembly \& clone:clonelocator A BAC clone \& gene:gadfly A gene accession number \& transcript:gadfly A transcript accession number \& translation:gadfly A translation \& codon:gadfly Significance unknown \& exon:gadfly An exon \& symbol:gadfly A classical gene symbol \& similarity:blastn A BLASTN hit \& similarity:blastx A BLASTX hit \& similarity:sim4 EST\->genome using SIM4 \& similarity:groupest EST\->genome using GROUPEST \& similarity:repeatmasker A repeat .Ve .PP \&\s-1IMPORTANT NOTE:\s0 This script will *only* work with the \s-1RELEASE3\s0 gadfly files and will not work with earlier releases. .SH "SEE ALSO" .IX Header "SEE ALSO" Bio::DB::GFF, bulk_load_gff.pl, load_gff.pl .SH "AUTHOR" .IX Header "AUTHOR" Lincoln Stein, lstein@cshl.org .PP Copyright (c) 2002 Cold Spring Harbor Laboratory .PP This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself. See \s-1DISCLAIMER\s0.txt for disclaimers of warranty.