.\" Automatically generated by Pod::Man 4.11 (Pod::Simple 3.35) .\" .\" Standard preamble: .\" ======================================================================== .de Sp \" Vertical space (when we can't use .PP) .if t .sp .5v .if n .sp .. .de Vb \" Begin verbatim text .ft CW .nf .ne \\$1 .. .de Ve \" End verbatim text .ft R .fi .. .\" Set up some character translations and predefined strings. \*(-- will .\" give an unbreakable dash, \*(PI will give pi, \*(L" will give a left .\" double quote, and \*(R" will give a right double quote. \*(C+ will .\" give a nicer C++. Capital omega is used to do unbreakable dashes and .\" therefore won't be available. \*(C` and \*(C' expand to `' in nroff, .\" nothing in troff, for use with C<>. .tr \(*W- .ds C+ C\v'-.1v'\h'-1p'\s-2+\h'-1p'+\s0\v'.1v'\h'-1p' .ie n \{\ . ds -- \(*W- . ds PI pi . if (\n(.H=4u)&(1m=24u) .ds -- \(*W\h'-12u'\(*W\h'-12u'-\" diablo 10 pitch . if (\n(.H=4u)&(1m=20u) .ds -- \(*W\h'-12u'\(*W\h'-8u'-\" diablo 12 pitch . ds L" "" . ds R" "" . ds C` "" . ds C' "" 'br\} .el\{\ . ds -- \|\(em\| . ds PI \(*p . ds L" `` . ds R" '' . ds C` . ds C' 'br\} .\" .\" Escape single quotes in literal strings from groff's Unicode transform. .ie \n(.g .ds Aq \(aq .el .ds Aq ' .\" .\" If the F register is >0, we'll generate index entries on stderr for .\" titles (.TH), headers (.SH), subsections (.SS), items (.Ip), and index .\" entries marked with X<> in POD. Of course, you'll have to process the .\" output yourself in some meaningful fashion. .\" .\" Avoid warning from groff about undefined register 'F'. .de IX .. .nr rF 0 .if \n(.g .if rF .nr rF 1 .if (\n(rF:(\n(.g==0)) \{\ . if \nF \{\ . de IX . tm Index:\\$1\t\\n%\t"\\$2" .. . if !\nF==2 \{\ . nr % 0 . nr F 2 . \} . \} .\} .rr rF .\" ======================================================================== .\" .IX Title "Bio::DB::GFF::Feature 3pm" .TH Bio::DB::GFF::Feature 3pm "2020-01-13" "perl v5.30.0" "User Contributed Perl Documentation" .\" For nroff, turn off justification. Always turn off hyphenation; it makes .\" way too many mistakes in technical documents. .if n .ad l .nh .SH "NAME" Bio::DB::GFF::Feature \-\- A relative segment identified by a feature type .SH "SYNOPSIS" .IX Header "SYNOPSIS" See Bio::DB::GFF. .SH "DESCRIPTION" .IX Header "DESCRIPTION" Bio::DB::GFF::Feature is a stretch of sequence that corresponding to a single annotation in a \s-1GFF\s0 database. It inherits from Bio::DB::GFF::RelSegment, and so has all the support for relative addressing of this class and its ancestors. It also inherits from Bio::SeqFeatureI and so has the familiar \fBstart()\fR, \fBstop()\fR, \&\fBprimary_tag()\fR and \fBlocation()\fR methods (it implements Bio::LocationI too, if needed). .PP Bio::DB::GFF::Feature adds new methods to retrieve the annotation type, group, and other \s-1GFF\s0 attributes. Annotation types are represented by Bio::DB::GFF::Typename objects, a simple class that has two methods called \fBmethod()\fR and \fBsource()\fR. These correspond to the method and source fields of a \s-1GFF\s0 file. .PP Annotation groups serve the dual purpose of giving the annotation a human-readable name, and providing higher-order groupings of subfeatures into features. The groups returned by this module are objects of the Bio::DB::GFF::Featname class. .PP Bio::DB::GFF::Feature inherits from and implements the abstract methods of Bio::SeqFeatureI, allowing it to interoperate with other Bioperl modules. .PP Generally, you will not create or manipulate Bio::DB::GFF::Feature objects directly, but use those that are returned by the Bio::DB::GFF::RelSegment\->\fBfeatures()\fR method. .SS "Important note about \fBstart()\fP vs \fBend()\fP" .IX Subsection "Important note about start() vs end()" If features are derived from segments that use relative addressing (which is the default), then \fBstart()\fR will be less than \fBend()\fR if the feature is on the opposite strand from the reference sequence. This breaks Bio::SeqI compliance, but is necessary to avoid having the real genomic locations designated by \fBstart()\fR and \fBend()\fR swap places when changing reference points. .PP To avoid this behavior, call \f(CW$segment\fR\->\fBabsolute\fR\|(1) before fetching features from it. This will force everything into absolute coordinates. .PP For example: .PP .Vb 3 \& my $segment = $db\->segment(\*(AqCHROMOSOME_I\*(Aq); \& $segment\->absolute(1); \& my @features = $segment\->features(\*(Aqtranscript\*(Aq); .Ve .SH "API" .IX Header "API" The remainder of this document describes the public and private methods implemented by this module. .SS "new_from_parent" .IX Subsection "new_from_parent" .Vb 6 \& Title : new_from_parent \& Usage : $f = Bio::DB::GFF::Feature\->new_from_parent(@args); \& Function: create a new feature object \& Returns : new Bio::DB::GFF::Feature object \& Args : see below \& Status : Internal .Ve .PP This method is called by Bio::DB::GFF to create a new feature using information obtained from the \s-1GFF\s0 database. It is one of two similar constructors. This one is called when the feature is generated from a RelSegment object, and should inherit the coordinate system of that object. .PP The 13 arguments are positional (sorry): .PP .Vb 10 \& $parent a Bio::DB::GFF::RelSegment object (or descendent) \& $start start of this feature \& $stop stop of this feature \& $method this feature\*(Aqs GFF method \& $source this feature\*(Aqs GFF source \& $score this feature\*(Aqs score \& $fstrand this feature\*(Aqs strand (relative to the source \& sequence, which has its own strandedness!) \& $phase this feature\*(Aqs phase \& $group this feature\*(Aqs group (a Bio::DB::GFF::Featname object) \& $db_id this feature\*(Aqs internal database ID \& $group_id this feature\*(Aqs internal group database ID \& $tstart this feature\*(Aqs target start \& $tstop this feature\*(Aqs target stop .Ve .PP tstart and tstop are not used for anything at the moment, since the information is embedded in the group object. .SS "new" .IX Subsection "new" .Vb 6 \& Title : new \& Usage : $f = Bio::DB::GFF::Feature\->new(@args); \& Function: create a new feature object \& Returns : new Bio::DB::GFF::Feature object \& Args : see below \& Status : Internal .Ve .PP This method is called by Bio::DB::GFF to create a new feature using information obtained from the \s-1GFF\s0 database. It is one of two similar constructors. This one is called when the feature is generated without reference to a RelSegment object, and should therefore use its default coordinate system (relative to itself). .PP The 11 arguments are positional: .PP .Vb 12 \& $factory a Bio::DB::GFF adaptor object (or descendent) \& $srcseq the source sequence \& $start start of this feature \& $stop stop of this feature \& $method this feature\*(Aqs GFF method \& $source this feature\*(Aqs GFF source \& $score this feature\*(Aqs score \& $fstrand this feature\*(Aqs strand (relative to the source \& sequence, which has its own strandedness!) \& $phase this feature\*(Aqs phase \& $group this feature\*(Aqs group \& $db_id this feature\*(Aqs internal database ID .Ve .SS "type" .IX Subsection "type" .Vb 6 \& Title : type \& Usage : $type = $f\->type([$newtype]) \& Function: get or set the feature type \& Returns : a Bio::DB::GFF::Typename object \& Args : a new Typename object (optional) \& Status : Public .Ve .PP This method gets or sets the type of the feature. The type is a Bio::DB::GFF::Typename object, which encapsulates the feature method and source. .PP The \fBmethod()\fR and \fBsource()\fR methods described next provide shortcuts to the individual fields of the type. .SS "method" .IX Subsection "method" .Vb 6 \& Title : method \& Usage : $method = $f\->method([$newmethod]) \& Function: get or set the feature method \& Returns : a string \& Args : a new method (optional) \& Status : Public .Ve .PP This method gets or sets the feature method. It is a convenience feature that delegates the task to the feature's type object. .SS "source" .IX Subsection "source" .Vb 6 \& Title : source \& Usage : $source = $f\->source([$newsource]) \& Function: get or set the feature source \& Returns : a string \& Args : a new source (optional) \& Status : Public .Ve .PP This method gets or sets the feature source. It is a convenience feature that delegates the task to the feature's type object. .SS "score" .IX Subsection "score" .Vb 6 \& Title : score \& Usage : $score = $f\->score([$newscore]) \& Function: get or set the feature score \& Returns : a string \& Args : a new score (optional) \& Status : Public .Ve .PP This method gets or sets the feature score. .SS "phase" .IX Subsection "phase" .Vb 6 \& Title : phase \& Usage : $phase = $f\->phase([$phase]) \& Function: get or set the feature phase \& Returns : a string \& Args : a new phase (optional) \& Status : Public .Ve .PP This method gets or sets the feature phase. .SS "strand" .IX Subsection "strand" .Vb 6 \& Title : strand \& Usage : $strand = $f\->strand \& Function: get the feature strand \& Returns : +1, 0 \-1 \& Args : none \& Status : Public .Ve .PP Returns the strand of the feature. Unlike the other methods, the strand cannot be changed once the object is created (due to coordinate considerations). .SS "group" .IX Subsection "group" .Vb 6 \& Title : group \& Usage : $group = $f\->group([$new_group]) \& Function: get or set the feature group \& Returns : a Bio::DB::GFF::Featname object \& Args : a new group (optional) \& Status : Public .Ve .PP This method gets or sets the feature group. The group is a Bio::DB::GFF::Featname object, which has an \s-1ID\s0 and a class. .SS "display_id" .IX Subsection "display_id" .Vb 6 \& Title : display_id \& Usage : $display_id = $f\->display_id([$display_id]) \& Function: get or set the feature display id \& Returns : a Bio::DB::GFF::Featname object \& Args : a new display_id (optional) \& Status : Public .Ve .PP This method is an alias for \fBgroup()\fR. It is provided for Bio::SeqFeatureI compatibility. .SS "info" .IX Subsection "info" .Vb 6 \& Title : info \& Usage : $info = $f\->info([$new_info]) \& Function: get or set the feature group \& Returns : a Bio::DB::GFF::Featname object \& Args : a new group (optional) \& Status : Public .Ve .PP This method is an alias for \fBgroup()\fR. It is provided for AcePerl compatibility. .SS "target" .IX Subsection "target" .Vb 6 \& Title : target \& Usage : $target = $f\->target([$new_target]) \& Function: get or set the feature target \& Returns : a Bio::DB::GFF::Homol object \& Args : a new group (optional) \& Status : Public .Ve .PP This method works like \fBgroup()\fR, but only returns the group if it implements the \fBstart()\fR method. This is typical for similarity/assembly features, where the target encodes the start and stop location of the alignment. .PP The returned object is of type Bio::DB::GFF::Homol, which is a subclass of Bio::DB::GFF::Segment. .SS "flatten_target" .IX Subsection "flatten_target" .Vb 6 \& Title : flatten_target \& Usage : $target = $f\->flatten_target($f\->target) \& Function: flatten a target object \& Returns : a string (GFF2), an array [GFF2.5] or an array ref [GFF3] \& Args : a target object (required), GFF version (optional) \& Status : Public .Ve .PP This method flattens a target object into text for \&\s-1GFF\s0 dumping. If a second argument is provided, version-specific vocabulary is used for the flattened target. .SS "hit" .IX Subsection "hit" .Vb 6 \& Title : hit \& Usage : $hit = $f\->hit([$new_hit]) \& Function: get or set the feature hit \& Returns : a Bio::DB::GFF::Featname object \& Args : a new group (optional) \& Status : Public .Ve .PP This is the same as \fBtarget()\fR, for compatibility with Bio::SeqFeature::SimilarityPair. .SS "id" .IX Subsection "id" .Vb 6 \& Title : id \& Usage : $id = $f\->id \& Function: get the feature ID \& Returns : a database identifier \& Args : none \& Status : Public .Ve .PP This method retrieves the database identifier for the feature. It cannot be changed. .SS "group_id" .IX Subsection "group_id" .Vb 6 \& Title : group_id \& Usage : $id = $f\->group_id \& Function: get the feature ID \& Returns : a database identifier \& Args : none \& Status : Public .Ve .PP This method retrieves the database group identifier for the feature. It cannot be changed. Often the group identifier is more useful than the feature identifier, since it is used to refer to a complex object containing subparts. .SS "clone" .IX Subsection "clone" .Vb 6 \& Title : clone \& Usage : $feature = $f\->clone \& Function: make a copy of the feature \& Returns : a new Bio::DB::GFF::Feature object \& Args : none \& Status : Public .Ve .PP This method returns a copy of the feature. .SS "compound" .IX Subsection "compound" .Vb 6 \& Title : compound \& Usage : $flag = $f\->compound([$newflag]) \& Function: get or set the compound flag \& Returns : a boolean \& Args : a new flag (optional) \& Status : Public .Ve .PP This method gets or sets a flag indicated that the feature is not a primary one from the database, but the result of aggregation. .SS "sub_SeqFeature" .IX Subsection "sub_SeqFeature" .Vb 6 \& Title : sub_SeqFeature \& Usage : @feat = $feature\->sub_SeqFeature([$method]) \& Function: get subfeatures \& Returns : a list of Bio::DB::GFF::Feature objects \& Args : a feature method (optional) \& Status : Public .Ve .PP This method returns a list of any subfeatures that belong to the main feature. For those features that contain heterogeneous subfeatures, you can retrieve a subset of the subfeatures by providing a method name to filter on. .PP This method may also be called as \fBsegments()\fR or \fBget_SeqFeatures()\fR. .SS "add_subfeature" .IX Subsection "add_subfeature" .Vb 6 \& Title : add_subfeature \& Usage : $feature\->add_subfeature($feature) \& Function: add a subfeature to the feature \& Returns : nothing \& Args : a Bio::DB::GFF::Feature object \& Status : Public .Ve .PP This method adds a new subfeature to the object. It is used internally by aggregators, but is available for public use as well. .SS "attach_seq" .IX Subsection "attach_seq" .Vb 8 \& Title : attach_seq \& Usage : $sf\->attach_seq($seq) \& Function: Attaches a Bio::Seq object to this feature. This \& Bio::Seq object is for the *entire* sequence: ie \& from 1 to 10000 \& Example : \& Returns : TRUE on success \& Args : a Bio::PrimarySeqI compliant object .Ve .SS "location" .IX Subsection "location" .Vb 6 \& Title : location \& Usage : my $location = $seqfeature\->location() \& Function: returns a location object suitable for identifying location \& of feature on sequence or parent feature \& Returns : Bio::LocationI object \& Args : none .Ve .SS "entire_seq" .IX Subsection "entire_seq" .Vb 7 \& Title : entire_seq \& Usage : $whole_seq = $sf\->entire_seq() \& Function: gives the entire sequence that this seqfeature is attached to \& Example : \& Returns : a Bio::PrimarySeqI compliant object, or undef if there is no \& sequence attached \& Args : none .Ve .SS "merged_segments" .IX Subsection "merged_segments" .Vb 6 \& Title : merged_segments \& Usage : @segs = $feature\->merged_segments([$method]) \& Function: get merged subfeatures \& Returns : a list of Bio::DB::GFF::Feature objects \& Args : a feature method (optional) \& Status : Public .Ve .PP This method acts like sub_SeqFeature, except that it merges overlapping segments of the same time into contiguous features. For those features that contain heterogeneous subfeatures, you can retrieve a subset of the subfeatures by providing a method name to filter on. .PP A side-effect of this method is that the features are returned in sorted order by their start tposition. .SS "sub_types" .IX Subsection "sub_types" .Vb 6 \& Title : sub_types \& Usage : @methods = $feature\->sub_types \& Function: get methods of all sub\-seqfeatures \& Returns : a list of method names \& Args : none \& Status : Public .Ve .PP For those features that contain subfeatures, this method will return a unique list of method names of those subfeatures, suitable for use with \fBsub_SeqFeature()\fR. .SS "attributes" .IX Subsection "attributes" .Vb 6 \& Title : attributes \& Usage : @attributes = $feature\->attributes($name) \& Function: get the "attributes" on a particular feature \& Returns : an array of string \& Args : feature ID \& Status : public .Ve .PP Some \s-1GFF\s0 version 2 files use the groups column to store a series of attribute/value pairs. In this interpretation of \s-1GFF,\s0 the first such pair is treated as the primary group for the feature; subsequent pairs are treated as attributes. Two attributes have special meaning: \&\*(L"Note\*(R" is for backward compatibility and is used for unstructured text remarks. \*(L"Alias\*(R" is considered as a synonym for the feature name. .PP .Vb 2 \& @gene_names = $feature\->attributes(\*(AqGene\*(Aq); \& @aliases = $feature\->attributes(\*(AqAlias\*(Aq); .Ve .PP If no name is provided, then \fBattributes()\fR returns a flattened hash, of attribute=>value pairs. This lets you do: .PP .Vb 1 \& %attributes = $db\->attributes; .Ve .SS "notes" .IX Subsection "notes" .Vb 6 \& Title : notes \& Usage : @notes = $feature\->notes \& Function: get the "notes" on a particular feature \& Returns : an array of string \& Args : feature ID \& Status : public .Ve .PP Some \s-1GFF\s0 version 2 files use the groups column to store various notes and remarks. Adaptors can elect to store the notes in the database, or just ignore them. For those adaptors that store the notes, the \&\fBnotes()\fR method will return them as a list. .SS "aliases" .IX Subsection "aliases" .Vb 6 \& Title : aliases \& Usage : @aliases = $feature\->aliases \& Function: get the "aliases" on a particular feature \& Returns : an array of string \& Args : feature ID \& Status : public .Ve .PP This method will return a list of attributes of type 'Alias'. .SS "Autogenerated Methods" .IX Subsection "Autogenerated Methods" .Vb 6 \& Title : AUTOLOAD \& Usage : @subfeat = $feature\->Method \& Function: Return subfeatures using autogenerated methods \& Returns : a list of Bio::DB::GFF::Feature objects \& Args : none \& Status : Public .Ve .PP Any method that begins with an initial capital letter will be passed to \s-1AUTOLOAD\s0 and treated as a call to sub_SeqFeature with the method name used as the method argument. For instance, this call: .PP .Vb 1 \& @exons = $feature\->Exon; .Ve .PP is equivalent to this call: .PP .Vb 1 \& @exons = $feature\->sub_SeqFeature(\*(Aqexon\*(Aq); .Ve .SS "SeqFeatureI methods" .IX Subsection "SeqFeatureI methods" The following Bio::SeqFeatureI methods are implemented: .PP \&\fBprimary_tag()\fR, \fBsource_tag()\fR, \fBall_tags()\fR, \fBhas_tag()\fR, \fBeach_tag_value()\fR [renamed \fBget_tag_values()\fR]. .SS "adjust_bounds" .IX Subsection "adjust_bounds" .Vb 6 \& Title : adjust_bounds \& Usage : $feature\->adjust_bounds \& Function: adjust the bounds of a feature \& Returns : ($start,$stop,$strand) \& Args : none \& Status : Public .Ve .PP This method adjusts the boundaries of the feature to enclose all its subfeatures. It returns the new start, stop and strand of the enclosing feature. .SS "sort_features" .IX Subsection "sort_features" .Vb 6 \& Title : sort_features \& Usage : $feature\->sort_features \& Function: sort features \& Returns : nothing \& Args : none \& Status : Public .Ve .PP This method sorts subfeatures in ascending order by their start position. For reverse strand features, it sorts subfeatures in descending order. After this is called sub_SeqFeature will return the features in order. .PP This method is called internally by \fBmerged_segments()\fR. .SS "asString" .IX Subsection "asString" .Vb 6 \& Title : asString \& Usage : $string = $feature\->asString \& Function: return human\-readabled representation of feature \& Returns : a string \& Args : none \& Status : Public .Ve .PP This method returns a human-readable representation of the feature and is called by the overloaded "" operator. .SS "gff_string" .IX Subsection "gff_string" .Vb 6 \& Title : gff_string \& Usage : $string = $feature\->gff_string \& Function: return GFF2 of GFF2.5 representation of feature \& Returns : a string \& Args : none \& Status : Public .Ve .SS "gff3_string" .IX Subsection "gff3_string" .Vb 7 \& Title : gff3_string \& Usage : $string = $feature\->gff3_string([$recurse]) \& Function: return GFF3 representation of feature \& Returns : a string \& Args : An optional flag, which if true, will cause the feature to recurse over \& subfeatures. \& Status : Public .Ve .SS "version" .IX Subsection "version" .Vb 6 \& Title : version \& Usage : $feature\->version() \& Function: get/set the GFF version to be returned by gff_string \& Returns : the GFF version (default is 2) \& Args : the GFF version (2, 2.5 of 3) \& Status : Public .Ve .SS "\fBcmap_link()\fP" .IX Subsection "cmap_link()" .Vb 6 \& Title : cmap_link \& Usage : $link = $feature\->cmap_link \& Function: returns a URL link to the corresponding feature in cmap \& Returns : a string \& Args : none \& Status : Public .Ve .PP If integrated cmap/gbrowse installation, it returns a link to the map otherwise it returns a link to a feature search on the feature name. See the cmap documentation for more information. .PP This function is intended primarily to be used in gbrowse conf files. For example: .PP .Vb 1 \& link = sub {my $self = shift; return $self\->cmap_viewer_link(data_source);} .Ve .SH "A Note About Similarities" .IX Header "A Note About Similarities" The current default aggregator for \s-1GFF\s0 \*(L"similarity\*(R" features creates a composite Bio::DB::GFF::Feature object of type \*(L"gapped_alignment\*(R". The \fBtarget()\fR method for the feature as a whole will return a RelSegment object that is as long as the extremes of the similarity hit target, but will not necessarily be the same length as the query sequence. The length of each \*(L"similarity\*(R" subfeature will be exactly the same length as its \fBtarget()\fR. These subfeatures are essentially the HSPs of the match. .PP The following illustrates this: .PP .Vb 2 \& @similarities = $segment\->feature(\*(Aqsimilarity:BLASTN\*(Aq); \& $sim = $similarities[0]; \& \& print $sim\->type; # yields "gapped_similarity:BLASTN" \& \& $query_length = $sim\->length; \& $target_length = $sim\->target\->length; # $query_length != $target_length \& \& @matches = $sim\->Similarity; # use autogenerated method \& $query1_length = $matches[0]\->length; \& $target1_length = $matches[0]\->target\->length; # $query1_length == $target1_length .Ve .PP If you merge segments by calling \fBmerged_segments()\fR, then the length of the query sequence segments will no longer necessarily equal the length of the targets, because the alignment information will have been lost. Nevertheless, the targets are adjusted so that the first and last base pairs of the query match the first and last base pairs of the target. .SH "BUGS" .IX Header "BUGS" This module is still under development. .SH "SEE ALSO" .IX Header "SEE ALSO" bioperl, Bio::DB::GFF, Bio::DB::RelSegment .SH "AUTHOR" .IX Header "AUTHOR" Lincoln Stein . .PP Copyright (c) 2001 Cold Spring Harbor Laboratory. .PP This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.