NAME¶
TFBS::Matrix::ICM - class for information content matrices of nucleotide
patterns
SYNOPSIS¶
- •
- creating a TFBS::Matrix::ICM object manually:
my $matrixref = [ [ 0.00, 0.30, 0.00, 0.00, 0.24, 0.00 ],
[ 0.00, 0.00, 0.00, 1.45, 0.42, 0.00 ],
[ 0.00, 0.89, 2.00, 0.00, 0.00, 0.00 ],
[ 0.00, 0.00, 0.00, 0.13, 0.06, 2.00 ]
];
my $icm = TFBS::Matrix::ICM->new(-matrix => $matrixref,
-name => "MyProfile",
-ID => "M0001"
);
# or
my $matrixstring = <<ENDMATRIX
2.00 0.30 0.00 0.00 0.24 0.00
0.00 0.00 0.00 1.45 0.42 0.00
0.00 0.89 2.00 0.00 0.00 0.00
0.00 0.00 0.00 0.13 0.06 2.00
ENDMATRIX
;
my $icm = TFBS::Matrix::ICM->new(-matrixstring => $matrixstring,
-name => "MyProfile",
-ID => "M0001"
);
- •
- retrieving a TFBS::Matix::ICM object from a database:
(See documentation of individual TFBS::DB::* modules to learn how to connect
to different types of pattern databases and retrieve TFBS::Matrix::*
objects from them.)
my $db_obj = TFBS::DB::JASPAR2->new
(-connect => ["dbi:mysql:JASPAR2:myhost",
"myusername", "mypassword"]);
my $pfm = $db_obj->get_Matrix_by_ID("M0001", "ICM");
# or
my $pfm = $db_obj->get_Matrix_by_name("MyProfile", "ICM");
- •
- retrieving list of individual TFBS::Matrix::ICM objects from a
TFBS::MatrixSet object
(see decumentation of TFBS::MatrixSet to learn how to create objects for
storage and manipulation of multiple matrices)
my @icm_list = $matrixset->all_patterns(-sort_by=>"name");
* drawing a sequence logo
$icm->draw_logo(-file=>"logo.png",
-full_scale =>2.25,
-xsize=>500,
-ysize =>250,
-graph_title=>"C/EBPalpha binding site logo",
-x_title=>"position",
-y_title=>"bits");
DESCRIPTION¶
TFBS::Matrix::ICM is a class whose instances are objects representing position
weight matrices (PFMs). An ICM is normally calculated from a raw position
frequency matrix (see TFBS::Matrix::PFM for the explanation of position
frequency matrices). For example, given the following position frequency
matrix,
A:[ 12 3 0 0 4 0 ]
C:[ 0 0 0 11 7 0 ]
G:[ 0 9 12 0 0 0 ]
T:[ 0 0 0 1 1 12 ]
the standard computational procedure is applied to convert it into the following
information content matrix:
A:[2.00 0.30 0.00 0.00 0.24 0.00]
C:[0.00 0.00 0.00 1.45 0.42 0.00]
G:[0.00 0.89 2.00 0.00 0.00 0.00]
T:[0.00 0.00 0.00 0.13 0.06 2.00]
which contains the "weights" associated with the occurence of each
nucleotide at the given position in a pattern.
A TFBS::Matrix::PWM object is equipped with methods to search nucleotide
sequences and pairwise alignments of nucleotide sequences with the pattern
they represent, and return a set of sites in nucleotide sequence (a
TFBS::SiteSet object for single sequence search, and a TFBS::SitePairSet for
the alignment search).
FEEDBACK¶
Please send bug reports and other comments to the author.
AUTHOR - Boris Lenhard¶
Boris Lenhard <Boris.Lenhard@cgb.ki.se>
APPENDIX¶
The rest of the documentation details each of the object methods. Internal
methods are preceded with an underscore.
new¶
Title : new
Usage : my $icm = TFBS::Matrix::ICM->new(%args)
Function: constructor for the TFBS::Matrix::ICM object
Returns : a new TFBS::Matrix::ICM object
Args : # you must specify either one of the following three:
-matrix, # reference to an array of arrays of integers
#or
-matrixstring,# a string containing four lines
# of tab- or space-delimited integers
#or
-matrixfile, # the name of a file containing four lines
# of tab- or space-delimited integers
#######
-name, # string, OPTIONAL
-ID, # string, OPTIONAL
-class, # string, OPTIONAL
-tags # an array reference, OPTIONAL
to_PWM¶
Title : to_PWM
Usage : my $pwm = $icm->to_PWM()
Function: converts an information content matrix (a TFBS::Matrix::ICM object)
to position weight matrix. At present it assumes uniform
background distribution of nucleotide frequencies.
Returns : a new TFBS::Matrix::PWM object
Args : none; in the future releases, it should be able to accept
a user defined background probability of the four
nucleotides
draw_logo¶
Title : draw_logo
Usage : my $gdImageObj = $icm->draw_logo(%args)
Function: Draws a "sequence logo", a graphical representation
of a possibly degenerate fixed-width nucleotide
sequence pattern, from the information content matrix
Returns : a GD::Image object;
if you only need the image file you can ignore it
Args : -file, # the name of the output PNG image file
# OPTIONAL: default none
-xsize # width of the image in pixels
# OPTIONAL: default 600
-ysize # height of the image in pixels
# OPTIONAL: default 5/8 of -x_size
-startpos # start position in the logo for x axis
# OPTIONAL: default is 1
-margin # size of image margins in pixels
# OPTIONAL: default 15% of -y_size
-full_scale # the maximum value on the y-axis, in bits
# OPTIONAL: default 2.25
-graph_title,# the graph title
# OPTIONAL: default none
-x_title, # x-axis title; OPTIONAL: default none
-y_title # y-axis title; OPTIONAL: default none
-error_bars # reference to an array of S.D. values for each column; OPTIONAL
-ps # if true, produces a postscript string instead of a GD::Image object
-pdf # if true AND the -file argumant is used, produces an output pdf file
_draw_ps_logo¶
Title : _draw_ps_logo
Usage : my $postscript_string = $icm->_draw_ps_logo(%args)
Internal method, should be accessed using draw_logo()
Function: Draws a "sequence logo", a graphical representation
of a possibly degenerate fixed-width nucleotide
sequence pattern, from the information content matrix
Returns : a postscript string;
if you only need the image file you can ignore it
Args : -file, # the name of the output PNG image file
# OPTIONAL: default none
-xsize # width of the image in pixels
# OPTIONAL: default 600
-ysize # height of the image in pixels
# OPTIONAL: default 5/8 of -x_size
-full_scale # the maximum value on the y-axis, in bits
# OPTIONAL: default 2.25
-graph_title,# the graph title
# OPTIONAL: default none
-x_title, # x-axis title; OPTIONAL: default none
-y_title # y-axis title; OPTIONAL: default none
_draw_svg_logo¶
name¶
class¶
matrix¶
length¶
revcom¶
rawprint¶
prettyprint¶
The above methods are common to all matrix objects. Please consult TFBS::Matrix
to find out how to use them.