.\" Automatically generated by Pod::Man 2.28 (Pod::Simple 3.28) .\" .\" Standard preamble: .\" ======================================================================== .de Sp \" Vertical space (when we can't use .PP) .if t .sp .5v .if n .sp .. .de Vb \" Begin verbatim text .ft CW .nf .ne \\$1 .. .de Ve \" End verbatim text .ft R .fi .. .\" Set up some character translations and predefined strings. \*(-- will .\" give an unbreakable dash, \*(PI will give pi, \*(L" will give a left .\" double quote, and \*(R" will give a right double quote. \*(C+ will .\" give a nicer C++. Capital omega is used to do unbreakable dashes and .\" therefore won't be available. \*(C` and \*(C' expand to `' in nroff, .\" nothing in troff, for use with C<>. .tr \(*W- .ds C+ C\v'-.1v'\h'-1p'\s-2+\h'-1p'+\s0\v'.1v'\h'-1p' .ie n \{\ . ds -- \(*W- . ds PI pi . if (\n(.H=4u)&(1m=24u) .ds -- \(*W\h'-12u'\(*W\h'-12u'-\" diablo 10 pitch . if (\n(.H=4u)&(1m=20u) .ds -- \(*W\h'-12u'\(*W\h'-8u'-\" diablo 12 pitch . ds L" "" . ds R" "" . ds C` "" . ds C' "" 'br\} .el\{\ . ds -- \|\(em\| . ds PI \(*p . ds L" `` . ds R" '' . ds C` . ds C' 'br\} .\" .\" Escape single quotes in literal strings from groff's Unicode transform. .ie \n(.g .ds Aq \(aq .el .ds Aq ' .\" .\" If the F register is turned on, we'll generate index entries on stderr for .\" titles (.TH), headers (.SH), subsections (.SS), items (.Ip), and index .\" entries marked with X<> in POD. Of course, you'll have to process the .\" output yourself in some meaningful fashion. .\" .\" Avoid warning from groff about undefined register 'F'. .de IX .. .nr rF 0 .if \n(.g .if rF .nr rF 1 .if (\n(rF:(\n(.g==0)) \{ . if \nF \{ . de IX . tm Index:\\$1\t\\n%\t"\\$2" .. . if !\nF==2 \{ . nr % 0 . nr F 2 . \} . \} .\} .rr rF .\" .\" Accent mark definitions (@(#)ms.acc 1.5 88/02/08 SMI; from UCB 4.2). .\" Fear. Run. Save yourself. No user-serviceable parts. . \" fudge factors for nroff and troff .if n \{\ . ds #H 0 . ds #V .8m . ds #F .3m . ds #[ \f1 . ds #] \fP .\} .if t \{\ . ds #H ((1u-(\\\\n(.fu%2u))*.13m) . ds #V .6m . ds #F 0 . ds #[ \& . ds #] \& .\} . \" simple accents for nroff and troff .if n \{\ . ds ' \& . ds ` \& . ds ^ \& . ds , \& . ds ~ ~ . ds / .\} .if t \{\ . ds ' \\k:\h'-(\\n(.wu*8/10-\*(#H)'\'\h"|\\n:u" . ds ` \\k:\h'-(\\n(.wu*8/10-\*(#H)'\`\h'|\\n:u' . ds ^ \\k:\h'-(\\n(.wu*10/11-\*(#H)'^\h'|\\n:u' . ds , \\k:\h'-(\\n(.wu*8/10)',\h'|\\n:u' . ds ~ \\k:\h'-(\\n(.wu-\*(#H-.1m)'~\h'|\\n:u' . ds / \\k:\h'-(\\n(.wu*8/10-\*(#H)'\z\(sl\h'|\\n:u' .\} . \" troff and (daisy-wheel) nroff accents .ds : \\k:\h'-(\\n(.wu*8/10-\*(#H+.1m+\*(#F)'\v'-\*(#V'\z.\h'.2m+\*(#F'.\h'|\\n:u'\v'\*(#V' .ds 8 \h'\*(#H'\(*b\h'-\*(#H' .ds o \\k:\h'-(\\n(.wu+\w'\(de'u-\*(#H)/2u'\v'-.3n'\*(#[\z\(de\v'.3n'\h'|\\n:u'\*(#] .ds d- \h'\*(#H'\(pd\h'-\w'~'u'\v'-.25m'\f2\(hy\fP\v'.25m'\h'-\*(#H' .ds D- D\\k:\h'-\w'D'u'\v'-.11m'\z\(hy\v'.11m'\h'|\\n:u' .ds th \*(#[\v'.3m'\s+1I\s-1\v'-.3m'\h'-(\w'I'u*2/3)'\s-1o\s+1\*(#] .ds Th \*(#[\s+2I\s-2\h'-\w'I'u*3/5'\v'-.3m'o\v'.3m'\*(#] .ds ae a\h'-(\w'a'u*4/10)'e .ds Ae A\h'-(\w'A'u*4/10)'E . \" corrections for vroff .if v .ds ~ \\k:\h'-(\\n(.wu*9/10-\*(#H)'\s-2\u~\d\s+2\h'|\\n:u' .if v .ds ^ \\k:\h'-(\\n(.wu*10/11-\*(#H)'\v'-.4m'^\v'.4m'\h'|\\n:u' . \" for low resolution devices (crt and lpr) .if \n(.H>23 .if \n(.V>19 \ \{\ . ds : e . ds 8 ss . ds o a . ds d- d\h'-1'\(ga . ds D- D\h'-1'\(hy . ds th \o'bp' . ds Th \o'LP' . ds ae ae . ds Ae AE .\} .rm #[ #] #H #V #F C .\" ======================================================================== .\" .IX Title "PROF 1" .TH PROF 1 "2015-07-31" "1.0.42" "User Commands" .\" For nroff, turn off justification. Always turn off hyphenation; it makes .\" way too many mistakes in technical documents. .if n .ad l .nh .SH "NAME" prof \- secondary structure and solvent accessibility predictor .SH "SYNOPSIS" .IX Header "SYNOPSIS" prof [\fI\s-1INPUTFILE\s0\fR+] [\s-1OPTIONS\s0] .SH "DESCRIPTION" .IX Header "DESCRIPTION" Secondary structure is predicted by a system of neural networks rating at an expected average accuracy > 72% for the three states helix, strand and loop (Rost & Sander, \s-1PNAS, 1993 , 90, 7558\-7562\s0; Rost & Sander, \s-1JMB, 1993 , 232, 584\-599\s0; and Rost & Sander, Proteins, 1994 , 19, 55\-72; evaluation of accuracy). Evaluated on the same data set, PROFsec is rated at ten percentage points higher three-state accuracy than methods using only single sequence information, and at more than six percentage points higher than, e.g., a method using alignment information based on statistics (Levin, Pascarella, Argos & Garnier, Prot. Engng., 6, 849\-54, 1993). PHDsec predictions have three main features: .IP "1. improved accuracy through evolutionary information from multiple sequence alignments" 4 .IX Item "1. improved accuracy through evolutionary information from multiple sequence alignments" .PD 0 .IP "2. improved beta-strand prediction through a balanced training procedure" 4 .IX Item "2. improved beta-strand prediction through a balanced training procedure" .IP "3. more accurate prediction of secondary structure segments by using a multi-level system" 4 .IX Item "3. more accurate prediction of secondary structure segments by using a multi-level system" .PD .PP Solvent accessibility is predicted by a neural network method rating at a correlation coefficient (correlation between experimentally observed and predicted relative solvent accessibility) of 0.54 cross-validated on a set of 238 globular proteins (Rost & Sander, Proteins, 1994, 20, 216\-226; evaluation of accuracy). The output of the neural network codes for 10 states of relative accessibility. Expressed in units of the difference between prediction by homology modelling (best method) and prediction at random (worst method), PROFacc is some 26 percentage points superior to a comparable neural network using three output states (buried, intermediate, exposed) and using no information from multiple alignments. .PP Transmembrane helices in integral membrane proteins are predicted by a system of neural networks. The shortcoming of the network system is that often too long helices are predicted. These are cut by an empirical filter. The final prediction (Rost et al., Protein Science, 1995, 4, 521\-533; evaluation of accuracy) has an expected per-residue accuracy of about 95%. The number of false positives, i.e., transmembrane helices predicted in globular proteins, is about 2%. The neural network prediction of transmembrane helices (PHDhtm) is refined by a dynamic programming-like algorithm. This method resulted in correct predictions of all transmembrane helices for 89% of the 131 proteins used in a cross-validation test; more than 98% of the transmembrane helices were correctly predicted. The output of this method is used to predict topology, i.e., the orientation of the N\-term with respect to the membrane. The expected accuracy of the topology prediction is > 86%. Prediction accuracy is higher than average for eukaryotic proteins and lower than average for prokaryotes. PHDtopology is more accurate than all other methods tested on identical data sets. .PP If no output file option (such as \fB\-\-fileRdb\fR or \fB\-\-fileOut\fR) is given the \s-1RDB\s0 formatted output is written into \fI./INPUTFILENAME.prof\fR where 'prof' replaces the extension of the input file. In lack of extension '.prof' is appended to the input file name. .SS "Output format" .IX Subsection "Output format" The \s-1RDB\s0 format is self-annotating, see example outputs in \fI/share/profphd/prof/exa\fR. .SH "REFERENCES" .IX Header "REFERENCES" .IP "Rost, B. and Sander, C. (1994a). Combining evolutionary information and neural networks to predict protein secondary structure. Proteins, 19(1), 55\-72." 4 .IX Item "Rost, B. and Sander, C. (1994a). Combining evolutionary information and neural networks to predict protein secondary structure. Proteins, 19(1), 55-72." .PD 0 .IP "Rost, B. and Sander, C. (1994b). Conservation and prediction of solvent accessibility in protein families. Proteins, 20(3), 216\-26." 4 .IX Item "Rost, B. and Sander, C. (1994b). Conservation and prediction of solvent accessibility in protein families. Proteins, 20(3), 216-26." .IP "Rost, B., Casadio, R., Fariselli, P., and Sander, C. (1995). Transmembrane helices predicted at 95% accuracy. Protein Sci, 4(3), 521\-33." 4 .IX Item "Rost, B., Casadio, R., Fariselli, P., and Sander, C. (1995). Transmembrane helices predicted at 95% accuracy. Protein Sci, 4(3), 521-33." .PD .SH "OPTIONS" .IX Header "OPTIONS" See each keyword for more help. Most of these are likely to be broken. .IP "a" 4 .IX Item "a" alternative connectivity patterns (default=3) .IP "3" 4 .IX Item "3" predict sec + acc + htm .IP "acc" 4 .IX Item "acc" predict solvent accessibility, only .IP "ali" 4 .IX Item "ali" add alignment to 'human\-readable' \s-1PROF\s0 output file(s) .IP "arch" 4 .IX Item "arch" system architecture (e.g.: SGI64|SGI5|SGI32|SUNMP|ALPHA) .IP "ascii" 4 .IX Item "ascii" write 'human\-readable' \s-1PROF\s0 output file(s) .IP "best" 4 .IX Item "best" \&\s-1PROF\s0 with best accuracy and longest run-time .IP "both" 4 .IX Item "both" predict secondary structure and solvent accessibility .IP "data" 4 .IX Item "data" data= for \s-1HTML\s0 out: only those parts of predictions written .IP "debug" 4 .IX Item "debug" keep most intermediate files, print debugging messages .IP "dirWork" 4 .IX Item "dirWork" work directory, default: a temporary directory from File::Temp::tempdir. Must be fully qualified path. .Sp Known to work. .IP "doEval" 4 .IX Item "doEval" \&\s-1DO\s0 evaluation for list (only for known structures and lists) .IP "doFilterHssp" 4 .IX Item "doFilterHssp" filter the input \s-1HSSP\s0 file (excluding some pairs) .IP "doHtmfil" 4 .IX Item "doHtmfil" \&\s-1DO\s0 filter the membrane prediction (default) .IP "doHtmisit" 4 .IX Item "doHtmisit" \&\s-1DO\s0 check strength of predicted membrane helix (default) .IP "doHtmref" 4 .IX Item "doHtmref" \&\s-1DO\s0 refine the membrane prediction (default) .IP "doHtmtop" 4 .IX Item "doHtmtop" \&\s-1DO\s0 membrane helix topology (default) .IP "dssp" 4 .IX Item "dssp" convert \s-1PROF\s0 into \s-1DSSP\s0 format .IP "expand" 4 .IX Item "expand" expand insertions when converting output to \s-1MSF\s0 format .IP "fast" 4 .IX Item "fast" \&\s-1PROF\s0 with lowest accuracy and highest speed .IP "fileCasp" 4 .IX Item "fileCasp" name of \s-1PROF\s0 output in \s-1CASP\s0 format (file.caspProf) .IP "fileDssp" 4 .IX Item "fileDssp" name of \s-1PROF\s0 output in \s-1DSSP\s0 format (file.dsspProf) .IP "fileHtml" 4 .IX Item "fileHtml" name of \s-1PROF\s0 output in \s-1HTML\s0 format (file.htmlProf) .IP "fileMsf" 4 .IX Item "fileMsf" name of \s-1PROF\s0 output in \s-1MSF\s0 format (file.msfProf) .IP "fileNotHtm" 4 .IX Item "fileNotHtm" name of file flagging that no membrane helix was found .IP "fileOut" 4 .IX Item "fileOut" name of \s-1PROF\s0 output in \s-1RDB\s0 format (file.rdbProf) .Sp Known to work. .IP "fileProf" 4 .IX Item "fileProf" name of \s-1PROF\s0 output in human readable format (file.prof) .Sp Broken. .IP "fileRdb" 4 .IX Item "fileRdb" name of \s-1PROF\s0 output in \s-1RDB\s0 format (file.rdbProf) .Sp Known to work. .IP "fileSaf" 4 .IX Item "fileSaf" name of \s-1PROF\s0 output in \s-1SAF\s0 format (file.safProf) .IP "filter" 4 .IX Item "filter" filter the input \s-1HSSP\s0 file (excluding some pairs) .IP "good" 4 .IX Item "good" \&\s-1PROF\s0 with good accuracy and moderate speed .IP "graph" 4 .IX Item "graph" add \s-1ASCII\s0 graph to 'human\-readable' \s-1PROF\s0 output file(s) .IP "htm" 4 .IX Item "htm" use: 'htm=' gives minimal transmembrane helix detected default is 'htm=8' (resp. htm=0.8) smaller numbers more false positives and fewer false negatives! .IP "html argument" 4 .IX Item "html argument" \&'hmtl' or 'html=' write \s-1HTML\s0 format of prediction 'html' will result in that the \s-1PROF\s0 output is converted to \s-1HTML \s0'html=body' restricts \s-1HTML\s0 file to the \s-1HTML_BODY\s0 tag part 'html=head' restricts \s-1HTML\s0 file to the \s-1HTML_HEADER\s0 tag part 'html=all' gives both \s-1HEADER\s0 and \s-1BODY\s0 .IP "keepConv" 4 .IX Item "keepConv" keep the conversion of the input file to \s-1HSSP\s0 format .IP "keepFilter argument" 4 .IX Item "keepFilter argument" <*|doKeepFilter=1> keep the filtered \s-1HSSP\s0 file .IP "keepHssp argument" 4 .IX Item "keepHssp argument" <*|doKeepHssp=1> keep the intermediate \s-1HSSP\s0 file .IP "keepNetDb argument" 4 .IX Item "keepNetDb argument" <*|doKeepNetDb=1> keep the intermediate DbNet file(s) .IP "list argument" 4 .IX Item "list argument" <*|isList=1> input file is list of files .IP "msf" 4 .IX Item "msf" convert \s-1PROF\s0 into \s-1MSF\s0 format .IP "nice" 4 .IX Item "nice" give 'nice\-D' to set the nice value (priority) of the job .IP "noProfHead" 4 .IX Item "noProfHead" do \s-1NOT\s0 copy file with tables into local directory .IP "noSearch" 4 .IX Item "noSearch" short for doSearchFile=0, i.e. no searching of \s-1DB\s0 files .IP "noascii" 4 .IX Item "noascii" surpress writing \s-1ASCII \s0(i.e. human readable) result files .IP "nohtml" 4 .IX Item "nohtml" surpress writing \s-1HTML\s0 result files .IP "nonice" 4 .IX Item "nonice" job will not be niced, i.e. not run with lower priority .IP "notEval" 4 .IX Item "notEval" \&\s-1DO NOT\s0 check accuracy even when known structures .IP "notHtmfil" 4 .IX Item "notHtmfil" do \s-1NOT\s0 filter the membrane prediction .IP "notHtmisit" 4 .IX Item "notHtmisit" do \s-1NOT\s0 check whether or not membrane helix strong enough .IP "notHtmref" 4 .IX Item "notHtmref" do \s-1NOT\s0 refine the membrane prediction .IP "notHtmtop" 4 .IX Item "notHtmtop" do \s-1NOT\s0 membrane helix topology .IP "nresPerLineAli" 4 .IX Item "nresPerLineAli" Number of characters used for \s-1MSF\s0 file. Default: 50. .IP "numresMin" 4 .IX Item "numresMin" Minimal number of residues to run network, otherwise prd=symbolPrdShort. Default: 9. .IP "optJury" 4 .IX Item "optJury" Adds \s-1PHD\s0 to jury. Default: `normal,usePHD'. .Sp Many other parameters change the default for this one as a side-effect, the list is not comprehensive: .Sp phd, nophd, /^para(3|Both|Sec|Acc|Htm|CapH|CapE|CapHE)/, /^para?/, jct .IP "para3" 4 .IX Item "para3" Parameter file for sec+acc+htm. Default: `<\s-1DIRPROF\s0>/net/PROFboth_best.par'. .IP "paraAcc" 4 .IX Item "paraAcc" Parameter file for acc. Default: `<\s-1DIRPROF\s0>/net/PROFacc_best.par'. .IP "paraBoth" 4 .IX Item "paraBoth" Parameter file for sec+acc. Default: `<\s-1DIRPROF\s0>/net/PROFboth_best.par'. .IP "paraSec" 4 .IX Item "paraSec" Parameter file for sec. Default: `<\s-1DIRPROF\s0>/net/PROFsec_best.par'. .IP "riSubAcc" 4 .IX Item "riSubAcc" Minimal reliability index (\s-1RI\s0) for subset PROFacc. Default: 4. .IP "riSubSec" 4 .IX Item "riSubSec" Minimal reliability index (\s-1RI\s0) for subset PROFsec. Default: 5. .IP "riSubSym" 4 .IX Item "riSubSym" Symbol for residues predicted with \s-1RI\s0 < riSubSec/Acc. Default: `.'. .IP "s_k_i_p" 4 .IX Item "s_k_i_p" problems, manual, hints, notation, txt, known, \s-1DONE,\s0 Date, date, aa, Lhssp, numaa, code .IP "saf" 4 .IX Item "saf" convert \s-1PROF\s0 into \s-1SAF\s0 format .IP "scrAddHelp" 4 .IX Item "scrAddHelp" .PD 0 .IP "scrGoal" 4 .IX Item "scrGoal" .PD neural network switching .IP "scrHelpTxt" 4 .IX Item "scrHelpTxt" Input file formats accepted: hssp,dssp,msf,saf,fastamul,pirmul,fasta,pir,gcg,swiss .IP "scrIn" 4 .IX Item "scrIn" list_of_files (or single file) parameter_file .IP "scrName" 4 .IX Item "scrName" prof .IP "scrNarg" 4 .IX Item "scrNarg" 2 .IP "sec" 4 .IX Item "sec" predict secondary structure, only .IP "silent" 4 .IX Item "silent" no information written to screen \- this is the default .IP "skipMissing" 4 .IX Item "skipMissing" do not abort if input file missing! .IP "sourceFile" 4 .IX Item "sourceFile" prof .IP "test" 4 .IX Item "test" is just a test (faster) .IP "translate-jobid-in-param-values" 4 .IX Item "translate-jobid-in-param-values" String 'jobid' gets substituted with \f(CW$par\fR{jobid} .IP "tst" 4 .IX Item "tst" quick run through program, low accuracy .IP "user" 4 .IX Item "user" user name .IP "\-\-version" 4 .IX Item "--version" Print version .SH "AUTHOR" .IX Header "AUTHOR" B. Rost, Sander C, Fariselli P, Casadio R, Liu J, Yachdav G, Kajan L. .SH "EXAMPLES" .IX Header "EXAMPLES" .IP "Prediction from alignment in \s-1HSSP\s0 file for best results" 4 .IX Item "Prediction from alignment in HSSP file for best results" .Vb 1 \& prof /share/profphd/prof/exa/1ppt.hssp fileRdb=/tmp/1ppt.hssp.prof .Ve .IP "Prediction from a single sequence" 4 .IX Item "Prediction from a single sequence" .Vb 1 \& prof /share/profphd/prof/exa/1ppt.f fileRdb=/tmp/1ppt.f.rdbProf .Ve .IP "phd.pl invocation" 4 .IX Item "phd.pl invocation" .Vb 1 \& /share/profphd/prof/embl/phd.pl /share/profphd/prof/exa/1ppt.hssp htm fileOutPhd=/tmp/query.phdPred fileOutRdb=/tmp/query.phdRdb fileNotHtm=/tmp/query.phdNotHtm .Ve .SH "ENVIRONMENT" .IX Header "ENVIRONMENT" .IP "\s-1PROFPHDDIR\s0" 4 .IX Item "PROFPHDDIR" Override package prof package dir \fI/share/profphd\fR. .IP "\s-1RGUTILSDIR\s0" 4 .IX Item "RGUTILSDIR" Override location of librg-utils-perl \fI/share/librg\-utils\-perl\fR. .SH "FILES" .IX Header "FILES" .IP "\fI*.rdbProf\fR" 4 .IX Item "*.rdbProf" default output file extension .IP "\fI/share/profphd/prof\fR" 4 .IX Item "/share/profphd/prof" default data directory .SH "BUGS" .IX Header "BUGS" Please report bugs at . .IP "Prediction from \s-1HSSP\s0 file fails when residue lines with exclamation marks `!' are present:" 4 .IX Item "Prediction from HSSP file fails when residue lines with exclamation marks `!' are present:" Use 'optJury=normal' and 'both' like this: .Sp .Vb 1 \& prof /tmp/1a3q.hssp fileRdb=/tmp/1a3q.hssp.profRdb optJury=normal both .Ve .SH "SEE ALSO" .IX Header "SEE ALSO" .IP "Main website" 4 .IX Item "Main website" .IP "Documentation" 4 .IX Item "Documentation" .IP "Community website" 4 .IX Item "Community website" .IP "\s-1FTP\s0" 4 .IX Item "FTP" .IP "Newsgroups" 4 .IX Item "Newsgroups"