NAME¶
g2p-sk - phonetic transcription for Slovak
SYNOPSIS¶
g2p-sk [--color] [--dl debug level] [--help] [--stats] [--ofile
<file_name>] [<input file>]
DESCRIPTION¶
The phonetic transcription is essential for some linguistic or speech
recognition applications. Depending on the language either rule based or
statistical approach is being used. g2p-sk implements the rule based approach
but in the future it may be replaced by statistical one.
Each input word consisting of the sequence of graphemes is transcribed in to the
sequence of phones in the SAMPA coding. If no input file is specified, the
standard input is expected. If input file is used then the output is written
in to the file as well. The filename is input filename with the extension
"_trans.txt".
The input output code page is ISO 8859-2. To use it with different CP use some
CP converter and pipes. For example to have input and output in UTF-8 use (for
interactive use):
filterm UTF8-iso2 iso2-UTF8 g2p-sk or (for batch
processing)
iconv -f UTF-8 -t ISO_8859-2 | g2p-sk | iconv -f ISO_8859-2 -t
UTF-8
Performance of the phonetic transcription depend on the morphematic
segmentation. To improve the quality of the morphematic segmentation is
possible to replace the small version of the simple morphematic dictionary in
the /usr/share/g2p_sk/Exceptions/morfemy.ddat with the better one. The
syllabic segmentation is as important as morphematic one. The syllabic
segmentation is provided by sylseg-sk package.
The design of the g2p-sk is language dependent. To use it for another language
the all rules need to be rewritten.
OPTIONS¶
- --color
- Enable color output.
- --dl 1..5
- Set the debug level. Control the amount of displayed information The debug
level 0 displays nothing. The maximum level 5 displays full debugging
report. The default debug level is 1.
- --help
- Display a short help text
- --ofile <file_name>
- Write output also in to given file.
- --stats
- Count and display statistic for each phone
EXAMPLES¶
- Use standard input and debug level 3:
- g2p-sk --dl 3
- Process all the from file aaa.txt:
- g2p-sk aaa.txt
EXIT STATUS¶
g2p-sk returns a zero if it succeeds to process all the input words
AUTHOR¶
Jozef Ivanecky (dodo (at) kanoistika.sk)
SEE ALSO¶
sylseg-sk(1),
filterm(1),
iconv(1),
konwert(1)