Scroll to navigation

PHONETISAURUS(1) User Commands PHONETISAURUS(1)

NAME

phonetisaurus-calculateER - estimates grapheme-to-phoneme error rate

SYNOPSIS

phonetisaurus-calculateER --hyp "hypseq or file" --ref "refseq or file" --usep "" [OPTIONS]

DESCRIPTION

phonetisaurus-calculateER

This tool evaluates performance of grapheme-to-phoneme tools.

OPTIONS

-h, --help

show this help message and exit

--hyp HYP, -w HYP

The file/string containing G2P/ASR hypotheses.

--ref REF, -r REF

The file/string containing G2P/ASR reference transcriptions.

--usep USEP, -u USEP

Character or regex separating units in a sequence. Defaults to ' '.

--fsep FSEP, -s FSEP

Character or regex separating fields in a sequence. Defaults to '\t'.

--format FORMAT, -f FORMAT

Input format. One of 'cmu', 'htk', 'g2p'. Defaults to 'g2p'.

--ignore IGNORE, -i IGNORE

Ignore specified characters when encountered in a HYPOTHESIS. A ' ' separated list.

--regex_ignore REGEX_IGNORE, -n REGEX_IGNORE

Ignore specified characters when encountered in a HYPOTHESIS. A regular expression.

--ignore_both, -b

Apply --ignore and --regex_ignore to both the HYPOTHESIS and REFERENCE files. Useful for analysis.

--testfile TESTFILE, -t TESTFILE

The test file in dictionary format. 1 word, 1 pronunciation per line, separated by '\t'.

--prefix PREFIX, -p PREFIX

Prefix used to generate the wordlist, hypothesis and reference files. Defaults to 'test'.

--modelfile MODELFILE, -m MODELFILE

Path to the phoneticizer model.

--mbrdecode, -e

Use the LMBR decoder.

--alpha ALPHA, -a ALPHA

Alpha for the mbr decoder.

--order ORDER, -o ORDER

N-gram order for the mbr decoder.

--precision PRECISION, -x PRECISION

Avg. N-gram precision factor for LMBR decoder. (.85)

--ratio RATIO, -y RATIO

N-gram ratio factor for LMBR decoder. (.72)

--beam BEAM, -z BEAM

LMBR/N-best search beam. Larger->Slower, better. (1500)

--verbose, -v

Verbose mode.
February 2013 phonetisaurus 0.7.8