table of contents
HFST-TOKENIZE(1) | User Commands | HFST-TOKENIZE(1) |
NAME¶
hfst-tokenize - =perform matching/lookup on text streamsSYNOPSIS¶
hfst-tokenize [--segment | --xerox | --cg] [OPTIONS...] RULESETDESCRIPTION¶
perform matching/lookup on text streamsCommon options:¶
- -h, --help
- Print help message
- -V, --version
- Print version info
- -v, --verbose
- Print verbosely while processing
- -q, --quiet
- Only print fatal erros and requested output
- -s, --silent
- Alias of --quiet
- -n --newline
- Newline as input separator (default is blank line)
- -a --print-all
- Print nonmatching text
- -w --print-weight
- Print weights
- --tokenize-multichar
- Tokenize multicharacter symbols (by default only one utf-8 character is tokenized at a time regardless of what is present in the alphabet)
- -t, --time-cutoff=S
- Limit search after having used S seconds per input
- --segment
- Segmenting / tokenization mode (default)
- --xerox
- Xerox output
- --cg
- cg output
- --finnpos
- FinnPos output
Use standard streams for input and output (for now).
REPORTING BUGS¶
Report bugs to <hfst-bugs@helsinki.fi> or directly to our bug tracker at: <https://sourceforge.net/tracker/?atid=1061990&group_id=224521&func=browse>hfst-tokenize home page:
<https://kitwiki.csc.fi/twiki/bin/view/KitWiki//HfstTokenize>
General help using HFST software:
<https://kitwiki.csc.fi/twiki/bin/view/KitWiki//HfstHome>
COPYRIGHT¶
Copyright © 2010 University of Helsinki, License GPLv3: GNU GPL version 3 <http://gnu.org/licenses/gpl.html>This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law.
January 2016 | HFST |