NAME¶
sampler - manual page for pynlpl-sampler 0.7.7
DESCRIPTION¶
usage: pynlpl-sampler [-h] [-t TESTSETSIZE] [-d DEVSETSITE] [-T TRAINSETSITE]
- [-S SEED]
- files [files ...]
Extracts random samples from datasets, supports multiple parallel
datasets (such as parallel corpora), provided that corresponding data is on
the same line.
positional arguments:¶
- files
- The data sets to sample from, must be of equal size (i.e., same number of
lines)
optional arguments:¶
- -h, --help
- show this help message and exit
- -t TESTSETSIZE, --testsetsize TESTSETSIZE
- Test set size (lines) (default: 0)
- -d DEVSETSITE, --devsetsite DEVSETSITE
- Development set size (lines) (default: 0)
- -T TRAINSETSITE, --trainsetsite TRAINSETSITE
- Training set size (lines), leave unassigned (0) to automatically use all
of the remaining data (default: 0)
- -S SEED, --seed SEED
- Seed for random number generator (default: 0)