NAME¶
pymvpa2-mkds - create a PyMVPA dataset from various sources
SYNOPSIS¶
pymvpa2 mkds [
--version] [
-h] [
-i [
dataset
[
dataset ...]]] [
--txt-data VALUE [
VALUE ...]
|
--npy-data VALUE [
VALUE ...]
| --mri-data IMAGE [
IMAGE
...]] [
--add-sa VALUE [
VALUE ...]] [
--add-fa VALUE
[
VALUE ...]] [
--add-sa-txt VALUE [
VALUE ...]]
[
--add-fa-txt VALUE [
VALUE ...]] [
--add-sa-attr
FILENAME] [
--add-sa-npy VALUE [
VALUE ...]] [
--add-fa-npy
VALUE [
VALUE ...]] [
--mask IMAGE] [
--add-vol-attr ARG
ARG] [
--add-fsl-mcpar FILENAME]
-o OUTPUT
[
--hdf5-compression TYPE]
DESCRIPTION¶
Create a PyMVPA dataset from various sources.
This command converts data from various sources, such as text files, NumPy's NPY
files, and MR (magnetic resonance) images into a PyMVPA dataset that gets
stored in HDF5 format. An arbitrary number of sample and feature attributes
can be added to a dataset, and individual attributes can be read from
heterogeneous sources (e.g. they do not have to be all from text files).
For datasets from MR images this command also supports automatic conversion of
additional images into (volumetric) feature attributes. This can be useful for
describing features with, for example, atlas labels.
COMPOSE ATTRIBUTES ON THE COMMAND LINE
Options
--add-sa and
--add-fa can be used to compose dataset
attributes directly on The command line. The syntax is:
...
--add-sa <attribute name> <comma-separated values>
[DTYPE]
where the optional 'DTYPE' is any identifier of a NumPy data type (e.g. 'int',
or 'float32'). If no data type is specified the attribute values will be
strings.
If only one attribute value is given, it will copied and assigned to all entries
in the dataset.
LOAD DATA FROM TEXT FILES
All options for loading data from text files support optional parameters to
Tweak the conversion:
...
--add-sa-txt <mandatory values> [DELIMITER [DTYPE [SKIPROWS
[COMMENTS]]]]
where 'DELIMITER' is the string that is used to separate values in the input
file, 'DTYPE' is any identifier of a NumPy data type (e.g. 'int', or
'float32'), 'SKIPROWS' is an integer indicating how many lines at the
beginning of the respective file shall be ignored, and 'COMMENTS' is a string
indicating how to-be-ignored comment lines are prefixed in the file.
LOAD DATA FROM NUMPY NPY FILES
All options for loading data from NumPy NPY files support an optional parameter:
...
--add-fa-npy <mandatory values> [MEMMAP]
where 'MEMMAP' is a flag that triggers whether the respective file shall be read
by memory-mapping, i.e. not read (immediately) into memory. Enable by with on
of: yes|1|true|enable|on'.
OPTIONS¶
- --version
- show program's version and license information and exit
- -h, --help, --help-np
- show this help message and exit. --help-np forcefully disables the
use of a pager for displaying the help.
- -i [dataset [dataset ...]], --input [dataset [dataset
...]]
- path(s) to one or more PyMVPA dataset files. All datasets will be merged
into a single dataset (vstack'ed) in order of specification. In some cases
this option may need to be specified more than once if multiple, but
separate, input datasets are required.
- --txt-data VALUE [VALUE ...]
- load samples from a text file. The first value is the filename the data
will be loaded from. Additional values modifying the way the data is
loaded are described in the section "Load data from text
files".
- --npy-data VALUE [VALUE ...]
- load samples from a Numpy .npy file. Compressed files (i.e. .npy.gz) are
supported as well. The first value is the filename the data will be loaded
from. Additional values modifying the way the data is loaded are described
in the section "Load data from Numpy NPY files".
- --mri-data IMAGE [IMAGE ...]
- load data from an MR image, such as a NIfTI file. This can either be a
single 4D image, or a list of 3D images, or a combination of both.
Options for attributes from the command line:¶
- --add-sa VALUE [VALUE ...]
- compose a sample attribute from the command line input. The first value is
the desired attribute name, the second value is a comma-separated list
(appropriately quoted) of actual attribute values. An optional third value
can be given to specify a data type. Additional information on defining
dataset attributes on the command line are given in the section
"Compose attributes on the command line.
- --add-fa VALUE [VALUE ...]
- compose a feature attribute from the command line input. The first value
is the desired attribute name, the second value is a comma-separated list
(appropriately quoted) of actual attribute values. An optional third value
can be given to specify a data type. Additional information on defining
dataset attributes on the command line are given in the section
"Compose attributes on the command line.
Options for attributes from text files:¶
- --add-sa-txt VALUE [VALUE ...]
- load sample attribute from a text file. The first value is the desired
attribute name, the second value is the filename the attribute will be
loaded from. Additional values modifying the way the data is loaded are
described in the section "Load data from text files".
- --add-fa-txt VALUE [VALUE ...]
- load feature attribute from a text file. The first value is the desired
attribute name, the second value is the filename the attribute will be
loaded from. Additional values modifying the way the data is loaded are
described in the section "Load data from text files".
- --add-sa-attr FILENAME
- load sample attribute values from an legacy 'attributes file'. Column data
is read as "literal". Only two column files ('targets' +
'chunks') without headers are supported. This option allows for reading
attributes files from early PyMVPA versions.
Options for attributes from stored Numpy arrays:¶
- --add-sa-npy VALUE [VALUE ...]
- load sample attribute from a Numpy .npy file. Compressed files (i.e.
.npy.gz) are supported as well. The first value is the desired attribute
name, the second value is the filename the data will be loaded from.
Additional values modifying the way the data is loaded are described in
the section "Load data from Numpy NPY files".
- --add-fa-npy VALUE [VALUE ...]
- load feature attribute from a Numpy .npy file. Compressed files (i.e.
.npy.gz) are supported as well. The first value is the desired attribute
name, the second value is the filename the data will be loaded from.
Additional values modifying the way the data is loaded are described in
the section "Load data from Numpy NPY files".
- --mask IMAGE
- mask image file with the same dimensions as an input data sample. All
voxels corresponding to non-zero mask elements will be permitted into the
dataset.
- --add-vol-attr ARG ARG
- attribute name (1st argument) and image file with the same dimensions as
an input data sample (2nd argument). The image data will be added as a
feature attribute under the specified name.
- --add-fsl-mcpar FILENAME
- 6-column motion parameter file in FSL's McFlirt format. Six additional
sample attributes will be created: mc_{x,y,z} and mc_rot{1-3}, for
translation and rotation estimates respectively.
Output options:¶
- -o OUTPUT, --output OUTPUT
- output filename ('.hdf5' extension is added automatically if necessary).
NOTE: The output format is suitable for data exchange between PyMVPA
commands, but is not recommended for long-term storage or exchange as its
specific content may vary depending on the actual software environment.
For long-term storage consider conversion into other data formats (see
'dump' command).
- --hdf5-compression TYPE
- compression type for HDF5 storage. Available values depend on the specific
HDF5 installation. Typical values are: 'gzip', 'lzf', 'szip', or integers
from 1 to 9 indicating gzip compression levels.
EXAMPLES¶
Load 4D MRI image, assign atlas labels to a feature attribute, and attach class
labels from a text file. The resulting dataset is stored as 'ds.hdf5' in the
current directory.
- $ pymvpa2 mkds -o ds --mri-data bold.nii.gz --vol-attr area harvox.nii.gz
--add-sa-txt targets labels.txt
AUTHOR¶
Written by Michael Hanke & Yaroslav Halchenko, and numerous other
contributors.
COPYRIGHT¶
Copyright © 2006-2014 PyMVPA developers
Permission is hereby granted, free of charge, to any person obtaining a copy of
this software and associated documentation files (the "Software"),
to deal in the Software without restriction, including without limitation the
rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
sell copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO
EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES
OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
DEALINGS IN THE SOFTWARE.