Scroll to navigation

datapm(1) Data package manager datapm(1)

NAME

datapm - data packaging system and utilities

SYNOPSIS

datapm COMMAND [OPTIONS]

DESCRIPTION

datapm (data package manager) is a command line tool and python library and for working with Data Packages and interacting with data hubs like those powered by CKAN

COMMANDS

about
About datapm
clone src-spec path [format-pattern] [url-pattern]
Download a package (i.e. metadata and resources) specified by src-spec to path
Resources to retrieve are selected interactively if no format-pattern is given. If provided, the optional glob-style format-pattern and url-pattern arguments are matched against the format and url of the resource to determine whether it should be retrieved.
download src-spec path [format-pattern] [url-pattern]
Download a package (i.e. metadata and resources) specified by src-spec to path
Resources to retrieve are selected interactively if no format-pattern is given. If provided, the optional glob-style format-pattern and url-pattern arguments are matched against the format and url of the resource to determine whether it should be retrieved.
dump pkg-spec path-of-resource-within-pkg
Dump contents of specified resource in specified package to stdout.
help
Show available commands
info package-spec [manifest]
Get information about a package (print package metadata). If manifest specified then show manifest info rather than package metadata.
WARNING: if you change the metadata for a python distribution you may need to rebuild the egg-info for changes to show up here.
init [path-or-name]
Initialize a data package at path. Package Name will be taken from last portion of path. If path simply a name then create in the current directory.
license
Show the license
list [index-spec]
List registered packages. If index-spec is not provided use default index.
man
Show the manual
push [source-file] [webstore-url]
Push local package in current directory to remote repository specified in .dpm/config. Alternatively push a single file to the webstore.
register rc-spec dest-spec
Register package at src-spec into index at dest-spec.
search index-spec query
Search registered packages in index-spec.
setup action
config [location]: Create configuration file at location. If not location specified use default (see --config).
index [location]: Setup an index at location specified in config.
repo: Setup a repository. The repository will be created at the location specified via the --repository option or default location specified by config.
update src-spec dest-spec
As for register.
upload path upload-spec
Upload a file or package at path to upload-spec. The upload-spec are of the form:

upload-dest-id://BUCKET/LABEL
For example:

## default ckan upload
ckan://BUCKET/LABEL

## an s3 upload destination
my-s3://BUCKET/LABEL

## local pairtree
my-pairtree://BUCKET/LABEL

## google storage
my-google-storage://BUCKET/LABEL
Upload destinations are specified in your datapm config file and are of the form:

[upload:dest-id]
ofs.backend = s3|google|archive.org|...
## see OFS documentation for a given backend
config-option = config-value

OPTIONS

--version
show program's version number and exit
-h, --help
show this help message and exit
-v, --verbose
Give more output
-d, --debug
Print debug output
-q, --quiet
Give less output
--log=FILENAME
Log file where a complete (maximum verbosity) record will be kept
-c CONFIG, --config=CONFIG
Path to config file (if any) - defaults to $HOME/.dpmrc
-r REPOSITORY, --repository=REPOSITORY
Path to repository - overrides value in config
-k API_KEY, --api-key=API_KEY
CKAN API Key (overrides value in config)

CONFIGURATION FILE


[dpm]
repo.default_path = $HOME/.dpm/repository
index.default = file

[index:ckan]
ckan.url = http://thedatahub.org/api/
ckan.api_key =

[index:db]
db.dburi = sqlite://$HOME/.datapm/repository/index.db

[upload:ckan]
ofs.backend = reststore
host = http://storage.ckan.net

FILES

~/.dpmrc
Per user datapm configuration file.

EXAMPLES

Grabbing some data from an index
 

datapm index-add file:///....
datapm update
datapm search "military spending"
some-id Military Spending 1890-1914
some-id-2 Military Spending 1890-1914 (normalized)
datapm install some-id
datapm plot some-id
 
Get two different datasets and use them together
 

datapm install pkg-a
datapm install pkg-b
datapm create merged
# manual merge
# e.g. PPP, GDP
datapm register my-merged-package
 

SEE ALSO

For more information visit the documentation at: http://readthedocs.org/docs/dpm
February 6, 2012