Scroll to navigation

datalad clone(1) datalad datalad clone(1)

NAME

datalad clone - obtain a dataset copy from a URL or local source (path)

SYNOPSIS

datalad clone [-h] [-d DATASET] [-D DESCRIPTION] [--reckless] [--alternative-sources SOURCE [SOURCE ...]] SOURCE [PATH]

DESCRIPTION

The purpose of this command is to obtain a new clone (copy) of a dataset and place it into a not-yet-existing or empty directory. As such CLONE provides a strict subset of the functionality offered by INSTALL. Only a single dataset can be obtained, recursion is not supported. However, once installed, arbitrary dataset components can be obtained via a subsequent GET command.

Primary differences over a direct `git clone` call are 1) the automatic initialization of a dataset annex (pure Git repositories are equally supported); 2) automatic registration of the newly obtained dataset as a subdataset (submodule), if a parent dataset is specified; 3) support for datalad's resource identifiers and automatic generation of alternative access URL for common cases (such as appending '.git' to the URL in case the accessing the base URL failed); and 4) ability to take additional alternative source locations as an argument.

OPTIONS

SOURCE
URL, DataLad resource identifier, local path or instance of dataset to be cloned. Constraints: value must be a string
PATH
path to clone into. If no PATH is provided a destination path will be derived from a source URL similar to git clone. [Default: None]

-h, --help, --help-np
show this help message. --help-np forcefully disables the use of a pager for displaying the help message
-d DATASET, --dataset DATASET
(parent) dataset to clone into. If given, the newly cloned dataset is registered as a subdataset of the parent. Also, if given, relative paths are interpreted as being relative to the parent dataset, and not relative to the working directory. Constraints: Value must be a Dataset or a valid identifier of a Dataset (e.g. a path) [Default: None]
-D DESCRIPTION, --description DESCRIPTION
short description to use for a dataset location. Its primary purpose is to help humans to identify a dataset copy (e.g., "mike's dataset on lab server"). Note that when a dataset is published, this information becomes available on the remote side. Constraints: value must be a string [Default: None]
--reckless
Set up the dataset to be able to obtain content in the cheapest/fastest possible way, even if this poses a potential risk the data integrity (e.g. hardlink files from a local clone of the dataset). Use with care, and limit to "read-only" use cases. With this flag the installed dataset will be marked as untrusted. [Default: False]
--alternative-sources SOURCE [SOURCE ...]
Alternative sources to be tried if a dataset cannot be obtained from the main SOURCE. Constraints: value must be a string [Default: None]

AUTHORS

datalad is developed by The DataLad Team and Contributors <team@datalad.org>.
2019-02-08