Scroll to navigation

READSTAT(1) General Commands Manual READSTAT(1)

NAME

readstat - read and write data set files from SAS, SPSS, and Stata

SYNOPSIS

readstat input-file

readstat [-f] input-file output-file

readstat [-f] input-file metadata-file output-file

DESCRIPTION

readstat converts data set files from popular statistics packages stored in both plain-text and binary formats.

In the first invocation style, readstat displays metadata from input-file, including the row count, column count, text encoding, and timestamp. input-file should be a file with one of the following extensions:

SAS binary file, created with SAS version 7 or newer
SAS portable file, version 5 or version 8, created with the SAS XPORT command
SPSS uncompressed binary file
SPSS compressed binary file
SPSS portable file
Stata binary file, version 104 or newer

If the row count cannot be determined from the file header, which is sometimes the case with SPSS binary files and always the case with SPSS portable files, readstat will report a value of -1.

In the second invocation style, readstat converts input-file to output-file, e.g. a SAS portable file to a Stata binary file. In addition to the preceding extension list, output-file may have extension csv or xlsx, which creates a CSV or Excel file, respectively.

The third invocation style is used when additional metadata about the input file, such as value labels or column widths, is stored in a separate file. Several types of metadata file are supported:

SAS binary "catalog" file, created with SAS version 7 or newer, containing value labels
JavaScript Object Notation (JSON) file, containing column metadata that cannot be gleaned from the input CSV. For details, see the manual page for the extract_metadata command.
Stata dictionary file, containing the data layout and column metadata for a plain-text input file.
SPSS command file, describing the data layout and column metadata for a plain-text input file.
SAS command file, describing the data layout and column metadata for a plain-text input file.

The last three formats can be used for both fixed-width and delimiter-separated (e.g. tab-separated) input files. These are commonly distributed along with plain-text ASCII data sets.

Both input and output formats are implied by the file extension.

OPTIONS

Overwrite any existing output-file.

BUGS

SAS binary files created by readstat do not open with current versions of SAS.

The finer details of format strings (e.g. "%8.2g") are not properly converted between file formats.

AUTHOR

Copyright (C) 2012-2019 Evan Miller, and others where indicated.

23 January 2019