readstat - Man Page

read and write data set files from SAS, SPSS, and Stata

Synopsis

readstat input-file

readstat [-f] input-file output-file

readstat [-f] input-file metadata-file output-file

Description

readstat converts data set files from popular statistics packages stored in both plain-text and binary formats.

In the first invocation style, readstat displays metadata from input-file, including the row count, column count, text encoding, and timestamp. input-file should be a file with one of the following extensions:

sas7bdat

SAS binary file, created with SAS version 7 or newer

xpt

SAS portable file, version 5 or version 8, created with the SAS XPORT command

sav

SPSS uncompressed binary file

zsav

SPSS compressed binary file

por

SPSS portable file

dta

Stata binary file, version 104 or newer

If the row count cannot be determined from the file header, which is sometimes the case with SPSS binary files and always the case with SPSS portable files, readstat will report a value of -1.

In the second invocation style, readstat converts input-file to output-file, e.g. a SAS portable file to a Stata binary file.  In addition to the preceding extension list, output-file may have extension csv or xlsx, which creates a CSV or Excel file, respectively.

The third invocation style is used when additional metadata about the input file, such as value labels or column widths, is stored in a separate file. Several types of metadata file are supported:

sas7bcat

SAS binary "catalog" file, created with SAS version 7 or newer, containing value labels

json

JavaScript Object Notation (JSON) file, containing column metadata that cannot be gleaned from the input CSV. For details, see the manual page for  the extract_metadata command.

dct

Stata dictionary file, containing the data layout and column metadata for a plain-text input file.

sps

SPSS command file, describing the data layout and column metadata for a plain-text input file.

sps

SAS command file, describing the data layout and column metadata for a plain-text input file.

The last three formats can be used for both fixed-width and delimiter-separated (e.g. tab-separated) input files. These are commonly distributed along with plain-text ASCII data sets.

Both input and output formats are implied by the file extension.

Options

-f

Overwrite any existing output-file.

Bugs

SAS binary files created by readstat do not open with current versions of SAS.

The finer details of format strings (e.g. "%8.2g") are not properly converted between file formats.

Author

Copyright (C) 2012-2019 Evan Miller, and others where indicated.

Info

23 January 2019