readstat - Man Page
read and write data set files from SAS, SPSS, and Stata
Synopsis
readstat input-file
readstat [-f] input-file output-file
readstat [-f] input-file metadata-file output-file
Description
readstat converts data set files from popular statistics packages stored in both plain-text and binary formats.
In the first invocation style, readstat displays metadata from input-file, including the row count, column count, text encoding, and timestamp. input-file should be a file with one of the following extensions:
- sas7bdat
SAS binary file, created with SAS version 7 or newer
- xpt
SAS portable file, version 5 or version 8, created with the SAS XPORT command
- sav
SPSS uncompressed binary file
- zsav
SPSS compressed binary file
- por
SPSS portable file
- dta
Stata binary file, version 104 or newer
If the row count cannot be determined from the file header, which is sometimes the case with SPSS binary files and always the case with SPSS portable files, readstat will report a value of -1.
In the second invocation style, readstat converts input-file to output-file, e.g. a SAS portable file to a Stata binary file. In addition to the preceding extension list, output-file may have extension csv or xlsx, which creates a CSV or Excel file, respectively.
The third invocation style is used when additional metadata about the input file, such as value labels or column widths, is stored in a separate file. Several types of metadata file are supported:
- sas7bcat
SAS binary "catalog" file, created with SAS version 7 or newer, containing value labels
- json
JavaScript Object Notation (JSON) file, containing column metadata that cannot be gleaned from the input CSV. For details, see the manual page for the extract_metadata command.
- dct
Stata dictionary file, containing the data layout and column metadata for a plain-text input file.
- sps
SPSS command file, describing the data layout and column metadata for a plain-text input file.
- sps
SAS command file, describing the data layout and column metadata for a plain-text input file.
The last three formats can be used for both fixed-width and delimiter-separated (e.g. tab-separated) input files. These are commonly distributed along with plain-text ASCII data sets.
Both input and output formats are implied by the file extension.
Options
- -f
Overwrite any existing output-file.
Bugs
SAS binary files created by readstat do not open with current versions of SAS.
The finer details of format strings (e.g. "%8.2g") are not properly converted between file formats.
Author
Copyright (C) 2012-2019 Evan Miller, and others where indicated.