SAS Data
Overview
SAS can read data in many formats, including text (ASCII) files, SPSS
portable files, database files, and Excel spreadsheets. The formats
allowed vary across operating systems.
Once the data are read by the SAS program, they are converted into a SAS
dataset which contains information about the names, values, formats, and
labels of the variables. The data set may be stored temporarily, for the
duration of the program, or it may be stored permanently for re-use in
other SAS programs.
The first section of this document shows how to prepare a raw (text)
data set. The second section explains how to create a permanent SAS data
set.
Raw Data
- The raw data file contains the values of variables in text (ASCII)
format. If the data are to be read in with a SAS data step, the raw data
file contains only the data. If PROC IMPORT is used to read raw data, the
first line in the data file may contain the names of the variables to be
read on subsequent lines.
- Data may be read inline (in the body of the SAS comand file) or
from an external data file. If you plan to write several programs to
analyze the data, or if the data file is more than a few lines, the data
should be stored in an external file.
- The values for each unit of analysis (subject) appear on one or more
(contiguous) lines.
- If possible, the data should be prepared in fixed format, with the
values of each variable occupying the same column for each unit of
analysis (e.g., AGE appears in columns 23-24). Spaces may be used to
separate the values of the variables, but it is not required.
- If data are not read in fixed format, the values of the variables
must be separated with a delimiter. The most commonly used form of
delimiter is the space, but other symbols, such as the comma, ampersand
(&), or tab may be used.
- Missing data may be left blank if the columns or format are
specified in the data step. If data are read using freefield input
(scanning for blanks), missing values may be indicated with a decimal
point.
- The data file may be created using a word processor, spreadsheet, or
text editor, but it must be output in text format.
SAS Data Sets
- A SAS Data Set consists of the variable names, formats, labels, and
data values stored in machine-readable format. The Data Set may be created
from a raw data set in a SAS data step (see
the SAS syntax page ) or in a SAS PROC such as PROC REG.
- SAS data sets may be temporary (existing for
the duration of the SAS run) or permanent (stored on tape or
disk for later use).
How to Create SAS Data Sets in a Data step
For examples of SAS programs using permanent SAS Data Sets, please refer
to the Sample SAS Command Files Page .