Fixed-Column, ASCII Data Files

Reading: SPSS Base 9.0 User's Guide, Chapter 3: Data Files
Homework: Read a Fixed-Column, ASCII Data File
Download: fixed.dat    Download Tips

  1. Overview
  2. Fixed-Column, ASCII Format
  3. How to Read a Fixed-Column Datafile into SPSS
  4. Completing the Data Definition Process
  5. How to Use the Syntax File to Read the ASCII Data.
  6. How to Use the TextWizard Predefined Format file to Read the ASCII Data 
  7. Types of Files Used in SPSS

1. Overview

It is becoming more and more unlikely that you will ever use a word processor to create an ASCII data file.  But you may come across ASCII (or text) data files that you want to analyze.  Text files are created by not only by word processors, but by form pages on the internet (e.g., the skills survey you filled out at the beginning of the course) and by some computer programs used by psychologists to collect data. So you may find it useful to be able to read text files into SPSS.

ASCII or text files can be structured in three basic ways, in fixed-column format,  freefield format, and delimited format.  The values for a given variable are always placed in the same columns in a fixed-column data file, e.g., the values of gender are always, say, in column 6. The values of variables in a freefield data file must be separated by one or more spaces or commas. The values of variables in a delimited data file must be separated by a single delimiter.  Freefield and delimited data files are similar in that the values for a particular variable do not always have to reside in the same columns for each case in the datafile. This set of notes describes the fixed-column format datafile and how to read it into SPSS.  The next set of notes describes the delimited format and how to read it. 

SPSS Version note: The Data Editor in SPSS for Windows 8.0 and earlier could read fixed-column and freefield formats. The Data Editor in SPSS for Windows 9.0 switched from reading freefield format data files to reading delimited format data files.  

If you are using version 8.0 you may want to read the section on freefield files, see SPSS80/ascii.htm and SPSS80/asciifree.htm.

If you want a set of guidelines about how to create ASCII files using a word processor then see:  SPSS80/ascii_fixed.htm .

top


2. Fixed-Column ASCII Format

Fixed column format means that the values for a variable are always located in the same column. Lets consider a dataset with four variables.  The codebook is shown in Table 1 and the data from five cases are shown in Table 2. The data are stored in the file called fixed.datId is always located in columns 1-2, firstnam is always located in columns 3-12, age is always located in columns 15-16 and gender is always located in column 18. In this example the case with id = 03 is a 10 year old female whose first name is Suzanne.

Table 1. Codebook for a Fixed-Column, ASCII Datafile
Variable
 Name
Columns Variable 
Type

Variable Labels/Value Labels

id 1-2 numeric2.0  
firstnam 3-12 string10 First name of respondent
age 15-16 numeric2.0 Age of respondent
gender 18 numeric1.0 Gender of Respondent/ 1 = "Female" 2 = "Male" 9 = "No Gender Information"

 

Table 2. Data for a Fixed-Column ASCII Datafile
01Martha      18 1
02            53 9
03Suzanne     10 1
04Debbie         1
07Fernandez   21 2

The values can be right next to each other ( e.g., id and firstnam) or they can be separated by one or more spaces (e.g., age and gender). The basic rule for fixed-column ASCII files is that the values for a variable must always be located in the same column.

Variable types. A wide array of data types are available in fixed column ASCII format including: numeric, several date types, string, dollar, comma, and dot. 

Missing values. Missing values can be either system missing or user-defined missing when you use use a fixed-column ASCII data file. A system missing value is defined by a set of blanks across the entire field. For example, the age for the case with id = 04 is system missing. The gender for the case with id = 02 is user missing. When you define your variables you would need to specifically assign 9 as missing for the gender variable.

String variables. String variables can be used in fixed-column format data files. By default SPSS expects the values of string variables to be left justified, as they are in this example.

top


3.  How to Read a Fixed-Column Datafile into SPSS

Open a fixed-column, ASCII data file in SPSS with the following sequence of commands:

File
   Read Text Data

Step 1 - 

Does your data match a predefined format?

The answer here is no.  A predefined format means that you have the formatting information stored in a file somewhere.  Press NEXT to go to Step 2. 

Step 2 - 

The first question asks whether the data file is in freefield format (Variables are delimited by spepecific characters) or in fixed column format (Variables are aligned in fixed-width columns).  Select fixed width format.

The second question asks whether or not the variable names are included at the top of the file. In our dataset the first row of data is the first case, not a list of variable names, so select no.

Step 3 -

The first case begins on line 1. 

All the data for a case are presented on one line, so enter 1 to the question, "How many lines represent a case.?"  If you have a large dataset with many variables then the variables may be entered on more than one line.   

We want to read (import)  all the cases. 

Step 4 -

This step defines the columns in which each variable is stored.  The Test Wizard will try to make the best guess about how the variables are entered.  It apparently makes the assumption that variables are separated by one or more spaces.  This results in the lumping together of the id and firstnam variables because they occur in consecutive columns without a break.  Drag the break line to the id and firstnam are separated.

Drag the remaining break line to the beginning of the age variable (column 15).

Insert a new break line at the beginning of the gender variable (column 18).

Click NEXT to go to the next step. 

Step 5 -

The values are now shown in the Data Editor or spreadsheet format.  Check the in the spreadsheet with the values from Table 2 to make sure that the data has been read correctly.

You should note that the dataset has an error.  The gender value for case 04, Debbie has been entered in the wrong column in the original datafile, fixed.dat.  The value of gender is being read as the value of age. There are two ways of fixing this problem.  You could exit the Text Wizard process, edit the fixed.dat file with a text editor (move the gender value to the correct column), and then begin the Text Wizard process again. Or, you could note the error and fix it when the actual SPSS datafile has been created.  You cannot change any data from within the Text Wizard.

Lets fix it later.

The next step is to enter the variable names and the data format for each variable. Click on V1 and enter "id" as the variable name and make sure that the variable format is numeric.  You cannot choose the width of the variable at this time, only the basic format of the variable (numeric, string, dollar, etc.).

Note that there is an option of not importing a variable.  If you do not wish to include a particular variable within the datafile you are creating, then click "Do not import" as the data format.

Go ahead and enter the names and formats for the other three variables.

When you have finished, click Next. 

Step 6

If at any time in the future you might want to read the text file again, then you should answer yes to either of the questions in this step. For example, you may be in the middle of the process of collecting data from your web page or computer program and you will want to reread the text data file after more data has been collected.  

There are two tools for making the rereading process easier. The first saves all the information you just entered in a file called a "TextWizard Predefined Format".  The default extension of this type of file is ".tpf."   Just to see how this works click yes in response to the question "Would you like save this file format for future use."  Then press the Save As button, select the directory to save the file and enter a file name.  I recommend using the same file name as the original ASCII file.  In this case "fixed."  The extension ".tpf" will be appended automatically.  

The other tool is to create a syntax file out of the information that you entered.  To try this out click yes  in response to the question "Would you like to paste the syntax?"  

Press Finish to continue the process.

A syntax window will open.  The syntax commands will look like those in the next table.

data list file='D:\My Documents\UCCS\COURSES\spss\data\fixed.dat' fixed
records=1 /1 id 1-2(F) firstnam 3-14(A) age 15-17(A) gender 18-18(A).
EXECUTE .

The SPSS command "data list" is used to read a text file.  The data list command defines:
(a) the location of the ASCII file ( file='D:\My Documents\UCCS\COURSES\spss\data\fixed.dat' );
(b) the format (fixed for freefield) of the file (fixed);
(c) the number of records used to store the data for each case (records=1);
(d) a record number (/1); and 
(d) each of the variables, their location on the record, and the basic data type 
(id 1-2(F) firstnam 3-14(A) age 15-17(A) gender 18-18(A).

For example, the variable id is stored in columns 1-2 and the data type is numeric.  The data list command uses the fortran formatting notation for identifying data types.  The fortran format "F" refers to a numeric variable; the fortran format "A" stands for a string (or alphanumeric, or string) variable.

The EXECUTE. command will cause the data to be read the entered in to the Data Editor.

Save the syntax commands.  The reason for creating the syntax commands is to use them in the future.  To do that you will need to save the commands as a syntax file. I recommend saving the syntax commands in a file that has the same name as the ASCII file, e.g., fixed in this example.  SPSS will automatically assign the extension ".sps" to a syntax file.

Run the syntax commands. At this point no data have been transferred to the SPSS Data Editor. Run all the syntax commands to read the data into the Data Editor.

top


4. Completing the Data Definition Process

The variable names, basic data type, and values now appear in the Data Editor.  There are still a few data definition tasks to be done:
(a) define variable labels
(b) define value labels where necessary
(c) define any user missing values
(d) check the data type definitions.

Finally, clean up any data problems that were encountered.  In this example the gender value for case 04 was misplaced. Delete the "1" from age and enter "1" for gender for case 04. 

You can now save the SPSS systems file.  Again, I recommend using the same variable name, e.g., fixed, as was used for the ASCII data file.

top


5. How to Use the Syntax File to Read the ASCII Data.

To use the syntax file to read the ASCII data
(a) Open the SPSS data editor.
(b) Click File     Open...   .Find and open the syntax file that you want to use, e.g., fixed.sps.  By default, SPSS assumes that you want to open an SPSS systems file to it limits the display of files to those with a .sav extension.  Change the extension to .sps to display syntax files.
(c) Run the syntax commands. 

The commands should run assuming that you have not moved the ASCII data file.  That is, the ASCII data file must reside in the directory that is identified in the   name='        part of the data list command.  You may have to edit the directory name if the location of the ASCII file has been changed.


6. How to Use the TextWizard Predefined Format file to Read the ASCII Data 

To use the TextWizard Predefined Format  file to read the ASCII data
(a) Open the SPSS data editor.
(b) Click File   Read Text Data . 
(c) Find the ASCII data file and open it.
(d) Answer yes to question "Does your text match a predefined format.";  Click Browse to find the .tpf file and open it. 

At this point you can either check each of the steps in the Text Wizard, or you can press Finish to go skip to the last step of the process.

top


7. Types of Files Used in SPSS

At this point in our exploration of SPSS we have encountered several different types of files. Here is a summary of the types of files and the default extensions for those files.

File type default extension Content (file type)
SPSS data file
(or SPSS system file)
 .sav Data file created by the SPSS Editor (ASCII + binary)
Syntax file .sps SPSS syntax commands (ASCII)
raw data file .dat 
or
.txt
Data values (ASCII)
TextWizard Predefined Format .tpf Data definition information that is used to read in an ASCII file using the SPSS Text Wizard. (binary? It is not readable by a word processor.)

It is recommended that the name of the file should be the same for each set of data.  For example, in this set of notes we are working with a set of data that we called "fixed."   The system file for that data is fixed.sav, the relevant syntax file is fixed.sps, the ASCII or raw data file is fixed.dat, and the TextWizard file is fixed.tpf.  When we look at the files on our disk we can easily see that those four files are related to each other. You should be able to recognize the type of file from its extension.

top


ŠLee A. Becker, 1999 -revised 07/13/00