Homework: Recode

Value: 20 pts;
Date Due: the beginning of the next class.

Name __________________________________

Here are the guidelines:
1) make a hardcopy of this file.
2) enter your name and the answers on your copy and turn the copy in to me.
3) after doing the problems by hand you should verify your answers, whenever possible, by writing a command file for these problems and running them through SPSS.

NOTE: DO THE HOMEWORK BY HAND FIRST.
If you just plug in the data you won't learn enough to do well on the test.

Here is the data for all the questions in this homework:

Table 1. The Data
The Data
 ID  X1  X2  X3  X4  AGE STRESS    

  1   4   9   4   5  -99  High
  2   9   3   1   6   19   Low
  3   6   4   7   .    .  Med
  4   3   2   .   2   51  Low
  Note: Assume that 9 has been identified as a user-missing value for X1, X2, X3, and X4, and that -99 is a user defined missing value for AGE.

When you fill in the blanks in the questions below --
4) use a period (.) to indicate a system missing value (SYSMIS).
5) pay particular attention to what happens to system- and user-missing values.


Set (1)  Scale reversal.

Assume that variables X1, X2, X3, and X4 are attitude items that are responded to on 7-point Likert-type scales with the following scale values:

1 = strongly disagree
2 = disagree
3 = slightly disagree
4 = neither agree nor disagree
5 = slightly agree
6 = agree
7 = strongly agree

You wish to reverse the scoring for variables X2 and X3 by recoding them into variables X2R and X3R.  Assume: (a) that you have run the syntax commands as shown in Table 2, and (b) you have made no other changes to the data after running the syntax commands.   Make the appropriate changes to Table 2 (fill in all the blanks).

Table 2. Set (1) - Reverse Scoring
Command Syntax
RECODE X2 X3 (1 = 7)(2 = 6)(3 = 5)(5 = 3)(6 = 2)(7 = 1)
    INTO X2R X3R.
EXECUTE.
The Data
 ID  X1  X2  X3  X4  AGE STRESS  X2R  X3R  

  1   4  ___ ___  5  -99  High   ___  ___
  2   9  ___ ___  6   19   Low   ___  ___
  3   6  ___ ___  .    .  Med    ___  ___
  4   3  ___ ___  2   51  Low    ___  ___

___T  ___ F   After running the commands in Table 2,  9's will be defined as user missing values in variables X2R and X3R.

What changes, if any, would you make to the recode command in Table 2?

___I would rewrite the command as follows:

RECODE  __________________________________________________________________
                 __________________________________________________________________

___I would make no changes.


What are the data types for X2R and X3R?  _________________ 

 


Set (2)  Creating a new Independent Variable

For this question assume that there are 200 cases in the datafile shown in Table 1.   Cases numbered 1 to 100 are children who have been diagnosed with posttraumatic stress disorder (PTSD), cases numbered 101 to 200 are children diagnosed with attention-deficit, hyperactivity disorder (ADHD). You want to run an analysis to see if the children with PTSD score differently on the attitude scale than the children with ADHD.  To do this you need a new variable, diagnos (diagnosis of disorder), that has a value of 1 if the child has ADHD and a value of 2 if the child has PTSD.  Write the command(s) that would yield the new variable, diagnos, with the correct values.  You get points for this question only if the command(s) you write would actually produce the desired output.  Please include the proper variable label and value label commands.

Table 3. Set (2): Creating a Diagnosis Independent Variable

Command 
Syntax

 

 

 

Variable Label
Value labels 

EXECUTE.

 

__ T  __ F   It would be necessary to add a MISSING VALUES command to this set of commands.


Set (3)  Missing Values

Assume you are starting over with the data in Table 1.

Table 1. The Data
The Data
 ID  X1  X2  X3  X4  AGE STRESS    

  1   4   9   4   5  -99  High
  2   9   3   1   6   19   Low
  3   6   4   7   .    .  Med
  4   3   2   .   2   51  Low
  Note: Assume that 9 has been identified as a user-missing value for X1, X2, X3, and X4, and that -99 is a user defined missing value for AGE.

You want to change all the user missing values for the variables X1, X2, X3, and X4 to system missing values.  That is, you do not with to have to deal with the user missing values in those variables.  Write the command(s) that will accomplish that goal.  Do not create any new variables. Fill in the blanks in Table 4 to indicate what the data will look like if your command(s) work as expected. 

Table 4. Set (3): Missing Values

Command Syntax

 

 

EXECUTE.
Output
  ID  X1  X2  X3  X4  AGE STRESS    

  __  __  __  __  __  -99  High
  __  __  __  __  __   19   Low
  __  __  __  __  __    .  Med
  __  __  __  __  __   51  Low

Set (4) String Values

You want to use the information in the STRESS variable to create a numeric variable that has three levels of stress (1 = Low,  2 = Med, and 3 = High).   Write the command(s) that will accomplish that goal.  Assume that the STRESS values in Table 1 represent all the different possible values for that variable. Fill in the blanks in Table 5 to indicate what the data would look like if your command(s) work as expected.

Table 5. Set (4): String Values
Command Syntax
 

 

 

EXECUTE.
Output
ID  X1  X2  X3  X4  AGE   STRESS  _____    

  1   4   9   4   5  -99  _____    ___ 
  2   9   3   1   6   19  _____    ___ 
  3   6   4   7   .    .  _____    ___
  4   3   2   .   2   51  _____    ___    


Suppose that there are 2000 cases in the datafile represented by Table 1.  The data were entered by a number of different people and you don't know how consistent they were in entering the string values for the STRESS variable (e.g., some values may be in all upper case, some in all lower case, etc.).  What would be an efficient way of determining all the possible string values for STRESS prior to recoding them?

 

 

 

 


Checking your answers.

You can build your own answer key for most of these problems by actually creating a data file in the Data Editor and running each of these problems. Make sure you have correctly entered the values for each variable.

As I noted earlier, you should fill in the blanks by hand prior to running your answer key. Then create the data file to check your answers. If the answers provided by SPSS do not match your own answers then go back and think about that happened for that transformation. Think about why there is a discrepancy between your original thoughts about the problem and the answers provided by your SPSS answer key.

top


ŠLee A. Becker, 1998-1999 -revised 09/27/99