![]()
In terms of measurement theory "validity" refers to whether or not you are measuring what you intend to measure. If you intend to measure anxiety but your items tap into social desirability, then you do not have a valid measure of anxiety. Reliability refers to whether you can measure it consistently.
These same concepts apply to the research design. Internal validity refers to whether or not the effects you obtain in your study are due to your conceptual variable. If there are alternative explanations to your data then the study does not have internal validity. External validity refers to whether or not the results can be generalized to people and situations outside of the specific participants and situations of the research.
The reliability of a study is related to whether or not the findings can be replicated. But that's another story.
We look at a set of case studies to see if we can seriously question the internal validity. We will be looking for reasonable alternative explanations of the findings.
| The University of Georgia studied the effects of dormitory hours on the GPA of 787 resident freshmen women. Of that group 371 women were required to observe dormitory hours, while the remaining (n = 416) were given permission by their parents to ignore closing hours. At the end of the academic term there was no significant difference in GPA between the two groups. Would you be willing to conclude from this study that dorm hours have no effect on GPA? | What is the design of
this study?
What alternative explanations are there? |
| The U.S. Navy has developed alcohol treatment centers for its personnel. In an effort to evaluate the effectiveness of these centers they collected pre- and posttest measures on 404 alcoholics who completed treatment. They found positive changes on level of trust, emotional stability, and extroversion. These positive changes were accompanied by significant decreases in both pathology (depression, hysteria) and anxiety. Ratings by the (former) alcoholics' commanding officers indicated the short-term success rate to be over 80 %. This stands in marked contrast to the 45% rate of success reported before development of the special treatment centers. | What is the design of
this study?
What alternative explanations are there? |
Groups For Parents is a packaged program for dealing with behavior problems in children. It offers a support groups of other parents and an integrated humanistic behavior modification approach. The authors reported the following evaluation of their program. Thirteen groups of parents (n = 277) met once a week for 2.5 hours over an 8 week period. About one half of the parents had been referred by various community agencies; the other half had heard about it from friends or other informed sources. The pre- and posttest measures included: (a) a problem behavior checklist; Approximately two thirds (n = 180) of those enrolled completed the entire eight week course. Differences between the pre- and posttest scores were analyzed. Significant, positive, results were found on all the measures. In addition, a very high rate of client satisfaction at the end of the study was reported. |
What is the design of
this study?
What alternative explanations are there? |
In 1953, Dr. J. N. Morris of London Hospital's Medical Research Council conducted what turned out to be a classic study of exercise and heart disease. His participants were drivers and conductors of London's double-decker busses, and he found that the drivers had 1.5 times the incidence of heart disease as the conductors and 2 times the coronary death rate. (Was this an ethical study?). Since the drivers simply sat in their seats all day while the conductors ran up and down the stairs to collect the fares, Dr. Morris asserted that exercise was the causal variable that brought about the observed health differences. |
What is the design of
this study?
What alternative explanations are there? |
Authoritarian personality types are commonly thought to be more punishing towards criminals than nonauthoritarians. For example, studies have shown that authoritarians are more likely to vote guilty and to recommend harsher sentences on murder cases than are nonauthoritarians. As a result consulting on jury selection in a self defense murder case I began to wonder if there might be some circumstances in which that generalization might not hold.
We presented authoritarian and nonauthoritarian "jurors" a trial summary that described a "self-defense" murder or a "standard" murder. Participants were randomly assigned to one of the two murder conditions. In the self-defense case the death took place at a the home of Mr. X, the assailant. A few minutes earlier Mrs. Y, the next door neighbor, had appeared at the door asking for help, indicating that her husband was drunk and threatening her with a gun. She was invited in and was talking with Mr. X's wife when Mr. Y appeared at the door, hand behind his back, demanding to see his wife. They argued for a short time. Mr. X asked Mr. Y to calm down and go home. When Mr. Y reached to open the door Mr. X took a shotgun and blasted it through the door, killing Mr. Y. In the "standard" murder the assailant, Mr. X, was robbing a convenience store. During the robbery he shot and killed a clerk, Mr. Y. The results for the standard murder replicated earlier findings that authoritarians were more likely to vote guilty than were nonauthoritarians. The results of the self-defense murder were the opposite, authoritarians were less likely to vote guilty than were the nonauthoritarians. |
What is the design of
this study?
What alternative explanations are there? |
These artifacts are related to the the general reactivity of the participant to the study and the demand characteristics of the study. Demand characteristics are cues that govern the participant's perceptions of the purpose of the study.
A. Reactivity Effects
-Evaluation apprehension
-Novelty effects
B. Demand Characteristics
-The good participant / the negative participant / the apathetic participant
-The obedient participant
1. Observer bias: The observer overestimates or underestimates the responses during the observation and/or the recording phase. Controlled by automating the data collection and recording. Also controlled by independent replication.
2. Interpreter bias: An error in the interpretion of data. Controlled by access to data by other scientists.
3. Intentional bias: Fabrication or fraudulent manipulation of data. Controlled by independent replication and by access to data by other scientists.
1. Biosocial effect: An error attributable to biosocial attitudes of the researcher (e.g., sex, age, ethnicity, and race). Controlled by independent replication.
2. Psychosocial effect: An error attributable to psychosocial attributes of the researcher (e.g., personality). Controlled by indepenent replication.
3. Modeling effect: An error that is a function of the example set by the researcher. Controlled by independent replications.
4. Experimenter expectancy bias: An error that results when the researcher's hypothesis leads unintentionally to behavior towards the participants that increases the likelihood that the hypothesis will be confirmed.
| Research Area | Proportion of Results that reached p < .05 in the predicted direction | Mean effect size in Cohen's d |
|
||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Laboratory interviews | .38 | 0.14 | |||||||||||
| Reaction time | .22 | 0.17 | |||||||||||
| Learning and abililty | .29 | 0.54 | |||||||||||
| Person perception | .27 | 0.55 | |||||||||||
| Inkblot tests | .44 | 0.84 | |||||||||||
| Everyday situations | .40 | 0.88 | |||||||||||
| Psychophysical judgments | .43 | 1.05 | |||||||||||
| Animal learning | .73 | 1.73 | |||||||||||
| Median (n = 345 studies) | .39 | 0.70 | |||||||||||
-from Rosnow and Rosenthal (1997)
1. Increase the number of experimenters.
2. Monitor the behavior of the experimenters.
3. Maintaining blind contact with the participants.
4. Minimizing experimenter-participant contact.
5. Using expectancy control groups
e.g.
| Expectancy (e.g. role playing) | ||
| Tx Condition | Occurrence | Nonoccurrence |
| Occurrence | A | B |
| Nonoccurrence | C | D |
Campbell, D. T., & Stanley, J. C. (1963). Experimental and quasi-experimental designs for research. Chicago: Rand-McNally.
Rosnow, R. L., & Rosenthal. R. (1997). People studying people: Artifacts and ethics in behavioral research. New York: W. H. Freeman.
-03/16/98