Overview: Simpson's Paradox is the name given to the phenomenon in which relationships observed between groups reverse when the groups are divided into subgroups based on a lurking variable.
Illustrated exampleComponents:
Data Table: The data are displayed in the table at the top of the applet.
The greater percentages are boxed (in orange if Simpson's Paradox is observed and in green otherwise).
Plot: For each of the comparison groups the plot shows the percentage of observations in the specified category of the outcome variable as a function of the percentage in a category of the lurking variable. Colored dots on the lines indicate the percentages in the lurking variable category for each of the comparison groups. The left extreme of each line indicates the percentage that would be in the specified outcome category if 0% of the observations were in the indicated lurking variable category and the right extreme corresponds to 100% in the lurking variable category.
The Sliders: The sliders allow the user to adjust the percentage in the lurking variable category for each of the comparison groups and to see how this affects the observed relationships. As a slider is adjusted, a circle on the corresponding line (circle color the same as slider dot color) moves. Dashed lines from the circles to the axes highlight the relationship between the variable values. The data in the table is updated as the sliders are adjusted. The combined counts for the comparison groups and the percentages in the outcome category for the subgroups are fixed.
Points to Ponder:
Describe what you see in the table when Simpson's Paradox is observed.
Describe what you see in the plot when Simpson's Paradox is observed.
When adjusting the sliders, at what point does Simpson's Paradox appear?
What do the conditions you observe in the applet tell you about what's going on in the data when Simpson’s Paradox occurs?
Comparison groups: Alaska and America West Airlines
Outcome variable: Percent of flights delayed
Lurking variable: Flight origination
1964 Civil Rights Act Data:
Source: Simpson's Paradox. Wikipedia.
Comparison groups: Democrats and Republicans
Outcome variable: Percent in favor
Lurking variable: Origin of representative (Northern v. southern)
20 Year Smoker Survival:
Source: Vanderpump, M.P.J., Tunbridge, W.M.G., French, J.M., Appleton, D., Bates, D., Clark, F., Grimley Evans, J. Rodgers, H. Tunbridge F., and Young, E.T. (1996) The Development of Ischemic Heart Disease in Relation to Autoimmune Thyroid Disease in a 20-Year Follow-up Study of an English Community Thyroid 6(3):155-160.
Comparison groups: Smokers and non-smokers
Outcome variable: Percent alive at 20 year follow-up
Lurking variable: Age of subject (under 65 v. 65 and older)
House pet data:
Source: Schneiter (2012) Hypothetical study data.