Simpson's Paradox

Investigate Simpson's Paradox

Overview: Simpson's Paradox is the name given to the phenomenon in which relationships observed between groups reverse when the groups are divided into subgroups based on a lurking variable.


For an illustrated example.

Data Table:The first line of the table shows counts for the two comparison groups, the number of each of these in a specified category of the outcome variable, and the percentage in that category.

In the second and third rows of the table, the data for each comparison group is divided into subgroups based on the lurking variable.

The table values that are shown in blue are fixed and do not update as the plot changes.

The greater percentages are boxed (in orange if Simpson's Paradox is observed and in blue otherwise).

Plot: For each of the comparison groups the plot shows the percentage of observations in the specified category of the outcome variable as a function of the percentage in a category of the lurking variable. Colored dots on the lines indicate the percentages in the lurking variable category for each of the comparison groups. The left extreme of each line indicates the percentage that would be in the specified outcome category if 0% of the observations were in the indicated lurking variable category and the right extreme corresponds to 100% in the lurking variable category.

The Sliders: The sliders allow the user to adjust the percentage in the lurking variable category for each of the comparison groups and to see how this affects the observed relationships. As a slider is adjusted, a circle on the corresponding line (circle color the same as slider dot color) moves. Dashed lines from the circles to the axes highlight the relationship between the variable values. The data in the table is updated as the sliders are adjusted. The combined counts for the comparison groups and the percentages in the outcome category for the subgroups are fixed.

Points to Ponder:


Baker-Kramer Data:
Source: Wainer, H. (2002) The BK plot: Making Simpson's Paradox Clear to the Masses. Chance 15(3). Berkeley Admissions Data:
Source: Hammel, W., Bickel, P., and O'Connell, J.W. (1975) Is There a Sex Bias in Graduate Admissions? Science. 187.
Florida Death Penalty Data:
Source: Radelet, M. L. and Pierce, G. L. (1991). Florida Law Review .


Airlines Data:
Source: Moore, McCabe, Craig.

1964 Civil Rights Act Data:
Source: Simpson's Paradox. Wikipedia.

20 Year Smoker Survival:
Source: Vanderpump, M.P.J., Tunbridge, W.M.G., French, J.M., Appleton, D., Bates, D., Clark, F., Grimley Evans, J. Rodgers, H. Tunbridge F., and Young, E.T. (1996) The Development of Ischemic Heart Disease in Relation to Autoimmune Thyroid Disease in a 20-Year Follow-up Study of an English Community Thyroid 6(3):155-160.

House pet data:
Source: Schneiter (2012) Hypothetical study data.