Stat 6560 - Statistical Graphics and Visualiztion

3 Credits
MW 3:30-4:45
FAV 264

Instructor:
Michael Minnotte
Office: Lund 201-C
Phone: 797-2844
E-mail: minnotte@math.usu.edu
Office Hours: TH 10:00 - 11:30 or by appointment

Texts:
Tufte, Edward R., (1983) The Visual Display of Quantitative Information, Cheshire, Connecticut, Graphics Press.
Cleveland, William S., (1993), Visualizing Data , Summit, New Jersey, Hobart Press.
S-plus scripts for figures
Data in S-plus format


Statistical graphics and data visualization are critical elements of modern data analysis and presentation. From initial exploration of a data set to the final presentation of results to the end user, graphics play a vital role in shaping our understanding of our data. Through proper use of graphics, we can make critical discoveries, and communicate them clearly. Conversely, poor use or misuse of graphics can seriously mislead (by accident or design).

In this course, we will start with presentation graphics, including discussion of both tools and principles which lead to clear communication and those which serve only to confuse or mislead. We will spend most of the semester in exploratory graphics and data analysis, broken down largely by the dimension of the applicable data. One- and two-dimensional datasets require and allow far different methods than those of more than three dimensions. Categorical and regression data call for their own specialized methods.

Even more than most aspects of statistics, graphics and visualization involve art as well as science. In most cases, there are many reasonable approaches. But an understanding of the options available and the underlying priciples will lead to successful analysis and presentation.


Prerequisites:
I will not enforce any specific prerequisites. A course or two in traditional statistical analysis would be very helpful, as would prior computing experience, but neither is required. Previous experience with S-plus (statistical computing package) will give a head start, but again is not necessary.

Other Sources:
Beyond the required texts, material will be drawn from a number of sources. Some additional useful references are:

Cleveland, William S., (1994), The Elements of Graphing Data, Summit, NJ, Hobart Press.
Cleveland, William S. and McGill, Marylyn E., eds., (1988), Dynamic Graphics for Statistics, Belmont, CA, Wadsworth & Brooks/Cole.
du Toit, S.H.C., Steyn, A.G.W., and Stumpf, R.H., (1986), Graphical Exploratory Data Analysis, New York, Springer-Verlag.
Henry, Gary T., (1995), Graphing Data: Techniques for Display and Analysis, Thousand Oaks, CA, SAGE Publications.
Tufte, Edward R., (1990), Envisioning Information, Cheshire, CT, Graphics Press.
Tufte, Edward R., (1997), Visual Explanations: Images and Quantities, Evidence and Narritive, Cheshire, CT, Graphics Press.
Tukey, John W., (1977), Exploratory Data Analysis, Reading, MA, Addison-Wesley.
Wainer, Howard, (1997), Visual Revelations: Graphical Tales of Fate and Deception from Napoleon Bonaparte to Ross Perot, New York, Springer-Verlag.
Wallgren, A., Wallgren, B., Persson, R., Jorner, U., and Haaland, J., (1996), Graphing Statistics and Data: Creating Better Charts, Newbury Park, CA, SAGE Publications.


There will be a variety of assignments throughout the quarter. Each assignment will include a value (typically 20-100 points) that it will be scored out of. Your final grade will be determined by the sum of your points in all assignments. Some assignments will include combinations of analysis of existing graphics, creation of your own, computer work (mostly in S-plus, but some other packages/languages as well), and short oral presentations. The value of each assignment should be roughly proportional to its importance and the amount of work involved.


Assignments

Assignment 1 (20 points, due September 16, 1998)
Assignment 2 (30 points, due September 23, 1998)
Assignment 3 (30 points, due September 30, 1998) - data3.1, data3.2, data3.3.
Assignment 4 (30 points, due October 14, 1998) - iris data.
Assignment 5 (20 points, due November 2, 1998)
Assignment 6 (20 points, due November 23, 1998)
Final Project 1 (Written, 50 points, Oral, 25 points, due November 30, 1998)
Final Project 2 (Oral, 25 points, due December 7, 1998; Written, 100 points, due December 17, 1998)


Software

Univariate empirical cdf and kernel density estimate S-plus functions
Mode tree S-plus functions
Mode forest S-plus functions
SiZer S-plus functions
Multiple kde's broken down by a factor S-plus function
Pairwise quantile-quantile plot S-plus function
Tukey mean-difference plot S-plus function
Spread-location plot S-plus function
Multiple scatterplots broken down by a factor S-plus function
Jittering S-plus function
Bivariate kde S-plus function
Local polynomial regression S-plus functions
Multiple perspective trivariate cloud S-plus function
Trivariate local polynomial regression S-plus functions
Andrews curve S-plus function
Color histogram / data image S-plus functions


If a student has a disability that will likely require some accomodation by the instructor, the student must contact the instructor and document the disability through the Disability Resource Center, preferably during the first week of the course. Any requests for special considerations relating to attendance, pedagogy, taking of examinations, etc. must be discussed with and approved by the instructor. In cooperation with the Disability Resource Center, course materials can be provided in alternative formats -- large print, audio, diskette or Braille.

Disclaimer: This is a new course, and the instructor is pretty much making it up as he goes. (The course, that is. Not the material.) He reserves the right to alter anything about this course, pretty much on whim.


Return to Mike Minnotte's home page.
Last updated: November 9, 1998