Stat 6510/6810: Modern Nonparametric Statistics
Fall, 2005
TR 1:30-2:45
Geology 405


Instructor: Michael Minnotte
Office: Lund 201-C
Phone: 797-2844 (office)
E-mail: Mike dot Minnotte at usu dot edu
Office Hours: TR 9:30 - 10:20, W 10:30 - 11:20, or by appointment.



Announcements

Reminder: No class Tuesday, November 22!

The schedule for the final project presentations has been determined. You may trade with another student if you both agree, but otherwise, the order will be as follows:

I'll put announcements here. Please check regularly, especially if you have to miss class for any reason.


Assignments


R Examples



Traditional statistics has focused on parametric models, in which strong assumptions are made about the distributions of and relationships between the studied random variables. Examples include the normality assumptions behind t- and F-tests, and the relationship assumptions behind simple linear regression. In many cases, these assumptions are made, not because they are inherently plausible, but because they lead to the only known solutions. Nonparametric statistics works to minimize these assumptions. Typically, a small amount of power or accuracy is sacrificed when the parametric assumptions happen to be true, in exchange for a much greater applicability under most other conditions. We will consider theoretical, applied, and implementational issues in three broad classes of methods.

  1. Testing procedures which use ranks or the empirical cumulative distribution function to make distributional comparisons.
  2. Resampling methods such as the bootstrap and jackknife, which allow estimation of features such as bias, standard error, and confidence intervals largely independently of the statistic measured and the population distribution.
  3. Smoothing methods such as histograms, kernel density estimates, and local polynomial regression, which allow estimation of probability density functions and regression functions without limitation to specific mathematical models.

Prerequisites: A good foundation in probability through expectation and variance of continuous random variables, such as Math 5710 or the equivalent, will make things much easier. You may take this concurrently with 5710, but come talk to me if this is necessary. You should also have some prior statistics experience, at least at the introductory level.

Assignments: There will be a brief homework assignment every week. In addition, there will be a number of computer projects during the semester (approximately one every other week). These will generally require some computer work (to be begun in class, and completed later if necessary) and a write-up, and will be worth more than the weekly assignments. Finally, each student will also be expected to produce a written and oral report on a related topic of his or her choosing during the last two weeks of class. Assignments will be handed out in class and posted to the web site.

Grades: Each assignment will include a value in points. Most homeworks will be 20 points, and the projects and reports will generally be around 50 points. Your final grade will be determined by the sum of your points in all assignments.


Software: We will use the R computer package. R is a Gnu-license (freeware) clone of the S-Plus package, and is available for free download (Windows and Unix) from the Comprehensive R Archive Network (below). I will spend some time in class going over the use of R, and you will have the opportunity to do some work in class to gain experience while I can help you.

R Sites
The Comprehensive R Archive Network
Windows R Setup Executable Download - click on rwXXXX.exe, where XXXX gives the version number
R Frequently Asked Questions (FAQ) List
R for Beginners (58 page pdf file)
An Introduction to R (100 page pdf file)
Data Analysis and Graphics Using R -- An Introduction (112 page pdf file)



Texts:

These texts provide good surveys of resampling and smoothing methods, respectively. Be aware that a lot of material in class will come from additional sources listed below, especially those in bold.

Other Sources: Beyond the required text, additional material will be drawn from a number of sources. Some additional useful references are:

  1. Nonparametric Testing
  2. Resampling
  3. Smoothing


Disability Statement: If a student has a disability that will likely require some accomodation by the instructor, the student must contact the instructor and document the disability through the Disability Resource Center, preferably during the first week of the course. Any requests for special considerations relating to attendance, pedagogy, taking of examinations, etc. must be discussed with and approved by the instructor. In cooperation with the Disability Resource Center, course materials can be provided in alternative formats - large print, audio, diskette or Braille.

Late Adds: The last day to add this class is September 19. Attending this class beyond that date without being officially registered will not be approved by the Dean's Office.

Disclaimer: The instructor reserves the right to alter anything about this course, pretty much on whim (but he probably won't).


Return to Mike Minnotte's home page.
Last updated: August 29, 2005