Web-based Statistics:
Lecture 12
Friday, May 26, 2000

An Introduction to XploRe

http://www.XploRe-stat.de


XploRe is a combination of classical and modern statistical procedures. It is the basis for statistical analysis, research, and teaching. Its purpose lies in the exploration and analysis of data, as well as in the development of new techniques.

Now we want to see how its XploRe can be used as a statistical package to perform statistical analyses.

The Net Based Statistics exists. But it is inconvenient, because it will take long time to get response from the server. So we give it up and use the old version XploRe.

First, we must download XploRe package from the web.

  1. Click Try XploRe on the homepage and get the download information.
  2. Fill out the form on the page.
  3. Click Submit All Given Information & Download XploRe botton.
  4. Choose all files to download.
  5. Then install it according to the directions.
  6. After the files are installed, click on the XploRe icon, and three windows open up on the screen.

These windows are:

Read Data

We can open the editor-window in XploRe, and load the data from a file.

(1) Use READ to read the Numerical Data.

Now we will show you some simple examples using the data "pullover" within the XploRe datasets.

Example 1. How to read numerical data only(using build-in datasets "pullover")

Solution: copy the following program in the editor window:

          
 x = read("pullover")  //pullover is the file name of data 
 x    // if you want to list the data in the output window

(2). Use READM to read the Mixed Data that contains Numerical Data and Text Data.

Example 2: How to read mixed data (using user-entered datasets) Now open the following file:

http://www.math.usu.edu/~vukasino/teaching/spring2000/complab/student_data1.prn,and copy the data to a new editor-window in the XploRe. We should delete the first line. The data is in the matrix form. Then save the data as user.dat .

Let's open a new editor-window by clicking the program again. Since this data set includes both Numerical and Text Data, we use READM to read the mixed data set.

Then copy the following lines into a new editor area:

library("XploRe")
x = readm("user")
x

Click the Execute to get the output. In the output-window, 'x.type', 'x.double' and 'x.text' are shown. The original data set was divided into two parts. The numerical data are double type, and the character data are text type. The 'x.double' has 5 variables in order : nr, age, sibling, height, weight.

Compute statistics

Let's pick the last four numerical variables as matrix y to compute the statistics using function mean and median. Copy the following lines into the editor-window.

y = x.double[ , 2:5]
library("stats")
mean(y)
median(y)
var(y)

We can also use the function summarize to obtain the statistical summary.

library("stats")
summarize(y, "age"|"sibling"|"height"|"weight")

Plot histogram and boxplot

Now, let's plot histogram and boxplot, using the functions plothist and plotbox to plot the fourth column height. Note that Plothist and plotbox can only plot for data in one column. Copy the following short program in a new editor.

library("XploRe")
x = readm("user")
z = x.double[ , 4]
library ("plot")
plothist(z)
plotbox(z)

Then click the Execute.

Exercise 1. Please write a program to plot the histogram and plotbox for age by yourselves.

Simple Linear Regression

Now, let's do the linear regression weight on age. First show the ANOVA table , and then plot the scatterplot and regression line. Copy the following program in a new editor-window.

library("XploRe")
x = readm("user")
y = x.double[ , 2|5]
y1 = y[ , 1]
y2 = y[ , 2]
library("stats")
{b, bse, bstan, bpval} = linreg(y1, y2)
library("plot")
plot(y)
regy=grlinreg(y)
plot(y, regy)

Exercise 2. Please write a program, to obtain the ANOVA table, and plot the regression line of height on age.

Basic features of XploRe

  1. XploRe is developed for Windows 95 / NT, Solaris 2, HP-UX, SGI, and Linux 1.2. The Java Client/Server engines are available for these platforms as well. XploRe may be downloaded for a free trial period.

  2. The user writes procedures or functions, like in Pascal or C/C++. In contrast to these languages the declaration of variables is not necessary. Furthermore, variables can be collected in list structures, so that it is possible to hold common information of a data set in a single data object.

  3. Case sensitive.

Homework:

  1. Write a program to plot the histogram and boxplot for sibling and weight , respectively.

  2. Write a program to perform the regression analysis and plot the regression line of Weight on height.