Project 03: Data Analysis and Normality

upload two files in Moodle:
(1) Data due: Monday April 20th at 10:00am
Excel file containing the data you use

(2) Write-up due: Tuesday, 5.May at 1pm
No late write-ups will be accepted

If you work with a partner, make sure both your names are in the file.
40 points


Many naturally occurring phenomena, including physical characteristics of humans (such as height and weight), turn out to follow normal distributions. In fact, the normal distribution occurs so frequently that it is often referred to as the "cornerstone of modern statistics," and it is a useful model for many things. Techniques for determining whether a data set is normally distributed were presented in class on Monday, 24.Nov. You might also find the link, notes to help determine whether data is normally distributed, helpful.

In this project, you will choose some data, you will describe (mathematically describe, that is, which would include the 5-number summary, mean, skewness) the data, including some graphs, and you will analyze whether it is normally distributed. In your write-up, where you will detail your data and analysis, you should write as though you were explaining them to someone unfamiliar with it. It might help to write as though your parent, friend, or roomate is the reader.

You should find some published, existing data. Possible sources include: almanacs, magazine and journal articles (Consumer Reports is a good source), textbooks, athletic teams, newspapers, reference materials, campus organizations, or professors with experimental data. The library can help you find many excellent data sources. If you use the web, be sure to evaluate the URL you used, following the guidelines discussed in class with Julie Woodling, the instructional librarian. Do not use any data from our textbook or that we have already discussed in class. The web-site: http://it.stlawu.edu/~rlock/datasurf.html might be a good place to start. Most importantly, you should choose data which interests you.

The data you use should have at least 30 cases, at least one categorical variable, and at least one quantitative variable. The categorical variable should divide your data into two or more groups. For example, if you have data on the fifty states you might include a variable to keep track of geographic region (East, West, North, South). The name of the state may be useful to record the data, but it would NOT work as a categorical variable. Upload your data by Mon, 08.Dec at 10:00am.

Some of the metrics used to grade your paper will include: