Project 03: Normal Distributions and Sampling

due: "Last Day of Classes," Tues, 29.Apr at 5:00pm
upload your file in Moodle, with the name ma103Project03YOURNAME
(saving in MS Word should automatically give the file the proper extension of .doc or .docx)
If you work with a partner, make sure both your names are in the file.
20 points


Many naturally occurring phenomena, including physical characteristics of humans (such as height and weight), turn out to follow normal distributions. In fact, the normal distribution occurs so frequently that it is often referred to as the "cornerstone of modern statistics," and it is a useful model for many things.

In this project, you will first analyze a given data set to determine (or confirm) that it is normally distributed. Then you will more closely analyze a smaller sample of similar data, and analyze whether this sample is representative of the larger data set.

You may use the obituary data below, or are welcome, and encouraged, to find data that interests you. For instance, most averages in baseball statistics, such as batting average, ERA, and fielding, etc., are normally distributed. If you choose to use your own data, just make sure to get approval from the instructor by the end of class on Friday, April 18.
You will earn 3 extra-credit points for finding your own data. You would be expected to perform an analysis similar to what you are asked to do below, as well as an analysis of the validity of your data source.

If you are interested in analyzing the age data, you should use, Project03AgeDataS08sec01.xls. While ages are not usually normally distributed (as opposed to the characteristics given above), some portion of age data might be. So, do a careful analysis of this data to determine whether it is normally distributed. Directions and a detailed example were discussed in class on Friday, 11.Apr. You should produce at least one bar chart/histogram as part of this analysis.

Finally, find 20 obituaries (Prof. Kruse has saved over 100 from the local paper and Prof. Roth has some of them), and record
(1) the name of the deceased
(2) their age at death
(3) their gender
and
(4) two other characteristics you may be interested in, such as highest level of education attained, military service, etc.

You may also search online for obituaries, just provide the URL(s) you used.

Analyze this smaller sample, considering our discussion in class involving Sampling in Topic 20, and conclude whether it could be representative of the larger sample of 200+ ages. Include the data you used in your write-up (with names), and you should include some type of graph or chart explaining each characteristic which you tracked.

 

Some of the metrics used to grade your paper will include: