My son has a statistics test tomorrow and is stuck on variances and deviations. Does anyone know a website that might help him? Thanks! He's at his frustration level............42
My son has a statistics test tomorrow and is stuck on variances and deviations. Does anyone know a website that might help him? Thanks! He's at his frustration level............42
What level of course is your son taking? Here is a useful site which contains information about a lot of a lot of topics.
http://www.statsoftinc.com/textbook/stathome.html
Here is an elementary discussion.
http://math.about.com/gi/dynamic/off...criptive2.html
If your son is just beginning his study, and wants to understand what the variance and standard deviation are all about, that's easy.
Here is a "population" of five numbers: 9, 10, 10 10, 11. The average (arithmetic mean) is 10. So far so good.
Here is another population: 0, 5, 10, 15, 20. The average of this population is 10, too.
So how are the two populations different, if they both have the same average? Well, in the first population, all the numbers are close together. They are all very close to the mean. In the second population they are spread apart. Some of them are a long way from the average. The variance measures, "how far from the mean are these numbers?"
So for each number, we look and see how far from average it is. The numbers 9, 10, 10, 10, 11 are, respectively, -1 unit, 0 units, 0 units, 0 units, and 1 unit away from the average of 10. Because we don't really care about the + or - sign, we square these differences, then take the average. That's the variance:
variance = [(-1)^2 + 0^2 + 0^2 + 0^2 + 1^2] / 5 = (1+0+0+0+1)/5 = 0.4
Now to make up for having squared these numbers, we have to take the square root to get the units to come out right (inches instead of square inches, etc.) This gives the standard deviation:
Standard deviation (sigma) = square root (0.4) = 0.63.
We say informally that the typical number in this list is about 10 +/- 0.63.
So the formula is
st. dev. (denoted by "sigma") = Square root [((x1-mean)^2 + (x2 - mean)^2 + ... + (xn-mean)^2) / N]
where N is the size of the population, and the variance is the same thing without the square root.
For the second population, the numbers are 0, 5, 10, 15, 20. Comparing them to the mean, they are off by -10, -5, 0, 5, and 10, respectively. Square these numbers, add them up, divide by 5, take the square root and we get a standard deviation of 7.07.
So now we have quantified the difference between the two populations. Although they both have the same average (10), the standard deviation is 0.63 for the first, showing that they are clustered closely about the mean, while for the second population the standard deviation is 7.07. This says that the typical member of the second population is about 7 units away from the average.
Application: If the data are normally distributed then we expect about 68% of the data to be within plus or minus one standard deviation of the mean, and we expect about 95% of the data to be within +/- 2 standard deviations of the mean. So for our first population we expect about 68% of the data to be in the range 10 +/- 0.63 -- that is, between 9.37 and 10.63 -- and we expect 95% of the data to be in the range 10 +/- (2*0.63), or within the interval 8.74 and 11.26.
All of this supposes that we have the whole population of numbers before us. In practice, we do not have access to every number in the whole population, and even if we did, that would be too many numbers to work with. So we estimate the statistical features of the whole population by taking a sample. Like when we take a political poll of 500 likely voters to try to predict how the whole population of 40 million are going to vote.
To compute the mean, variance and standard deviation of a sample everything is the same except that we divide by n-1 instead of by n when we compute the variance and standard deviation. The reason for this is quite subtle and has to do with the concept of bias and with "degrees of freedom" -- basically, we lose a little accuracy whenever we use a sample statistic to estimate a real population statistic in the real world, so we have to take that into account in our formulas.
So if the first set of numbers, {0, 5, 10, 15, 20} is not the whole population but just a small sample of five things chosen at random from a larger set, then the standard deviation is
s = square root (((x1-mean)^2 + (x2 - mean)^2 + ... + (xn-mean)^2) / n-1)
= square root ((100+25+0+25+100)/4)
= 7.9,
instead of 7.07. In practice it is almost always a sample that we are working with, but on the other hand except for very small samples the difference between the two formulas is not worth bothering about.
The standard deviation has many uses besides just as a descriptive measure of variation. For instance, if you want to estimate the mean of a population by knowing the mean of a sample, a 95% confidence interval for the true mean is given by the formula,
true mean = sample mean +/- 1.96 sigma/(square root of n),
where sigma is the standard deviation.
Last edited by Mathman; 09-19-2004 at 03:53 PM.
Wow Mathman, you are a lifesaver........I am sure this will help him. He's in his third year of college and this is a course in basic stats for business and economics. He got an A on his first test, but this section had him stumped. I'll pass this info along to him........Thanks again......... 42
Mathman,
Why weren't you my freshman calculus prof. I still remember his opening greeting to us - "I care as much about you students as you care about my salary". The only course in which I got a C in both semesters and wore them like a badge of honor.
ok that was so over my head... so glad I am not taking THAT ever!
seriously though, show, sending luck your son's way!
Thanks Toni.......he's been studying up a storm and feels much more confidant. Here's crossing my fingers.......42
Math, I am totally, totally impressed!!!!
Show's son-----Good Luck.
I hated all that stuff.
Dee
Okey, dokey, that's as far as I got before I had to say, "HUH?" :DOriginally Posted by Mathman
Hey RGal, he lost me at "Here"........but luckily my son understood it..... 42
When I first saw the title of the thread I thought that Show was offering to do a little math tutoring here at GS. Being highly allergic to math I headed in the opposite direction.
MM why do you use [(())]. I learnt from 5th grade to use {[()]}Originally Posted by
So the formula is
st. dev. (denoted by "sigma") = Square root [((x1-mean)^2 + (x2 - mean)^2 + ... + (xn-mean)^2) / N
I have a better application for you, why don't you time MK's free skate for her next 7 competitions. (I assume one cheesefest at the beginning and end of the season, 2 GP events, one GPF, nationals and worlds. I am not trying to tempt skating gods, I can reasonably assume that she is injury free and skates good enough to qualfiy in all these competitions.) Lets see what is her standard deviation. Now compare her standard deviation to another skater of your choice e.g. Fumie. or an American skater e.g. ......Application If the data are normally distributed then we expect about 68% of the data to be within plus or minus one standard deviation of the mean, and we expect about 95% of the data to be within +/- 2 standard deviations of the mean. So for our first population we expect about 68% of the data to be in the range 10 +/- 0.63 -- that is, between 9.37 and 10.63 -- and we expect 95% of the data to be in the range 10 +/- (2*0.63), or within the interval 8.74 and 11.26.
I am sure MK will skate with a high degree of freedomdegrees of freedom
You bias re: Mk. Nah neverthe concept of bias
An even better application, if you take a few clocks and use them to time the same piece of music e.g. 1000 times, then compare the sigma for the different clocks, I bet the clock used at worlds 04 sp will give you a sigma that translate to > 2 seconds.The standard deviation has many uses besides just as a descriptive measure of variation. For instance, if you want to estimate the mean of a population by knowing the mean of a sample, a 95% confidence interval for the true mean is given by the formula,
true mean = sample mean +/- 1.96 sigma/(square root of n),
where sigma is the standard deviation.
One of my scientific friends did a similar experiement and the result? The clock with the biggest sigma is a Citizen's. BTW the true mean of the Citizens was truly mean
Last edited by gezando; 09-22-2004 at 12:03 AM.
Hells Bells Mathman!
I did basic stats in Uni and just about managed to get my head round it!
Nowadays SPSS is my best friend
Dear 42, Glad your son made it to statistics! Here's a site that helps with homework questions in this area (and many others, as well)
http://www.netskool.com/netskool/postQuestion.jsp
There may be a membership or fee, however, from my own personal experience, it may be worth it if it helps one to grasp the processes and eventually leads to a good or passing grade.
Good luck,
Your sk8m8, Mark
No, no, everybody. This standard deviation thing is a snap.
Michelle skates her SP at three different events. I time the performances with my Citizens' watch. The times are
2:38, 2:40, and 2:42.
So the average time is 2:40, but once she went under by 2 seconds and once she went over by 2 seconds.
This 2 is the standard deviation of this sample, and that's all it means. Sometimes she was under and sometimes she was over, and the amount that she was off was typically about 2 seconds.
Everything else is just your math professor trying to impress you.
Mathman
PS. Gezando, actually there is a reason (in fact, two) why I don't like to use that convention about nested parentheses. The first is that I want to reserve braces { } to enclose the elements of a set, and I want to reserve brackets [ ] for html commands, like bold, color, etc. I don't want to confuse the GS vBoard software, LOL.
The second reason is that not all programming languages read these symbols as parentheses in mathematical or logical statements. But they do all handle nested parentheses ((((((((...)))))))) very easily. It's just we poor humans that need help in matching up which pairs of parentheses go together.
Exercise. In seven performances , Michelle's SP went
2:45, 2:47, 2:47, 2:48, 2:46, 2:47, 2.47
for a mean of 2:47 and a standard deviation of 1.3 seconds. What is the probability that she will go over the new time limit of 2:50 in Moscow?
Answer: About 1 per cent.
Last edited by Mathman; 09-22-2004 at 03:24 PM.
Bookmarks