A Pedagogical Moment

I have a bone to pick with individuals who teach statistics.

A standard deviation is a measure of the average distance between data points and their mean. Thus, if one’s dataset contained the numbers 2, 4, 6, and 8, the mean would be 20/4 = 5 (the sum divided by the number of data points). A standard deviation would be a measure of the average difference between each point - 2, 4, 6, and 8 - and the mean, 5.

The formula is, therefore, as follows:

Standard Deviation Formula

Seems straightforward, right? But there are a couple of oddities: First, why “n-1″ as opposed to “n”? I have seen proofs which justify this peculiarity, and am quite confident that no amount of time, energy, and chemical stimulants could ever allow me to understand what the hell is going on. So I’m content to let that one go. And to their credit, every stats prof. I’ve had has always been fairly upfront with the class on this question: “it’s just better” is a pretty standard summation. (The only variation I’ve ever heard is that the -1 “stands in for the mean” …that is, if every number in the set were the same, there could be no standard deviation, and the -1 exists to account for this… It didn’t make any sense when I first heard it, and it doesn’t now, either.)

My beef is with the second oddity: the entire formula is tucked under a radical; that is, the final step in the calculatory process is to take the final figure and get its square root. The question “why?” naturally follows, and the answer seems fine at first glance: it’s because the distance of each data point from the mean will necessarily produce negative figures, and squaring those differences gets rid of that pesky negative sign. Thus, one must calculate the square root of the final figure in order to “undo” the squaring of each previously squared distance from the mean. (Incidentally, the figure prior to this final step is referred to as the “variance,” which is useful in other scenarios, but represents essentially the same thing as the standard deviation… and if you’re a statistician who wants to e-mail me and tell me why that’s incorrect… please don’t).

Spot the trouble yet? I’ll go on…

My occasionally stubborn, often bizarre, 10-year-homeschooled brain immediately jumps to a question: rather than going through the whole squaring/square root rigamarole, why not just drop the sign? Take the absolute value of each distance from the mean and be done with it. In addition to being simpler, this corrects for the fact that the square root of the sum of squares is not equal to the sum of the square root of each individual squared distance from the mean. Math just doesn’t work that way (more on this later).

Think of it this way : let’s say we have four numbers, 2, 4, 5, and 9. Their mean is 5. Their sum of squares is 26. Twenty-six divided by 3 (remember, we’re using n-1) is 8.67, the square root of which is 2.94, which is the standard deviation of 2, 4, 5 and 9 using the above formula. However, if we use the absolute values of each data point’s distance from the mean, as mentioned above, things change. The sum of those values (3, 1, 0, and 4) is 8. Dividing 8 by 3 (remember, n-1…) gives us 2.67. Like I said: this is not the same thing as the standard deviation. So that whole “undoing the squaring” thing is just plain wrong.

It is my understanding that Sir Ronald Fischer, in his infinite wisdom, has produced a lengthy piece explaining, in great detail, why standard deviation is superior to “mean deviation” - what we got by using absolute values (2.67). Seeing as Fischer is widely considered to be the father of modern statistics, I’m willing to accept that he’s right and that years of review by people much smarter than I have essentially proven this to be the case.

…or this could just be because Fischer was an insufferable asshole with whom nobody cared to argue (I think Ward Churchill and Noam Chomsky have also adopted this tactic). Either way, I don’t care to question him.

My point - and yes, I do have one - is related to the way we teach statistics (thus, the title of this post): Statistics instructors should stop telling their students that taking the square root of the sum of squares divided by n-1 “undos” the squaring of each data point’s distance from the mean! It’s confusing! And statistics is confusing enough, damnit!

Moreover, I should like to point out that when I have inquired about this discrepency in the past, I was invariably met with a look which can only be described as “huh?” Not only does the question seem to be rejected prima facie simply because it’s the accepted method (knowing the inner workings of statistical techniques is not always a good thing, and is almost never necessary), but the notion that taking the square root of a sum of squares does not produce the same figure as the sum each individual number by itself has been, on multiple occasions, been met with disbelief.

Don’t believe me? 25+25+25+25 = 100. The square root of 100 is 10. However, 5+5+5+5 = 20.

And 20 is not 10. At least not yet.

…but ask me again on Nov. 5th.


No Responses to “A Pedagogical Moment”  

  1. No Comments

Leave a Reply





Get Firefox!