STDEV problem

Question

STDEV problem

Anonymous

Hello!

This is maybe a minor problem but I have noticed something off about the STDEV function.

A friend of mine showed me a problem where he had a list of numbers and used the STDEV function over it.

Now the problem I noticed is that the STDEV has incorrect answers sometimes. Minor difference but still obvious.

The problem is reproduced in the following manner:

Make a column of the number "0.1". Say about a column of 6 "0.1"s. For example A1:A6 = 0.1
Let another cell calculate the Standard Deviation by calling e.g. =STDEV(A1:A6)

The result of this will be:

1.52024E-17

Which is incorrect since I did not have a deviation. If I use A1:A5 instead, it does give me the right answer meaning 0.

This is indeed a trivial number and can be seen as zero, but it really is not zero. I am worried that this will affect larger calculations.

This happens for other number as well btw like e.g. 0.2.

So obviously my question is, can this be fixed? Also can others tell me if they can reproduce this issue too and confirm?

Locked Question. This question was migrated from the Microsoft Support Community. You can vote on whether it's helpful, but you can't add comments or replies or follow the question.

0 comments

Answer accepted by question author

5 additional answers

Answer 1

SidhartaPrahladsingh wrote:

The problem is reproduced in the following manner:

Make a column of the number "0.1". Say about a column of 6 "0.1"s. For example A1:A6 = 0.1

Let another cell calculate the Standard Deviation by calling e.g. =STDEV(A1:A6)

The result of this will be:

1.52024E-17

Which is incorrect since I did not have a deviation. If I use A1:A5 instead, it does give me the right answer meaning 0.

[....]

So obviously my question is, can this be fixed?

In general, the work-around is: if you expect accuracy to a specific number of decimal places, explicitly round. For example, =ROUND(STDEV(A1:A6),6).

That applies to all arithemtic with non-integers.

The problem is: Excel uses binary to represent numbers, specifically 64-bit binary floating-point, an industry standard. So most non-integers cannot be represented exactly.

In particular, 0.1 cannot be represented exactly by any sum of powers of 2. Instead, its internal representation is exactly 0.100000000000000,0055511151231257827021181583404541015625.

(I use period for the decimal point and comma to demarcate 15 significant digits, the most that Excel will format.)

Because of the approximate representation, the sum and subsequent division in computing the average and the variance incur varying degrees of numerical error (difference from expectations mathematically).

In particular, AVERAGE(A1:A5) =

0.100000000000000,0055511151231257827021181583404541015625

which is the same representation as 0.1. But AVERAGE(A1:A6) =

0.0999999999999999,9167332731531132594682276248931884765625

which Excel displays as 0.100000000000000, but which is not the same as the constant 0.1. So when calculating the variance, 0.1 - average might not be exactly zero.

You might notice that =0.1-AVERAGE(A1:A6) is exactly zero, nevertheless.

But =0.1-AVEARGE(A1:A6)-0 is not exactly zero, as we would expect. This reflects the true arithmetic result, which Excel uses internally (usually).

Excel returns exactly zero in the first formula because of a dubious heuristic whereby under very specific conditions, Excel replaces a non-zero result with zero if it deems the non-zero result to be "close enough" to zero.

[EDIT] Of course, mathematically, the "- 0" operation at the end should not make any difference. The difference is due to the capriciousness of the implementation of the heuristic; that is, the "very specific conditions". Even adding parentheses around the expression -- that is, =(0.1 - AVERAGE(A1:A6)) -- is sufficient to violate the "very specific conditions" and thwart the heuristic. Klunk!

Answer 2

Hi. I get the same answer with other programs. Excel is pretty consistent. It's due to the nature of machine precision.

Note that =DEVSQ(0.1, 0.1, 0.1, 0.1, 0.1, 0.1) returns 1.155E-33.

Which when used as =SQRT(DEVSQ(0.1, 0.1, 0.1, 0.1, 0.1, 0.1)/(6-1))

returns the same value: 1.52E-17.

Answer 3

Anonymous

Thank you for the in depth answer! Very insightful.

I will take your advice and use the workaround.

0 comments

Answer 4

So obviously my question is, can this be fixed? Also can others tell me if they can reproduce this issue too and confirm?

Hi,

It's easily reproducible and not just with STDEV and It's to do with the way Excel stores floating point numbers. See the KB article; link below. There is no solution as such but as the KB article describes there are workarounds using ROUND like this which will return zero for any number of 0.1s

=ROUND(STDEV(A1:A6),5)

https://support.microsoft.com/kb/78113/en-gb?wa=wsignin1.0

Answer 5

Anonymous

Hi Dana,

Same results here. Fully agree about machine precision but there is a work around.

with xnumbers.xla, =xstdev(A1:A6,100) returns 0

0 comments

Share via

STDEV problem

5 additional answers