A family of Microsoft spreadsheet software with tools for analyzing, charting, and communicating data.
SidhartaPrahladsingh wrote:
The problem is reproduced in the following manner:
- Make a column of the number "0.1". Say about a column of 6 "0.1"s. For example A1:A6 = 0.1
- Let another cell calculate the Standard Deviation by calling e.g. =STDEV(A1:A6)
The result of this will be:
1.52024E-17 Which is incorrect since I did not have a deviation. If I use A1:A5 instead, it does give me the right answer meaning 0.
[....]
So obviously my question is, can this be fixed?
In general, the work-around is: if you expect accuracy to a specific number of decimal places, explicitly round. For example, =ROUND(STDEV(A1:A6),6).
That applies to all arithemtic with non-integers.
The problem is: Excel uses binary to represent numbers, specifically 64-bit binary floating-point, an industry standard. So most non-integers cannot be represented exactly.
In particular, 0.1 cannot be represented exactly by any sum of powers of 2. Instead, its internal representation is exactly 0.100000000000000,0055511151231257827021181583404541015625.
(I use period for the decimal point and comma to demarcate 15 significant digits, the most that Excel will format.)
Because of the approximate representation, the sum and subsequent division in computing the average and the variance incur varying degrees of numerical error (difference from expectations mathematically).
In particular, AVERAGE(A1:A5) =
0.100000000000000,0055511151231257827021181583404541015625
which is the same representation as 0.1. But AVERAGE(A1:A6) =
0.0999999999999999,9167332731531132594682276248931884765625
which Excel displays as 0.100000000000000, but which is not the same as the constant 0.1. So when calculating the variance, 0.1 - average might not be exactly zero.
You might notice that =0.1-AVERAGE(A1:A6) is exactly zero, nevertheless.
But =0.1-AVEARGE(A1:A6)-0 is not exactly zero, as we would expect. This reflects the true arithmetic result, which Excel uses internally (usually).
Excel returns exactly zero in the first formula because of a dubious heuristic whereby under very specific conditions, Excel replaces a non-zero result with zero if it deems the non-zero result to be "close enough" to zero.
[EDIT] Of course, mathematically, the "- 0" operation at the end should not make any difference. The difference is due to the capriciousness of the implementation of the heuristic; that is, the "very specific conditions". Even adding parentheses around the expression -- that is, =(0.1 - AVERAGE(A1:A6)) -- is sufficient to violate the "very specific conditions" and thwart the heuristic. Klunk!