A family of Microsoft spreadsheet software with tools for analyzing, charting, and communicating data.
Tim wrote:
Why do user specified bins become ignored by the histogram function?
Example input data is just an incrementing sequence of numbers
[.... incrementing by 0.02 from 0 to 0.5 .....]
Result from histogram function is not evenly distributed at all - there is always an anomaly in the fifth bin location:
Bin Frequency 0 1 0.1 5 0.2 5 0.3 5 0.4 4 0.5 5 More 1 The bin 0,4 is missing one value and it is then placed into more.
The problem is with your data, not the FREQUENCY function or Histogram feature.
You do not say exactly how you set up your data. My guess: you enter 0 into A1 (e.g.), then enter =A1+0.02 into A2 and copy down.
The remedy: change the formula in A2 to:
=ROUND(A1+0.02, 2)
if you expect accuracy to 2 decimal places. Copy A2 down.
The root cause of the problem is infinitesimal differences that arise in computer binary arithmetic because most non-integers cannot be represented exactly in binary.
You can see these differences and get some insight into the histogram results if you enter the following into B1 and copy down, assuming your data is in A1:A26:
=IF(B1-TEXT(B1,"0.00")=0, "", B1-TEXT(B1,"0.00")-0)
The redundant -0 at the end is needed to work around a dubious of feature of Excel, whereby the true difference is arbitrary replaced (but only sometimes!) with exactly zero if the true difference is "close enough".
You will see that what appears to be 0.4 is actually about 5.55E-17 larger, whereas what appears to be 0.3 is indeed 0.3. That is why the 0.4 bin has a lower count.
Also, what appears to be 0.5 is actually about 1.11E-16 larger. That is why the "more" bin has a count of 1.