Share via

Excel histogram bin anomaly

Anonymous
2015-02-06T15:34:20+00:00

Why do user specified bins become ignored by the histogram function?

Example input data   is just an incrementing  sequence of numbers 

0
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
0.18
0.2
0.22
0.24
0.26
0.28
0.3
0.32
0.34
0.36
0.38
0.4
0.42
0.44
0.46
0.48
0.5

Example bins: are just an incrementing sequence of numbers.

0
0.1
0.2
0.3
0.4
0.5

Result from histogram function is not evenly distributed at all - there is always an anomaly in the fifth bin location:

Bin Frequency
0 1
0.1 5
0.2 5
0.3 5
0.4 4
0.5 5
More 1

The bin 0,4 is missing one value and it is then placed into more.

Microsoft 365 and Office | Excel | For home | Windows

Locked Question. This question was migrated from the Microsoft Support Community. You can vote on whether it's helpful, but you can't add comments or replies or follow the question.

0 comments No comments

Answer accepted by question author

Anonymous
2015-02-06T16:00:45+00:00

Tim wrote:

Why do user specified bins become ignored by the histogram function?

Example input data   is just an incrementing  sequence of numbers

[.... incrementing by 0.02 from 0 to 0.5 .....]

Result from histogram function is not evenly distributed at all - there is always an anomaly in the fifth bin location:

Bin Frequency
0 1
0.1 5
0.2 5
0.3 5
0.4 4
0.5 5
More 1

The bin 0,4 is missing one value and it is then placed into more.

The problem is with your data, not the FREQUENCY function or Histogram feature.

You do not say exactly how you set up your data.  My guess:  you enter 0 into A1 (e.g.), then enter =A1+0.02 into A2 and copy down.

The remedy:  change the formula in A2 to:

=ROUND(A1+0.02, 2)

if you expect accuracy to 2 decimal places.  Copy A2 down.

The root cause of the problem is infinitesimal differences that arise in computer binary arithmetic because most non-integers cannot be represented exactly in binary.

You can see these differences and get some insight into the histogram results if you enter the following into B1 and copy down, assuming your data is in A1:A26:

=IF(B1-TEXT(B1,"0.00")=0, "", B1-TEXT(B1,"0.00")-0)

The redundant -0 at the end is needed to work around a dubious of feature of Excel, whereby the true difference is arbitrary replaced (but only sometimes!) with exactly zero if the true difference is "close enough".

You will see that what appears to be 0.4 is actually about 5.55E-17 larger, whereas what appears to be 0.3 is indeed 0.3.  That is why the 0.4 bin has a lower count.

Also, what appears to be 0.5 is actually about 1.11E-16 larger.  That is why the "more" bin has a count of 1.

Was this answer helpful?

0 comments No comments

0 additional answers

Sort by: Most helpful