Share via


2.3.1.1.3 Segment Size Limitations for .idf Files

Segments in .idf files MUST follow certain size rules. First, all in-memory segments that belong to the same column (that is, the same .idf file when persisted to disk) MUST be of equal size. Second, all in-memory segments have a minimum size of 16,384 rows and a maximum size of 16,777,216 rows. Third, all in-memory segments MUST have a row count that is a power of two.

Only two exceptions to these segment size requirements exist.

First, the last segment of a partition does not need to be within the range of the minimum and maximum row counts, nor does it need to have a row count that is a power of two. The last segment can even be of zero size (the case of the last segment as an empty segment). 

Second, when using hybrid compression, both a primary segment and a subsegment that is associated with the primary segment exist. These two segments MUST be considered one unit when applying these rules because the two segments represent data from the same column.

The case of the last segment as an empty segment can occur when an empty table (that is, a table with no rows) exists. The reason is that every column belonging to that table MUST have at least one segment, and every column is required to have a column data storage file (.idf file). Therefore, the first segment is also the last segment and can bypass the restrictions, and therefore be zero (empty). In other words, because the two segments are treated as one unit, both the primary (RLE) segment and the subsegment (bit packing subsegment) are zero (empty).

Note again that this limitation for segments is measured in rows, not in 8-byte units. The reason is that the size of a row is variable, because the particular column might be a column of floating point values, integers, strings, or BLOBs. However, if these row count requirements are adhered to, the compressed segments (which are persisted to the Spreadsheet Data Model file as streamed-in .idf files) will be correct and will not generate any errors or undefined behavior when the file is read.