Counting loading

This algorithm approximates the number of unique items (cardinality) of a multiset. This is achieved by using a hash function that is applied to every element that is to be counted. The algorithm observes the maximum number of leading zeros that occur for all hash values, where intuitively hash values with more leading zeros are less likely and indicate a larger cardinality.


Inputs

Insert 1,000,000 random records with a given cardinality of...

Precision

Smaller will give a worse estimate

less error, more memory → ←greater error, less memory
{{precision}}

Result

Estimate () {{Math.round(result.estimate, 0)}} * ({{(((( Math.round(result.estimate, 0) - result.actual_cardinality)/result.actual_cardinality) * 100.0)).toFixed(2)}}% error)
Registers () {{result.m}} The number of registers used.
Memory used {{result.m*8}} bytes
* Estimate has been corrected using {{result.correction_used.name}}

Calculation

The cardinality () is estimated by the formula

{{result.alpha_m}} A constant used to correct a systematic multiplicative bias
{{Math.pow(result.m, 2.0)}} The number of registers () squared
{{result.z}} The harmonic mean of the registers

Corrections

For certain cardinalities, the above calculation yields an incorrect result, so HyperLogLog applies a measure to correct this

Original estimate () {{result.originalEstimate}}
Correction Method {{result.correction_used.name}} (load factor = {{result.correction_used.metadata.FillFactor}})
Linear Count

Linear Count is a simple counting algorithm that considers the load factor of the registers. A low load factor indicates the probability of collisions is low, and the cardinality can be estimated by counting the number of registers that are empty (), and applying the following formula

As you can see from the graph below, this gives a fairly good estimate of the cardinality when the registers are not full, but becomes more prone to error as approaches 0.

This is why HyperLogLog only uses LinearCount for smaller cardinalities.


Registers {{result.m}}

There are too many registers to display here, but you can imagine it! Try setting the precision to a lower value if you want to see the registers in action!

Each register represents the maximum number of leading zeroes + 1 seen.

{{value}}