Adding
The core of a HyperLogLog instance consists of:
- a series of registers (), sometimes implemented as an array
- a precision () argument, usually between 4 and 16, specified on creation
- The precision determines how many registers there are available
- More registers reduce the error in the count, at the expense of more memory
- a hash function () which hashes the data into a 32-bit integer
The aim of HyperLogLog is to estimate the cardinality () based on the current state of the registers.
Adding an item might ultimately end up changing the state of the registers, as you will see below.
Diagram
Hopefully this diagram will explain the steps that occur during an add operation. If not, maybe the playground below might give you some intuition.
Playground
Use the tool below to see what happens when you add an item into HyperLogLog
Insert ()
Precision ()
Register #{{result.b}} was updated with value {{result.w}}
Value in register #{{result.b}} is > {{result.w}}, no update occurred.
Try inserting some data above, or click "Random" to insert a random value
Calculation
{{result.data}} | ||
{{toBinary(result.h, 32)}} ({{result.h}})
|
The hash of the value | |
{{toBinary(result.b, result.p)}} ({{result.b}}) |
Top {{result.p}} bits of | |
{{result.w}} | The number of leading zeroes + 1 of the remaining {{32-result.p}} bits |
|
✅ ❌ | ||
{{result.E}} | The approximate cardinality |
Registers {{result.m}}
Each register represents the maximum number of leading zeroes + 1 seen.
There are too many registers to display here, but you can imagine it! Try setting the precision to a lower value if you want to see the registers in action!
{{value}} |