JumpBackHash: Say Goodbye to the Modulo Operation to Distribute Keys Uniformly to Buckets

Data Structures and Algorithms arXiv:2403.18682, 2024

The distribution of keys to a given number of buckets is a fundamental task in distributed data processing and storage. A simple, fast, and therefore popular approach is to map the hash values of keys to buckets based on the remainder after dividing by the number of buckets. Unfortunately, these mappings are not stable when the number of buckets changes, which can lead to severe spikes in system resource utilization, such as network or database requests. Consistent hash algorithms can minimize remappings, but are either significantly slower than the modulo-based approach, require floating-point arithmetic, or are based on a family of hash functions rarely available in standard libraries. This paper introduces JumpBackHash, which uses only integer arithmetic and a standard pseudorandom generator. Due to its speed and simple implementation, it can safely replace the modulo-based approach to improve assignment and system stability. A production-ready Java implementation of JumpBackHash has been released as part of the Hash4j open source library.

Keep exploring ...

... and find out even more about engineering at Dynatrace.

Research

Dynatrace Research is shaping the technological future of Dynatrace in the domain of software intelligence.

Developer Relations

Labs

Each one of our Engineering Labs is designed to inspire innovation, collaboration and big ideas.

JumpBackHash: Say Goodbye to the Modulo Operation to Distribute Keys Uniformly to Buckets

Keep exploring ...

Research

Developer Relations

Labs

You want to shape the tech future of Dynatrace together with us?