GliderJan Varho

Hash Prioritized Treaps

  • algorithms
  • C
  • data structures
  • hash
  • random
  • treap

In looking for a data structure for a memory management problem I tried to find a self-balancing tree that would use as little space as possible. The only one I found from Wikipedia that didn't require any additional information at nodes was the scapegoat tree.

Unfortunately, it's not as easy to implement as the intro suggests. As a further complication, a scapegoat tree has some additional information at the root, which means one cannot use a subtree node as a tree for certain operations. Similar reasons made rb-trees unattractive.

Since I had already used treaps before, I thought I could create one without storing the priority, by basing it on a hash of the key. Since the data I will use it on is fairly nonuniform and I don't need any kind of security guarantees, I used a very simple hash function by taking essentially a single round of MurmurHash on an integer key.

The result is a fairly fast self-balancing tree that is very easy to implement without storing any additional information anywhere.

Due to the space constraints I mentioned above, I left out parent pointers by trading for O(log(n)*log(n)) expected insert performance with (tail) recursive bubble up. In practice it's not very noticeable, since the data is in caches after the first descent and the number of hashes does not increase. I was going to work around that by using a stack, but didn't see a need to.

By tagging pointers I can further drop child pointers where not needed to get a total space use of n*(sizeof(key)+sizeof(void*)) for the whole tree. That's still TODO, though.