*Jon Maiga, 2020-08-10*

As I am already completely off track of what I was going to do (b-spline noise) I might just as well round up this little hashing side project with a hash function based on the previous findings.

This is the third post in the series, here are the two earlier the construct of a bit mixer and tuning bit mixers.

- Not sensitive to the choice of multiplier constant
- Native c++ code (no machine specific instructions for now)
- A counter based pseudo random number generator that passes PractRand $2^{44}$ (>16TB)
- For hashing pass all SMHasher tests

After reviewing the findings in my earlier experiments I decided to try using the `mxmxmx`

construction, hence the name `mx3`

, multiply and xor-shift, repeated three times. The compelling argument to use this is

- It’s the simplest construction I found that is possible to push pass PractRand $>2^{44}$ bytes.
- For hashing, the
`mxm`

part alone of the construction looks promising for mixing the data stream. Also in my earlier speed tests it seems efficient.

All constants (except deliberate bad baselines `c_base`

, `c_found`

) passed PractRand $2^{44}$ bytes (>16TB) for the `mxmxmx`

construction. This is very reassuring that we aren’t too dependent on finding a needle in a haystack. However it’s noticeable that the `random`

does have worse avalanche behaviour than the others (see `xmxmx`

and `mxmxmx`

in the previous post), so one has to pay some attention in the selection.

To obtain an multiplier I employed the SFFS algorithm starting without any seed. After running for a couple of hours it came back with

```
C = 0xbea225f9eb34556d
```

To verify this is a good constant, I ranked it for the mx3 construction against about 25 other constants with respect to avalanche behavior - and sure enough it ranked among the top ones. Finally I verified it had expected PractRand and SMHasher results.

Here is the 64-bit mixer

```
inline uint64_t mix(uint64_t x) {
x *= 0xbea225f9eb34556d;
x ^= x >> 33;
x *= 0xbea225f9eb34556d;
x ^= x >> 29;
x *= 0xbea225f9eb34556d;
x ^= x >> 39;
return x;
}
```

Nothing revolutionary here, it uses the familiar multiply and xor-shift as seen in Murmur3 and SplitMix, but with an extra multiplication on beforehand, different xor-shift constants and only one multiplier constant, `C`

. The xor-shift constants 33/29/39 are the ones found in my earlier experiment.

The PRNG just uses the mixer and feeds it with a counter like this:

```
class random {
public:
explicit random(uint64_t seed) : _counter(seed) {}
uint64_t operator()() { return mix(_counter++); }
private:
uint64_t _counter;
};
```

This PRNG has a cycle of $2^{64}$.

The hash would look something like this (pseudo code)

```
uint64_t hash(data,end) {
uint64_t h = 0;
while (data != end)
h += mxm(data++);
return mix(h);
}
```

For performance reasons, I think we don’t need the full `mix`

for every 8 bytes of data in the stream. Here `mxm`

refers to the first three operations in the `mix`

function. But we needed to make a modification in `mxm`

to make this work, more on this a bit down.

SMHasher is a testsuite designed especially for testing hash functions. I went ahead and cloned a fork by Reini Urban. To get started quickly I used `fast-hash`

as a template to implement the new hash function.

At first, the idea I outlined above did not work as it failed some differential and permutation tests (among other). However, once I started to get a feeling for why the tests failed I it was pretty straightforward to fix. For example in the stream mixer I had to not only sum up the hash but also mix it up bit

```
while (data != end) {
h += mxm(*data++);
h *= C;
}
```

One thing that I needed to make the stream mixer `mxm`

a bit stronger. In my previous experiments I had made the observation that the avalanche bias doesn’t imply a good PractRand result and vice-versa. The first xor-constant 33, chosen in `mxmxmx`

was one that favored the PractRand test, while the optimal constant to minimize avalanche bias is 57 (found by SFFS in previous article `mxm`

construction). By exchanging 33 with 57 it passes the BIC test, but be worsen the PractRand result. I was not able to find any reasonable middleground which was very discouraging. Then I realized: why not use both? Here is the modified hash mixer.

```
uint64_t mxm(uint64_t x) {
x *= C;
x ^= (x >> 57) ^ (x >> 33);
x *= C;
return x;
}
```

With this it passed all the SMHasher tests (but for performance reasons I don’t use the modification in the mx3::random).

Here is the mx3 code on GitHub.

**Earlier posts in series**