Conntrack hash function comparisons III

The results and graphs were generated by an upadted cttest package which supports the new candidate hash functions: jenkins3 (i.e. lookup3(), see Bob Jenkins hash page), murmur2 and superfasthash.

[The results of the second hash function tests can be found here with the descriptions of the different graphs below. Here just the minimally required stuff is repeated.]

The key of the hash functions for conntrack consists of the network parameters identifying a connection:

The input datasets consist 50000 connections (100000 entry to hash), each "connecting to" one fixed IP address at a given port. The clients are connecting from 50000 different ports. In the different datasets the number of clients (and thus the number of different source IP addresses) is varied, thus we have got a fixed server IP address, a fixed server port, 50000 different client ports and client IP addresses between 50000 and 1 in the different input data sets:
	50000, 45000, 40000, ... 15000,
	10000, 9000, 8000, ... 2000,
	1000, 900, 800, ... 200,
	100, 99, 98, ... 1
That means 126 input datasets for every hash, at every hash size. The x axis below (except the speed graph) parametrized in the 126 different data sets, starting at 1 client and ending at 50000.

All hashes were tested at the following odd:

	8191 8209
	16383 16411
	32767 32771
	65535 65537
	131071 131101
	262143 262147
and even hash sizes:
	8192 16384
	32768 65536
	131072 262144

Hashes jenkins2, jenkins3, superfast

The superfasthash by Paul Hsieh might be tempting at the first glance, just when checking it's speed. However, it does not behave properly: it has got trouble when there are a few clients (1-20) or many (1000-50000) and produces just too much clashes.

Hashes jenkins2, jenkins3, murmur2

The murmur2 hash by Austin Appleby is the fastest among the tested ones. However, checking for example the maximum list length at hash size 32767 we can see that it cannot produce there good results. The chi-square graph (and the walking time graph too) reveals that it has got trouble at hash size 32771 too.

Hashes jenkins2, jenkins3

The Jenkins hashes are just good. One cannot spot any weakness at the different hash sizes or at the different input data.

Bit tests

cttest contains the test from lookup2()/lookup3() to probe wether every input bit affects every output bit half the time.

Only the jenkins3 (lookup3) and the murmur2 hashes pass the test.

Speed comparisons

At the speed test, the jenkins3() hash is the second fastest function when compiling with the -Os gcc flag. If compiled with -O2, it's the third one (first is murmur2, then superfasthash and then comes the jenkins3).

In my opinion we should replace the current jhash implementation of lookup2 by the faster and better lookup3.

Any suggestion for improvements are welcomed.

Jozsef Kadlecsik