Faster string merging

* use power-of-two hash table
* use better hash function (hashing 32bits at once and with better
  mixing characteristics)
* use input-offset-to-entry maps instead of retaining full input
  contents for lookup time
* don't reread SEC_MERGE section multiple times
* care for cache behaviour for the hot lookup routine

The overall effect is less usage in libz and much faster string merging
itself.  On a debug-info-enabled cc1 the effect at the time of this
writing on the machine I used was going from 14400 perf samples to 9300
perf samples or from 3.7 seconds to 2.4 seconds, i.e. about 33% .
5 files changed