Ai Dreams Forum

Member's Experiments & Projects => AI Programming => Topic started by: frankinstien on July 25, 2021, 03:03:37 am

Title: Improving lookup performance for Hashtables
Post by: frankinstien on July 25, 2021, 03:03:37 am
I read a few articles about hashtables and the problem with collisions (https://preshing.com/20110504/hash-collision-probabilities/). So I came up with a key algorithm using power functions to create unique keys from strings using only alphanumeric characters. So by precomputing the keys as doubles I get a lookup improvement of anywhere from 43% to 90%!

Below are some tests where the dictionary has 370,102 unique words and 100,000 lookups were performed for each test:

Code
Performce Test with NET string key dictionary

Total Time for all look ups: 0.04 seconds!

__________________________________
Performce Test with NET double key dictionary

Total Time for all look ups: 0.029 seconds!

__________________________________
Performce Test with NET string key dictionary

Total Time for all look ups: 0.053 seconds!

__________________________________
Performce Test with NET double key dictionary

Total Time for all look ups: 0.028 seconds!

__________________________________
Performce Test with NET string key dictionary

Total Time for all look ups: 0.041 seconds!

__________________________________
Performce Test with NET double key dictionary

Total Time for all look ups: 0.028 seconds!

Ultimately the key generator will be modified to work with numerical ranges so fuzzy states can be looked up using a hashtable.
Title: Re: Improving lookup performance for Hashtables
Post by: MagnusWootton on July 25, 2021, 03:15:16 am
Dont understand,  are u getting a lookup speed increase?    and its just looking for an exact match or is it looking for a similarity match?

Ive got an optimization for nearest neighbour matching myself,   but I dont understand what  "precomputing the keys as doubles"  means!   why wouldnt an integer suffice - why need floating point unit?
Title: Re: Improving lookup performance for Hashtables
Post by: frankinstien on July 25, 2021, 03:25:45 am
Right now it's an exact match but I just tried using byte arrays and I'm getting a 408% improvement over just using string keys! :D

Of course, there are data sources for partial matches or usual entry mistakes and this approach will shine with those kinds of resources.
Title: Re: Improving lookup performance for Hashtables
Post by: frankinstien on July 25, 2021, 11:02:21 pm
Oops, that claim about byte arrays was due to an error in the code! But the use of the power function key(double) performance boost is consistent. O0
Title: Re: Improving lookup performance for Hashtables
Post by: MagnusWootton on July 26, 2021, 04:25:39 am
Ai requires a really excellent optimization,  if it turns out your unlucky this time,   you've only failed once you've given up completely.

If you want to know about my KNN op,  just ask,   unless u want your own original invention, which I highly recommend over copying others!  :)