[sugar] python hash function limited to specific table size?

Mike C. Fletcher mcfletch at vrplumber.com
Thu Mar 15 14:24:39 EDT 2007


Dan Williams wrote:
> Hmm; right.  I was thinking that would reduce the spread/effectiveness
> of the hash algorithm, rather than allowing the algorithm to understand
> the size required and possibly incorporate that in a better manner.  I
> appear to be overthinking.  Bad Dan.
>   
You already got here, but just to amplify your conclusion:

    Python's dictionaries are extremely well optimised for most
    reasonable hash table sizes (i.e. below millions, and they're even
    reasonable at those sizes in my experience).  Python uses
    dictionaries *everywhere* internally, so they are given a lot of
    attention by python-dev.  Trying to code around them, e.g. by
    computing hashes yourself, is almost always going to slow you down
    compared to the highly optimised C code implementing the dictionaries.

    That is, if you have strings and want to index by them, just use the
    strings as keys and move onto the next task.  That isn't to say the
    dictionary implementation is perfect, but it's pretty hard to beat
    *in Python* without some rare and significant application-specific
    restrictions.

    The same largely holds true for lists.  There are cases where you
    might want to use a different implementation, but most of the time
    you want to assume "use a list" and only replace with a different
    implementation if you discover a problem with the list-based
    implementation.

Have fun,
Mike

-- 
________________________________________________
  Mike C. Fletcher
  Designer, VR Plumber, Coder
  http://www.vrplumber.com
  http://blog.vrplumber.com



More information about the Sugar mailing list