[sugar] python hash function limited to specific table size?
Bert Freudenberg
bert at freudenbergs.de
Thu Mar 15 15:11:52 EDT 2007
On Mar 15, 2007, at 19:24 , Mike C. Fletcher wrote:
> Dan Williams wrote:
>> Hmm; right. I was thinking that would reduce the spread/
>> effectiveness
>> of the hash algorithm, rather than allowing the algorithm to
>> understand
>> the size required and possibly incorporate that in a better
>> manner. I
>> appear to be overthinking. Bad Dan.
>>
> You already got here, but just to amplify your conclusion:
>
> Python's dictionaries are extremely well optimised for most
> reasonable hash table sizes (i.e. below millions, and they're even
> reasonable at those sizes in my experience). Python uses
> dictionaries *everywhere* internally, so they are given a lot of
> attention by python-dev. Trying to code around them, e.g. by
> computing hashes yourself, is almost always going to slow you down
> compared to the highly optimised C code implementing the
> dictionaries.
>
> That is, if you have strings and want to index by them, just use
> the
> strings as keys and move onto the next task. That isn't to say the
> dictionary implementation is perfect, but it's pretty hard to beat
> *in Python* without some rare and significant application-specific
> restrictions.
>
> The same largely holds true for lists. There are cases where you
> might want to use a different implementation, but most of the time
> you want to assume "use a list" and only replace with a different
> implementation if you discover a problem with the list-based
> implementation.
Well, what I read between Dan's lines was that he does not actually
have a dictionary, but wants to select one of 500 values based on a
string hash. Another use would be, for example, to make a seed for a
random generator that then would make a repeatable sequence, perhaps
to assign a color for a nickname.
- Bert -
More information about the Sugar
mailing list