Interesting, thanks for the responses. And yeah I meant 1/3, I always mix up negatives.<br><br>Agree that as you point out the biggest slowdown will be on classes that define their own __hash__, however since classes use instancedicts and this would reduce the dict size from 96 -> 64 bytes, we could blow 4 bytes to cache the hash on the object.
<br><br>In fact PyObject_Hash could 'intern' the result of __hash__ into a __hashvalue__ member of the class dict. This might be the best of both worlds since it'll only use space for the hashvalue if its needed.<br><br>Oh and the reason I brought up strings was that one can grab the ob_shash from the stringobject in lookupdict_string to avoid even the function call to get the hash for a string, so its just the same as storing the hash on the entry for strings.
<br><br>The reason I looked into this to begin with was that my code used up a bunch of memory which was traceable to lots of little objects with instance dicts, so it seemed that if instancedicts took less memory I wouldn't have to go and add __slots__ to a bunch of my classes, or rewrite things as tuples/lists, etc.
<br><br>thanks!<br>-Kirat<br><br><div><span class="gmail_quote">On 4/23/06, <b class="gmail_sendername">Tim Peters</b> <<a href="mailto:tim.peters@gmail.com">tim.peters@gmail.com</a>> wrote:</span><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
[Kirat Singh]<br>> Hi, this is my first python dev post, so please forgive me if this topic has<br>> already been discussed.<br><br>It's hard to find one that hasn't -- but it's even harder to find the<br>old discussions ;-)
<br><br>> It seemed to me that removing me_hash from a dict entry would save 2/3 of<br>> the space<br><br>1/3, right?<br><br>> used by dictionaries and also improve alignment of the entries<br>> since they'd be 8 bytes instead of 12.
<br><br>How would that help? On 32-bit boxes, we have 3 4-byte members in<br>PyDictEntry, and they'll all 4-byte aligned. In what respect related<br>to alignment is that sub-optimal?<br><br>> And sets end up having just 4 byte entries.
<br>><br>> I'm guessing that string dicts are the most common (hence the specialized<br>> lookupdict_string routine),<br><br>String dicts were the only kind at first, and their speed is critical<br>because Python itself makes heavy use of them (
e.g., to implement<br>instance and module namespaces, and keyword arguments).<br><br>> and since strings already contain their hash, this would probably mitigate<br>> the performance impact.<br><br>No slowdown in string dicts would be welcome, but since strings
<br>already cache their own hash, they seem unaffected by this.<br><br>It's the speed of other key types that would suffer, and for classes<br>that define their own __hash__ method that could well be deadly (see<br>Martin's reply for more detail).
<br><br>> One could also add a hash to Tuples since they are immutable.<br><br>A patch to do that was recently rejected. You can read its comments<br>for some of the reasons:<br><br> <a href="http://www.python.org/sf/1462796">
http://www.python.org/sf/1462796</a><br><br>More reasons were given in a python-dev thread about the same thing<br>earlier this month:<br><br> <a href="http://mail.python.org/pipermail/python-dev/2006-April/063275.html">
http://mail.python.org/pipermail/python-dev/2006-April/063275.html</a><br><br>> If this isn't a totally stupid idea, I'd be happy to volunteer to try the<br>> experiment and run any suggested tests.<br><br>I'd be -1 if it slowed dict operations for classes that define their
<br>own __hash__. I do a lot of that ;-)<br><br>> PS any opinion on making _Py_StringEq a macro?<br><br>Yes: don't bother unless it provably speeds something "important" :-)<br> It's kinda messy for a macro otherwise, macros always make debugging
<br>harder (can't step through the source expansion in a debugger w/o a<br>lot of pain), etc.<br><br>> inline function would be nice but I hesitate to bring up the C/C++ debate, both<br>> languages suck in their own special way ;-)
<br><br>Does the Python source even compile as C++ now? People have been<br>working toward that, but my last impression was that it's not there<br>yet.<br></blockquote></div><br>