<div dir="ltr"><div>tl;dr <b>.iteritems() is faster and more memory efficient than .items() in python2</b></div><div><br></div><div><br></div><div>Using xrange() in python2 instead of range() because it's more memory efficient and consistent between python 2 and 3...</div><div><br></div><div># xrange() + .items()</div><div>python -m timeit -n 20 for\ i\ in\ dict(enumerate(xrange(1000000))).items():\ pass</div><div>20 loops, best of 3: 729 msec per loop</div><div>peak memory usage: 203 megabytes</div><div><br></div><div># xrange() + .iteritems()</div><div>python -m timeit -n 20 for\ i\ in\ dict(enumerate(xrange(1000000))).iteritems():\ pass</div><div>20 loops, best of 3: 644 msec per loop</div><div>peak memory usage: 176 megabytes</div><div><br></div><div># python 3</div><div>python3 -m timeit -n 20 for\ i\ in\ dict(enumerate(range(1000000))).items():\ pass</div><div>20 loops, best of 3: 826 msec per loop</div><div>peak memory usage: 198 megabytes</div><div><br></div><div><br></div><div>And if you really want to see the results with range() in python2...</div><div><br></div><div><div># range() + .items()</div><div>python -m timeit -n 20 for\ i\ in\ dict(enumerate(range(1000000))).items():\ pass</div><div>20 loops, best of 3: 851 msec per loop</div><div>peak memory usage: 254 megabytes</div><div><br></div><div># range() + .iteritems()</div><div>python -m timeit -n 20 for\ i\ in\ dict(enumerate(range(1000000))).iteritems():\ pass</div><div>20 loops, best of 3: 919 msec per loop</div><div>peak memory usage: 184 megabytes</div></div><div><br></div><div><br></div><div>To benchmark memory consumption, I used the following on bare metal:</div><div>


<p class=""><span class="">$ valgrind --tool=massif --pages-as-heap=yes </span>-massif-out-file=massif.out $COMMAND_FROM_ABOVE<br>$ cat massif.out | grep mem_heap_B | sort -u</p><p class=""><span class="">$ </span><span class="">python2 --version<br></span>Python 2.7.9</p><p class=""><span class="">$ </span><span class="">python3 --version<br></span>Python 3.4.3</p></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Wed, Jun 10, 2015 at 8:36 PM, gordon chung <span dir="ltr"><<a href="mailto:gord@live.ca" target="_blank">gord@live.ca</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">


<div><div dir="ltr">


<div dir="ltr"><div>> Date: Wed, 10 Jun 2015 21:33:44 +1200<br>> From: <a href="mailto:robertc@robertcollins.net" target="_blank">robertc@robertcollins.net</a><br>> To: <a href="mailto:openstack-dev@lists.openstack.org" target="_blank">openstack-dev@lists.openstack.org</a><br>> Subject: Re: [openstack-dev] [all][python3] use of six.iteritems()<span class=""><br>> <br>> On 10 June 2015 at 17:22, gordon chung <<a href="mailto:gord@live.ca" target="_blank">gord@live.ca</a>> wrote:<br>> > maybe the suggestion should be "don't blindly apply six.iteritems or items" rather than don't apply iteritems at all. admittedly, it's a massive eyesore, but it's a very real use case that some projects deal with large data results and to enforce the latter policy can have negative effects[1].  one "million item dictionary" might be negligible but in a multi-user, multi-* environment that can have a significant impact on the amount memory required to store everything.<br>> <br>> > [1] disclaimer: i have no real world results but i assume memory management was the reason for the switch in logic from py2 to py3<br>> <br>> I wouldn't make that assumption.<br>> <br>> And no, memory isn't an issue. If you have a million item dict,<br>> ignoring the internal overheads, the dict needs 1 million object<br>> pointers. The size of a list with those pointers in it is 1M (pointer<br>> size in bytes). E.g. 4M or 8M. Nothing to worry about given the<br>> footprint of such a program :)</span></div><div><br></div><div>iiuc, items() (in py2) will create a copy of  the dictionary in memory to be processed. this is useful for cases such as concurrency where you want to ensure consistency but doing a quick test i noticed a massive spike in memory usage between items() and iteritems.</div><div><br></div><div>'for i in dict(enumerate(range(1000000))).items(): pass' consumes significantly more memory than 'for i in dict(enumerate(range(1000000))).iteritems(): pass'. on my system, the difference in memory consumption was double when using items() vs iteritems() and the cpu util was significantly more as well... let me know if there's anything that stands out as inaccurate.</div><div><br></div><div>unless there's something wrong with my ignorant testing above, i think it's something projects should consider when mass applying any iteritems/items patch.</div><div><br></div><div><span style="font-size:12pt">cheers,</span></div><div><span style="font-size:12pt">gord</span></div></div>

                                          </div></div>

<br>__________________________________________________________________________<br>

OpenStack Development Mailing List (not for usage questions)<br>

Unsubscribe: <a href="http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe" target="_blank">OpenStack-dev-request@lists.openstack.org?subject:unsubscribe</a><br>

<a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev" target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev</a><br>

<br></blockquote></div><br></div>