<html>
<head>
<meta content="text/html; charset=windows-1252"
http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<br>
<br>
<div class="moz-cite-prefix">On 6/10/15 11:48 PM, Dolph Mathews
wrote:<br>
</div>
<blockquote
cite="mid:CAC=h7gU9tJvMqJd0JRu0i+o=eZ6J_W+dZFevJ0uwm8uNBxOSXA@mail.gmail.com"
type="cite">
<div dir="ltr">
<div>tl;dr <b>.iteritems() is faster and more memory efficient
than .items() in python2</b></div>
<div><br>
</div>
<div><br>
</div>
<div>Using xrange() in python2 instead of range() because it's
more memory efficient and consistent between python 2 and 3...</div>
<div><br>
</div>
<div># xrange() + .items()</div>
<div>python -m timeit -n 20 for\ i\ in\
dict(enumerate(xrange(1000000))).items():\ pass</div>
<div>20 loops, best of 3: 729 msec per loop</div>
<div>peak memory usage: 203 megabytes</div>
<div><br>
</div>
<div># xrange() + .iteritems()</div>
<div>python -m timeit -n 20 for\ i\ in\
dict(enumerate(xrange(1000000))).iteritems():\ pass</div>
<div>20 loops, best of 3: 644 msec per loop</div>
<div>peak memory usage: 176 megabytes</div>
<div><br>
</div>
<div># python 3</div>
<div>python3 -m timeit -n 20 for\ i\ in\
dict(enumerate(range(1000000))).items():\ pass</div>
<div>20 loops, best of 3: 826 msec per loop</div>
<div>peak memory usage: 198 megabytes</div>
</div>
</blockquote>
it is just me, or are these differences pretty negligible
considering this is the "1 million item dictionary", which in itself
is a unicorn in openstack code or really most code anywhere? <br>
<br>
as was stated before, if we have million-item dictionaries floating
around, that code has problems. I already have to wait full
seconds for responses to come back when I play around with Neutron +
Horizon in a devstack VM, and that's with no data at all. 100ms
extra for a hypothetical million item structure would be long after
the whole app has fallen over from having just ten thousand of
anything, much less a million.<br>
<br>
My only concern with items() is that it is semantically different in
Py2k / Py3k. Code that would otherwise have a "dictionary changed
size" issue under iteritems() / py3k items() would succeed under
py2k items(). If such a coding mistake is not covered by tests (as
this is a data-dependent error condition), it would manifest as a
sudden error condition on Py3k only.<br>
<br>
<br>
<br>
<blockquote
cite="mid:CAC=h7gU9tJvMqJd0JRu0i+o=eZ6J_W+dZFevJ0uwm8uNBxOSXA@mail.gmail.com"
type="cite">
<div dir="ltr">
<div><br>
</div>
<div><br>
</div>
<div>And if you really want to see the results with range() in
python2...</div>
<div><br>
</div>
<div>
<div># range() + .items()</div>
<div>python -m timeit -n 20 for\ i\ in\
dict(enumerate(range(1000000))).items():\ pass</div>
<div>20 loops, best of 3: 851 msec per loop</div>
<div>peak memory usage: 254 megabytes</div>
<div><br>
</div>
<div># range() + .iteritems()</div>
<div>python -m timeit -n 20 for\ i\ in\
dict(enumerate(range(1000000))).iteritems():\ pass</div>
<div>20 loops, best of 3: 919 msec per loop</div>
<div>peak memory usage: 184 megabytes</div>
</div>
<div><br>
</div>
<div><br>
</div>
<div>To benchmark memory consumption, I used the following on
bare metal:</div>
<div>
<p class=""><span class="">$ valgrind --tool=massif
--pages-as-heap=yes </span>-massif-out-file=massif.out
$COMMAND_FROM_ABOVE<br>
$ cat massif.out | grep mem_heap_B | sort -u</p>
<p class=""><span class="">$ </span><span class="">python2
--version<br>
</span>Python 2.7.9</p>
<p class=""><span class="">$ </span><span class="">python3
--version<br>
</span>Python 3.4.3</p>
</div>
</div>
<div class="gmail_extra"><br>
<div class="gmail_quote">On Wed, Jun 10, 2015 at 8:36 PM, gordon
chung <span dir="ltr"><<a moz-do-not-send="true"
href="mailto:gord@live.ca" target="_blank">gord@live.ca</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">
<div>
<div dir="ltr">
<div dir="ltr">
<div>> Date: Wed, 10 Jun 2015 21:33:44 +1200<br>
> From: <a moz-do-not-send="true"
href="mailto:robertc@robertcollins.net"
target="_blank">robertc@robertcollins.net</a><br>
> To: <a moz-do-not-send="true"
href="mailto:openstack-dev@lists.openstack.org"
target="_blank">openstack-dev@lists.openstack.org</a><br>
> Subject: Re: [openstack-dev] [all][python3] use
of six.iteritems()<span class=""><br>
> <br>
> On 10 June 2015 at 17:22, gordon chung <<a
moz-do-not-send="true"
href="mailto:gord@live.ca" target="_blank"><a class="moz-txt-link-abbreviated" href="mailto:gord@live.ca">gord@live.ca</a></a>>
wrote:<br>
> > maybe the suggestion should be "don't
blindly apply six.iteritems or items" rather than
don't apply iteritems at all. admittedly, it's a
massive eyesore, but it's a very real use case
that some projects deal with large data results
and to enforce the latter policy can have negative
effects[1]. one "million item dictionary" might be
negligible but in a multi-user, multi-*
environment that can have a significant impact on
the amount memory required to store everything.<br>
> <br>
> > [1] disclaimer: i have no real world
results but i assume memory management was the
reason for the switch in logic from py2 to py3<br>
> <br>
> I wouldn't make that assumption.<br>
> <br>
> And no, memory isn't an issue. If you have a
million item dict,<br>
> ignoring the internal overheads, the dict
needs 1 million object<br>
> pointers. The size of a list with those
pointers in it is 1M (pointer<br>
> size in bytes). E.g. 4M or 8M. Nothing to
worry about given the<br>
> footprint of such a program :)</span></div>
<div><br>
</div>
<div>iiuc, items() (in py2) will create a copy of the
dictionary in memory to be processed. this is useful
for cases such as concurrency where you want to
ensure consistency but doing a quick test i noticed
a massive spike in memory usage between items() and
iteritems.</div>
<div><br>
</div>
<div>'for i in
dict(enumerate(range(1000000))).items(): pass'
consumes significantly more memory than 'for i in
dict(enumerate(range(1000000))).iteritems(): pass'.
on my system, the difference in memory consumption
was double when using items() vs iteritems() and the
cpu util was significantly more as well... let me
know if there's anything that stands out as
inaccurate.</div>
<div><br>
</div>
<div>unless there's something wrong with my ignorant
testing above, i think it's something projects
should consider when mass applying any
iteritems/items patch.</div>
<div><br>
</div>
<div><span style="font-size:12pt">cheers,</span></div>
<div><span style="font-size:12pt">gord</span></div>
</div>
</div>
</div>
<br>
__________________________________________________________________________<br>
OpenStack Development Mailing List (not for usage questions)<br>
Unsubscribe: <a moz-do-not-send="true"
href="http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe"
target="_blank">OpenStack-dev-request@lists.openstack.org?subject:unsubscribe</a><br>
<a moz-do-not-send="true"
href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev"
target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev</a><br>
<br>
</blockquote>
</div>
<br>
</div>
<br>
<fieldset class="mimeAttachmentHeader"></fieldset>
<br>
<pre wrap="">__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: <a class="moz-txt-link-abbreviated" href="mailto:OpenStack-dev-request@lists.openstack.org?subject:unsubscribe">OpenStack-dev-request@lists.openstack.org?subject:unsubscribe</a>
<a class="moz-txt-link-freetext" href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev</a>
</pre>
</blockquote>
<br>
</body>
</html>