[openstack-dev] [all][python3] use of six.iteritems()

Dolph Mathews dolph.mathews at gmail.com
Thu Jun 11 03:48:43 UTC 2015


tl;dr *.iteritems() is faster and more memory efficient than .items() in
python2*


Using xrange() in python2 instead of range() because it's more memory
efficient and consistent between python 2 and 3...

# xrange() + .items()
python -m timeit -n 20 for\ i\ in\
dict(enumerate(xrange(1000000))).items():\ pass
20 loops, best of 3: 729 msec per loop
peak memory usage: 203 megabytes

# xrange() + .iteritems()
python -m timeit -n 20 for\ i\ in\
dict(enumerate(xrange(1000000))).iteritems():\ pass
20 loops, best of 3: 644 msec per loop
peak memory usage: 176 megabytes

# python 3
python3 -m timeit -n 20 for\ i\ in\
dict(enumerate(range(1000000))).items():\ pass
20 loops, best of 3: 826 msec per loop
peak memory usage: 198 megabytes


And if you really want to see the results with range() in python2...

# range() + .items()
python -m timeit -n 20 for\ i\ in\
dict(enumerate(range(1000000))).items():\ pass
20 loops, best of 3: 851 msec per loop
peak memory usage: 254 megabytes

# range() + .iteritems()
python -m timeit -n 20 for\ i\ in\
dict(enumerate(range(1000000))).iteritems():\ pass
20 loops, best of 3: 919 msec per loop
peak memory usage: 184 megabytes


To benchmark memory consumption, I used the following on bare metal:

$ valgrind --tool=massif --pages-as-heap=yes -massif-out-file=massif.out
$COMMAND_FROM_ABOVE
$ cat massif.out | grep mem_heap_B | sort -u

$ python2 --version
Python 2.7.9

$ python3 --version
Python 3.4.3

On Wed, Jun 10, 2015 at 8:36 PM, gordon chung <gord at live.ca> wrote:

>  > Date: Wed, 10 Jun 2015 21:33:44 +1200
> > From: robertc at robertcollins.net
> > To: openstack-dev at lists.openstack.org
> > Subject: Re: [openstack-dev] [all][python3] use of six.iteritems()
> >
> > On 10 June 2015 at 17:22, gordon chung <gord at live.ca> wrote:
> > > maybe the suggestion should be "don't blindly apply six.iteritems or
> items" rather than don't apply iteritems at all. admittedly, it's a massive
> eyesore, but it's a very real use case that some projects deal with large
> data results and to enforce the latter policy can have negative effects[1].
> one "million item dictionary" might be negligible but in a multi-user,
> multi-* environment that can have a significant impact on the amount memory
> required to store everything.
> >
> > > [1] disclaimer: i have no real world results but i assume memory
> management was the reason for the switch in logic from py2 to py3
> >
> > I wouldn't make that assumption.
> >
> > And no, memory isn't an issue. If you have a million item dict,
> > ignoring the internal overheads, the dict needs 1 million object
> > pointers. The size of a list with those pointers in it is 1M (pointer
> > size in bytes). E.g. 4M or 8M. Nothing to worry about given the
> > footprint of such a program :)
>
> iiuc, items() (in py2) will create a copy of  the dictionary in memory to
> be processed. this is useful for cases such as concurrency where you want
> to ensure consistency but doing a quick test i noticed a massive spike in
> memory usage between items() and iteritems.
>
> 'for i in dict(enumerate(range(1000000))).items(): pass' consumes
> significantly more memory than 'for i in
> dict(enumerate(range(1000000))).iteritems(): pass'. on my system, the
> difference in memory consumption was double when using items() vs
> iteritems() and the cpu util was significantly more as well... let me know
> if there's anything that stands out as inaccurate.
>
> unless there's something wrong with my ignorant testing above, i think
> it's something projects should consider when mass applying any
> iteritems/items patch.
>
> cheers,
> gord
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20150610/7ac4846e/attachment.html>


More information about the OpenStack-dev mailing list