[openstack-dev] [all][python3] use of six.iteritems()
Victor Stinner
vstinner at redhat.com
Thu Jun 11 15:00:54 UTC 2015
Hi,
Le 10/06/2015 02:15, Robert Collins a écrit :
> python2.7 -m timeit -s 'd=dict(enumerate(range(1000000)))' 'for i in
> d.items(): pass'
> 10 loops, best of 3: 76.6 msec per loop
> python2.7 -m timeit -s 'd=dict(enumerate(range(1000000)))' 'for i in
> d.iteritems(): pass'
> 100 loops, best of 3: 22.6 msec per loop
.items() is 3x as slow as .iteritems(). Hum, I don't have the same
results. Try attached benchmark. I'm using my own wrapper on top of
timeit, because timeit is bad at calibrating the benchmark :-/ timeit
gives unreliable results.
Results on with CPU model: Intel(R) Core(TM) i7-3520M CPU @ 2.90GHz:
[ 10 keys ]
713 ns: iteritems
922 ns (+29%): items
[ 10^3 keys ]
42.1 us: iteritems
59.4 us (+41%): items
[ 10^6 keys (1 million) ]
89.3 ms: iteritems
442 ms (+395%): items
In my benchmark, .items() is 5x as slow as .iteritems(). The code to
iterate on 1 million items takes almost an half second. IMO adding 300
ms to each request is not negligible on an application. If this delay is
added multiple times (multiple loops iterating on 1 million items), we
may reach up to 1 second on an user request :-/
Anyway, when I write patches to port a project to Python 3, I don't want
to touch *anything* to Python 2. The API, the performances, the
behaviour, etc. must not change.
I don't want to be responsible of a slow down, and I don't feel able to
estimate if replacing dict.iteritems() with dict.items() has a cost on a
real application.
As Ihar wrote: it must be done in a separated patch, by developers
knowning well the project.
Currently, most developers writing Python 3 patches are not heavily
involved in each ported project.
There is also dict.itervalues(), not only dict.iteritems().
"for key in dict.iterkeys()" can simply be written "for key in dict:".
There is also xrange() vs range(), the debate is similar:
https://review.openstack.org/#/c/185418/
For Python 3, I suggest to use "from six.moves import range" to get the
Python 3 behaviour on Python 2: range() always create an iterator, it
doesn't create a temporary list. IMO it makes the code more readable
because "for i in xrange(n):" becomes "for i in range(n):". six is not
written outside imports and "range()" is better than "xrange()" for
developers starting to learn Python.
Victor
-------------- next part --------------
A non-text attachment was scrubbed...
Name: bench_iteritems.py
Type: text/x-python
Size: 959 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20150611/98fbd47e/attachment.py>
More information about the OpenStack-dev
mailing list