[openstack-dev] [all][python3] use of six.iteritems()

Victor Stinner vstinner at redhat.com
Thu Jun 11 15:00:54 UTC 2015


Le 10/06/2015 02:15, Robert Collins a écrit :
> python2.7 -m timeit -s 'd=dict(enumerate(range(1000000)))' 'for i in
> d.items(): pass'
> 10 loops, best of 3: 76.6 msec per loop

> python2.7 -m timeit -s 'd=dict(enumerate(range(1000000)))' 'for i in
> d.iteritems(): pass'
> 100 loops, best of 3: 22.6 msec per loop

.items() is 3x as slow as .iteritems(). Hum, I don't have the same 
results. Try attached benchmark. I'm using my own wrapper on top of 
timeit, because timeit is bad at calibrating the benchmark :-/ timeit 
gives unreliable results.

Results on with CPU model: Intel(R) Core(TM) i7-3520M CPU @ 2.90GHz:

[ 10 keys ]
713 ns: iteritems
922 ns (+29%): items

[ 10^3 keys ]
42.1 us: iteritems
59.4 us (+41%): items

[ 10^6 keys (1 million) ]
89.3 ms: iteritems
442 ms (+395%): items

In my benchmark, .items() is 5x as slow as .iteritems(). The code to 
iterate on 1 million items takes almost an half second. IMO adding 300 
ms to each request is not negligible on an application. If this delay is 
added multiple times (multiple loops iterating on 1 million items), we 
may reach up to 1 second on an user request :-/

Anyway, when I write patches to port a project to Python 3, I don't want 
to touch *anything* to Python 2. The API, the performances, the 
behaviour, etc. must not change.

I don't want to be responsible of a slow down, and I don't feel able to 
estimate if replacing dict.iteritems() with dict.items() has a cost on a 
real application.

As Ihar wrote: it must be done in a separated patch, by developers 
knowning well the project.

Currently, most developers writing Python 3 patches are not heavily 
involved in each ported project.

There is also dict.itervalues(), not only dict.iteritems().

"for key in dict.iterkeys()" can simply be written "for key in dict:".

There is also xrange() vs range(), the debate is similar:

For Python 3, I suggest to use "from six.moves import range" to get the 
Python 3 behaviour  on Python 2: range() always create an iterator, it 
doesn't create a temporary list. IMO it makes the code more readable 
because "for i in xrange(n):" becomes "for i in range(n):". six is not 
written outside imports and "range()" is better than "xrange()" for 
developers starting to learn Python.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: bench_iteritems.py
Type: text/x-python
Size: 959 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20150611/98fbd47e/attachment.py>

More information about the OpenStack-dev mailing list