[openstack-dev] [all] warning about __init__ importing modules - fast CLIs

Boris Pavlovic bpavlovic at mirantis.com
Tue Mar 17 00:15:59 UTC 2015


Robert,


Thanks for sharing this!
Now I know how to speed up start of Rally ;)


Best regards,
Boris Pavlovic

On Tue, Mar 17, 2015 at 2:23 AM, Robert Collins <robertc at robertcollins.net>
wrote:

> So, one of the things that we sometimes do in an __init__.py is this:
>
> all = ["submodule"]
> import submodule
>
> This means users can do
>
> import mymodule
> mymodule.submodule
>
> and it works.
>
> This is actually a bit of an anti-pattern in the Python space, because
> to import mymodule.othersubmodule we'll always pay the import cost of
> mymodule.submodule whether or not any code from it is used.
>
> And the import cost can be substantial.
>
> Take for instance http://pad.lv/1431649 which is about osc being slow,
> and some of the slowness is likely due to the cost of importing unused
> modules from python-keystoneclient.
>
> In general, it is important for snappy short lived processes that only
> the needed code is imported. And that implies a few things in library
> code that they consume. CLI's are the most prevalent example of such
> short lived processes (including rootwrap's CLI thunk still).
> https://files.bemusement.org/talks/OSDC2008-FastPython/ is a nice
> summary of this btw by one of the other bzr cores back in the day -
> and not much has changed since then. We'll likely want to port the
> profile-imports facility over to our tooling to really track things
> down, since the default Python tools don't give us timestamps (hey,
> someone want to add that to python -v ?).
>
> So - the constraints I'd propose for libraries we use from CLI's,
> including our python-*client:
>  - import libraryname should be fast - no more than a ms or so. Timing
> with .pyc files is ok.
>    To time it (hot cache) - something like the following
>      python -m timeit -s 'import sys; o=dict(sys.modules)' 'import
> keystoneclient; sys.modules.clear();sys.modules.update(o)'
>    right now keystoneclient is somewhat slow: 10 loops, best of 3: 220
> msec per loop
>    Timing cold cache is harder, something like:
>     import datetime
>     import subprocess
>     subprocess.call('echo 3 | sudo tee /proc/sys/vm/drop_caches',
> shell=True)
>     start = datetime.datetime.now()
>     import keystoneclient
>     stop = datetime.datetime.now()
>     print stop-start
>
>    should get a decent approximation. Right now I see 0:00:00.506059 -
> or 500ms. On an SSD. Try it on spinning rust and I think you'll cry.
>
>  - as a corollary, __init__ should not import things unless *every use
> of the library ever* will need it.
>
> -Rob
>
> --
> Robert Collins <rbtcollins at hp.com>
> Distinguished Technologist
> HP Converged Cloud
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20150317/57c7ffd5/attachment.html>


More information about the OpenStack-dev mailing list