removing use of pkg_resources to improve command line app performance

Sean Mooney smooney at redhat.com
Mon Jul 6 18:54:05 UTC 2020


On Mon, 2020-07-06 at 14:37 -0400, Doug Hellmann wrote:
> We have had a long-standing issue with the performance of the openstack command line tool. At least part of the
> startup cost is the time taken in scanning for all of the plugins that are installed, which is a side-effect of
> importing pkg_resources. To fix that, we need to eliminate all use of pkg_resources in code that would be used by a
> command line application (long-running services are candidates, too, but the benefit is bigger in short-lived command
> line apps).
> 
> Python 3.8 added a new library importlib.metadata, which also has an entry points API. It is more efficient, and
> produces data in a format that can be cached to make it even faster. I have started adding support for that caching to
> stevedore [0], which is the Oslo library for managing application plugins. For version of python earlier than 3.8, the
> same library is available on PyPI as “importlib_metadata”.
based on https://opendev.org/openstack/governance/src/branch/master/reference/runtimes/victoria.rst we still need
to support 3.6 for victoria. is there a backport lib like mock for this on older python releases?
> 
> A big part of the implementation work will actually be removing the use of pkg_resources in places other than
> stevedore. We have a couple of different use patterns to consider and replace in different ways.
> 
> First, anything using iter_entry_points() should use a stevedore extension manager instead. There are a few of them to
> choose from, based on how the plugins will be used. The stevedore docs [1] include a tutorial and documentation for
> all of the classes and their uses. Most calls to iter_entry_points() can be replaced with a stevedore.ExtensionManager
> directly, but the other managers are meant to implement common access patterns like selecting a subset (or just one)
> of the available plugins by name.
> 
> Second, we have a few places where pkg_resources.get_distribution(name).version is used to discover a package’s
> installed version. Those can be changed to use importlib.metadata.version() instead, as in [2]. This is *much* faster
> because importlib goes directly to the metadata file for the named package instead of looking through all of the
> installed packages.
> 
> Finally, any code using any properties of the EntryPoint returned by stevedore other than “name” and “load()” may need
> to be updated. The new EntryPoint class in importlib.metadata is not 100% compatible with the one from pkg_resources.
> The same data is there, but sometimes it is named differently. If we need a compatibility layer we could put that in
> stevedore, but it is unusual to need access to any of the internals of EntryPoint and it’s typically better to use the
> manager abstractions in stevedore instead of manipulating EntryPoint instances directly.
> 
> I have started making some of the changes [3], but I’m doing this in my quarantine-induced spare time so it’s likely
> to take a while. If you want to pitch in, I would appreciate it. I am using the topic “osc-performance”, since the
> work is related to making python-openstackclient faster. Feel free to tag me for reviews on your patches.
> 
> Doug
> 
> [0] https://review.opendev.org/#/c/739306/
> [1] https://docs.openstack.org/stevedore/latest/
> [2] https://review.opendev.org/#/c/739379/2
> [3] https://review.opendev.org/#/q/topic:osc-performance
> 




More information about the openstack-discuss mailing list