removing use of pkg_resources to improve command line app performance

Doug Hellmann doug at doughellmann.com
Mon Jul 6 19:33:05 UTC 2020



> On Jul 6, 2020, at 3:30 PM, Sean Mooney <smooney at redhat.com> wrote:
> 
> On Mon, 2020-07-06 at 15:03 -0400, Doug Hellmann wrote:
>>> On Jul 6, 2020, at 2:54 PM, Sean Mooney <smooney at redhat.com> wrote:
>>> 
>>> On Mon, 2020-07-06 at 14:37 -0400, Doug Hellmann wrote:
>>>> We have had a long-standing issue with the performance of the openstack command line tool. At least part of the
>>>> startup cost is the time taken in scanning for all of the plugins that are installed, which is a side-effect of
>>>> importing pkg_resources. To fix that, we need to eliminate all use of pkg_resources in code that would be used by
>>>> a
>>>> command line application (long-running services are candidates, too, but the benefit is bigger in short-lived
>>>> command
>>>> line apps).
>>>> 
>>>> Python 3.8 added a new library importlib.metadata, which also has an entry points API. It is more efficient, and
>>>> produces data in a format that can be cached to make it even faster. I have started adding support for that
>>>> caching to
>>>> stevedore [0], which is the Oslo library for managing application plugins. For version of python earlier than 3.8,
>>>> the
>>>> same library is available on PyPI as “importlib_metadata”.
>>> 
>>> based on https://opendev.org/openstack/governance/src/branch/master/reference/runtimes/victoria.rst we still need
>>> to support 3.6 for victoria. is there a backport lib like mock for this on older python releases?
>> 
>> Yes, importlib_metadata is on PyPI and available all the way back to 2.7. It is already in the requirements list, and
>> if applications switch to using stevedore instead of scanning plugins themselves the implementation details of which
>> version of the library is invoked will be hidden.
> cool i will need to check os-vif more closely but i think we do everthing via the stevedore extension manager
> https://github.com/openstack/os-vif/blob/master/os_vif/__init__.py#L38-L49
> maybe some plugins are doing some things tehy should not but the intent was to rely only on stevedore and its apis.
> so it sound like this should just work for os-vif at least.

That’s definitely the goal of putting the cache behind the stevedore API.

>> 
>>>> 
>>>> A big part of the implementation work will actually be removing the use of pkg_resources in places other than
>>>> stevedore. We have a couple of different use patterns to consider and replace in different ways.
>>>> 
>>>> First, anything using iter_entry_points() should use a stevedore extension manager instead. There are a few of
>>>> them to
>>>> choose from, based on how the plugins will be used. The stevedore docs [1] include a tutorial and documentation
>>>> for
>>>> all of the classes and their uses. Most calls to iter_entry_points() can be replaced with a
>>>> stevedore.ExtensionManager
>>>> directly, but the other managers are meant to implement common access patterns like selecting a subset (or just
>>>> one)
>>>> of the available plugins by name.
>>>> 
>>>> Second, we have a few places where pkg_resources.get_distribution(name).version is used to discover a package’s
>>>> installed version. Those can be changed to use importlib.metadata.version() instead, as in [2]. This is *much*
>>>> faster
>>>> because importlib goes directly to the metadata file for the named package instead of looking through all of the
>>>> installed packages.
>>>> 
>>>> Finally, any code using any properties of the EntryPoint returned by stevedore other than “name” and “load()” may
>>>> need
>>>> to be updated. The new EntryPoint class in importlib.metadata is not 100% compatible with the one from
>>>> pkg_resources.
>>>> The same data is there, but sometimes it is named differently. If we need a compatibility layer we could put that
>>>> in
>>>> stevedore, but it is unusual to need access to any of the internals of EntryPoint and it’s typically better to use
>>>> the
>>>> manager abstractions in stevedore instead of manipulating EntryPoint instances directly.
>>>> 
>>>> I have started making some of the changes [3], but I’m doing this in my quarantine-induced spare time so it’s
>>>> likely
>>>> to take a while. If you want to pitch in, I would appreciate it. I am using the topic “osc-performance”, since the
>>>> work is related to making python-openstackclient faster. Feel free to tag me for reviews on your patches.
>>>> 
>>>> Doug
>>>> 
>>>> [0] https://review.opendev.org/#/c/739306/
>>>> [1] https://docs.openstack.org/stevedore/latest/
>>>> [2] https://review.opendev.org/#/c/739379/2
>>>> [3] https://review.opendev.org/#/q/topic:osc-performance




More information about the openstack-discuss mailing list