[openstack-dev] Please do *NOT* use "vendorized" versions of anything (here: glanceclient using requests.packages.urllib3)

Morgan Fainberg morgan.fainberg at gmail.com
Thu Sep 18 00:22:47 UTC 2014



-----Original Message-----
From: Ian Cordasco <ian.cordasco at rackspace.com>
Reply: OpenStack Development Mailing List (not for usage questions) <openstack-dev at lists.openstack.org>>
Date: September 17, 2014 at 16:28:57
To: OpenStack Development Mailing List (not for usage questions) <openstack-dev at lists.openstack.org>>
Subject:  Re: [openstack-dev] Please do *NOT* use "vendorized" versions of anything (here: glanceclient using requests.packages.urllib3)

> On 9/17/14, 5:39 PM, "Mike Bayer" wrote:
>  
> >
> >On Sep 17, 2014, at 4:31 PM, Ian Cordasco  
> >wrote:
> >
> >> Project X pins a version of requests. Alice doesn’t know anything about
> >> requests and does pip install X. Until Alice takes a more active role in
> >> the development of Project X and looks into requests, she will never
> >>know
> >> she’s installed software that has exposures in it.
> >
> >If a vulnerability is reported in urllib3 1.9.1, Alice, as well as me and
> >everyone else who is not a novice, will at least know we need to run:
> >
> >$ pip show urllib3
> >---
> >Name: urllib3
> >Version: 1.9.1
> >Location:
> >/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packa  
> >ges
> >Requires:
> >
> >
> >and we know right there we have to upgrade. We upgrade, and we’re done.
> > If we see that some library is pinning it, we will know. We will
> >complain loudly to that library’s author and/or replace that library.
> >The tools are there to give us what we need to be aware and to escalate
> >the problem.
>  
> And when the library is unmaintained, you’ll be yelling into the echo
> chamber. If it is popular enough, it’ll continue to be used regardless
> because every recommendation that will be highest ranked by :search_engine
> will tell people looking for related libraries to do so. In all
> likelihood, most users who stumble across these projects will also be
> ignorant of the fact that :some_dependency has a CVE and needs to be
> upgraded.
>  
> >When a library silently bundles the source code and bypasses any normal
> >means of us knowing it’s present unless we read the source code or scour
> >the documentation, we have no way to know we’re affected. Some
> >applications, particularly pip, have to do this, however, it should only
> >be for technical reasons. It should not be because you don’t want novice
> >users to have to learn something, or because you’re angling to have lots
> >of downloads on pypi.
>  
> It is for technical reasons, but this is not the appropriate place to
> discuss them. There are at least 3 other closed issues on requests’ issue
> tracker that discuss most (if not all) of them. This is the list to
> discuss the development of OpenStack. Finally, for the last time, the fact
> that we vendor these libraries exists in multiple places (beyond the
> source).
>  
> >>> People make sure to upgrade their Requests libraries locally, but for
> >>>all
> >>> those poor saps who have *no idea* they have widely used apps that are
> >>> bundling it silently, they remain totally open to vulnerabilities and
> >>>the
> >>> black hats have disneyland at their disposal.
> >>
> >> I think more applications bundle it than you realize. You’re likely
> >>using
> >> one daily that does it.
> >
> >
> >SQLAlchemy itself vendorizes Queue and some fragments of six, but that is
> >of a much smaller scale, and is for technical reasons, rather than
> >appeasing-newbie reasons. But HTTP has a lot of security-critical
> >surface area. If I were to just bundle my own fork of an HMAC library
> >with a few of my own special enhancements, that would be seen as a
> >problem.
>  
> It would be seen as a problem. Except we don’t do anything even remotely
> as security related. The majority of what we do is certificate
> verification. We don’t bundle ssl. We don’t bundle pyOpenSSL. We don’t
> bundle back ports.ssl. We don’t have custom TLS handlers that we wrote
> from scratch. You’re analogy is way out of proportion. And for what it’s
> worth, pyCrypto (if I remember correctly) has never been audited and yet
> it is used:
> https://github.com/openstack/requirements/blob/master/global-requirements.t  
> xt#L86. That seems like a bigger issue than whether requests vendors an
> implementation detail.
>  
> >
> >> And yeah, we’ll continue to take the blame for the mistake that was made
> >> for those two exposures. As for “Is that how things should be done?”
> >> that’s not for me to answer. More than enough projects do it and do it
> >>out
> >> of necessity. The reality is that by vendoring its dependencies,
> >>requests
> >> allows its users more flexibility than other projects.
> >
> >I haven’t seen the technical reason for Requests doing this, I’ve only
> >seen this reason: "I want my users to be free to not use packaging if
> >they don't won't to. They can just grab the tarball and go.”. If that’s
> >really the only reason, then I fail to see how that reason has anything
> >to do with flexibility, other than the flexibility to remain lazy and
> >ignorant of basic computer programming skills - and Requests is a library
> >*for programmers*. It doesn’t do anything without typing code.
>  
> Perhaps I wasn’t clear enough. If I wasn’t, I apologize. What I meant to
> say above is that requests gives the users the ability to vendor requests
> into their libraries or applications as well. It does not advocate it. It
> does not require it (although I’m starting to wonder if we should change
> our license to make it a requirement :P). It gives the user the ability to
> use requests as a vendored dependency without having to edit a single line
> of the source. No imports need to be mangled and no edits are necessary to
> vendor requests. It’s on PyPI so you can use it as a dependency in your
> setup.py or just vendor it. That’s the flexibility I’m referencing. It’s a
> healthy flexibility too because it follows the mantra of Python that
> “We’re all consenting adults” and so users of requests can do whatever
> they want with the code.


I think that all of the conversation to this point has been valuable, the general consensus is vendoring a library is not as desirable as using it strictly as a dependency. It would be nice in a perfect world if vendoring wasn’t and issue, but in this case I think the root of the matter is that Debian un-vendors urllib3 and we have referenced the vendored urllib3 instead of installing and utilizing urllib3 directly.

This poses at least one problem for us: we are not able to guarantee we’re using the same urllib3 library as requests is. I am unsure how big of a deal this ends up being, but it is a concern and has brought up a question of how to handle this in the most appropriate and consistent way across all of the distributions we as OpenStack support. 

Does this make requests a bad library we should toss aside for something else? Instead of being concerned with the reasons for vendoring urllib3 (or un-vendoring it) we should shift the conversation towards two questions:

1. Is it a real issue if the version of urllib3 is mismatched between our client libraries and requests? 
2. If it is a real issue how are we solving it?

Obviously we can work with the requests team to figure out the best approach. We should focus on the solution here rather than continuing down the path of whether requests should/shouldn’t be vendoring it’s dependencies since it is clear that the team has their reasons and does not want to switch to the dependency model again.

—Morgan








More information about the OpenStack-dev mailing list