[openstack-dev] [tc][infra][release][security][stable][kolla][loci][tripleo][docker][kubernetes] do we want to be publishing binary container images?

Michał Jastrzębski inc007 at gmail.com
Wed May 17 18:39:26 UTC 2017

On 17 May 2017 at 11:36, Michał Jastrzębski <inc007 at gmail.com> wrote:
> On 17 May 2017 at 11:04, Doug Hellmann <doug at doughellmann.com> wrote:
>> Excerpts from Michał Jastrzębski's message of 2017-05-17 07:47:31 -0700:
>>> On 17 May 2017 at 04:14, Chris Dent <cdent+os at anticdent.org> wrote:
>>> > On Wed, 17 May 2017, Thierry Carrez wrote:
>>> >
>>> >> Back to container image world, if we refresh those images daily and they
>>> >> are not versioned or archived (basically you can only use the latest and
>>> >> can't really access past dailies), I think we'd be in a similar situation
>>> >> ?
>>> >
>>> >
>>> > Yes, this.
>>> I think it's not a bad idea to message "you are responsible for
>>> archving your containers". Do that, combine it with good toolset that
>>> helps users determine versions of packages and other metadata and
>>> we'll end up with something that itself would be greatly appreciated.
>>> Few potential user stories.
>>> I have OpenStack <100 nodes and need every single one of them, hence
>>> no CI. At the same time I want to have fresh packages to avoid CVEs. I
>>> deploy kolla with tip-of-the-stable-branch and setup cronjob that will
>>> upgrade it every week. Because my scenerio is quite typical and
>>> containers already ran through gates that tests my scenerio, I'm good.
>>> Another one:
>>> I have 300+ node cloud, heavy CI and security team examining every
>>> container. While I could build containers locally, downloading them is
>>> just simpler and effectively the same (after all, it's containers
>>> being tested not build process). Every download our security team
>>> scrutinize contaniers and uses toolset Kolla provides to help them.
>>> Additional benefit is that on top of our CI these images went through
>>> Kolla CI which is nice, more testing is always good.
>>> And another one
>>> We are Kolla community. We want to provide testing for full release
>>> upgrades every day in gates, to make sure OpenStack and Kolla is
>>> upgradable and improve general user experience of upgrades. Because
>>> infra is resource constrained, we cannot afford building 2 sets of
>>> containers (stable and master) and doing deploy->test->upgrade->test.
>>> However because we have these cached containers, that are fresh and
>>> passed CI for deploy, we can just use them! Now effectively we're not
>>> only testing Kolla's correctness of upgrade procedure but also all the
>>> other project team upgrades! Oh, it seems Nova merged something that
>>> negatively affects upgrades, let's make sure they are aware!
>>> And last one, which cannot be underestimated
>>> I am CTO of some company and I've heard OpenStack is no longer hard to
>>> deploy, I'll just download kolla-ansible and try. I'll follow this
>>> guide that deploys simple OpenStack with 2 commands and few small
>>> configs, and it's done! Super simple! We're moving to OpenStack and
>>> start contributing tomorrow!
>>> Please, let's solve messaging problems, put burden of archiving on
>>> users, whatever it takes to protect our community from wrong
>>> expectations, but not kill this effort. There are very real and
>>> immediate benefits to OpenStack as a whole if we do this.
>>> Cheers,
>>> Michal
>> You've presented some positive scenarios. Here's a worst case
>> situation that I'm worried about.
>> Suppose in a few months the top several companies contributing to
>> kolla decide to pull out of or reduce their contributions to
>> OpenStack.  IBM, Intel, Oracle, and Cisco either lay folks off or
>> redirect their efforts to other projects.  Maybe they start
>> contributing directly to kubernetes. The kolla team is hit badly,
>> and all of the people from that team who know how the container
>> publishing jobs work are gone.
> There are only 2 ways to defend against that: diverse community, which
> we have. If Intel, Red Hat, Oracle, Cisco and IBM back out of
> OpenStack, we'd still have almost 50% of contributors. I think we'll
> much more likely to survive than most of other Big Tent projects. In
> fact, I'd think with our current diversity, that we'll survive for as
> long as OpenStack survives.

Diverse community and off-by-one errors;) I was meaning to say diverse
community and involvement.

> Also all the more reasons why *we shouldn't build images personally*,
> we should have autonomous process to do it for us.
>> The day after everyone says goodbye, the build breaks. Maybe a bad
>> patch lands, or maybe some upstream assumption changes. The issue
>> isn't with the infra jobs themselves. The break means no new container
>> images are being published. Since there's not much of a kolla team
>> any more, it looks like it will be a while before anyone has time
>> to figure out how to fix the problem.
>> Later that same day, a new zero-day exploit is announced in a
>> component included in all or most of those images. Something that
>> isn't developed in the community, such as OpenSSL or glibc. The
>> exploit allows a complete breach of any app running with it. All
>> existing published containers include the bad bits and need to be
>> updated.
> I guess this is problem of all the software ever written. If community
> dies around it, people who uses it are in lots of trouble. One way to
> make sure it won't happen is to get involved yourself to make sure you
> can fix what is broken for you. This is how open source works. In
> Kolla most of our contributors are actually operators who run these
> very containers in their own infrastructure. This is where our
> diversity comes from. We aren't distro and that makes us, and our
> users, more protected from this scenario.
> If nova loses all of it's community, and someone finds critical bug in
> nova that allows hackers to gain access to vm data, there will be
> nobody to fix it, that's bad right? Same argument can be made. We
> aren't discussing deleting Nova tho right?
>> We now have an unknown number of clouds running containers built
>> by the community with major security holes. The team responsible
>> for maintaining those images is a shambles, but even if they weren't
>> the automation isn't working, so no new images can be published.
>> The consumers of the existing containers haven't bothered to set
>> up build pipelines of their own, because why bother? Even though
>> we've clearly said the images "we" publish are for our own testing,
>> they have found it irresistibly convenient to use them and move on
>> with their lives.
>> When the exploit is announced, they start clamoring for new container
>> images, and become understandably irate when we say we didn't think
>> they would be using them in production and they *shouldn't have*
>> and their problems are not our problems because we told them not
>> to do that. Some of them point to this mailing list thread, and the
>> promises made.  When we tell them those images were really being
>> built by the kolla team and that they're gone and none of the rest
>> of us know how to build new images or fix the problem with the build
>> system, panic ensues.  The community gets a bad reputation for
>> overreaching and not supporting what "we" produce.
> Automated builds, involvement into community are ways to avoid this.
> Diversity already keeps us relatively safe.
>> Contrast that with a scenario in which consumers either take
>> responsibility for their systems by building their own images, by
>> collaborating directly with other consumers to share the resources
>> needed to build those images, or by paying a third-party a sustainable
>> amount of money to build images for them. In any of those cases,
>> there is an incentive for the responsible party to be ready and
>> able to produce new images in a timely manner. Consumers of the
>> images know exactly where to go for support when they have problems.
>> Issues in those images don't reflect on the community in any way,
>> because we were not involved in producing them.
> Unless as you said build system breaks, then they are equally screwed
> locally. Unless someone fix it, and they can fix it for openstack
> infra too. Difference is, for OpenStack infra it's whole community
> that can fix it where local it's just you. That's the strength of open
> source.
>> As I said at the start of this thread, we've long avoided building
>> and supporting simple operating system style packages of the
>> components we produce. I am still struggling to understand how
>> building more complex artifacts, including bits over which we have
>> little or no control, is somehow more sustainable than those simple
>> packages.
> Binaries are built as standalone projects. Nova-api has no
> dependencies build into .rpm. If issue you just described would happen
> in any of projects openstack uses as dependency, in any of these [1],
> same argument applies. We pin specific versions in upper constraints.
> I'm willing to bet money of it that if today one of these libs would
> release CVE, there is good change we won't find out.
> Bottom line, yes, we do *package* today with PIP. Exactly same issues
> apply to pip packages if versions of dependencies are fixed, which
> they are in a large portion. I guess that's a larger issue which we
> should address and has nothing to do whatsoever with container
> publishing.
> [1] https://github.com/openstack/requirements/blob/master/upper-constraints.txt
>> Doug
>> __________________________________________________________________________
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

More information about the OpenStack-dev mailing list