[openstack-dev] [tc][infra][release][security][stable][kolla][loci][tripleo][docker][kubernetes] do we want to be publishing binary container images?

Michał Jastrzębski inc007 at gmail.com
Wed May 17 18:36:40 UTC 2017


On 17 May 2017 at 11:04, Doug Hellmann <doug at doughellmann.com> wrote:
> Excerpts from Michał Jastrzębski's message of 2017-05-17 07:47:31 -0700:
>> On 17 May 2017 at 04:14, Chris Dent <cdent+os at anticdent.org> wrote:
>> > On Wed, 17 May 2017, Thierry Carrez wrote:
>> >
>> >> Back to container image world, if we refresh those images daily and they
>> >> are not versioned or archived (basically you can only use the latest and
>> >> can't really access past dailies), I think we'd be in a similar situation
>> >> ?
>> >
>> >
>> > Yes, this.
>>
>> I think it's not a bad idea to message "you are responsible for
>> archving your containers". Do that, combine it with good toolset that
>> helps users determine versions of packages and other metadata and
>> we'll end up with something that itself would be greatly appreciated.
>>
>> Few potential user stories.
>>
>> I have OpenStack <100 nodes and need every single one of them, hence
>> no CI. At the same time I want to have fresh packages to avoid CVEs. I
>> deploy kolla with tip-of-the-stable-branch and setup cronjob that will
>> upgrade it every week. Because my scenerio is quite typical and
>> containers already ran through gates that tests my scenerio, I'm good.
>>
>> Another one:
>>
>> I have 300+ node cloud, heavy CI and security team examining every
>> container. While I could build containers locally, downloading them is
>> just simpler and effectively the same (after all, it's containers
>> being tested not build process). Every download our security team
>> scrutinize contaniers and uses toolset Kolla provides to help them.
>> Additional benefit is that on top of our CI these images went through
>> Kolla CI which is nice, more testing is always good.
>>
>> And another one
>>
>> We are Kolla community. We want to provide testing for full release
>> upgrades every day in gates, to make sure OpenStack and Kolla is
>> upgradable and improve general user experience of upgrades. Because
>> infra is resource constrained, we cannot afford building 2 sets of
>> containers (stable and master) and doing deploy->test->upgrade->test.
>> However because we have these cached containers, that are fresh and
>> passed CI for deploy, we can just use them! Now effectively we're not
>> only testing Kolla's correctness of upgrade procedure but also all the
>> other project team upgrades! Oh, it seems Nova merged something that
>> negatively affects upgrades, let's make sure they are aware!
>>
>> And last one, which cannot be underestimated
>>
>> I am CTO of some company and I've heard OpenStack is no longer hard to
>> deploy, I'll just download kolla-ansible and try. I'll follow this
>> guide that deploys simple OpenStack with 2 commands and few small
>> configs, and it's done! Super simple! We're moving to OpenStack and
>> start contributing tomorrow!
>>
>> Please, let's solve messaging problems, put burden of archiving on
>> users, whatever it takes to protect our community from wrong
>> expectations, but not kill this effort. There are very real and
>> immediate benefits to OpenStack as a whole if we do this.
>>
>> Cheers,
>> Michal
>
> You've presented some positive scenarios. Here's a worst case
> situation that I'm worried about.
>
> Suppose in a few months the top several companies contributing to
> kolla decide to pull out of or reduce their contributions to
> OpenStack.  IBM, Intel, Oracle, and Cisco either lay folks off or
> redirect their efforts to other projects.  Maybe they start
> contributing directly to kubernetes. The kolla team is hit badly,
> and all of the people from that team who know how the container
> publishing jobs work are gone.

There are only 2 ways to defend against that: diverse community, which
we have. If Intel, Red Hat, Oracle, Cisco and IBM back out of
OpenStack, we'd still have almost 50% of contributors. I think we'll
much more likely to survive than most of other Big Tent projects. In
fact, I'd think with our current diversity, that we'll survive for as
long as OpenStack survives.

Also all the more reasons why *we shouldn't build images personally*,
we should have autonomous process to do it for us.

> The day after everyone says goodbye, the build breaks. Maybe a bad
> patch lands, or maybe some upstream assumption changes. The issue
> isn't with the infra jobs themselves. The break means no new container
> images are being published. Since there's not much of a kolla team
> any more, it looks like it will be a while before anyone has time
> to figure out how to fix the problem.

> Later that same day, a new zero-day exploit is announced in a
> component included in all or most of those images. Something that
> isn't developed in the community, such as OpenSSL or glibc. The
> exploit allows a complete breach of any app running with it. All
> existing published containers include the bad bits and need to be
> updated.

I guess this is problem of all the software ever written. If community
dies around it, people who uses it are in lots of trouble. One way to
make sure it won't happen is to get involved yourself to make sure you
can fix what is broken for you. This is how open source works. In
Kolla most of our contributors are actually operators who run these
very containers in their own infrastructure. This is where our
diversity comes from. We aren't distro and that makes us, and our
users, more protected from this scenario.

If nova loses all of it's community, and someone finds critical bug in
nova that allows hackers to gain access to vm data, there will be
nobody to fix it, that's bad right? Same argument can be made. We
aren't discussing deleting Nova tho right?

> We now have an unknown number of clouds running containers built
> by the community with major security holes. The team responsible
> for maintaining those images is a shambles, but even if they weren't
> the automation isn't working, so no new images can be published.
> The consumers of the existing containers haven't bothered to set
> up build pipelines of their own, because why bother? Even though
> we've clearly said the images "we" publish are for our own testing,
> they have found it irresistibly convenient to use them and move on
> with their lives.
>
> When the exploit is announced, they start clamoring for new container
> images, and become understandably irate when we say we didn't think
> they would be using them in production and they *shouldn't have*
> and their problems are not our problems because we told them not
> to do that. Some of them point to this mailing list thread, and the
> promises made.  When we tell them those images were really being
> built by the kolla team and that they're gone and none of the rest
> of us know how to build new images or fix the problem with the build
> system, panic ensues.  The community gets a bad reputation for
> overreaching and not supporting what "we" produce.

Automated builds, involvement into community are ways to avoid this.
Diversity already keeps us relatively safe.

> Contrast that with a scenario in which consumers either take
> responsibility for their systems by building their own images, by
> collaborating directly with other consumers to share the resources
> needed to build those images, or by paying a third-party a sustainable
> amount of money to build images for them. In any of those cases,
> there is an incentive for the responsible party to be ready and
> able to produce new images in a timely manner. Consumers of the
> images know exactly where to go for support when they have problems.
> Issues in those images don't reflect on the community in any way,
> because we were not involved in producing them.

Unless as you said build system breaks, then they are equally screwed
locally. Unless someone fix it, and they can fix it for openstack
infra too. Difference is, for OpenStack infra it's whole community
that can fix it where local it's just you. That's the strength of open
source.

> As I said at the start of this thread, we've long avoided building
> and supporting simple operating system style packages of the
> components we produce. I am still struggling to understand how
> building more complex artifacts, including bits over which we have
> little or no control, is somehow more sustainable than those simple
> packages.

Binaries are built as standalone projects. Nova-api has no
dependencies build into .rpm. If issue you just described would happen
in any of projects openstack uses as dependency, in any of these [1],
same argument applies. We pin specific versions in upper constraints.
I'm willing to bet money of it that if today one of these libs would
release CVE, there is good change we won't find out.

Bottom line, yes, we do *package* today with PIP. Exactly same issues
apply to pip packages if versions of dependencies are fixed, which
they are in a large portion. I guess that's a larger issue which we
should address and has nothing to do whatsoever with container
publishing.

[1] https://github.com/openstack/requirements/blob/master/upper-constraints.txt

> Doug
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



More information about the OpenStack-dev mailing list