[Openstack-operators] [nova] Can we bump MIN_LIBVIRT_VERSION to 1.2.2 in Liberty?

John Garbutt john at johngarbutt.com
Fri May 15 17:14:08 UTC 2015


On 15 May 2015 at 17:41, Daniel P. Berrange <berrange at redhat.com> wrote:
> On Fri, May 15, 2015 at 05:27:35PM +0100, John Garbutt wrote:
>> On 15 May 2015 at 11:51, Daniel P. Berrange <berrange at redhat.com> wrote:
>> > On Thu, May 14, 2015 at 02:23:25PM -0500, Matt Riedemann wrote:
>> >> The minimum required version of libvirt in the driver is 0.9.11 still [1].
>> >> We've been gating against 1.2.2 in Ubuntu Trusty 14.04 since Juno.
>> >>
>> >> The libvirt distro support matrix is here: [2]
>> >>
>> >> Can we safely assume the people aren't going to be running Libvirt compute
>> >> nodes on RHEL < 7.1 or Ubuntu Precise?
>> >
>> > I don't really think so - at the very least Fedora 20 and RHEL 7.0 are still
>> > actively supported platforms by their vendors, which both have older libvirt
>> > versions (1.1.3 and 1.1.1 respectively).
>> >
>> > I'm not sure whether SUSE team consider any of the 12.x versions to still be
>> > actively supported platforms or not, likewise which 13.x versions are under
>> > active support.
>>
>> As I understand it, RHEL 5.0 is supported until March 31, 2017.
>> https://access.redhat.com/support/policy/updates/errata
>>
>> Where should we draw the line? Its a tricky question.
>
> No one is seriously deploying KVM on RHEL-5 any more though, nor has done for
> quite a while, as RHEL-5 was very immature for KVM.  RHEL-6.x though is a solid
> platform for KVM that is still a widely deployed platform for virtualization,
> even for brand new deployments.
>
> The cut off line is difficult to set in stone - it takes a bit of analysis
> to understand what the real world popularity of the platform is..

Yeah, makes sense.

>> > In this case, I just don't see compelling benefits to Nova libvirt maint
>> > to justify increasing the minimum version to the level you suggest, and
>> > it has a clear negative impact on our users which they will not be able
>> > to easily deal with. They will be left with 3 options all of which are
>> > unsatisfactory
>> >
>> >  - Upgrade from RHEL-6 to RHEL-7 - a major undertaking for most organizations
>> >  - Upgrade libvirt on RHEL-6 - they essentially take on the support burden
>> >    for the hypervisor themselves, loosing support from the vendor
>> >  - Re-add the code we remove from Nova - we've given users a maint burden,
>> >    to rid ourselves of code that was posing no real maint burden on ourselves.
>>
>> Thats assuming vendors do not backport libvirt and QEMU and support that.
>
> In early updates we do rebase to newer libvirt & QEMU in RHEL, but RHEL-6
> has past the point in its lifecycle where we do that, so the versions
> are effectively fixed now for remainder of RHEL-6.x.

Ah, OK.

It seems I also slightly miss-read the options, you covered it there.

>> > As a more general point, I think we are lacking clear guidance on our
>> > policies around hypervisor platform support and thus have difficulty
>> > in deciding when it is reasonable for us to drop support for platforms.
>>
>> +1
>>
>> Lets try to fixing this in Liberty.
>>
>> I see this as part of "feature classification":
>> http://libertydesignsummit.sched.org/event/7ee9be7e3b005880706a914f44b296ed
>>
>> Honestly, I want us to call out the exact combinations we test. But
>> that in logs, or in the release notes , or docs, or some combination
>> of all of those.
>>
>> Partly so we are honest about what we know works, partly so folks
>> setup up and help us test more combinations users care about.
>>
>> This should really be true for everything, VMware, XenServer, Hyper-V
>> and others.
>
> Yes indeed. We need to give our users keep information about what is
> tested, so that they can undertake their own testing (with tempest
> or equiv) when they deploy on platforms that don't 100% match what
> upstream tested, or can at least ask their vendors to confirm they
> have tested on their behalf.

Cool, we agree there.

>> > The hypervisor platform is very different. While OpenStack does achieve
>> > some level of testing coverage of the hypervisor platform version used
>> > in the gate, this testing is inconsequential compared to the level of
>> > testing that vendors put into their hypervisor platforms.
>>
>> That depends on the scope right...
>>
>> Sure, we don't really test the "data plane", that is left the
>> hypervisor "vendor".
>>
>> But we do heavily test the "control plane", and thats useful.
>
> I can't speak for other vendors, but when we test virtualization
> in RHEL, we specifically test the combination of libvirt + QEMU + kernel,
> so changing any part of that invalidates testing and certification
> to some degrees. There are also sometimes fixes to libvirt's RHEL
> code to deal with RHEL specific problems, so if deploying a new
> upstream libvirt users would loose some of those workarounds.

Agreed.

I am just trying to be clear I don't expect OpenStack testing to test
the hypervisor "data plane".

That feels like something the upstream projects or vendor, has to do themselves.

>> > Even if users decide they want to upgrade their hypervisor platform to
>> > a new version provided officially by the vendor, this is not always a
>> > quick or easy task. Many organizations have non-trivial internal
>> > testing and certification requirements before upgrading OS and/or hypervisor
>> > platforms. The hardware they own has to be certified as compatible and
>> > tested. They often have 3rd party monitoring and security auditing tools
>> > that need to be upgraded and integrated with the new platform. They may
>> > need todo long term stress tests to prove the new platform / hardare
>> > combination is reliable at meeting their uptime requirements, so on.
>> > So even with an active desire to upgrade their platform, it may take
>> > anywhere from 3-6 months to actually put that plan into pratice. It may
>> > seem strange, but at the same time they can be perfectly ok upgrading
>> > openstack in a matter of weeks, or even doing continuous deployment,
>> > simply because that is a layer above that does not directly impact on
>> > the hardware or their base platform certification.
>>
>> Sadly, this is the reality at the moment.
>>
>> > I could see us increase libvirt to 0.10.2 and qemu to 0.12.1.2as, assuming
>> > opensuse 12.2 is end of life, then rhel-6 is the next oldest platform that
>> > is still supported. For that I would make liberty print a warning on start
>> > if libvirt was in the range 0.9.9 < 0.10.2, likewise for QEMU, and then
>> > change the min version in  Mxxxx.
>>
>> Sounds good:
>> https://review.openstack.org/#/c/183220/
>>
>> I feel we should add warnings for libvirt < 1.2.2 to say its untested.
>
> I don't think that is it really acceptable for nova to log for every single
> version that is not matching upstream CI. That is going to end up with
> the majority of nova deployments spamming their logs with warning messages
> that users will just learn to ignore. The various distros will also likely
> just patch out the warnings, otherwise users will file an endless stream
> of bug reports about them.
>
> We need to be clear to our users about the different cases
>
>  1. What is explicitly not going to work (ie old versions that are dropped)
>  2. What is supported by the code (ie anything newer than min version)
>  3. What is formally tested (ie the version(s) covered by CI

Yes, agreed we should be clear about that distinction.

> The logs are appropriate when we need to alert people about stuff moving
> from item 2, to item 1 in a future release. I don't think we should be
> warning about everything in item 2 for the reason mentioned above. That
> is a task for release notes and/or documentation

Thats fair, mostly due to the vendor situation you describe.

We are just historically bad at doing that properly. We need to find a
good way of doing that.

>> Its tempting to say its scheduled for removal in N? So we have time to
>> work out if thats possible.
>
> I think that at the start of each dev cycle, we look at the distros we
> wish to target and identify if any can be dropped or have been EOLd.
> Then use that to decide what to deprecate in that cycle, for deletion
> in the next cycle. Trying to second guess too many cycles in advance
> is probably counter-productive.

Well, as you mentioned, its not just about EOLed right?
Its more about adoption curves vs maintenance costs?

My idea was to start the conversation about when most folks are likely
to move. They have already said no to deprecating in liberty, removal
in M. I am curious if they still see themselves on RHEL 6.x in N?
Maybe the answer is still yes?

Thanks,
John



More information about the OpenStack-operators mailing list