Open Stack

Tue Oct 21 13:56:02 UTC 2014

We experienced the same with Openvswitch newer than 2.0.0.
The Version working is openvswitch-2.0.0-1.fc20.x86_64 (the initial fc20
package).

Kind Regards
Oliver

Am 21.10.2014 um 07:56 schrieb Daniele Venzano:
> The version we are using is:
> 1.10.2-0ubuntu2~cloud0
> 
> The version that was not working for us is:
> 2.0.1+git20140120-0ubuntu2~cloud1
> 
> Network:
> Intel Corporation I350 Gigabit Network Connection (igb module)
> 
> We were seeing the problem, strangely enough, at the application level,
> inside the VMs, where Hadoop was reporting corrupted data on TCP
> connections. No other messages on the hypervisor or in the VM kernel.
> Hadoop makes lots of connections to lots of different VMs moving lots
> (terabytes) of data as fast as posssibile. Also, it was
> non-deterministic, Hadoop would try several times to transfer the data,
> sometimes successfully, sometimes giving up. I tried some quick iperf
> tests, but they worked fine.
> 
> Daniele
> 
> On 10/20/14 18:46, Manish Godara wrote:
>> > We had to do the same downgrade with openvswitch, the newest
>> version, under heavy load, corrupts packets in-transit, but we do not
>> have the time to investigate the issue further.
>>
>> Daniele, what was the openvswitch version before and after the
>> upgrade?  And which ethernet drivers do you have?  The corruption
>> maybe related to the drivers you have (the issues may be triggered by
>> the way openvswitch flows are configured in Icehouse vs Havana).
>>
>> Thanks.
>>
>> From: Daniele Venzano <daniele.venzano at eurecom.fr
>> <mailto:daniele.venzano at eurecom.fr>>
>> Organization: Eurecom
>> Date: Sunday, October 19, 2014 11:46 PM
>> To: "openstack-operators at lists.openstack.org
>> <mailto:openstack-operators at lists.openstack.org>"
>> <openstack-operators at lists.openstack.org
>> <mailto:openstack-operators at lists.openstack.org>>
>> Subject: Re: [Openstack-operators] qemu 1.x to 2.0
>>
>> We have the same setup (Icehouse on Ubuntu 12.04) and had similar
>> issues. We downgraded qemu from 2.x to 1.x, as we cannot terminate all
>> VMs for all users. We had non-resumable VMs also in the middle of the
>> 1.x series and nothing was documented in the changlelog.
>> We had to do the same downgrade with openvswitch, the newest version,
>> under heavy load, corrupts packets in-transit, but we do not have the
>> time to investigate the issue further.
>>
>> We plan to warn our users in time for the next major upgrade to Juno
>> that all VMs need to be terminated, probably during the Christmas
>> holidays. I do not think they will be happy.
>> Seeing also all the problems we had upgrading Neutron from OVS to ML2,
>> terminating all VMs is probably the best policy anyway during an
>> OpenStack upgrade. Or you do lots of migrations and upgrade qemu one
>> compute host at the time, but if something goes wrong you end-up with
>> an angry user and a stuck VM.
>>
>> It certainly is a big deal.
>>
>> On 10/20/14 00:59, Joe Topjian wrote:
>>> Hello,
>>>
>>> We recently upgraded an OpenStack Grizzly environment to Icehouse
>>> (doing a quick stop-over at Havana). This environment is still
>>> running Ubuntu 12.04.
>>>
>>> The Ubuntu 14.04 release notes
>>> <https://wiki.ubuntu.com/TrustyTahr/ReleaseNotes#Ubuntu_Server> make
>>> mention of incompatibilities with 12.04 and moving to 14.04 and qemu
>>> 2.0. I didn't think that this would apply for upgrades staying on
>>> 12.04, but it indeed does.
>>>
>>> We found that existing instances could not be live migrated (as per
>>> the release notes). Additionally, instances that were hard-rebooted
>>> and had the libvirt xml file rebuilt could no longer start, either.
>>>
>>> The exact error message we saw was:
>>>
>>> "Length mismatch: vga.vram: 1000000 in != 800000"
>>>
>>> I found a few bugs that are related to this, but I don't think
>>> they're fully relevant to the issue I ran into:
>>>
>>> https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1308756
>>> https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1291321
>>> https://bugs.launchpad.net/nova/+bug/1312133
>>>
>>> We ended up downgrading to the stock Ubuntu 12.04 qemu 1.0 packages
>>> and everything is working nicely.
>>>
>>> I'm wondering if anyone else has run into this issue and how they
>>> dealt with it or plan to deal with it.
>>>
>>> Also, I'm curious as to why exactly qemu 1.x to 2.0 are incompatible
>>> with each other. Is this just an Ubuntu issue? Or is this native of qemu?
>>>
>>> Unless I'm missing something, this seems like a big deal. If we
>>> continue to use Ubuntu's OpenStack packages, we're basically stuck at
>>> 12.04 and Icehouse unless we have all users snapshot their instance
>>> and re-launch in a new cloud.
>>>
>>> Thanks,
>>> Joe
>>>
>>>
>>>
>>> _______________________________________________
>>> OpenStack-operators mailing list
>>> OpenStack-operators at lists.openstack.orghttp://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>>
>>
> 
> 
> 
> 
> _______________________________________________
> OpenStack-operators mailing list
> OpenStack-operators at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
> 

Open Stack

[Openstack-operators] qemu 1.x to 2.0

OpenStack

Community

Documentation

Branding & Legal