[Openstack-operators] attaching network cards to VMs taking a very long time

Saverio Proto zioproto at gmail.com
Tue May 22 13:30:24 UTC 2018


Sorry email went out incomplete.
Read this:
https://cloudblog.switch.ch/2017/08/28/starting-1000-instances-on-switchengines/

make sure that Openstack rootwrap configured to work in daemon mode

Thank you

Saverio


2018-05-22 15:29 GMT+02:00 Saverio Proto <zioproto at gmail.com>:
> Hello Radu,
>
> do you have the Openstack rootwrap configured to work in daemon mode ?
>
> please read this article:
>
> 2018-05-18 10:21 GMT+02:00 Radu Popescu | eMAG, Technology
> <radu.popescu at emag.ro>:
>> Hi,
>>
>> so, nova says the VM is ACTIVE and actually boots with no network. We are
>> setting some metadata that we use later on and have cloud-init for different
>> tasks.
>> So, VM is up, OS is running, but network is working after a random amount of
>> time, that can get to around 45 minutes. Thing is, is not happening to all
>> VMs in that test (around 300), but it's happening to a fair amount - around
>> 25%.
>>
>> I can see the callback coming few seconds after neutron openvswitch agent
>> says it's completed the setup. My question is, why is it taking so long for
>> nova openvswitch agent to configure the port? I can see the port up in both
>> host OS and openvswitch. I would assume it's doing the whole namespace and
>> iptables setup. But still, 30 minutes? Seems a lot!
>>
>> Thanks,
>> Radu
>>
>> On Thu, 2018-05-17 at 11:50 -0400, George Mihaiescu wrote:
>>
>> We have other scheduled tests that perform end-to-end (assign floating IP,
>> ssh, ping outside) and never had an issue.
>> I think we turned it off because the callback code was initially buggy and
>> nova would wait forever while things were in fact ok, but I'll  change
>> "vif_plugging_is_fatal = True" and "vif_plugging_timeout = 300" and run
>> another large test, just to confirm.
>>
>> We usually run these large tests after a version upgrade to test the APIs
>> under load.
>>
>>
>>
>> On Thu, May 17, 2018 at 11:42 AM, Matt Riedemann <mriedemos at gmail.com>
>> wrote:
>>
>> On 5/17/2018 9:46 AM, George Mihaiescu wrote:
>>
>> and large rally tests of 500 instances complete with no issues.
>>
>>
>> Sure, except you can't ssh into the guests.
>>
>> The whole reason the vif plugging is fatal and timeout and callback code was
>> because the upstream CI was unstable without it. The server would report as
>> ACTIVE but the ports weren't wired up so ssh would fail. Having an ACTIVE
>> guest that you can't actually do anything with is kind of pointless.
>>
>> _______________________________________________
>>
>> OpenStack-operators mailing list
>>
>> OpenStack-operators at lists.openstack.org
>>
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>>
>>
>>
>> _______________________________________________
>> OpenStack-operators mailing list
>> OpenStack-operators at lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>>



More information about the OpenStack-operators mailing list