[openstack-dev] [nova][neutron]Fail to communicate new host when the first host for a new instance fails
Neil Jerram
Neil.Jerram at metaswitch.com
Thu May 14 12:16:43 UTC 2015
Hi Rossella,
Many thanks for your quick reply!
On 14/05/15 11:08, Rossella Sblendido wrote:
> Hi Neil,
>
> what's the status of the port after the migration? You might be hitting
> [1] . See also the patch that fixes the issue [2]
Thanks, but that is definitely not the cause of the problem in my case,
because my agent does not call get_device_details.
(BTW - it seems obviously wrong to me for an API named
get_device_details to change the port status to BUILD, even if the call
is coming from the correct host. I would expect that an agent could
safely call get_device_details at any time without having any effect on
the port state.)
> If you wait a bit longer, is the host_id updated by Nova?
No, it isn't.
I've now been able to reproduce this again, and look directly at the
Neutron DB, and I think what I see indicates that this is definitely an
OpenStack bug (as opposed to a problem in my mechanism driver).
My hosts are named calico-vm13 and calico-vm15, and calico-vm13 is set
up so that libvirt will fail to launch any instances. When I use the
Horizon UI to create an instance, Nova tries calico-vm13 first - which
fails - and then calico-vm15, which succeeds.
Horizon then shows that the instance is on calico-vm15:
admin calico-vm15
dltst
cirros-0.3.2-x86_64
10.28.29.214
2001:db8:c41:2::1d9a
m1.tiny Active None Running 24 minutes
The port for that instance is the cc80291c one here:
mysql> select * from ports;
+------------+-------------+------+-------------+-------------------+----------------+--------+-------------+--------------+
| tenant_id | id | name | network_id | mac_address |
admin_state_up | status | device_id | device_owner |
+------------+-------------+------+-------------+-------------------+----------------+--------+-------------+--------------+
| b2d9f70... | 79fd9d6c... | | 1fca4aa4... | fa:16:3e:d3:1a:62 |
1 | DOWN | dhcpea9f... | network:dhcp |
| b2d9f70... | cc80291c... | | 1fca4aa4... | fa:16:3e:bc:df:f0 |
1 | ACTIVE | e2b61f5f... | compute:None |
| b2d9f70... | d9f7d1d0... | | 1fca4aa4... | fa:16:3e:0b:29:3e |
1 | DOWN | dhcp2ffe... | network:dhcp |
And the ml2_port_bindings table shows that Neutron/ML2 thinks that port
is still on calico-vm13:
mysql> select * from ml2_port_bindings;
+-------------+-------------+----------+--------+-------------+-----------+-----------------------+---------+
| port_id | host | vif_type | driver | segment |
vnic_type | vif_details | profile |
+-------------+-------------+----------+--------+-------------+-----------+-----------------------+---------+
| 79fd9d6c... | calico-vm13 | tap | calico | fdc5ef44... | normal
| {"port_filter": true} | |
| cc80291c... | calico-vm13 | tap | calico | fdc5ef44... | normal
| {"port_filter": true} | |
| d9f7d1d0... | calico-vm15 | tap | calico | fdc5ef44... | normal
| {"port_filter": true} | |
Where should I start looking, to see where Nova / Neutron _should_ be
updating the port binding, in this scenario?
Many thanks,
Neil
> cheers,
>
> Rossella
>
> [1] https://bugs.launchpad.net/neutron/+bug/1439857
> [2] https://review.openstack.org/#/c/163178/
>
> On 05/14/2015 11:29 AM, Neil Jerram wrote:
>> Hi all, this is about a problem I'm seeing with my Neutron ML2 mechanism
>> driver [1]. I'm expecting to see an update_port_postcommit call to
>> signal that the binding:host_id for a port is changing, but I don't see
>> that.
>>
>> The scenario is launching a new instance in a cluster with two compute
>> hosts, where we've rigged things so that one of the compute hosts will
>> always be chosen first, but libvirt isn't correctly configured there and
>> hence the instance launching attempt will fail. Then Nova tries to use
>> the other compute host instead, and that mostly works - except that my
>> mechanism driver still thinks that the new instance's port is still
>> bound to the first compute host.
>>
>> Is anyone aware of a known problem in this area (in Juno-level code), or
>> where I could like to start pinning this down in more detail?
>>
>> Many thanks,
>> Neil
>>
>> __________________________________________________________________________
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>>
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
More information about the OpenStack-dev
mailing list