---- On Sat, 26 Jan 2019 03:17:44 +0900 Sean Mooney <smooney@redhat.com> wrote ----
On 01/25/2019 12:04 PM, Terry Wilson wrote:
On Thu, Jan 24, 2019 at 7:34 PM Ghanshyam Mann <gmann@ghanshyammann.com> wrote:
As Sean also pointed that in patch that we should go for the approach of "making sure all attached interface to server is active, server is sshable bthe efore server can be used in test" [1]. This is something we agreed in Denver PTG for afazekas proposal[2].
If we see the from user perspective , user can have an Active VM with active port which can flip to down in between of that port usage. This seems bug to me.
To me, this ignores real-world situations where a port status *can* change w/o user interaction.
How is this ignoring that scenario?
On Fri, 2019-01-25 at 12:26 -0500, Jay Pipes wrote: the only case i know of for definity would be if the admin state is down which should not prevent the vm from booting but neutron shoudl not allow network connecitigy in this case.
Can it happen in between of connectivity also? I mean when VM is active and SSHable then, down admin state can cause port to become down.
It seems weird to ignore a status change if it is detected. In the case that we hit, it was a change to os-vif where it was recreating a port.
Which was a bug, right?
yes kind of. we could have fixed it by merging the nova change i had or reverting the os-vif change. i revert the os-vif change as the nova change was hitting a different bug in neutron. but only one entity. os-vif or the hyperviror should have been creating the port on ovs. so it was a bug when both were.
But it could just as easily be some vendor-specific "that port just died" kind of thing.
In which case, the test waiting for SSH to be available would timeout because connectivity would be broken anyway, no?
if it did not recover yes it would.
Why not update the status of the port if you
know it has changed?
Sorry, I don't see where anyone is suggesting not changing the status of the port if some non-bug real scenario changes the status of the port?
Also, the patch itself (outside the ironic case) just adds a window for the status to bounce.
Unless I'm mistaken, the patch is simply changing the condition that the tempest test uses to identify broken VM connectivity. It will use the SSH connectivity test instead of looking at the port status test.
The SSH test was determined to be a more stable test of VM network connectivity than relying on the Neutron port status indicator which can be a little flaky.
ssh is more reliable for hotpug as we needed to wait for the guest os to process the hotplug event. waithing for the vm to be pingable or sshable is more reliable in that specific case. the port status being active simply means that the port is curently configured by neutron. that gives you no knolage of if the gust has processed the hotplug event.
+1, I agree on hotplug event case and yes Tempest test should make test VM usable for test after sshable/pingable success. afazekas updated few test for that and it will be reasonable thing to do.
in general im not sure if ssh connectivity would be more reliabel but if that is what the test requires to work its better to expeclitly validate it then use the port status as a proxy.
Or am I missing something?
its a valid question i think port status and vm connectity are two different things.
if you are writing an api test then port status hsould be suffient. if you need to connect to the vm in any way it becomes a senario test in which case wait for sshable or pingable might be more suitable.
Yeah, scenario tests expect the end-to-end connectivity internal/external to tenants. Tempest API tests hardly check the ssh verification. -gmann
not sure if i answer your question however.
-jay