On 5/23/19 3:11 AM, Matthew Booth wrote:
On Thu, 23 May 2019 at 03:02, melanie witt <melwittt@gmail.com> wrote:
Hey all,
I'm looking for feedback around whether we can improve how we show server status in server list and server show when the compute host it resides on is down.
When a compute host goes down while a server on it was previously running, the server status continues to show as ACTIVE in a server list. This is because the power state and status is adjusted by a periodic task run by nova-compute, so if nova-compute is down, it cannot update those states.
So, for an end user, when they do a server list, they see their server as ACTIVE when it's actually powered off.
We have another field called 'host_status' available since API microversion 2.16 [1] which is controlled by policy and defaults to admin, which is capable of showing the server status as UNKNOWN if the field is specified, for example:
nova list --fields id,name,status,task_state,power_state,networks,host_status
This is cool, but it is only available to admin by default, and it requires that the end user adds the field to their CLI command in the --fields option.
Question: do people think we should make the server status field reflect UNKNOWN as well, if the 'host_status' is UNKNOWN? And if so, should it be controlled by policy or no?
Normally, we do not expose compute host details to non-admin in the API by default, but I noticed recently that our "down cells" support will show server status as UNKNOWN if a server is in a down cell [2]. So I wondered if it would be considered OK to show UNKNOWN if a host is down we well, without defaulting it to admin-only.
+1 from me. This seems to have confused users in the past and honest is better than potentially wrong, imho. I can't think of a reason why this information 'leak' would cause any problems. Can anybody else?
Agreed. I don't think that a server status of "UNKNOWN" really constitutes "exposing compute host details". It's not sharing anything about *why* the server status is unknown - it's just not pretending that the last known status is still valid, when that may or may not actually be true. Or is the proposal to expose host_status where it would not normally be visible? It seems that the the down-host scenario is basically the same as down-cell, as far as being able to ascertain server status, so it seems to make sense to use the same indicator. ~iain