[nova][dev][ops] server status when compute host is down
melanie witt
melwittt at gmail.com
Tue Jun 18 22:40:13 UTC 2019
On 5/23/19 1:08 PM, melanie witt wrote:
> On Thu, 23 May 2019 11:56:34 -0700, Iain Macdonnell
> <iain.macdonnell at oracle.com> wrote:
>>
>>
>> On 5/23/19 11:32 AM, Matt Riedemann wrote:
>>> As I said elsewhere in this thread, if you're proposing to add a new
>>> policy rule to change the 'status' field based on host_status, why not
>>> just tell people to open up the policy rule we already have for the
>>> host_status field so non-admins can see it in their server details? This
>>> sounds like an education problem more than a technical problem to me.
>>
>> Because *that* implies revealing infrastructure status details to
>> end-users, which is probably not desirable in a lot of cases.
>
> This is a good point. If an operator were to enable 'host_status' via
> policy, end users would also get to see host_status UP and DOWN, which
> is typically not desired by cloud admins. There's currently no option
> for exposing only UNKNOWN, as a small but helpful bit of info for end
> users.
>
>> Isn't this as simple as not lying to the user about the *server* status
>> when it cannot be ascertained for any reason? In that case, the user
>> should be given (only) that information, but not any "dirty laundry"
>> about what caused it....
>>
>> Even if the admin doesn't care about revealing infrastructure status,
>> the end-user shouldn't have to know that server_status can't be trusted,
>> and that they have to check other fields to figure out if it's reliable
>> or not at any given time.
>
> And yes, I was thinking about it more simply, and the replies on this
> thread have led me to think that if we could show the cosmetic-only
> status of UNKNOWN for nova-compute communication interruptions, similar
> to what we do for down cells, we would not put a policy control on it
> (since UNKNOWN is not leaking infra details). And not make any changes
> to notifications etc, just a cosmetic-only UNKNOWN status implemented at
> the REST API layer if host_status is UNKNOWN. I was thinking maybe we'd
> leave server status alone if host_status is UP or DOWN since its status
> should be reflected in those cases as-is.
>
> Assuming we could move forward without a policy control on it, I think
> the only remaining concern would be the collision of UNKNOWN status with
> down cells where for down cells, some server attributes are not
> available. Personally, this doesn't seem like a major problem to me
> since UNKNOWN implies an uncertain state, in general. But maybe I'm
> wrong. How important is the difference?
>
> Finally, it sounds like the consensus is that if we do decide to make
> this change, we would need a new microversion to account for server
> status being able to be UNKNOWN if host_status is UNKNOWN.
FYI, I've proposed a spec here: https://review.opendev.org/666181
-melanie
More information about the openstack-discuss
mailing list