[openstack-dev] [nova] should we have a stale data indication in "nova list/show"?

Joe Gordon joe.gordon0 at gmail.com
Wed Jun 25 00:12:18 UTC 2014


On Tue, Jun 24, 2014 at 4:16 PM, Ahmed RAHAL <arahal at iweb.com> wrote:

> Le 2014-06-24 17:38, Joe Gordon a écrit :
>
>>
>> On Jun 24, 2014 2:31 PM, "Russell Bryant" <rbryant at redhat.com
>> <mailto:rbryant at redhat.com>> wrote:
>>
>
>   > There be dragons here.  Just because Nova doesn't see the node
>> reporting
>>  > in, doesn't mean the VMs aren't actually still running.  I think this
>>  > needs to be left to logic outside of Nova.
>>  >
>>  > For example, if your deployment monitoring really does think the host
>> is
>>  > down, you want to make sure it's *completely* dead before taking
>> further
>>  > action such as evacuating the host.  You certainly don't want to risk
>>  > having the VM running on two different hosts.  This is just a business
>> I
>>  > don't think Nova should be getting in to.
>>
>> I agree nova shouldn't take any actions. But I don't think leaving an
>> instance as 'active' is right either.  I was thinking move instance to
>> error state (maybe an unknown state would be more accurate) and let the
>> user deal with it, versus just letting the user deal with everything.
>> Since nova knows something *may* be wrong shouldn't we convey that to
>> the user (I'm not 100% sure we should myself).
>>
>
> I saw compute nodes going down, from a management perspective (say,
> nova-compute disappeared), but VMs were just fine. Reporting on the state
> may be misleading. The 'unknown' state would fit, but nothing lets us
> presume the VMs are non-functional or impacted.
>

nothing lets us presume the opposite as well. We don't know if the instance
is still up.


>
> As far as an operator is concerned, a compute node not responding is a
> reason enough to check the situation.
>
> To go further about other comments related to customer feedback, there are
> many reasons a customer may think his VM is down, so showing him a 'useful
> information' in some cases will only trigger more anxiety.
> Besides people will start hammering the API to check 'state' instead of
> using proper monitoring.
> But, state is already reported if the customer shuts down a VM, so ...
>
> Currently, compute nodes state reporting is done by the nova-compute
> process himself, reporting back with a time stamp to the database (through
> conductor if I recall well). It's more like a watchdog than a reporting
> system.
> For VMs (assuming we find it useful) the same kind of process could occur:
> nova-compute reporting back all states with time stamps for all VMs he
> hosts. This shall then be optional, as I already sense scaling/performance
> issues here (ceilometer anyone ?).
>
> Finally, assuming the customer had access to this 'unknown' state
> information, what would he be able to do with it ? Usually he has no lever
> to 'evacuate' or 'recover' the VM. All he could do is spawn another
> instance to replace the lost one. But only if the VM really is currently
> unavailable, an information he must get from other sources.
>

If I was a user, and my instance went to an 'UNKNOWN' state, I would check
if its still operating, and if not delete it and start another instance.


>
> So, I see how the state reporting could be a useful information, but am
> not sure that nova Status is the right place for it.
>
> Ahmed.
>
>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20140624/82be90f7/attachment.html>


More information about the OpenStack-dev mailing list