[Openstack] [Orchestration] Handling error events ... explicit vs. implicit
mark.washenberger at rackspace.com
Wed Dec 7 16:36:03 UTC 2011
So the way this might work is, for example, when a run_instance fails on compute node, it would publish a "run_instance for uuid=<blah> failed" event. There would be a subscriber associated with the scheduler listening for such events--when it receives one it would go check the capacity table and update it to reflect the failure. Does that sound about right?
"Sandy Walsh" <sandy.walsh at RACKSPACE.COM> said:
> Sure, the problem I'm immediately facing is reclaiming resources from the Capacity
> table when something fails. (we claim them immediately in the scheduler when the
> host is selected to lessen the latency).
> The other situation is Orchestration needs it for retries, rescheduling, rollbacks
> and cross-service timeouts.
> I think it's needed core functionality. I like Fail-Fast for the same reasons, but
> it can get in the way.
> From: openstack-bounces+sandy.walsh=rackspace.com at lists.launchpad.net
> [openstack-bounces+sandy.walsh=rackspace.com at lists.launchpad.net] on behalf of
> Mark Washenberger [mark.washenberger at rackspace.com]
> Sent: Wednesday, December 07, 2011 11:53 AM
> To: openstack at lists.launchpad.net
> Subject: Re: [Openstack] [Orchestration] Handling error events ... explicit
> vs. implicit
> Can you talk a little more about how you want to apply this failure notification?
> That is, what is the case where you are going to use the information that an
> operation failed? In my head I have an idea of getting code simplicity dividends
> from an "everything succeeds" approach to some of our operations. But it might not
> really apply to the case you're working on.
More information about the Openstack