[Openstack] RPC Semantics

Johannes Erdfelt johannes at erdfelt.com
Tue Jun 12 20:52:58 UTC 2012


On Tue, Jun 12, 2012, Eric Windisch <eric at cloudscaling.com> wrote:
> > For instance, an instance migration can take a while since we need to
> > copy many gigabytes of disks to another host. If we want to do a
> > software upgrade, we either need to wait a long time for the migration
> > to finish, or we need to restart the service and then restart the
> > processing of the message.
> 
> You wait a long time period. If you wait a long time and it fails,
> you're restarting. Having it do so automatically on the consumer-side
> isn't necessarily a good thing. 

I'm not sure I understand what you're saying.

If a migration takes a few hours, are you willing to wait that long to
restart the software?

If so, are you willing to wait indefinitely if a steady stream of
migrations take place?

Or are you willing to delay unrelated actions while waiting for the one
long migration to finish (because you don't want to wait forever if
there's a steady stream of actions).

> > If all software gets restarted, then persistence is important.
>
> Again, I see an argument in having callers have limited persistence, but
> not consumers.

It's not clear to me why that is so.

> > > All calls have a timeout (TTL). The ZeroMQ driver also implements a TTL
> > > on the casts, and I'm quite sure we should support this in Kombu/Qpid
> > > as well to avoid a thundering-herd.
> > 
> > What thundering herd problems exist in Openstack?
> 
> Say we have one api service, one scheduler.  If the scheduler fails, API
> requests to create an instance will pile up, until the scheduler returns.
> The returning scheduler will get all of those instance creation requests
> and will launch those instances. (This would also be applicable for
> messages between the scheduler and a compute service)

This isn't the thundering herd problem.

http://en.wikipedia.org/wiki/Thundering_herd_problem

> The end-user will see the run-instance command as potentially failing and
> may attempt to launch again. The queue will hold all of these requests
> and they will all get processed when the scheduler returns.

That assumes a downtime that is long enough for the user to get
impatient, right?

> This is especially problematic with auto-scaling. How well will
> Rightscale or Enstratus run against a system that takes hours and hours
> to launch instances?  They'll just retry and retry. You don't want
> these to just queue up.

I agree this is a problem, but it's a bit out-of-scope of my original
email. I think we use the RPC layer incorrectly for some services that
would be better served with something that isn't persisted.

> > I do know there are problems with queuing combined with timeouts. It
> > makes less sense to process a get_nw_info request if the requestor has
> > already timed out and will ignore the response. Is that what you're
> > referring to with TTLs?
> 
> That is important too, in the case of calls, but not all that important.
> I'm not so concerned about machines sending useless replies, we can
> ignore them.

The useless replies aren't the problem, it's the useless processing that's
the problem. If a service is overloaded, then we're just making it worse
by making it do processing that has no value.

> > Idempotent actions want persistence so it will actually complete the
> > action requested in the message. For instance, if nova-compute is
> > stopped in the middle of an instance-create, we want to actually finish
> > the create after the process is restarted.
> 
> Only if it hasn't timed out. Otherwise, you'd only asking for a thundering
> herd.
> 
> What has happened on the caller side? Has it timed out and given the user
> an error?  What manager methods (rpc methods) that call RPC, how deep
> does that stack go?

I'd argue that you will never be able to decide what the "correct"
timeout is for operations like creating a server. Only the user can make
that determination. However, I don't want to get too off-topic.

> Perhaps it is better that if nova-compute is stopped in the middle of
> an instance-create that it can *cleanup* on a restore, rather than
> attempting to continue an arguably pointless and potentially dangerous
> path of actually creating that instance?

I fail to see what is pointless and potentially dangerous about creating
the instance.

Consider the scenario of continuous deployment of software. You may
deploy software multiple times a day. A service is down for only as long
as it takes to update the software, a few seconds.

If the create action is idempotent, then it's safe wrt to restarts. But
it does need the previous messages again to be able to start them again.

> > There is no process waiting for a return value, but we certainly would
> > like for the message to be persisted so we can restart it.
> 
> I'm not sure about that.

Why not?

> > > Anyway, in the ZeroMQ driver, we could have a local queue to track
> > > casts and remove them when the send() coroutine completes. This would
> > > provide restart protection for casts. 
> > 
> > Assuming the requesting process remains running the entire time?
> 
> I meant ONLY persisting in the requesting process. If the requesting
> process fails while before that message is consumed, the requesting
> process can attempt to resubmit that message for consumption upon
> relaunch.  The requesting process would track the amount of time
> waiting for the message to be consumed and would subtract that time
> from the remaining timeout. 

When will the sending service know that it should resend a message?
Wouldn't this be best done a pull basis be the receiving service?

If the requesting service never restarts (because of hardware failure),
who resends the message again?

In some cases, I can see the argument that the receiving side shouldn't
persist (because of the hardware failure scenario), but the same
argument could be made for persisting in the sending side too.

To me, this makes an argument for persisting in another service, but
the downside is added complexity and yet another service.

JE





More information about the Openstack mailing list