Open Stack

Thu Sep 1 13:52:42 UTC 2016

On Wed, Aug 31, 2016 at 3:30 PM, Ian Wells <ijw.ubuntu at cack.org.uk> wrote:
> On 31 August 2016 at 10:12, Clint Byrum <clint at fewbar.com> wrote:
>>
>> Excerpts from Duncan Thomas's message of 2016-08-31 12:42:23 +0300:
>> > On 31 August 2016 at 11:57, Bogdan Dobrelya <bdobrelia at mirantis.com>
>> > wrote:
>> >
>> > > I agree that RPC design pattern, as it is implemented now, is a major
>> > > blocker for OpenStack in general. It requires a major redesign,
>> > > including handling of corner cases, on both sides, *especially* RPC
>> > > call
>> > > clients. Or may be it just have to be abandoned to be replaced by a
>> > > more
>> > > cloud friendly pattern.
>> >
>> >
>> > Is there a writeup anywhere on what these issues are? I've heard this
>> > sentiment expressed multiple times now, but without a writeup of the
>> > issues
>> > and the design goals of the replacement, we're unlikely to make progress
>> > on
>> > a replacement - even if somebody takes the heroic approach and writes a
>> > full replacement themselves, the odds of getting community by-in are
>> > very
>> > low.
>>
>> Right, this is exactly the sort of thing I'd like to gather a group of
>> design-minded folks around in an Architecture WG. Oslo is busy with the
>> implementations we have now, but I'm sure many oslo contributors would
>> like to come up for air and talk about the design issues, and come up
>> with a current design, and some revisions to it, or a whole new one,
>> that can be used to put these summit hallway rumors to rest.
>
>
> I'd say the issue is comparatively easy to describe.  In a call sequence:
>
> 1. A sends a message to B
> 2. B receives messages
> 3. B acts upon message
> 4. B responds to message
> 5. A receives response
> 6. A acts upon response
>
> ... you can have a fault at any point in that message flow (consider crashes
> or program restarts).  If you ask for something to happen, you wait for a
> reply, and you don't get one, what does it mean?  The operation may have
> happened, with or without success, or it may not have gotten to the far end.
> If you send the message, does that mean you'd like it to cause an action
> tomorrow?  A year from now?  Or perhaps you'd like it to just not happen?
> Do you understand what Oslo promises you here, and do you think every person
> who ever wrote an RPC call in the whole OpenStack solution also understood
> it?
>

Precisely - IMHO it's a shortcoming of the current o.m. RPC (and
Notification) API in that it does not let the API user explicitly set
the desired delivery guarantee when publishing.  Right now it's
implied that the delivery guarantee is "At Most Once" but that's
mostly not precisely defined in any meaningful way.

Any messaging API should be explicit regarding what delivery
guarantee(s) are possible.  In addition, an API should allow the user
to designate the importance of a message on a per-send basis:  can
this message be dropped?  can this message be duplicated?  At what
point in time does the message become invalid (already offered for RPC
via timeout, but not Notifications IIRC), etc....

And well-understood failure modes... things always fail...

> I have opinions about other patterns we could use, but I don't want to push
> my solutions here, I want to see if this is really as much of a problem as
> it looks and if people concur with my summary above.  However, the right
> approach is most definitely to create a new and more fitting set of oslo
> interfaces for communication patterns, and then to encourage people to move
> to the new ones from the old.  (Whether RabbitMQ is involved is neither here
> nor there, as this is really a question of Oslo APIs, not their
> implementation.)
>

Hmmmmmm... maybe.   Message bus technology is varied, and so is it's
behavior.  There are brokerless, point-to-point backends supported by
oslo.messaging [1],[2] which will exhibit different
capabilities/behaviors from the traditional broker-based
store-and-forward backend (e.g. message acking end-to-end vs to the
intermediary).

All the more reason to have explicit delivery guarantees and well
understood failure scenarios defined by the API.

[1] http://docs.openstack.org/developer/oslo.messaging/zmq_driver.html
[2] http://docs.openstack.org/developer/oslo.messaging/AMQP1.0.html

> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>

-- 
Ken Giusti  (kgiusti at gmail.com)

Open Stack

[openstack-dev] [all][massively distributed][architecture]Coordination between actions/WGs

OpenStack

Community

Documentation

Branding & Legal