[openstack-dev] [all][massively distributed][architecture]Coordination between actions/WGs

Alec Hothan (ahothan) ahothan at cisco.com
Thu Sep 1 15:56:30 UTC 2016


This topic of oslo messaging issues has been going on for a long time and the main issue is not the transport itself (each transport has its own limitations) but the code using oslo messaging (e.g. pieces of almost every openstack service). It is relatively easy to write code using oslo messaging that works with devstack or a small scale deployment, it is much less easy to write such code that works under the conditions of operations at scale: frequent lack of an appropriate test platform, limitations in existing testing tools and to top it all, "fuzzy" oslo messaging API definition makes the handling of abnormal conditions and load conditions very unpredictable and inconsistent across components.

You can't solve this by just "fixing" the oslo messaging layer or by swapping to another transport (you'll just open up another can of worms)

As suggested by Ian below, the only practical way to fix this is to define a new set of APIs that is much more strictly defined, have openstack code migrate to these new APIs and test adequately.
That is clearly very difficult to do with resources moving away from "stable and mature" services and attracted by the latest buzzwords (such as containers).

On the original topic of this thread, having geographical distribution will certainly introduce a new set of issues at scale.


  Alec

 






On 9/1/16, 6:52 AM, "Ken Giusti" <kgiusti at gmail.com> wrote:

>On Wed, Aug 31, 2016 at 3:30 PM, Ian Wells <ijw.ubuntu at cack.org.uk> wrote:
>> On 31 August 2016 at 10:12, Clint Byrum <clint at fewbar.com> wrote:
>>>
>>> Excerpts from Duncan Thomas's message of 2016-08-31 12:42:23 +0300:
>>> > On 31 August 2016 at 11:57, Bogdan Dobrelya <bdobrelia at mirantis.com>
>>> > wrote:
>>> >
>>> > > I agree that RPC design pattern, as it is implemented now, is a major
>>> > > blocker for OpenStack in general. It requires a major redesign,
>>> > > including handling of corner cases, on both sides, *especially* RPC
>>> > > call
>>> > > clients. Or may be it just have to be abandoned to be replaced by a
>>> > > more
>>> > > cloud friendly pattern.
>>> >
>>> >
>>> > Is there a writeup anywhere on what these issues are? I've heard this
>>> > sentiment expressed multiple times now, but without a writeup of the
>>> > issues
>>> > and the design goals of the replacement, we're unlikely to make progress
>>> > on
>>> > a replacement - even if somebody takes the heroic approach and writes a
>>> > full replacement themselves, the odds of getting community by-in are
>>> > very
>>> > low.
>>>
>>> Right, this is exactly the sort of thing I'd like to gather a group of
>>> design-minded folks around in an Architecture WG. Oslo is busy with the
>>> implementations we have now, but I'm sure many oslo contributors would
>>> like to come up for air and talk about the design issues, and come up
>>> with a current design, and some revisions to it, or a whole new one,
>>> that can be used to put these summit hallway rumors to rest.
>>
>>
>> I'd say the issue is comparatively easy to describe.  In a call sequence:
>>
>> 1. A sends a message to B
>> 2. B receives messages
>> 3. B acts upon message
>> 4. B responds to message
>> 5. A receives response
>> 6. A acts upon response
>>
>> ... you can have a fault at any point in that message flow (consider crashes
>> or program restarts).  If you ask for something to happen, you wait for a
>> reply, and you don't get one, what does it mean?  The operation may have
>> happened, with or without success, or it may not have gotten to the far end.
>> If you send the message, does that mean you'd like it to cause an action
>> tomorrow?  A year from now?  Or perhaps you'd like it to just not happen?
>> Do you understand what Oslo promises you here, and do you think every person
>> who ever wrote an RPC call in the whole OpenStack solution also understood
>> it?
>>
>
>Precisely - IMHO it's a shortcoming of the current o.m. RPC (and
>Notification) API in that it does not let the API user explicitly set
>the desired delivery guarantee when publishing.  Right now it's
>implied that the delivery guarantee is "At Most Once" but that's
>mostly not precisely defined in any meaningful way.
>
>Any messaging API should be explicit regarding what delivery
>guarantee(s) are possible.  In addition, an API should allow the user
>to designate the importance of a message on a per-send basis:  can
>this message be dropped?  can this message be duplicated?  At what
>point in time does the message become invalid (already offered for RPC
>via timeout, but not Notifications IIRC), etc....
>
>And well-understood failure modes... things always fail...
>
>
>> I have opinions about other patterns we could use, but I don't want to push
>> my solutions here, I want to see if this is really as much of a problem as
>> it looks and if people concur with my summary above.  However, the right
>> approach is most definitely to create a new and more fitting set of oslo
>> interfaces for communication patterns, and then to encourage people to move
>> to the new ones from the old.  (Whether RabbitMQ is involved is neither here
>> nor there, as this is really a question of Oslo APIs, not their
>> implementation.)
>>
>
>Hmmmmmm... maybe.   Message bus technology is varied, and so is it's
>behavior.  There are brokerless, point-to-point backends supported by
>oslo.messaging [1],[2] which will exhibit different
>capabilities/behaviors from the traditional broker-based
>store-and-forward backend (e.g. message acking end-to-end vs to the
>intermediary).
>
>All the more reason to have explicit delivery guarantees and well
>understood failure scenarios defined by the API.
>
>[1] http://docs.openstack.org/developer/oslo.messaging/zmq_driver.html
>[2] http://docs.openstack.org/developer/oslo.messaging/AMQP1.0.html
>
>
>> __________________________________________________________________________
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>
>
>
>-- 
>Ken Giusti  (kgiusti at gmail.com)
>
>__________________________________________________________________________
>OpenStack Development Mailing List (not for usage questions)
>Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


More information about the OpenStack-dev mailing list