Open Stack

Thu Jul 10 14:31:04 UTC 2014

On 7/10/2014 5:52 AM, Eoghan Glynn wrote:
> TL;DR: do we need to stabilize notifications behind a versioned
>        and discoverable contract?

Thanks for dusting this off. Versioning and published schemas for
notifications are important to the StackTach team.  It would be nice to
get this resolved. We're happy to help out.

> Folks,
>
> One of the issues that has been raised in the recent discussions with
> the QA team about branchless Tempest relates to some legacy defects
> in the OpenStack notification system.
>
> Now, I don't personally subscribe to the PoV that ceilometer, or
> indeed any other consumer of these notifications (e.g. StackTach), was
> at fault for going ahead and depending on this pre-existing mechanism
> without first fixing it.
>
> But be that as it may, we have a shortcoming here that needs to be
> called out explicitly, and possible solutions explored.
>
> In many ways it's akin to the un-versioned RPC that existed in nova
> before the versioned-rpc-apis BP[1] was landed back in Folsom IIRC,
> except that notification consumers tend to be at arms-length from the
> producer, and the effect of a notification is generally more advisory
> than actionable.
>
> A great outcome would include some or all of the following:
>
>  1. more complete in-tree test coverage of notification logic on the
>     producer side

Ultimately this is the core problem. A breaking change in the
notifications caused tests to fail in other systems. Should we be adding
more tests or simply add version checking at the lower levels (like the
first pass of RPC versioning did)?

(more on this below)

>  2. versioned notification payloads to protect consumers from breaking
>     changes in payload format
Yep, like RPC the biggies are:
1. removal of fields from notifications
2. change in semantics of a particular field
3. addition of new fields (not a biggie)

The urgency for notifications is a little different than RPC where there
is a method on the other end expecting a certain format. Notifications
consumers have to be a little more forgiving when things don't come in
as expected.

This isn't a justification for breaking changes. Just stating that we
have some leeway.

I guess it really comes down to systems that are using notifications for
critical synchronization vs. purely informational.

>  
>  3. external discoverability of which event types a service is emitting
These questions can be saved for later, but ...

Is the use-case that a downstream system can learn which queue to
subscribe to programmatically?

Is this a nice-to-have?

Would / should this belong in a metadata service?

>  4. external discoverability of which event types a service is consuming

Isn't this what the topic queues are for? Consumers should only
subscribe to the topics they're interested in.

> If you're thinking that sounds like a substantial chunk of cross-project
> work & co-ordination, you'd be right :)

Perhaps notification schemas should be broken out into a separate
repo(s)? That way we can test independent of the publishing system. For
example, our notigen event simulator [5] could use it.

These could just be dependent libraries/plugins to oslo.messaging.

>
> So the purpose of this thread is simply to get a read on the appetite
> in the community for such an effort. At the least it would require:
>
>  * trashing out the details in say a cross-project-track session at
>    the K* summit
>
>  * buy-in from the producer-side projects (nova, glance, cinder etc.)
>    in terms of stepping up to make the changes
>
>  * acquiescence from non-integrated projects that currently consume
>    these notifications
>
>    (we shouldn't, as good citizens, simply pull the rug out from under
>    projects such as StackTach without discussion upfront)
We'll adapt StackTach.v2 accordingly. StackTach.v3 is far less impacted
by notification changes since they are offloaded and processed in a
secondary step. Breaking changes will just stall the processing. I
suspect .v3 will be in place before .v2 is affected.

Adding version handling to Stack-Distiller (our notification->event
translator) should be pretty easy (and useful) [6]

>  * dunno if the TC would need to give their imprimatur to such an
>    approach, or whether we could simply self-organize and get it done
>    without the need for governance resolutions etc.
>
> Any opinions on how desirable or necessary this is, and how the
> detailed mechanics might work, would be welcome.

A published set of schemas would be very useful for StackTach, we'd love
to help out in any way possible. In the near-term we have to press on
under the assumption notification definitions are fragile.

> Apologies BTW if this has already been discussed and rejected as
> unworkable. I see a stalled versioned-notifications BP[2] and some
> references to the CADF versioning scheme in the LP fossil-record.
> Also an inconclusive ML thread from 2012[3], and a related grizzly
> summit design session[4], but it's unclear to me whether these
> aspirations got much traction in the end.

I'd really like to see the CADF work adopted vs. other, hand-rolled,
schema solutions.

> Cheers,
> Eoghan

Thanks again for gearing this up. Look forward to next steps.

-Sandy

[5] https://github.com/StackTach/notigen
[6] https://github.com/StackTach/stackdistiller

Open Stack

[openstack-dev] [all] Treating notifications as a contract

OpenStack

Community

Documentation

Branding & Legal