[Openstack] Push vs Polling (from Versioning Thread)

Bryan Taylor btaylor at rackspace.com
Fri Oct 28 07:16:38 UTC 2011


On 10/27/2011 02:14 PM, Monsyne Dragon wrote:
>> The web was not designed to deal with a bunch of clients needing to
>> know about infrastructure changes the instant they happen.
> True.  This whole issue is the reason Nova's existing notification system  is designed  as a push system.  Currently it's used to push error notifications and usage info, but there is no reason it could not eventually also provide notifications to end users.  After watching demos of large cloud users where they were polling apis to see when their instances were ready (and often spinning up new ones when that didn't happen fast enough)
> I kept that use case in mind when coding the notifications system.
You are really doing ATOM + PSH, so you offer both push and pull models 
to clients, right?

The standard way to do long running operations is to POST and get back a 
request resource that you poll to see the status and preferably some 
kind of progress metric. It will eventually link to the outputs. 
Contrast this with a system where they submit their request and wait to 
hear back when it's done.  In both cases, let's say their request takes 
45 seconds to complete. After 30 seconds they get impatiant. In the 
first scenario they reload the request resource hoping its done and see 
their request is at 67%. They refresh 3 times in 6 seconds and see it's 
80% done. In the push scenario all they know is they haven't gotten word 
it's done. There's no way for them to take action to find out anything 
or ask for progress.

Expecting the requester to watch a feed until their item shows up is 
bad, because they really want a high resolution look at their request, 
not a course grained look across all requests, so give them a resource 
suitable for their use case. It's like asking a sports fan to watch the 
news to find out who wins. They want to watch their team, not know who 
wins. Other people just want to know who wins, for all the games. It's 
different views into the same events.
>> No, it doesn't. You push changes as they occur to a message queue. A
>> separate system tracks subscribers and sends them out. There is no
>> conversational state if done right.
> Indeed. this is how the notifications system is/can work right now. If you turn notifications on, nova pushes them to a rabbit queue.  A separate app, namely a hub using the standard PubSubHubbub protocol, (plus an external rabbit queue ->  feed generator app we wrote, called Yagi) manages the subscriptions and pings the subscribers.
I'll be one of your clients, actually. I don't think PubSubHubbub helps 
me much. Pitch me on why it's worth my time to go understand its 
registration protocol and why I should implement a server and poke holes 
in firewalls for you to call me as opposed to just running a cron job 
that unrolls the new pages of the feed to the end and sleeps. There's a 
reason this model won out.

An arguable advantage to a client is that I get slightly better data 
latency. But I'm polling within my data latency needs already.

>>>> Push notifications are the only mechanism for solving the scaling issue.
>>>> You push any changes to a message queue. Agents pick up the changes and
>>>> send them on to subscriber endpoints. Not that hard.
>>> Not that hard with a few fairly reliable clients. Very hard with a web scale set of unreliable clients while I simultaneously need to scale the back end.
> Actually, we are already implementing this at 200+node scale in nova, since this is  how we are handling the collection of  usage data for billing.  At the moment is seems to be working reasonably well. We are not using the  PSH hubs atm, since we are pushing to a few internal consumers via AtomPub, and don't need the complex subscription management, but from our point of view, pinging a hub is no different from what we already do, and the hub handles the subscription details.
You advocate ATOM + PSH and I advocate ATOM + cache. The big difference 
is that clients have to know and care about PSH to use it, whereas with 
a cache, they don't.

> It would be nice to one day support notifications of end-users, as I think it would be of great benefit to them. There's work that would need to be done around hub/Yagi auth, and I think there is some bigger fish to fry, nova functionality-wise at the moment, but it is something to keep in mind.
>
Let Repose handle auth for you and add a varnish cache in front of your 
atom feed and let them poll you. Done. You might want rate limiting that 
favors conditional GETs, but the Repose team should handle that.






More information about the Openstack mailing list