[Openstack] Push vs Polling (from Versioning Thread)

Bryan Taylor btaylor at rackspace.com
Fri Oct 28 03:20:32 UTC 2011


Just to be clear we are talking about APIs fit for customer consumption 
here, not internal integrations where both ends are under our control.

On 10/27/2011 11:38 AM, George Reese wrote:
>> I disagree. The web was designed specifically to solve the distributed scaling problem and it's based on HTTP polling. It scales pretty well. The argument against polling not scaling inevitably neglects using caching properly.
> The web was not designed to deal with a bunch of clients needing to
> know about infrastructure changes the instant they happen.
Neither physics nor math were designed for that either. The CAP theorem 
simply doesn't allow a distributed system with an uptime guarantee to 
communicate changes "the instant they happen". Once you realize the best 
your clients can hope for is eventual consistency, the sooner you'll 
realize that polling is just fine.

BTW, here's Roy Fielding's article on this subject of poll vs push.
http://roy.gbiv.com/untangled/2008/paper-tigers-and-hidden-dragons
> And API data should not be cached. The Rackspace API used to do that,
> and it created a mess.
I'm not sure what you are referring to, but this is a classic strawman. 
Somebody implemented a "mess" using caching, so caching is bad!? You 
didn't say what the mess was, so there's no way to even evaluate your 
statement.
>> Push doesn't scaled because it requires the server to know about every client and track conversational state with them.
> No, it doesn't. You push changes as they occur to a message queue. A
> separate system tracks subscribers and sends them out. There is no
> conversational state if done right.

A "separate" system? That's why you think it's simple -- you push the 
hard part outside of your box and claim victory. It's not a separate 
system, it's all one big cloud. If there are N interested clients the 
process you described requires O(N) resources. Moving it to another tier 
means it's somebody else's O(N) resources. You are illustrating 
Fielding's point in the article above: "People need to understand that 
general-purpose PubSub is not a solution to scalability problems — it 
simply moves the problem somewhere else, and usually to a place that is 
inversely supported by the economics. "

How exactly does this separate system know where to "send them out" to? 
Each client has to tell it and you have to store it and look it up on a 
per outbound message basis. And keep it accurate. Customers just love 
keeping you informed of where they want to send their messages. Do you 
know what happens when they forget to tell you they moved and they don't 
get the message? They blame you and ask for a credit memo. And do you 
know what happens when you tell them no. They go to your competitor. If 
there is no conversational state, then you aren't waiting for an 
acknowledgement from the other side for each message and you can't prove 
that it was delivered or even try again.





More information about the Openstack mailing list