Open Stack

Fri Oct 28 13:11:17 UTC 2011

When you look at the scalability issue solely from the perspective of the cloud provider, requiring polling is the lazier, but not really more scalable solution. Especially if you go nuts with caching. Then it might be even a bit more scalable.

But when you look at the distributed systems use cases, it's terrible for the system as a whole. People are polling your API because they want to know whether or not the state of the system is different from what they "remember" it to be. The more critical it is for them to rapidly know about changes, the more often they poll. Yes, there are ways to determine when changes are more likely to occur and thus optimize the polling interval. Some clients may be, in your eyes as the provider, overly aggressive. But who the hell are you to judge their use case and throttle them?

But here's the bottom line: The vast majority of work is completely wasted when polling is the change propagation method.

Push notifications don't make your core system any more complex. You push the change to a message queue and rely on another system to do the work.

The other system is scalable. It has no need to be stateless and can be run in an on-demand format using agents to handle the growing/shrinking notification needs.

Bryan brings up the point that some of these subscription endpoints may go away. That's a total red-herring. You have mechanisms in place to detect failed deliveries and unsubscribe after a time (among other strategies). 

The bottom line: When you push changes, the vast majority of your work is meaningful work when pushing is the change propagation method.

Let's do the math. Let's say I am interested in any change in VM state so I can auto-scale/auto-recover. Let's use a fairly simplistic polling strategy, but do it efficiently (and assume the API enables me to make a single API call to get state for all VMs). Let's pick 1 query/minute (in reality, you wouldn't pick a flat polling rate like this, but it is useful for this thought experiment).

Now multiply that times 1,000 customers. Or 100,000. Or 1,000,000. 

Now let's say that the client is going through a cloud management service. And that service is serving 20% of your customer base. They are likely making queries across a wide range of resources, not just VMs. And they have to scale the polling from their end.

Both sides are thus engaged in trying to figure out a way to scale work that is almost entirely pointless work. 

There's a reason you see the cloud management tools "pushing push". We've seen this IaaS polling across a bunch of clouds. It sucks.

-George

--
George Reese - Chief Technology Officer, enStratus
e: george.reese at enstratus.com    t: @GeorgeReese    p: +1.207.956.0217    f: +1.612.338.5041
enStratus: Governance for Public, Private, and Hybrid Clouds - @enStratus - http://www.enstratus.com
To schedule a meeting with me: http://tungle.me/GeorgeReese

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack/attachments/20111028/d7e1b5dd/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 4395 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack/attachments/20111028/d7e1b5dd/attachment.bin>

Open Stack

[Openstack] Push vs Polling (from Versioning Thread)

OpenStack

Community

Documentation

Branding & Legal