[openstack-dev] [all][api][tc][perfromance] API for getting only status of resources

Clint Byrum clint at fewbar.com
Wed Nov 4 06:51:26 UTC 2015


Excerpts from John Griffith's message of 2015-11-03 21:45:12 -0800:
> On Tue, Nov 3, 2015 at 3:20 PM, Boris Pavlovic <boris at pavlovic.me> wrote:
> 
> > Hi stackers,
> >
> > Usually such projects like Heat, Tempest, Rally, Scalar, and other tool
> > that works with OpenStack are working with resources (e.g. VM, Volumes,
> > Images, ..) in the next way:
> >
> > >>> resource = api.resouce_do_some_stuff()
> > >>> while api.resource_get(resource["uuid"]) != expected_status
> > >>>    sleep(a_bit)
> >
> > For each async operation they are polling and call many times
> > resource_get() which creates significant load on API and DB layers due the
> > nature of this request. (Usually getting full information about resources
> > produces SQL requests that contains multiple JOINs, e,g for nova vm it's 6
> > joins).
> >
> > What if we add new API method that will just resturn resource status by
> > UUID? Or even just extend get request with the new argument that returns
> > only status?
> >
> > Thoughts?
> >
> >
> > Best regards,
> > Boris Pavlovic
> >
> > __________________________________________________________________________
> > OpenStack Development Mailing List (not for usage questions)
> > Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> >
> > ​Hey Boris,
> 
> As I asked in IRC, I'm kinda curious what the difference is here in terms
> of API and DB calls.  I very well might be missing an idea here, but
> currently we do a get by ID in that loop that you mention, the only
> difference I see in what you're suggesting is a reduced payload maybe?  A
> response that only includes the status?
> 
> I may be missing an important idea here, but it seems to me that you would
> still have the same number of API calls and DB request, just possibly a
> slightly smaller payload.  Let me know if I'm missing the idea here.

This is a scaling optimization. Reading fewer columns from the DB will
result in a leaner query (even if the time difference is indiscernible
by humans, doing 1000 "SELECT status" vs. "SELECT *" concurrently will
show just how much faster this can be. There's also the issue of ORM
objects. If you can avoid building a whole object, and just grab one field
with one direct query, you'll save overall on RAM, CPU, wire traffic,
cache space, etc. etc. It only makes sense to optimize at this level if
you expect many tight loops polling many resources.



More information about the OpenStack-dev mailing list