[openstack-dev] [TripleO] Tuskar CLI after architecture changes

Jay Pipes jaypipes at gmail.com
Sat Dec 14 15:51:18 UTC 2013


On Thu, 2013-12-12 at 15:22 +0100, Hugh O. Brock wrote:
> On Thu, Dec 12, 2013 at 03:11:14PM +0100, Ladislav Smola wrote:
> > Agree with this.
> > 
> > Though I am an optimist,  I believe that this time, we can avoid
> > calling multiple services in one request that depend on each other.
> > About the multiple users at once, this should be solved inside the
> > API calls of the services.
> > 
> > So I think we should forbid building these complex API calls
> > composites in the Tuskar-API. If we will want something like this,
> > we should implement
> > it properly inside the services itself. If we will not be able to
> > convince the community about it, maybe it's just not that good
> > feature. :-D
> > 
> 
> It's worth adding that in the particular case Radomir sites (the
> "Deploy" button), even with all the locks in the world, the resources
> that we have supposedly requisitioned in the undercloud for the user may
> have already been allocated to someone else by Nova -- because Nova
> currently doesn't allow reservation of resources. (There is work under
> way to allow this but it is quite a way off.) So we could find ourselves
> claiming for the user that we're going to deploy an overcloud at a
> certain scale and then find ourselves unable to do so.
> 
> Frankly I think the whole multi-user case for Tuskar is far enough off
> that I would consider wrapping a single-login restriction around the
> entire thing and calling it a day... except that that would be
> crazy. I'm just trying to make the point that making these operations
> really safe for multiple users is way harder than just putting a lock on
> the tuskar API.

That's actually not that crazy, Hugh :) We've deployed more than a half
dozen availability zones, and I've never seen anyone trample over each
other trying to deploy OpenStack to the same set of bare-metal machines
at the same time... it simply doesn't happen in the real world -- or at
least, it would be so exceedingly rare that trying to deal with this
kind of thing is more of an academic exercise than anything else.

Instead of focusing on locking issues -- which I agree are very
important in the virtualized side of things where resources are
"thinner" -- I believe that in the bare-metal world, a more useful focus
would be to ensure that the Tuskar API service treats related group
operations (like "deploy an undercloud on these nodes") in a way that
can handle failures in a graceful and/or atomic way.

For example, if the construction or installation of one compute worker
failed, adding some retry or retry-after-wait-for-event logic would be
more useful than trying to put locks in a bunch of places to prevent
multiple sysadmins from trying to deploy on the same bare-metal nodes
(since it's just not gonna happen in the real world, and IMO, if it did
happen, the sysadmins/deployers should be punished and have to clean up
their own mess ;)

Best,
-jay




More information about the OpenStack-dev mailing list