[openstack-dev] [TripleO] Is Swift a good choice of database for the TripleO API?

Ben Nemec openstack at nemebean.com
Wed Dec 23 15:28:59 UTC 2015

On 12/23/2015 03:19 AM, Dougal Matthews wrote:
> On 22 December 2015 at 17:59, Ben Nemec <openstack at nemebean.com
> <mailto:openstack at nemebean.com>> wrote:
>     Can we just do git like I've been suggesting all along? ;-)
>     More serious discussion inline. :-)
>     On 12/22/2015 09:36 AM, Dougal Matthews wrote:
>     > Hi all,
>     >
>     > This topic came up in the 2015-12-15 meeting[1], and again briefly
>     today.
>     > After working with the code that came out of the deployment library
>     > spec[2] I
>     > had some concerns with how we are storing the templates.
>     >
>     > Simply put, when we are dealing with 100+ files from
>     tripleo-heat-templates
>     > how can we ensure consistency in Swift without any atomicity or
>     > transactions.
>     > I think this is best explained with a couple of examples.
>     >
>     >  - When we create a new deployment plan (upload all the templates
>     to swift)
>     >    how do we handle the case where there is an error? For example,
>     if we are
>     >    uploading 10 files - what do we do if the 5th one fails for
>     some reason?
>     >    There is a patch to do a manual rollback[3], but I have
>     concerns about
>     >    doing this in Python. If Swift is completely inaccessible for a
>     short
>     >    period the rollback wont work either.
>     >
>     >  - When deploying to Heat, we need to download all the YAML files from
>     > Swift.
>     >    This can take a couple of seconds. What happens if somebody
>     starts to
>     >    upload a new version of the plan in the middle? We could end up
>     trying to
>     >    deploy half old and half new files. We wouldn't have a
>     consistent view of
>     >    the database.
>     >
>     > We had a few suggestions in the meeting:
>     >
>     >  - Add a locking mechanism. I would be concerned about deadlocks or
>     > having to
>     >    lock for the full duration of a deploy.
>     There should be no need to lock the plan for the entire deploy.  It's
>     not like we're re-reading the templates at the end of the deploy today.
>      It's a one-shot read and then the plan could be unlocked, at least as
>     far as I know.
> Good point. That would be holding the lock for longer than we need.
>     The only option where we wouldn't need locking at all is the
>     read-copy-update model Clint mentions, which might be a valid option as
>     well.  Whatever we do, there are going to be concurrency issues though.
>      For example, what happens if two users try to make updates to the plan
>     at the same time?  If you don't either merge the changes or disallow one
>     of them completely then one user's changes might be lost.
>     TBH, this is further convincing me that we should just make this git
>     backed and let git handle the merging and conflict resolution (never
>     mind the fact that it gets us a well-understood version control system
>     for "free").  For updates that don't conflict with other changes, git
>     can merge them automatically, but for merge conflicts you just return a
>     rebase error to the user and make them resolve it.  I have a feeling
>     this is the behavior we'll converge on eventually anyway, and rather
>     than reimplement git, let's just use the real thing.
> I'd be curious to hear more how you would go about doing this with git. I've
> never automated git to this level, so I am concerned about what issues we
> might hit.

TBH I haven't thought it through to that extent yet.  I'm mostly
suggesting it because it seems like a fit for the template storage
requirements - we know we want version control, we want to be able to
merge changes from multiple sources, and we want some way to handle
merge conflicts.  Git does all of this already.

That said, I'm not sure about everything here.  For example, how would
you expose merge conflicts to the user?  I don't know that I would want
to force a user to learn git in order to use TripleO (although that
would be the devops-y thing to do), but maybe just passing them back the
files with the merge conflict markers and having them resolve those
locally and retry the update would work.  I'm not sure how that would
map to the current version of the API though.  Do we provide any way to
pass templates back to the user?  I feel like that was kind of a one-way

> Do you happen to know of any good sources that I can find out more?
>     /2 cents
>     >  - Store the files in a tarball (so we only deal with one file). I think we
>     >    could still hit issues with multiple operations unless we guarantee that
>     >    only one instance of the API is ever running.
>     >
>     > I think these could potentially be made to work, but at this point are we
>     > using the best tool for the job?
>     >
>     > For alternatives, I see a can think of a couple of options:
>     >
>     > - Use a database, like we did for Tuskar and most OpenStack API's do.
>     I kind of like this, in particular because it would allow us to handle
>     migrations between versions the same way as other OpenStack services.
> Yeah, I feel like this is the "safe" option. It's a well trodden and
> tested path.
>     I'm not entirely sure how it maps to our template configuration method
>     though.  Storing a bunch of template blobs in the database feels a
>     little square peg, round hole to me, and might undo a lot of the
>     benefits of using the database in the first place.
> I don't follow this point. In the way we access Swift, everything is
> essentially
> a blob of text also.

True.  I guess I was still comparing to Git here, which does know about
the contents of the files it contains.

>     Now, on the other hand, having a database to store basic data like
>     metadata on plans and such might be a useful thing regardless of where
>     we store the templates themselves.  We could also use it for locking
>     just fine IMHO - TripleO isn't a tool targeted to have cloud-scale
>     number of users.  It's a deployer tool with a relatively limited number
>     of users per installation, so the scaling of a locking mechanism isn't a
>     problem I necessarily think we need to solve.  We have way bigger
>     scaling issues than that to tackle before I think we would hit the limit
>     of a database-locking scheme.
>     > - Invest time in building something on Swift.
>     > - Glance was transitioning to be a Artifact store. I don't know the
>     > status of
>     >   this or if it would meet out needs.
>     >
>     > Any input, ideas or suggestions would be great!
>     >
>     > Thanks,
>     > Dougal
>     >
>     >
>     > [1]:
>     > http://eavesdrop.openstack.org/meetings/tripleo/2015/tripleo.2015-12-15-14.03.log.html#l-89
>     > [2]:
>     > https://specs.openstack.org/openstack/tripleo-specs/specs/mitaka/tripleo-overcloud-deployment-library.html
>     > [3]: https://review.openstack.org/#/c/257481/
>     >
>     >
>     >
>     __________________________________________________________________________
>     > OpenStack Development Mailing List (not for usage questions)
>     > Unsubscribe:
>     OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>     <http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe>
>     > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>     >
>     __________________________________________________________________________
>     OpenStack Development Mailing List (not for usage questions)
>     Unsubscribe:
>     OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>     <http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe>
>     http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

More information about the OpenStack-dev mailing list