[openstack-dev] [TripleO] Is Swift a good choice of database for the TripleO API?

Jiri Tomasek jtomasek at redhat.com
Wed Jan 6 11:27:12 UTC 2016


On 01/06/2016 11:48 AM, Dougal Matthews wrote:
>
>
> On 5 January 2016 at 17:09, Jiri Tomasek <jtomasek at redhat.com 
> <mailto:jtomasek at redhat.com>> wrote:
>
>     On 12/23/2015 07:40 PM, Steven Hardy wrote:
>
>         On Wed, Dec 23, 2015 at 11:05:05AM -0600, Ben Nemec wrote:
>
>             On 12/23/2015 10:26 AM, Steven Hardy wrote:
>
>                 On Wed, Dec 23, 2015 at 09:28:59AM -0600, Ben Nemec wrote:
>
>                     On 12/23/2015 03:19 AM, Dougal Matthews wrote:
>
>
>                         On 22 December 2015 at 17:59, Ben Nemec
>                         <openstack at nemebean.com
>                         <mailto:openstack at nemebean.com>
>                         <mailto:openstack at nemebean.com
>                         <mailto:openstack at nemebean.com>>> wrote:
>
>                              Can we just do git like I've been
>                         suggesting all along? ;-)
>
>                              More serious discussion inline. :-)
>
>                              On 12/22/2015 09:36 AM, Dougal Matthews
>                         wrote:
>                              > Hi all,
>                              >
>                              > This topic came up in the 2015-12-15
>                         meeting[1], and again briefly
>                              today.
>                              > After working with the code that came
>                         out of the deployment library
>                              > spec[2] I
>                              > had some concerns with how we are
>                         storing the templates.
>                              >
>                              > Simply put, when we are dealing with
>                         100+ files from
>                              tripleo-heat-templates
>                              > how can we ensure consistency in Swift
>                         without any atomicity or
>                              > transactions.
>                              > I think this is best explained with a
>                         couple of examples.
>                              >
>                              >  - When we create a new deployment plan
>                         (upload all the templates
>                              to swift)
>                              >    how do we handle the case where
>                         there is an error? For example,
>                              if we are
>                              >    uploading 10 files - what do we do
>                         if the 5th one fails for
>                              some reason?
>                              >    There is a patch to do a manual
>                         rollback[3], but I have
>                              concerns about
>                              >    doing this in Python. If Swift is
>                         completely inaccessible for a
>                              short
>                              >    period the rollback wont work either.
>                              >
>                              >  - When deploying to Heat, we need to
>                         download all the YAML files from
>                              > Swift.
>                              >    This can take a couple of seconds.
>                         What happens if somebody
>                              starts to
>                              >    upload a new version of the plan in
>                         the middle? We could end up
>                              trying to
>                              >    deploy half old and half new files.
>                         We wouldn't have a
>                              consistent view of
>                              >    the database.
>                              >
>                              > We had a few suggestions in the meeting:
>                              >
>                              >  - Add a locking mechanism. I would be
>                         concerned about deadlocks or
>                              > having to
>                              >    lock for the full duration of a deploy.
>
>                              There should be no need to lock the plan
>                         for the entire deploy.  It's
>                              not like we're re-reading the templates
>                         at the end of the deploy today.
>                               It's a one-shot read and then the plan
>                         could be unlocked, at least as
>                              far as I know.
>
>
>                         Good point. That would be holding the lock for
>                         longer than we need.
>
>                              The only option where we wouldn't need
>                         locking at all is the
>                              read-copy-update model Clint mentions,
>                         which might be a valid option as
>                              well.  Whatever we do, there are going to
>                         be concurrency issues though.
>                               For example, what happens if two users
>                         try to make updates to the plan
>                              at the same time?  If you don't either
>                         merge the changes or disallow one
>                              of them completely then one user's
>                         changes might be lost.
>
>                              TBH, this is further convincing me that
>                         we should just make this git
>                              backed and let git handle the merging and
>                         conflict resolution (never
>                              mind the fact that it gets us a
>                         well-understood version control system
>                              for "free").  For updates that don't
>                         conflict with other changes, git
>                              can merge them automatically, but for
>                         merge conflicts you just return a
>                              rebase error to the user and make them
>                         resolve it.  I have a feeling
>                              this is the behavior we'll converge on
>                         eventually anyway, and rather
>                              than reimplement git, let's just use the
>                         real thing.
>
>
>                         I'd be curious to hear more how you would go
>                         about doing this with git. I've
>                         never automated git to this level, so I am
>                         concerned about what issues we
>                         might hit.
>
>                     TBH I haven't thought it through to that extent
>                     yet.  I'm mostly
>                     suggesting it because it seems like a fit for the
>                     template storage
>                     requirements - we know we want version control, we
>                     want to be able to
>                     merge changes from multiple sources, and we want
>                     some way to handle
>                     merge conflicts.  Git does all of this already.
>
>                     That said, I'm not sure about everything here. 
>                     For example, how would
>                     you expose merge conflicts to the user?  I don't
>                     know that I would want
>                     to force a user to learn git in order to use
>                     TripleO (although that
>                     would be the devops-y thing to do), but maybe just
>                     passing them back the
>                     files with the merge conflict markers and having
>                     them resolve those
>                     locally and retry the update would work.  I'm not
>                     sure how that would
>                     map to the current version of the API though. Do
>                     we provide any way to
>                     pass templates back to the user?  I feel like that
>                     was kind of a one-way
>                     street.
>
>                 What part of the deployment API workflow could result
>                 in merge conflicts?
>
>                 My understanding was that it's something like:
>
>                 1. Take copy of reference templates tree
>                 2. Introspect tempalates, expose required parameters
>                 so user can be
>                 prompted for them
>                 3. Create environment files(s) derived from the user input
>                 4. Validate the combination of (1) and (3)
>                 5. Deploy the templates+environments
>
>                 On update, (1) would be "overwrite existing version of
>                 templates"
>
>             This update policy means you may have just blown away
>             someone else's
>             work, unless you rebase on the plan's templates
>             immediately before
>             updating (and even then there's a race if two people
>             submit updates at
>             the same time).
>
>         What has been proposed to date is somewhat more limited in
>         scope than what
>         you're hinting at (which I think is more of a
>         colloborate-on-templates
>         requirement?)
>
>         https://github.com/openstack/tripleo-specs/blob/master/specs/mitaka/tripleo-overcloud-deployment-library.rst
>
>         Here, you would expect any template collaboration to happen
>         outside of the
>         scope of the actual deployment workflow, so e.g step 1 above
>         consumes
>         either a packaged version of tripleo-heat-templates (which we
>         don't expect
>         to be routinely modified), or another location on the local
>         filesystem
>         (such as a repository managed by e.g git, outside of the
>         deployment
>         workflow).
>
>         The "plan" then takes a copy of the golden tree, prompts for
>         additional
>         inputs, validates and deploys it.
>
>         You are right though, if we allow concurrent update of the
>         plan, it's
>         possible that environments added to two versions of the plan
>         would have to
>         be merged, which could mean either conflicts or validation
>         errors (if two
>         operators select mutually exclusive configurations for example).
>
>             Possible example: Two operators are working on enabling
>             separate
>             features in their cloud, and need to make configuration
>             changes to the
>             plan to do so.  Let's say one decides they need to enable
>             the Storage
>             network, while the other decides to enable the Tenant
>             network.  The
>             first operator makes their changes, sends the update and
>             thinks their
>             work is done.  The second operator, working from the same
>             base set of
>             templates as the first, makes their changes and sends the
>             update.  Using
>             the "overwrite" method of conflict resolution the first
>             operator's
>             changes have just been silently destroyed with no
>             indication to either
>             user that anything bad happened.
>
>         Ok, so separating the two requirements alluded to here may
>         help improve
>         clarity:
>
>         1. Multiple users collaborating on the t-h-t tree as a whole.
>
>         2. Enabling multiple features via updates and avoiding
>         mid-air-collisions
>
>         I think (2) may simpler problem to consider, particularly if a
>         lock
>         of some sort is considered acceptable, e.g we explcitly do not
>         allow multiple
>         operators actively modifying the cloud concurrently.
>
>         That would also be consistent with the current heat behavior,
>         e.g even if
>         you did allow multiple operators to concurrently change a
>         plan, they cannot
>         concurrently update the overcloud via heat anyway (this will
>         change
>         eventually with convergence).
>
>         (1) is a much harder problem, and I can't help thinking it'd
>         be better
>         solved with existing tools (e.g document how to use git,
>         gerrit, jenkins &
>         CI test your own t-h-t tree, potentially allowing for
>         semi-automated
>         promotion of things between environments, a staging workflow).
>
>             I guess you could tell users "don't do that", but unless
>             you have
>             exactly one person making updates to the templates there's
>             going to be
>             the possibility of conflicts, and in the Swift case all it
>             takes is two
>             people editing the same file, even in completely different
>             areas, for
>             someone's changes to be lost.
>
>         Ok, good point, I think I'd been assuming more of a serialized
>         workflow as
>         a given, so it's definitely something to consider, thanks for
>         clarifying.
>
>         Steve
>
>         __________________________________________________________________________
>         OpenStack Development Mailing List (not for usage questions)
>         Unsubscribe:
>         OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>         <http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe>
>         http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
>     To add the information here and maybe (hopefully) clear things a
>     bit,  the current workflow does not manipulate the templates and
>     environments content.
>     We only set the metadata about certain templates/environments and
>     create single temporary environment file:
>
>     1. Upload files (using git, it means provide git url) and identify
>     capabilities-map file (capabilities_map.yaml) and set it's 'type'
>     metadata to 'capabilities-map'
>
>
> I think we have multiple ideas related to git floating around - using 
> git as an external input source, or using git as a data store that we 
> update and manage and store on the undercloud. Both seem valid.
>
>     2. based on the capabilities-map information, identify
>     'root-template' (overcloud.yaml), 'root-environment'
>     (overcloud-resource-registry-puppet.yaml), 'environment'
>     (environments/*.yaml) and store this information in those files
>     'type' metadata.
>
>
> I don't think we need to set this metadata. We can use the 
> capabilities-map as an index and look up that file each time we need 
> this information.

Good point, that get's us rid of having to store those.

>     3. Let user select from optional environments ('type' is
>     'environment') based on the constraints defined in
>     capabilities-map. Store the information about selected
>     environments in 'enabled' meta.
>
>
> The metadata for enabled is environments is important, but I'll come 
> back to this below.
>
>     4. Generate a list of parameters by sending templates,
>     root-environment and _enabled_ optional environments to
>     heat-validate (nested). Let user set values for those parameters
>     and store the parameter values in newly created temporary
>     environment's parameter_defaults block. Upload this template to
>     Swift and set it's 'type' meta to 'temp-environment'.
>     5. Deploy - take everything from Swift, process templates (to
>     resolve the urls in get_file etc.) and merge environments in
>     order: root environment < enabled optional environments <
>     temporary environment. And send this to Heat API's Stack Create.
>
>     So you can see, that we don't really manipulate the template
>     files, we just add a metadata and create single temporary
>     environment that holds the parameter values, 
>
>
> Don't we allow users to upload new template files or update them? If 
> users need to delete a plan and create a new one for each version that 
> sounds painful.



>     although this is not really necessary and can be replaced by
>     storing the parameter values in DB and then send this as
>     'parameters' param to Heat.  I think that storing files in Git is
>     good idea as it is what we already have (t-h-t) but we probably
>     need to use DB to store the metadata because the metadata are
>     plan-specific, whereas the Git repository is not (or is it meant
>     to be? That would mean creating separate git repo for every
>     deploymeny attempt.)
>
>
> I think we need to be careful how we store any metadata. They key 
> advantage (AFAICT) with storing the files in git is that operators can 
> easily access and deploy them manually. However, if they need to 
> understand our bespoke metadata or extract it from a database to 
> understand the deploy then that advantage is lost. Maybe rather than 
> metadata we can update a file (or users can add this file) that 
> defines the deployment, this would be similar to one that has been 
> proposed to python-tripleoclient[1]. If we can then support this file 
> in python-heatclient it would mean a deploy could easily be understood 
> from the API, python-tripleoclient and python-heatclient. Even without 
> heatclient supporting this file, it is easy to look at and see how you 
> would call heatclient.
>
> [1]: https://review.openstack.org/#/c/249222/
>
> When we make a deploy, we will want to store the sha that we have 
> deployed, I am not sure where we want to store this information.

Ok, so this approach involves branching the git repo with a Plan 
creation and the Plan metadata would get stored in the answers file that 
gets committed to that branch. Sounds good.

In regards to uploading/updating new templates, this sounds somewhat 
counterproductive to me. Is there a use case for adding/changing 
template as part of Plan design? IMO if we want to add template it is 
usually done globally in t-h-t and not in Plan specific branch. I don't 
see when we could need to do this. Adding environment is more valid 
probably, but that would involve also updating the capabilities map. We 
have the feature to add additional files to plan currently because we 
use Swift and we have this step of uploading files as part of plan 
creation. Using GIT, Plan creation is just a matter of pointing to git repo.

This is why I tend to not touch the files and just store the metadata. 
Tying the metadata to the git repo (using answers file and branching 
repo on Plan creation) is totally valid point.

>
>
>     To make sure, that Plan is in sync with Git repo (t-h-t) we can
>     create the Plan is tied to not just specific repository, but also
>     to a specific tag or commit. This way if the user updates the
>     templates repository with changes he wants to use, he needs to
>     create a new Plan and start over the deployment process.
>
>     Correct me if I am wrong, but I think this approach resolves the
>     problems with merge conflicts. The Files and Plan (Deployment) are
>     separate thing - Files are stored in Git and Plan is stored in DB,
>     holds the files metadata and is tied to a Git repo commit/tag.
>
>     Any changes that involve the changes in templates themself should
>     be done in Git repo and I am not convinced that we want to
>     introduce anything like that in GUI/CLI deployment workflow, as as
>     it was agreed before, Git is best tool for doing/tracking such
>     changes.
>
>     Jirka
>
>
>
>     __________________________________________________________________________
>     OpenStack Development Mailing List (not for usage questions)
>     Unsubscribe:
>     OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>     <http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe>
>     http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
>
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Jirka
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20160106/b4d16af2/attachment.html>


More information about the OpenStack-dev mailing list