[openstack-dev] [tripleo] service validation during deployment steps

Steven Hardy shardy at redhat.com
Wed Jul 27 08:25:35 UTC 2016


Hi Emilien,

On Tue, Jul 26, 2016 at 03:59:33PM -0400, Emilien Macchi wrote:
> I would love to hear some feedback about $topic, thanks.

Sorry for the slow response, we did dicuss this on IRC, but providing that
feedback and some other comments below:

> On Fri, Jul 15, 2016 at 11:31 AM, Emilien Macchi <emilien at redhat.com> wrote:
> > Hi,
> >
> > Some people on the field brought interesting feedback:
> >
> > "As a TripleO User, I would like the deployment to stop immediately
> > after an resource creation failure during a step of the deployment and
> > be able to easily understand what service or resource failed to be
> > installed".
> >
> > Example:
> > If during step4 Puppet tries to deploy Neutron and OVS, but OVS fails
> > to start for some reasons, deployment should stop at the end of the
> > step.

I don't think anyone will argue against this use-case, we absolutely want
to enable a better "fail fast" for deployment problems, as well as better
surfacing of why it failed.

> > So there are 2 things in this user story:
> >
> > 1) Be able to run some service validation within a step deployment.
> > Note about the implementation: make the validation composable per
> > service (OVS, nova, etc) and not per role (compute, controller, etc).

+1, now we have composable services we need any validations to be
associated with the services, not the roles.

That said, it's fairly easy to imagine an interface like
step_config/config_settings could be used to wire in composable service
validations on a per-role basis, e.g similar to what we do here, but
per-step:

https://github.com/openstack/tripleo-heat-templates/blob/master/overcloud.yaml#L1144

Similar to what was proposed (but never merged) here:

https://review.openstack.org/#/c/174150/15/puppet/controller-post-puppet.yaml

> > 2) Make this information readable and easy to access and understand
> > for our users.
> >
> > I have a proof-of-concept for 1) and partially 2), with the example of
> > OVS: https://review.openstack.org/#/c/342202/
> > This patch will make sure OVS is actually usable at step 4 by running
> > 'ovs-vsctl show' during the Puppet catalog and if it's working, it
> > will create a Puppet anchor. This anchor is currently not useful but
> > could be in future if we want to rely on it for orchestration.
> > I wrote the service validation in Puppet 2 years ago when doing Spinal
> > Stack with eNovance:
> > https://github.com/openstack/puppet-openstacklib/blob/master/manifests/service_validation.pp
> > I think we could re-use it very easily, it has been proven to work.
> > Also, the code is within our Puppet profiles, so it's by design
> > composable and we don't need to make any connection with our current
> > services with some magic. Validation will reside within Puppet
> > manifests.
> > If you look my PoC, this code could even live in puppet-vswitch itself
> > (we already have this code for puppet-nova, and some others).

I think having the validations inside the puppet implementation is OK, but
ideally I think we do want it to be part of the puppet modules themselves
(not part of the puppet-tripleo abstraction layer).

The issue I'd have with putting it in puppet-tripleo is that if we're going
to do this in a tripleo specific way, it should probably be done via a
method that's more config tool agnostic.  Otherwise we'll have to recreate
the same validations for future implementations (I'm thinking specifically
about containers here, and possibly ansible[1].

So, in summary, I'm +1 on getting this integrated if it can be done with
little overhead and it's something we can leverage via the puppet modules
vs puppet-tripleo.

> >
> > Ok now, what if validation fails?
> > I'm testing it here: https://review.openstack.org/#/c/342205/
> > If you look at /var/log/messages, you'll see:
> >
> > Error: /Stage[main]/Tripleo::Profile::Base::Neutron::Ovs/Openstacklib::Service_validation[openvswitch]/Exec[execute
> > openvswitch validation]/returns: change from notrun to 0 failed
> >
> > So it's pretty clear by looking at logs that openvswitch service
> > validation failed and something is wrong. You'll also notice in the
> > logs that deployed stopped at step 4 since OVS is not considered to
> > run.
> > It's partially addressing 2) because we need to make it more explicit
> > and readable. Dan Prince had the idea to use
> > https://github.com/ripienaar/puppet-reportprint to print a nice report
> > of Puppet catalog result (we haven't tried it yet). We could also use
> > Operational Tools later to monitor Puppet logs and find Service
> > validation failures.

This all sounds good, but we do need to think beyond the puppet
implementation, e.g how will we enable similar validations in a container
based deployment?

I remember SpinalStack also used serverspec, can you describe the
differences between using that tool (was it only used for post-deploy
validation of the whole server, not per-step validation?)

I'm just wondering if the overhead of integrating per-service validations
via a more generic tool (not necessarily serverspec but something like it,
e.g I've been looking at testinfra which is a more python based tool aiming
to do similar things[2]) would be worth it?

Maybe this is complementary to any per-step validation done inside the
puppet modules, but we could do something like:

outputs:
  role_data:
    description: Role data for the Heat Engine role.
    value:
      service_name: heat-api
      config_settings:
        heat::<settings ...>: foo
      step_config: |
        include ::tripleo::profile::base::heat::api
      validation:
        group: serverspec
        config:
          step_4: |
            Package "openstack-heat-api"
              should be installed
            Service "openstack-heat-api"
              should be enabled
              should be running
            Port "8004"
              should be listening


Looking at the WIP container composable services patch[3], this can
probably be directly reused:

outputs:
  role_data:
    description: Role data for the Keystone API role.
    value:
      config_settings: <the config settings>
      step_config: <stepconfig>
      puppet_tags: keystone_config
      docker_config:
        keystone:
          container_step_config: 1
          image:
            list_join:
              - '/'
              - [ {get_param: DockerNamespace}, {get_param:
                DockerKeystoneImage} ]
          net: host
          privileged: false
          restart: always
          volumes:
            - /run:/run
            - /var/lib/etc-data/json-config/keystone.json:/var/lib/kolla/config_files/keystone.json
          environment:
            - KOLLA_CONFIG_STRATEGY=COPY_ALWAYS
      validation:
        group: serverspec
        config:
          step_1: |
            Service "httpd"
              should be enabled
              should be running
            Port "5000"
              should be listening
            Port "35357"
              should be listening

Anyway, just some ideas there - I'm not opposed to what you suggest re the
puppet validations, but I'm very aware that we'll be left with a feature
gap if we *only* do that, then (pretty soon) enable fully containerized
deployments.

Thanks,

Steve

[1] http://lists.openstack.org/pipermail/openstack-dev/2016-July/099564.html
[2] https://testinfra.readthedocs.io/en/latest/
[3] https://review.openstack.org/#/c/330659/12/docker/services/keystone.yaml


> >
> >
> > So this email is a bootstrap of discussion, it's open for feedback.
> > Don't take my PoC as something we'll implement. It's an idea and I
> > think it's worth to look at it.
> > I like it for 2 reasons:
> > - the validation code reside within our profiles, so it's composable by design.
> > - it's flexible and allow us to test everything. It can be a bash
> > script, a shell command, a Puppet resource (provider, service, etc).
> >
> > Thanks for reading so far,
> > --
> > Emilien Macchi
> 
> 
> 
> -- 
> Emilien Macchi
> 
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

-- 
Steve Hardy
Red Hat Engineering, Cloud



More information about the OpenStack-dev mailing list