[openstack-dev] [tripleo] service validation during deployment steps
Steven Hardy
shardy at redhat.com
Wed Jul 27 08:25:35 UTC 2016
Hi Emilien,
On Tue, Jul 26, 2016 at 03:59:33PM -0400, Emilien Macchi wrote:
> I would love to hear some feedback about $topic, thanks.
Sorry for the slow response, we did dicuss this on IRC, but providing that
feedback and some other comments below:
> On Fri, Jul 15, 2016 at 11:31 AM, Emilien Macchi <emilien at redhat.com> wrote:
> > Hi,
> >
> > Some people on the field brought interesting feedback:
> >
> > "As a TripleO User, I would like the deployment to stop immediately
> > after an resource creation failure during a step of the deployment and
> > be able to easily understand what service or resource failed to be
> > installed".
> >
> > Example:
> > If during step4 Puppet tries to deploy Neutron and OVS, but OVS fails
> > to start for some reasons, deployment should stop at the end of the
> > step.
I don't think anyone will argue against this use-case, we absolutely want
to enable a better "fail fast" for deployment problems, as well as better
surfacing of why it failed.
> > So there are 2 things in this user story:
> >
> > 1) Be able to run some service validation within a step deployment.
> > Note about the implementation: make the validation composable per
> > service (OVS, nova, etc) and not per role (compute, controller, etc).
+1, now we have composable services we need any validations to be
associated with the services, not the roles.
That said, it's fairly easy to imagine an interface like
step_config/config_settings could be used to wire in composable service
validations on a per-role basis, e.g similar to what we do here, but
per-step:
https://github.com/openstack/tripleo-heat-templates/blob/master/overcloud.yaml#L1144
Similar to what was proposed (but never merged) here:
https://review.openstack.org/#/c/174150/15/puppet/controller-post-puppet.yaml
> > 2) Make this information readable and easy to access and understand
> > for our users.
> >
> > I have a proof-of-concept for 1) and partially 2), with the example of
> > OVS: https://review.openstack.org/#/c/342202/
> > This patch will make sure OVS is actually usable at step 4 by running
> > 'ovs-vsctl show' during the Puppet catalog and if it's working, it
> > will create a Puppet anchor. This anchor is currently not useful but
> > could be in future if we want to rely on it for orchestration.
> > I wrote the service validation in Puppet 2 years ago when doing Spinal
> > Stack with eNovance:
> > https://github.com/openstack/puppet-openstacklib/blob/master/manifests/service_validation.pp
> > I think we could re-use it very easily, it has been proven to work.
> > Also, the code is within our Puppet profiles, so it's by design
> > composable and we don't need to make any connection with our current
> > services with some magic. Validation will reside within Puppet
> > manifests.
> > If you look my PoC, this code could even live in puppet-vswitch itself
> > (we already have this code for puppet-nova, and some others).
I think having the validations inside the puppet implementation is OK, but
ideally I think we do want it to be part of the puppet modules themselves
(not part of the puppet-tripleo abstraction layer).
The issue I'd have with putting it in puppet-tripleo is that if we're going
to do this in a tripleo specific way, it should probably be done via a
method that's more config tool agnostic. Otherwise we'll have to recreate
the same validations for future implementations (I'm thinking specifically
about containers here, and possibly ansible[1].
So, in summary, I'm +1 on getting this integrated if it can be done with
little overhead and it's something we can leverage via the puppet modules
vs puppet-tripleo.
> >
> > Ok now, what if validation fails?
> > I'm testing it here: https://review.openstack.org/#/c/342205/
> > If you look at /var/log/messages, you'll see:
> >
> > Error: /Stage[main]/Tripleo::Profile::Base::Neutron::Ovs/Openstacklib::Service_validation[openvswitch]/Exec[execute
> > openvswitch validation]/returns: change from notrun to 0 failed
> >
> > So it's pretty clear by looking at logs that openvswitch service
> > validation failed and something is wrong. You'll also notice in the
> > logs that deployed stopped at step 4 since OVS is not considered to
> > run.
> > It's partially addressing 2) because we need to make it more explicit
> > and readable. Dan Prince had the idea to use
> > https://github.com/ripienaar/puppet-reportprint to print a nice report
> > of Puppet catalog result (we haven't tried it yet). We could also use
> > Operational Tools later to monitor Puppet logs and find Service
> > validation failures.
This all sounds good, but we do need to think beyond the puppet
implementation, e.g how will we enable similar validations in a container
based deployment?
I remember SpinalStack also used serverspec, can you describe the
differences between using that tool (was it only used for post-deploy
validation of the whole server, not per-step validation?)
I'm just wondering if the overhead of integrating per-service validations
via a more generic tool (not necessarily serverspec but something like it,
e.g I've been looking at testinfra which is a more python based tool aiming
to do similar things[2]) would be worth it?
Maybe this is complementary to any per-step validation done inside the
puppet modules, but we could do something like:
outputs:
role_data:
description: Role data for the Heat Engine role.
value:
service_name: heat-api
config_settings:
heat::<settings ...>: foo
step_config: |
include ::tripleo::profile::base::heat::api
validation:
group: serverspec
config:
step_4: |
Package "openstack-heat-api"
should be installed
Service "openstack-heat-api"
should be enabled
should be running
Port "8004"
should be listening
Looking at the WIP container composable services patch[3], this can
probably be directly reused:
outputs:
role_data:
description: Role data for the Keystone API role.
value:
config_settings: <the config settings>
step_config: <stepconfig>
puppet_tags: keystone_config
docker_config:
keystone:
container_step_config: 1
image:
list_join:
- '/'
- [ {get_param: DockerNamespace}, {get_param:
DockerKeystoneImage} ]
net: host
privileged: false
restart: always
volumes:
- /run:/run
- /var/lib/etc-data/json-config/keystone.json:/var/lib/kolla/config_files/keystone.json
environment:
- KOLLA_CONFIG_STRATEGY=COPY_ALWAYS
validation:
group: serverspec
config:
step_1: |
Service "httpd"
should be enabled
should be running
Port "5000"
should be listening
Port "35357"
should be listening
Anyway, just some ideas there - I'm not opposed to what you suggest re the
puppet validations, but I'm very aware that we'll be left with a feature
gap if we *only* do that, then (pretty soon) enable fully containerized
deployments.
Thanks,
Steve
[1] http://lists.openstack.org/pipermail/openstack-dev/2016-July/099564.html
[2] https://testinfra.readthedocs.io/en/latest/
[3] https://review.openstack.org/#/c/330659/12/docker/services/keystone.yaml
> >
> >
> > So this email is a bootstrap of discussion, it's open for feedback.
> > Don't take my PoC as something we'll implement. It's an idea and I
> > think it's worth to look at it.
> > I like it for 2 reasons:
> > - the validation code reside within our profiles, so it's composable by design.
> > - it's flexible and allow us to test everything. It can be a bash
> > script, a shell command, a Puppet resource (provider, service, etc).
> >
> > Thanks for reading so far,
> > --
> > Emilien Macchi
>
>
>
> --
> Emilien Macchi
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
--
Steve Hardy
Red Hat Engineering, Cloud
More information about the OpenStack-dev
mailing list