[Openstack] Redhat overcloud deployment failing at post deployment step

Xu, Rongjie (NSB - CN/Hangzhou) rongjie.xu at nokia-sbell.com
Thu Aug 24 07:11:46 UTC 2017


Hi,

What kind of storage are you using? I am also deploy a 1controller + 1compute environment. And I got some errors like follows when I tried to use NFS as Cinder/Glance backend (if disable NFS, I got deployment successful)

And NFS server could be outside of Overcloud, right? Currently I deploy NFS server in Undercloud. (attach my storage environment file)

[stack at rcp ~]$ openstack stack failures list overcloud
overcloud.AllNodesDeploySteps.ControllerDeployment_Step1.0:
  resource_type: OS::Heat::StructuredDeployment
  physical_resource_id: e9fc0409-afca-407b-b43e-40a95cd783ca
  status: CREATE_FAILED
  status_reason: |
    Error: resources[0]: Deployment to server failed: deploy_status_code : Deployment exited with non-zero status code: 6
  deploy_stdout: |
    ...
    Notice: /Stage[main]/Pacemaker::Service/Service[pacemaker]/enable: enable changed 'false' to 'true'
    Notice: /Stage[main]/Pacemaker::Service/Service[corosync]/enable: enable changed 'false' to 'true'
    Notice: /Stage[main]/Pacemaker::Corosync/Exec[wait-for-settle]/returns: executed successfully
    Notice: /Stage[main]/Pacemaker::Stonith/Pacemaker::Property[Disable STONITH]/Exec[Creating cluster-wide property stonith-enabled]/returns: executed successfully
    Notice: /Stage[main]/Haproxy/Haproxy::Instance[haproxy]/Haproxy::Config[haproxy]/Concat[/etc/haproxy/haproxy.cfg]/File[/etc/haproxy/haproxy.cfg]/content: content changed '{md5}1f337186b0e1ba5ee82760cb437fb810' to '{md5}90fd221c4698a762b582d08c41b7e124'
    Notice: /File[/etc/haproxy/haproxy.cfg]/seluser: seluser changed 'unconfined_u' to 'system_u'
    Notice: /Stage[main]/Tripleo::Profile::Base::Haproxy/Exec[haproxy-reload]: Triggered 'refresh' from 1 events
    Notice: /Firewall[998 log all]: Dependency Exec[NFS mount for glance file backend] has failures: true
    Notice: /Firewall[999 drop all]: Dependency Exec[NFS mount for glance file backend] has failures: true
    Notice: Finished catalog run in 321.78 seconds
    (truncated, view all with --long)
  deploy_stderr: |
    exception: connect failed
    Warning: Scope(Haproxy::Config[haproxy]): haproxy: The $merge_options parameter will default to true in the next major release. Please review the documentation regarding the implications.
    Error: mount -t nfs '192.0.2.1:/glance' '/var/lib/glance/images' -o intr,context=system_u:object_r:glance_var_lib_t:s0 returned 32 instead of one of [0]
    Error: /Stage[main]/Tripleo::Glance::Nfs_mount/Exec[NFS mount for glance file backend]/returns: change from notrun to 0 failed: mount -t nfs '192.0.2.1:/glance' '/var/lib/glance/images' -o intr,context=system_u:object_r:glance_var_lib_t:s0 returned 32 instead of one of [0]
    Warning: /Firewall[998 log all]: Skipping because of failed dependencies
Warning: /Firewall[999 drop all]: Skipping because of failed dependencies


Best Regards
Xu Rongjie (Max)

From: Shyam Biradar [mailto:shyambiradarsggsit at gmail.com]
Sent: Thursday, August 24, 2017 14:50
To: Vagner Farias <vfarias at redhat.com>
Cc: openstack <openstack at lists.openstack.org>
Subject: Re: [Openstack] Redhat overcloud deployment failing at post deployment step

Thanks Vagner. Somehow I was able to find this blog for TripleO debugging, it helped me a lot. I am good now, overcloud deployment worked fine. It was network configuration issue in network environment file.





Thanks & Regards,
Shyam Biradar,
Email: shyambiradarsggsit at gmail.com<mailto:shyambiradarsggsit at gmail.com>,
Contact: +91 8600266938.

On Wed, Aug 23, 2017 at 6:28 PM, Vagner Farias <vfarias at redhat.com<mailto:vfarias at redhat.com>> wrote:
Hello Shyam,
As a general rule, I'd recommend using the following command to investigate deployment failures (after sourcing stackrc file). Send back the results to the list if the output still seems confusing.
$ openstack stack failures list --long overcloud

It'd also help the investigation if you could make the storage-environment.yaml and network-environment.yaml files available, together with the results of above command  (http://paste.openstack.org/ or somewhere else).

AllNodesDeploySteps is a huge stack with several nested stacks and the failure could have happened in any of the steps. Although the above command should provide a clue of what happened, if you are curious you may like to run the command below to list all the nested resources:
$ openstack stack resource list -n5
or, to get only the failed resources:

$ openstack stack resource list -n5 | grep FAIL
There a good explanation on how to debug tripleo heat templates at http://hardysteven.blogspot.com.br/2015/04/debugging-tripleo-heat-templates.html, if you want to go further.

--
Vagner Farias

On Wed, Aug 23, 2017 at 3:30 AM, Shyam Biradar <shyambiradarsggsit at gmail.com<mailto:shyambiradarsggsit at gmail.com>> wrote:
Hi,


I am installing Redhat openstack platform 10 on virtual environment (KVM) using pxe_ssh ipmi driver.
Undercloud, compute, controller all three nodes are available on single kvm box. Using single nic config.

Overcloud deployment failing during post deployement step with following error:
-------------------------------------------------------
017-08-22 13:42:55Z [overcloud.AllNodesDeploySteps.ControllerDeployment_Step4]: CREATE_FAILED  Resource CREATE failed: Error: resources[0]: Deployment to server failed: deploy_status_code : Deployment exited with non-zero status code: 6


-------------------------------------------------------

Corresponding heat resource is

---------------------------------------------------------
[stack at redhat-undercloud ~]$ openstack stack resource list overcloud | grep FAILED
| AllNodesDeploySteps                       | 186d4a53-e171-4184-a8e2-4f5fbc1290ee         | OS::TripleO::PostDeploySteps                    | CREATE_FAILED   | 2017-08-22T13:13:47Z |
[stack at redhat-undercloud ~]$
------------------------------------------------------------



I am using following command to deploy overcloud:
------------------------------------------------------------

openstack overcloud deploy --templates -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml \
-e ~/templates/network-environment.yaml \
-e ~/templates/storage-environment.yaml \
--control-scale 1 --compute-scale 1 --control-flavor control --compute-flavor compute \
--ntp-server 0.north-america.pool.ntp.org<http://0.north-america.pool.ntp.org> --neutron-network-type vxlan --neutron-tunnel-types vxlan \
--validation-errors-fatal --validation-warnings-fatal --timeout 90
-------------------------------------------------------------


No errors I could find in os-collect-config or heat logs except following:

Aug 22 23:52:03 localhost os-collect-config: /var/lib/os-collect-config/local-data not found. Skipping
Aug 22 23:52:03 localhost os-collect-config: No local metadata found (['/var/lib/os-collect-config/local-data'])


I have looked into /var/log/heat/*, os-collect-config logs. Any other log files that I should look into?


Thanks & Regards,
Shyam Biradar,
Email: shyambiradarsggsit at gmail.com<mailto:shyambiradarsggsit at gmail.com>,
Contact: +91 8600266938<tel:+91%2086002%2066938>.

_______________________________________________
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to     : openstack at lists.openstack.org<mailto:openstack at lists.openstack.org>
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack/attachments/20170824/ca57d25f/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: storage-environment.yaml
Type: application/octet-stream
Size: 2234 bytes
Desc: storage-environment.yaml
URL: <http://lists.openstack.org/pipermail/openstack/attachments/20170824/ca57d25f/attachment.obj>


More information about the Openstack mailing list