[Openstack] Redhat overcloud deployment failing at post deployment step
Xu, Rongjie (NSB - CN/Hangzhou)
rongjie.xu at nokia-sbell.com
Thu Aug 24 07:11:46 UTC 2017
Hi,
What kind of storage are you using? I am also deploy a 1controller + 1compute environment. And I got some errors like follows when I tried to use NFS as Cinder/Glance backend (if disable NFS, I got deployment successful)
And NFS server could be outside of Overcloud, right? Currently I deploy NFS server in Undercloud. (attach my storage environment file)
[stack at rcp ~]$ openstack stack failures list overcloud
overcloud.AllNodesDeploySteps.ControllerDeployment_Step1.0:
resource_type: OS::Heat::StructuredDeployment
physical_resource_id: e9fc0409-afca-407b-b43e-40a95cd783ca
status: CREATE_FAILED
status_reason: |
Error: resources[0]: Deployment to server failed: deploy_status_code : Deployment exited with non-zero status code: 6
deploy_stdout: |
...
Notice: /Stage[main]/Pacemaker::Service/Service[pacemaker]/enable: enable changed 'false' to 'true'
Notice: /Stage[main]/Pacemaker::Service/Service[corosync]/enable: enable changed 'false' to 'true'
Notice: /Stage[main]/Pacemaker::Corosync/Exec[wait-for-settle]/returns: executed successfully
Notice: /Stage[main]/Pacemaker::Stonith/Pacemaker::Property[Disable STONITH]/Exec[Creating cluster-wide property stonith-enabled]/returns: executed successfully
Notice: /Stage[main]/Haproxy/Haproxy::Instance[haproxy]/Haproxy::Config[haproxy]/Concat[/etc/haproxy/haproxy.cfg]/File[/etc/haproxy/haproxy.cfg]/content: content changed '{md5}1f337186b0e1ba5ee82760cb437fb810' to '{md5}90fd221c4698a762b582d08c41b7e124'
Notice: /File[/etc/haproxy/haproxy.cfg]/seluser: seluser changed 'unconfined_u' to 'system_u'
Notice: /Stage[main]/Tripleo::Profile::Base::Haproxy/Exec[haproxy-reload]: Triggered 'refresh' from 1 events
Notice: /Firewall[998 log all]: Dependency Exec[NFS mount for glance file backend] has failures: true
Notice: /Firewall[999 drop all]: Dependency Exec[NFS mount for glance file backend] has failures: true
Notice: Finished catalog run in 321.78 seconds
(truncated, view all with --long)
deploy_stderr: |
exception: connect failed
Warning: Scope(Haproxy::Config[haproxy]): haproxy: The $merge_options parameter will default to true in the next major release. Please review the documentation regarding the implications.
Error: mount -t nfs '192.0.2.1:/glance' '/var/lib/glance/images' -o intr,context=system_u:object_r:glance_var_lib_t:s0 returned 32 instead of one of [0]
Error: /Stage[main]/Tripleo::Glance::Nfs_mount/Exec[NFS mount for glance file backend]/returns: change from notrun to 0 failed: mount -t nfs '192.0.2.1:/glance' '/var/lib/glance/images' -o intr,context=system_u:object_r:glance_var_lib_t:s0 returned 32 instead of one of [0]
Warning: /Firewall[998 log all]: Skipping because of failed dependencies
Warning: /Firewall[999 drop all]: Skipping because of failed dependencies
Best Regards
Xu Rongjie (Max)
From: Shyam Biradar [mailto:shyambiradarsggsit at gmail.com]
Sent: Thursday, August 24, 2017 14:50
To: Vagner Farias <vfarias at redhat.com>
Cc: openstack <openstack at lists.openstack.org>
Subject: Re: [Openstack] Redhat overcloud deployment failing at post deployment step
Thanks Vagner. Somehow I was able to find this blog for TripleO debugging, it helped me a lot. I am good now, overcloud deployment worked fine. It was network configuration issue in network environment file.
Thanks & Regards,
Shyam Biradar,
Email: shyambiradarsggsit at gmail.com<mailto:shyambiradarsggsit at gmail.com>,
Contact: +91 8600266938.
On Wed, Aug 23, 2017 at 6:28 PM, Vagner Farias <vfarias at redhat.com<mailto:vfarias at redhat.com>> wrote:
Hello Shyam,
As a general rule, I'd recommend using the following command to investigate deployment failures (after sourcing stackrc file). Send back the results to the list if the output still seems confusing.
$ openstack stack failures list --long overcloud
It'd also help the investigation if you could make the storage-environment.yaml and network-environment.yaml files available, together with the results of above command (http://paste.openstack.org/ or somewhere else).
AllNodesDeploySteps is a huge stack with several nested stacks and the failure could have happened in any of the steps. Although the above command should provide a clue of what happened, if you are curious you may like to run the command below to list all the nested resources:
$ openstack stack resource list -n5
or, to get only the failed resources:
$ openstack stack resource list -n5 | grep FAIL
There a good explanation on how to debug tripleo heat templates at http://hardysteven.blogspot.com.br/2015/04/debugging-tripleo-heat-templates.html, if you want to go further.
--
Vagner Farias
On Wed, Aug 23, 2017 at 3:30 AM, Shyam Biradar <shyambiradarsggsit at gmail.com<mailto:shyambiradarsggsit at gmail.com>> wrote:
Hi,
I am installing Redhat openstack platform 10 on virtual environment (KVM) using pxe_ssh ipmi driver.
Undercloud, compute, controller all three nodes are available on single kvm box. Using single nic config.
Overcloud deployment failing during post deployement step with following error:
-------------------------------------------------------
017-08-22 13:42:55Z [overcloud.AllNodesDeploySteps.ControllerDeployment_Step4]: CREATE_FAILED Resource CREATE failed: Error: resources[0]: Deployment to server failed: deploy_status_code : Deployment exited with non-zero status code: 6
-------------------------------------------------------
Corresponding heat resource is
---------------------------------------------------------
[stack at redhat-undercloud ~]$ openstack stack resource list overcloud | grep FAILED
| AllNodesDeploySteps | 186d4a53-e171-4184-a8e2-4f5fbc1290ee | OS::TripleO::PostDeploySteps | CREATE_FAILED | 2017-08-22T13:13:47Z |
[stack at redhat-undercloud ~]$
------------------------------------------------------------
I am using following command to deploy overcloud:
------------------------------------------------------------
openstack overcloud deploy --templates -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml \
-e ~/templates/network-environment.yaml \
-e ~/templates/storage-environment.yaml \
--control-scale 1 --compute-scale 1 --control-flavor control --compute-flavor compute \
--ntp-server 0.north-america.pool.ntp.org<http://0.north-america.pool.ntp.org> --neutron-network-type vxlan --neutron-tunnel-types vxlan \
--validation-errors-fatal --validation-warnings-fatal --timeout 90
-------------------------------------------------------------
No errors I could find in os-collect-config or heat logs except following:
Aug 22 23:52:03 localhost os-collect-config: /var/lib/os-collect-config/local-data not found. Skipping
Aug 22 23:52:03 localhost os-collect-config: No local metadata found (['/var/lib/os-collect-config/local-data'])
I have looked into /var/log/heat/*, os-collect-config logs. Any other log files that I should look into?
Thanks & Regards,
Shyam Biradar,
Email: shyambiradarsggsit at gmail.com<mailto:shyambiradarsggsit at gmail.com>,
Contact: +91 8600266938<tel:+91%2086002%2066938>.
_______________________________________________
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack at lists.openstack.org<mailto:openstack at lists.openstack.org>
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack/attachments/20170824/ca57d25f/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: storage-environment.yaml
Type: application/octet-stream
Size: 2234 bytes
Desc: storage-environment.yaml
URL: <http://lists.openstack.org/pipermail/openstack/attachments/20170824/ca57d25f/attachment.obj>
More information about the Openstack
mailing list