Need help deploying Openstack

newer
[all][tc] Technical Committee next...

older
[ops] [cinder] Cinder driver for...

wodel youchi

18 Aug 2021 18 Aug '21

4:26 p.m.

Hi, I am trying to deploy openstack with tripleO using VMs and nested-KVM for the compute node. This is for test and learning purposes. I am using the Train version and following some tutorials. I prepared my different template files and started the deployment, but I got these errors : *Failed to provision instance fc40457e-4b3c-4402-ae9d-c528f2c2ad30: Asynchronous exception: Node failed to deploy. Exception: Agent API for node 6d3724fc-6f13-4588-bbe5-56bc4f9a4f87 returned HTTP status code 404 with error: Not found: Extension with id iscsi not found. for node* and *Got HTTP 409: {"errors": [{"status": 409, "title": "Conflict", "detail": "There was a conflict when trying to complete your request.\n\n Unable to allocate inventory: Unable to create allocation for 'CUSTOM_BAREMETAL' on resource provider '6d3724fc-6f13-4588-bbe5-56bc4f9a4f87'. The requested amount would exceed the capacity. ",* Could you help understand what those errors mean? I couldn't find anything similar on the net. Thanks in advance. Regards.

Attachments:

attachment.html (text/html — 1.4 KB)

Show replies by date

John Fulton

18 Aug 18 Aug

5:01 p.m.

New subject: [tripleo] Re: Need help deploying Openstack

Hi Wodel, Yes, it's possible to deploy openstack with tripleo using VMs and nested-KVM for the compute node. I personally use this tool on my hypervisor to do it. https://github.com/cjeanner/tripleo-lab By trying to get tripleo working with nested KVM without using a tool like the above you might eventually create your own version of the same tool though using the above helps you skip those steps. The issue you're hitting from the error message below looks like the Nova scheduler on the undercloud not finding an Ironic node that satisfies the scheduling criteria. This can be debugged but you might find it easier to just not have the problem by letting another tool deal with this for you. Also, with wallaby and newer tripleo does not use Nova on the undercloud and instead the recommended deployment process is to use metalsmith as described here: https://docs.openstack.org/project-deploy-guide/tripleo-docs/latest/provisio... You also have the standalone option of using tripleo on a single VM: https://docs.openstack.org/project-deploy-guide/tripleo-docs/latest/deployme... John On Wed, Aug 18, 2021 at 10:28 AM wodel youchi <wodel.youchi@gmail.com> wrote:

...

Hi, I am trying to deploy openstack with tripleO using VMs and nested-KVM for the compute node. This is for test and learning purposes.

I am using the Train version and following some tutorials. I prepared my different template files and started the deployment, but I got these errors :

*Failed to provision instance fc40457e-4b3c-4402-ae9d-c528f2c2ad30: Asynchronous exception: Node failed to deploy. Exception: Agent API for node 6d3724fc-6f13-4588-bbe5-56bc4f9a4f87 returned HTTP status code 404 with error: Not found: Extension with id iscsi not found. for node*

and

*Got HTTP 409: {"errors": [{"status": 409, "title": "Conflict", "detail": "There was a conflict when trying to complete your request.\n\n Unable to allocate inventory: Unable to create allocation for 'CUSTOM_BAREMETAL' on resource provider '6d3724fc-6f13-4588-bbe5-56bc4f9a4f87'. The requested amount would exceed the capacity. ",*

Could you help understand what those errors mean? I couldn't find anything similar on the net.

Thanks in advance.

Regards.

wodel youchi

5:28 p.m.

New subject: [tripleo] Re: Need help deploying Openstack

Thanks John, My idea is to learn how to deploy openstack with tripleO like what will be with physical nodes, my goal is not to get openstack up and running to lean how to use it, my goal is to install it in the same way as if it was a physical implementation. About Wallaby, I have tried, but I got other errors : I followed these documents : https://docs.openstack.org/project-deploy-guide/tripleo-docs/latest/features... https://docs.openstack.org/project-deploy-guide/tripleo-docs/latest/features... https://docs.openstack.org/project-deploy-guide/tripleo-docs/latest/provisio... At the end I tried these two commands : 1) openstack overcloud node provision --stack overcloud --network-config --output ~/overcloud-baremetal-deployed.yaml ~/templates/ *baremetal-node-net.yaml* And I this error message :

...

2021-08-15 15:59:32.402435 | 52540075-9baf-2b0b-5fc8-000000000017 | FATAL | Provision instances | localhost | error={"changed": false, "logging": "Deploy attempt failed on node computeHCI2 (UUID 9440e769-9634-4015-82b0-97b8c1921ef5), cleaning up\nTraceback (most recent call last):\n File \"/usr/lib/python3.6/site-packages/metalsmith/_provisioner.py\", line 392, in provision_node\n nics.validate()\n File \"/usr/lib/python3.6/site-packages/metalsmith/_nics.py\", line 60, in validate\n result.append(('network', self._get_network(nic)))\n File \"/usr/lib/python3.6/site-packages/metalsmith/_nics.py\", line 128, in _get_network\n 'Unexpected fields for a network: %s' % ', '.join(unexpected))\nmetalsmith.exceptions.InvalidNIC: Unexpected fields for a network: subnet\nDeploy attempt failed on node computeHCI0 (UUID 31eecc38-7d80-4ddd-9cc4-a76edf00ec3a), cleaning up\nTraceback (most recent call last):\n File \"/usr/lib/python3.6/site-packages/metalsmith/_provisioner.py\", line 392, in provision_node\n nics.validate()\n File \"/usr/lib/python3.6/site-packages/metalsmith/_nics.py\", line 60, in validate\n result.append(('network', self._get_network(nic)))\n File \"/usr/lib/python3.6/site-packages/metalsmith/_nics.py\", line 128, in _get_network\n 'Unexpected fields for a network: %s' % ', '.join(unexpected))\nmetalsmith.exceptions.InvalidNIC: Unexpected fields for a network: subnet\nDeploy attempt failed on node controller2 (UUID f62f8cb5-40e4-4e40-805d-852c2a2f7e00), cleaning up\nTraceback (most recent call last):\n File \"/usr/lib/python3.6/site-packages/metalsmith/_provisioner.py\", line 392, in provision_node\n nics.validate()\n File \"/usr/lib/python3.6/site-packages/metalsmith/_nics.py\", line 60, in validate\n result.append(('network', self._get_network(nic)))\n File \"/usr/lib/python3.6/site-packages/metalsmith/_nics.py\", line 128, in _get_network\n 'Unexpected fields for a network: %s' % ', '.join(unexpected))\nmetalsmith.exceptions.InvalidNIC: Unexpected fields for a network: subnet\nDeploy attempt failed on node computeHCI1 (UUID 2798b208-4842-4083-9b3f-c46953b52928), cleaning up\nTraceback (most recent call last):\n File \"/usr/lib/python3.6/site-packages/metalsmith/_provisioner.py\", line 392, in provision_node\n nics.validate()\n File \"/usr/lib/python3.6/site-packages/metalsmith/_nics.py\", line 60, in validate\n result.append(('network', self._get_network(nic)))\n File \"/usr/lib/python3.6/site-packages/metalsmith/_nics.py\", line 128, in _get_network\n 'Unexpected fields for a network: %s' % ', '.join(unexpected))\nmetalsmith.exceptions.InvalidNIC: Unexpected fields for a network: subnet\n Deploy attempt failed on node controller1 (UUID df87e5c7-32b0-4bfa-9ef9-f0a98666e7de), cleaning up\nTraceback (most recent call last):\n File \"/usr/lib/python3.6/site-packages/metalsmith/_provisioner.py\", line 392, in provision_node\n nics.validate()\n File \"/usr/lib/python3.6/site-packages/metalsmith/_nics.py\", line 60, in validate\n result.append(('network', self._get_network(nic)))\n File \"/usr/lib/python3.6/site-packages/metalsmith/_nics.py\", line 128, in _get_network\n 'Unexpected fields for a network: %s' % ', '.join(unexpected))\nmetalsmith.exceptions.InvalidNIC: Unexpected fields for a network: subnet\nDeploy attempt failed on node controller0 (UUID af75122c-a6e8-42d7-afd2-571738c42061), cleaning up\nTraceback (most recent call last):\n File \"/usr/lib/python3.6/site-packages/metalsmith/_*provisioner.py\", line 392, in provision_node\n nics.validate()\n File \"/usr/lib/python3.6/site-**packages/metalsmith/_nics.py\"**, line 60, in validate\n result.append(('network', self._get_network(nic)))\n File \"/usr/lib/python3.6/site-**packages/metalsmith/_nics.py\"**, line 128, in _get_network\n 'Unexpected fields for a network: %s' % ', '.join(unexpected))\**nmetalsmith.exceptions.**InvalidNIC: Unexpected fields for a network: subnet\n", "msg": "Unexpected fields for a network: subnet"}*

2) openstack overcloud node provision --stack overcloud --output ~/ overcloud-baremetal-deployed.yaml ~/templates/*baremetal-node**.yaml* This time I got this error :

...

2021-08-16 13:47:02.156021 | 52540075-9baf-0672-d7b8-000000000017 | FATAL | Provision instances | localhost | error={"changed": false, "logging": "Created port overcloud-computehci-1-ctlplane (UUID 17e55729-b44b-40a8-9361-9e36e8527de5) for node controller0 (UUID 61bc6512-9c4e-4199-936a-754801f7cffa) with {'network_id': '1c8c5e86-79ac-4ec6-9616-3c965cab6e88', 'name': 'overcloud-computehci-1-ctlplane'}\nCreated port overcloud-controller-0-ctlplane (UUID 0a09bbb5-934d-4d34-ab5b-7c14518d06c6) for node controller1 (UUID 03915e7c-a314-41d9-be8c-46291879692a) with {'network_id': '1c8c5e86-79ac-4ec6-9616-3c965cab6e88', 'name': 'overcloud-controller-0-ctlplane'}\nCreated port overcloud-computehci-2-ctlplane (UUID 76ef56f1-4dbb-453e-a344-968b0da95823) for node computeHCI2 (UUID a5d3552b-ab79-404e-8a52-48dc53a3aa45) with {'network_id': '1c8c5e86-79ac-4ec6-9616-3c965cab6e88', 'name': 'overcloud-computehci-2-ctlplane'}\nCreated port overcloud-computehci-0-ctlplane (UUID 79967f38-90ee-49e6-8716-b09cc9460afe) for node computeHCI1 (UUID 3534309b-c11f-48d4-b23d-3ed9f2dcbf79) with {'network_id': '1c8c5e86-79ac-4ec6-9616-3c965cab6e88', 'name': 'overcloud-computehci-0-ctlplane'}\nCreated port overcloud-controller-1-ctlplane (UUID c70b7c8c-31a6-460e-aa77-ea37b8e332f6) for node controller2 (UUID 83f47771-9f82-4437-8df4-de32bcd6fc63) with {'network_id': '1c8c5e86-79ac-4ec6-9616-3c965cab6e88', 'name': 'overcloud-controller-1-ctlplane'}\nCreated port overcloud-controller-2-ctlplane (UUID 2b74c4e3-705c-4bf1-85de-a5e0a9cdb591) for node computeHCI0 (UUID 3924529b-44a7-4c74-b1e1-2175d6313a3e) with {'network_id': '1c8c5e86-79ac-4ec6-9616-3c965cab6e88', 'name': 'overcloud-controller-2-ctlplane'}\nAttached port overcloud-controller-0-ctlplane (UUID 0a09bbb5-934d-4d34-ab5b-7c14518d06c6) to node controller1 (UUID 03915e7c-a314-41d9-be8c-46291879692a)\nAttached port overcloud-computehci-1-ctlplane (UUID 17e55729-b44b-40a8-9361-9e36e8527de5) to node controller0 (UUID 61bc6512-9c4e-4199-936a-754801f7cffa)\nProvisioning started on node controller1 (UUID 03915e7c-a314-41d9-be8c-46291879692a)\nAttached port overcloud-computehci-2-ctlplane (UUID 76ef56f1-4dbb-453e-a344-968b0da95823) to node computeHCI2 (UUID a5d3552b-ab79-404e-8a52-48dc53a3aa45)\nAttached port overcloud-computehci-0-ctlplane (UUID 79967f38-90ee-49e6-8716-b09cc9460afe) to node computeHCI1 (UUID 3534309b-c11f-48d4-b23d-3ed9f2dcbf79)\nProvisioning started on node controller0 (UUID 61bc6512-9c4e-4199-936a-754801f7cffa)\nAttached port overcloud-controller-1-ctlplane (UUID c70b7c8c-31a6-460e-aa77-ea37b8e332f6) to node controller2 (UUID 83f47771-9f82-4437-8df4-de32bcd6fc63)\nProvisioning started on node computeHCI2 (UUID a5d3552b-ab79-404e-8a52-48dc53a3aa45)\nProvisioning started on node computeHCI1 (UUID 3534309b-c11f-48d4-b23d-3ed9f2dcbf79)\nAttached port overcloud-controller-2-ctlplane (UUID 2b74c4e3-705c-4bf1-85de-a5e0a9cdb591) to node computeHCI0 (UUID 3924529b-44a7-4c74-b1e1-2175d6313a3e)\nProvisioning started on node controller2 (UUID 83f47771-9f82-4437-8df4-de32bcd6fc63)\nProvisioning started on node computeHCI0 (UUID 3924529b-44a7-4c74-b1e1-2175d6313a3e)\n", "msg": "Node *a5d3552b-ab79-404e-8a52-**48dc53a3aa45 reached failure state \"deploy failed\"; the last error is Agent returned error for deploy step {'step': 'write_image', 'priority': 80, 'argsinfo': None, 'interface': 'deploy'} on node a5d3552b-ab79-404e-8a52-**48dc53a3aa45 : Error performing deploy_step write_image: Command execution failed: Failed to check the number of primary partitions present on /dev/vda for node a5d3552b-ab79-404e-8a52-**48dc53a3aa45. Error: The device /dev/vda does not have a valid MBR partition table."}*

Any ideas? thanks in advance. Regards. Le mer. 18 août 2021 à 16:01, John Fulton <johfulto@redhat.com> a écrit :

...

Hi Wodel,

Yes, it's possible to deploy openstack with tripleo using VMs and nested-KVM for the compute node. I personally use this tool on my hypervisor to do it.

https://github.com/cjeanner/tripleo-lab

By trying to get tripleo working with nested KVM without using a tool like the above you might eventually create your own version of the same tool though using the above helps you skip those steps.

The issue you're hitting from the error message below looks like the Nova scheduler on the undercloud not finding an Ironic node that satisfies the scheduling criteria. This can be debugged but you might find it easier to just not have the problem by letting another tool deal with this for you. Also, with wallaby and newer tripleo does not use Nova on the undercloud and instead the recommended deployment process is to use metalsmith as described here:

https://docs.openstack.org/project-deploy-guide/tripleo-docs/latest/provisio...

You also have the standalone option of using tripleo on a single VM:

https://docs.openstack.org/project-deploy-guide/tripleo-docs/latest/deployme...

John

On Wed, Aug 18, 2021 at 10:28 AM wodel youchi <wodel.youchi@gmail.com> wrote:

...
Hi, I am trying to deploy openstack with tripleO using VMs and nested-KVM for the compute node. This is for test and learning purposes.

I am using the Train version and following some tutorials. I prepared my different template files and started the deployment, but I got these errors :

*Failed to provision instance fc40457e-4b3c-4402-ae9d-c528f2c2ad30: Asynchronous exception: Node failed to deploy. Exception: Agent API for node 6d3724fc-6f13-4588-bbe5-56bc4f9a4f87 returned HTTP status code 404 with error: Not found: Extension with id iscsi not found. for node*

and

*Got HTTP 409: {"errors": [{"status": 409, "title": "Conflict", "detail": "There was a conflict when trying to complete your request.\n\n Unable to allocate inventory: Unable to create allocation for 'CUSTOM_BAREMETAL' on resource provider '6d3724fc-6f13-4588-bbe5-56bc4f9a4f87'. The requested amount would exceed the capacity. ",*

Could you help understand what those errors mean? I couldn't find anything similar on the net.

Thanks in advance.

Regards.

John Fulton

5:39 p.m.

New subject: [tripleo] Re: Need help deploying Openstack

On Wed, Aug 18, 2021 at 11:29 AM wodel youchi <wodel.youchi@gmail.com> wrote:

...

Thanks John,

My idea is to learn how to deploy openstack with tripleO like what will be with physical nodes, my goal is not to get openstack up and running to lean how to use it, my goal is to install it in the same way as if it was a physical implementation.

I use tripleo lab for the same reasons while I'm working on TripleO. Tripleo-lab creates the virtual hardware, adds it to Ironic and then installs the undercloud. But that's where I stop using it and then deploy the overcloud myself. In this way I can pretend I'm using physical nodes. I just don't have to deal with the environmental issues from setting up the simulation with VMs myself (e.g. the second issue below /dev/vda does not have a valid MBR). My notes on using tripleo-lab to run the commands like the ones below (though I didn't have the errors you had) are here if it's useful to you: https://github.com/fultonj/xena/tree/main/networkv2 John

...

About Wallaby, I have tried, but I got other errors :

I followed these documents :

https://docs.openstack.org/project-deploy-guide/tripleo-docs/latest/features...

https://docs.openstack.org/project-deploy-guide/tripleo-docs/latest/features...

https://docs.openstack.org/project-deploy-guide/tripleo-docs/latest/provisio...

At the end I tried these two commands : 1) openstack overcloud node provision --stack overcloud --network-config --output ~/overcloud-baremetal-deployed.yaml ~/templates/ *baremetal-node-net.yaml*

And I this error message :

...
2021-08-15 15:59:32.402435 | 52540075-9baf-2b0b-5fc8-000000000017 | FATAL | Provision instances | localhost | error={"changed": false, "logging": "Deploy attempt failed on node computeHCI2 (UUID 9440e769-9634-4015-82b0-97b8c1921ef5), cleaning up\nTraceback (most recent call last):\n File \"/usr/lib/python3.6/site-packages/metalsmith/_provisioner.py\", line 392, in provision_node\n nics.validate()\n File \"/usr/lib/python3.6/site-packages/metalsmith/_nics.py\", line 60, in validate\n result.append(('network', self._get_network(nic)))\n File \"/usr/lib/python3.6/site-packages/metalsmith/_nics.py\", line 128, in _get_network\n 'Unexpected fields for a network: %s' % ', '.join(unexpected))\nmetalsmith.exceptions.InvalidNIC: Unexpected fields for a network: subnet\nDeploy attempt failed on node computeHCI0 (UUID 31eecc38-7d80-4ddd-9cc4-a76edf00ec3a), cleaning up\nTraceback (most recent call last):\n File \"/usr/lib/python3.6/site-packages/metalsmith/_provisioner.py\", line 392, in provision_node\n nics.validate()\n File \"/usr/lib/python3.6/site-packages/metalsmith/_nics.py\", line 60, in validate\n result.append(('network', self._get_network(nic)))\n File \"/usr/lib/python3.6/site-packages/metalsmith/_nics.py\", line 128, in _get_network\n 'Unexpected fields for a network: %s' % ', '.join(unexpected))\nmetalsmith.exceptions.InvalidNIC: Unexpected fields for a network: subnet\nDeploy attempt failed on node controller2 (UUID f62f8cb5-40e4-4e40-805d-852c2a2f7e00), cleaning up\nTraceback (most recent call last):\n File \"/usr/lib/python3.6/site-packages/metalsmith/_provisioner.py\", line 392, in provision_node\n nics.validate()\n File \"/usr/lib/python3.6/site-packages/metalsmith/_nics.py\", line 60, in validate\n result.append(('network', self._get_network(nic)))\n File \"/usr/lib/python3.6/site-packages/metalsmith/_nics.py\", line 128, in _get_network\n 'Unexpected fields for a network: %s' % ', '.join(unexpected))\nmetalsmith.exceptions.InvalidNIC: Unexpected fields for a network: subnet\nDeploy attempt failed on node computeHCI1 (UUID 2798b208-4842-4083-9b3f-c46953b52928), cleaning up\nTraceback (most recent call last):\n File \"/usr/lib/python3.6/site-packages/metalsmith/_provisioner.py\", line 392, in provision_node\n nics.validate()\n File \"/usr/lib/python3.6/site-packages/metalsmith/_nics.py\", line 60, in validate\n result.append(('network', self._get_network(nic)))\n File \"/usr/lib/python3.6/site-packages/metalsmith/_nics.py\", line 128, in _get_network\n 'Unexpected fields for a network: %s' % ', '.join(unexpected))\nmetalsmith.exceptions.InvalidNIC: Unexpected fields for a network: subnet\n Deploy attempt failed on node controller1 (UUID df87e5c7-32b0-4bfa-9ef9-f0a98666e7de), cleaning up\nTraceback (most recent call last):\n File \"/usr/lib/python3.6/site-packages/metalsmith/_provisioner.py\", line 392, in provision_node\n nics.validate()\n File \"/usr/lib/python3.6/site-packages/metalsmith/_nics.py\", line 60, in validate\n result.append(('network', self._get_network(nic)))\n File \"/usr/lib/python3.6/site-packages/metalsmith/_nics.py\", line 128, in _get_network\n 'Unexpected fields for a network: %s' % ', '.join(unexpected))\nmetalsmith.exceptions.InvalidNIC: Unexpected fields for a network: subnet\nDeploy attempt failed on node controller0 (UUID af75122c-a6e8-42d7-afd2-571738c42061), cleaning up\nTraceback (most recent call last):\n File \"/usr/lib/python3.6/site-packages/metalsmith/_*provisioner.py\", line 392, in provision_node\n nics.validate()\n File \"/usr/lib/python3.6/site-**packages/metalsmith/_nics.py\"**, line 60, in validate\n result.append(('network', self._get_network(nic)))\n File \"/usr/lib/python3.6/site-**packages/metalsmith/_nics.py\"**, line 128, in _get_network\n 'Unexpected fields for a network: %s' % ', '.join(unexpected))\**nmetalsmith.exceptions.**InvalidNIC: Unexpected fields for a network: subnet\n", "msg": "Unexpected fields for a network: subnet"}*

2) openstack overcloud node provision --stack overcloud --output ~/ overcloud-baremetal-deployed.yaml ~/templates/*baremetal-node**.yaml*

This time I got this error :

...
2021-08-16 13:47:02.156021 | 52540075-9baf-0672-d7b8-000000000017 | FATAL | Provision instances | localhost | error={"changed": false, "logging": "Created port overcloud-computehci-1-ctlplane (UUID 17e55729-b44b-40a8-9361-9e36e8527de5) for node controller0 (UUID 61bc6512-9c4e-4199-936a-754801f7cffa) with {'network_id': '1c8c5e86-79ac-4ec6-9616-3c965cab6e88', 'name': 'overcloud-computehci-1-ctlplane'}\nCreated port overcloud-controller-0-ctlplane (UUID 0a09bbb5-934d-4d34-ab5b-7c14518d06c6) for node controller1 (UUID 03915e7c-a314-41d9-be8c-46291879692a) with {'network_id': '1c8c5e86-79ac-4ec6-9616-3c965cab6e88', 'name': 'overcloud-controller-0-ctlplane'}\nCreated port overcloud-computehci-2-ctlplane (UUID 76ef56f1-4dbb-453e-a344-968b0da95823) for node computeHCI2 (UUID a5d3552b-ab79-404e-8a52-48dc53a3aa45) with {'network_id': '1c8c5e86-79ac-4ec6-9616-3c965cab6e88', 'name': 'overcloud-computehci-2-ctlplane'}\nCreated port overcloud-computehci-0-ctlplane (UUID 79967f38-90ee-49e6-8716-b09cc9460afe) for node computeHCI1 (UUID 3534309b-c11f-48d4-b23d-3ed9f2dcbf79) with {'network_id': '1c8c5e86-79ac-4ec6-9616-3c965cab6e88', 'name': 'overcloud-computehci-0-ctlplane'}\nCreated port overcloud-controller-1-ctlplane (UUID c70b7c8c-31a6-460e-aa77-ea37b8e332f6) for node controller2 (UUID 83f47771-9f82-4437-8df4-de32bcd6fc63) with {'network_id': '1c8c5e86-79ac-4ec6-9616-3c965cab6e88', 'name': 'overcloud-controller-1-ctlplane'}\nCreated port overcloud-controller-2-ctlplane (UUID 2b74c4e3-705c-4bf1-85de-a5e0a9cdb591) for node computeHCI0 (UUID 3924529b-44a7-4c74-b1e1-2175d6313a3e) with {'network_id': '1c8c5e86-79ac-4ec6-9616-3c965cab6e88', 'name': 'overcloud-controller-2-ctlplane'}\nAttached port overcloud-controller-0-ctlplane (UUID 0a09bbb5-934d-4d34-ab5b-7c14518d06c6) to node controller1 (UUID 03915e7c-a314-41d9-be8c-46291879692a)\nAttached port overcloud-computehci-1-ctlplane (UUID 17e55729-b44b-40a8-9361-9e36e8527de5) to node controller0 (UUID 61bc6512-9c4e-4199-936a-754801f7cffa)\nProvisioning started on node controller1 (UUID 03915e7c-a314-41d9-be8c-46291879692a)\nAttached port overcloud-computehci-2-ctlplane (UUID 76ef56f1-4dbb-453e-a344-968b0da95823) to node computeHCI2 (UUID a5d3552b-ab79-404e-8a52-48dc53a3aa45)\nAttached port overcloud-computehci-0-ctlplane (UUID 79967f38-90ee-49e6-8716-b09cc9460afe) to node computeHCI1 (UUID 3534309b-c11f-48d4-b23d-3ed9f2dcbf79)\nProvisioning started on node controller0 (UUID 61bc6512-9c4e-4199-936a-754801f7cffa)\nAttached port overcloud-controller-1-ctlplane (UUID c70b7c8c-31a6-460e-aa77-ea37b8e332f6) to node controller2 (UUID 83f47771-9f82-4437-8df4-de32bcd6fc63)\nProvisioning started on node computeHCI2 (UUID a5d3552b-ab79-404e-8a52-48dc53a3aa45)\nProvisioning started on node computeHCI1 (UUID 3534309b-c11f-48d4-b23d-3ed9f2dcbf79)\nAttached port overcloud-controller-2-ctlplane (UUID 2b74c4e3-705c-4bf1-85de-a5e0a9cdb591) to node computeHCI0 (UUID 3924529b-44a7-4c74-b1e1-2175d6313a3e)\nProvisioning started on node controller2 (UUID 83f47771-9f82-4437-8df4-de32bcd6fc63)\nProvisioning started on node computeHCI0 (UUID 3924529b-44a7-4c74-b1e1-2175d6313a3e)\n", "msg": "Node *a5d3552b-ab79-404e-8a52-**48dc53a3aa45 reached failure state \"deploy failed\"; the last error is Agent returned error for deploy step {'step': 'write_image', 'priority': 80, 'argsinfo': None, 'interface': 'deploy'} on node a5d3552b-ab79-404e-8a52-**48dc53a3aa45 : Error performing deploy_step write_image: Command execution failed: Failed to check the number of primary partitions present on /dev/vda for node a5d3552b-ab79-404e-8a52-**48dc53a3aa45. Error: The device /dev/vda does not have a valid MBR partition table."}*

Any ideas? thanks in advance.

Regards.

Le mer. 18 août 2021 à 16:01, John Fulton <johfulto@redhat.com> a écrit :

...
Hi Wodel,

Yes, it's possible to deploy openstack with tripleo using VMs and nested-KVM for the compute node. I personally use this tool on my hypervisor to do it.

https://github.com/cjeanner/tripleo-lab

By trying to get tripleo working with nested KVM without using a tool like the above you might eventually create your own version of the same tool though using the above helps you skip those steps.

The issue you're hitting from the error message below looks like the Nova scheduler on the undercloud not finding an Ironic node that satisfies the scheduling criteria. This can be debugged but you might find it easier to just not have the problem by letting another tool deal with this for you. Also, with wallaby and newer tripleo does not use Nova on the undercloud and instead the recommended deployment process is to use metalsmith as described here:

https://docs.openstack.org/project-deploy-guide/tripleo-docs/latest/provisio...

You also have the standalone option of using tripleo on a single VM:

https://docs.openstack.org/project-deploy-guide/tripleo-docs/latest/deployme...

John

On Wed, Aug 18, 2021 at 10:28 AM wodel youchi <wodel.youchi@gmail.com> wrote:

...
Hi, I am trying to deploy openstack with tripleO using VMs and nested-KVM for the compute node. This is for test and learning purposes.

I am using the Train version and following some tutorials. I prepared my different template files and started the deployment, but I got these errors :

*Failed to provision instance fc40457e-4b3c-4402-ae9d-c528f2c2ad30: Asynchronous exception: Node failed to deploy. Exception: Agent API for node 6d3724fc-6f13-4588-bbe5-56bc4f9a4f87 returned HTTP status code 404 with error: Not found: Extension with id iscsi not found. for node*

and

*Got HTTP 409: {"errors": [{"status": 409, "title": "Conflict", "detail": "There was a conflict when trying to complete your request.\n\n Unable to allocate inventory: Unable to create allocation for 'CUSTOM_BAREMETAL' on resource provider '6d3724fc-6f13-4588-bbe5-56bc4f9a4f87'. The requested amount would exceed the capacity. ",*

Could you help understand what those errors mean? I couldn't find anything similar on the net.

Thanks in advance.

Regards.

Dmitry Tantsur

6:05 p.m.

Hi, On Wed, Aug 18, 2021 at 4:39 PM wodel youchi <wodel.youchi@gmail.com> wrote:

...

Hi, I am trying to deploy openstack with tripleO using VMs and nested-KVM for the compute node. This is for test and learning purposes.

I am using the Train version and following some tutorials. I prepared my different template files and started the deployment, but I got these errors :

*Failed to provision instance fc40457e-4b3c-4402-ae9d-c528f2c2ad30: Asynchronous exception: Node failed to deploy. Exception: Agent API for node 6d3724fc-6f13-4588-bbe5-56bc4f9a4f87 returned HTTP status code 404 with error: Not found: Extension with id iscsi not found. for node*

You somehow ended up using master (Xena release) deploy ramdisk with Train TripleO. You need to make sure to download Train images. I hope TripleO people can point you at the right place. Dmitry

...

and

*Got HTTP 409: {"errors": [{"status": 409, "title": "Conflict", "detail": "There was a conflict when trying to complete your request.\n\n Unable to allocate inventory: Unable to create allocation for 'CUSTOM_BAREMETAL' on resource provider '6d3724fc-6f13-4588-bbe5-56bc4f9a4f87'. The requested amount would exceed the capacity. ",*

Could you help understand what those errors mean? I couldn't find anything similar on the net.

Thanks in advance.

Regards.

-- Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn, Commercial register: Amtsgericht Muenchen, HRB 153243, Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neill

Wesley Hayutin

7:02 p.m.

On Wed, Aug 18, 2021 at 10:10 AM Dmitry Tantsur <dtantsur@redhat.com> wrote:

...

Hi,

On Wed, Aug 18, 2021 at 4:39 PM wodel youchi <wodel.youchi@gmail.com> wrote:

...
Hi, I am trying to deploy openstack with tripleO using VMs and nested-KVM for the compute node. This is for test and learning purposes.

I am using the Train version and following some tutorials. I prepared my different template files and started the deployment, but I got these errors :

*Failed to provision instance fc40457e-4b3c-4402-ae9d-c528f2c2ad30: Asynchronous exception: Node failed to deploy. Exception: Agent API for node 6d3724fc-6f13-4588-bbe5-56bc4f9a4f87 returned HTTP status code 404 with error: Not found: Extension with id iscsi not found. for node*

You somehow ended up using master (Xena release) deploy ramdisk with Train TripleO. You need to make sure to download Train images. I hope TripleO people can point you at the right place.

Dmitry

http://images.rdoproject.org/centos8/ http://images.rdoproject.org/centos8/train/rdo_trunk/current-tripleo/

...

...
and

*Got HTTP 409: {"errors": [{"status": 409, "title": "Conflict", "detail": "There was a conflict when trying to complete your request.\n\n Unable to allocate inventory: Unable to create allocation for 'CUSTOM_BAREMETAL' on resource provider '6d3724fc-6f13-4588-bbe5-56bc4f9a4f87'. The requested amount would exceed the capacity. ",*

Could you help understand what those errors mean? I couldn't find anything similar on the net.

Thanks in advance.

Regards.

-- Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn, Commercial register: Amtsgericht Muenchen, HRB 153243, Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neill

wodel youchi

23 Aug 23 Aug

4:11 p.m.

Hi, I redid the undercloud deployment for the Train version for now. And I verified the download URL for the images. My overcloud deployment has moved forward but I still get errors. This is what I got this time :

...

"TASK [ceph-grafana : wait for grafana to start] ********************************",

"Monday 23 August 2021 14:55:21 +0100 (0:00:00.961) 0:12:59.319 ********* ",

* "fatal: [overcloud-controller-0]: FAILED! => {\"changed\": false, \"elapsed\": 300, \"msg\": \"Timeout when waiting for 10.20 0.7.151:3100\"}",

"fatal: [overcloud-controller-1]: FAILED! => {\"changed\": false, \"elapsed\": 300, \"msg\": \"Timeout when waiting for 10.20 0.7.155:3100\"}",

"fatal: [overcloud-controller-2]: FAILED! => {\"changed\": false, \"elapsed\": 300, \"msg\": \"Timeout when waiting for 10.20 0.7.165:3100\"}",

"RUNNING HAND*LER [ceph-prometheus : service handler] ****************************", "Monday 23 August 2021 15:00:22 +0100 (0:05:00.767) 0:18:00.087 ********* ", "PLAY RECAP *********************************************************************",

"overcloud-computehci-0 : ok=224 changed=23 unreachable=0 failed=0 skipped=415 rescued=0 ignored=0 ", "overcloud-computehci-1 : ok=199 changed=18 unreachable=0 failed=0 skipped=392 rescued=0 ignored=0 ", "overcloud-computehci-2 : ok=212 changed=23 unreachable=0 failed=0 skipped=390 rescued=0 ignored=0 ", "overcloud-controller-0 : ok=370 changed=52 unreachable=0 failed=1 skipped=539 rescued=0 ignored=0 ", "overcloud-controller-1 : ok=308 changed=43 unreachable=0 failed=1 skipped=495 rescued=0 ignored=0 ", "overcloud-controller-2 : ok=317 changed=45 unreachable=0 failed=1 skipped=493 rescued=0 ignored=0 ",

"INSTALLER STATUS

...

***************************************************************",

"Install Ceph Monitor : Complete (0:00:52)",

"Install Ceph Manager : Complete (0:05:49)",

"Install Ceph OSD : Complete (0:02:28)",

"Install Ceph RGW : Complete (0:00:27)", "Install Ceph Client : Complete (0:00:33)", "Install Ceph Grafana : In Progress (0:05:54)", "\tThis phase can be restarted by running: roles/ceph-grafana/tasks/main.yml", "Install Ceph Node Exporter : Complete (0:00:28)", "Monday 23 August 2021 15:00:22 +0100 (0:00:00.006) 0:18:00.094 ********* ", "=============================================================================== ", "ceph-grafana : wait for grafana to start ------------------------------ 300.77s", "ceph-facts : get ceph current status ---------------------------------- 300.27s", "ceph-container-common : pulling udtrain.ctlplane.umaitek.dz:8787/ceph-ci/daemon:v4.0.19-stable-4.0-nautilus-centos-7-x86_64 image -- 19.04s", "ceph-mon : waiting for the monitor(s) to form the quorum... ------------ 12.83s", "ceph-osd : use ceph-volume lvm batch to create bluestore osds ---------- 12.13s", "ceph-osd : wait for all osd to be up ----------------------------------- 11.88s", "ceph-osd : set pg_autoscale_mode value on pool(s) ---------------------- 11.00s", "ceph-osd : create openstack pool(s) ------------------------------------ 10.80s", "ceph-grafana : make sure grafana is down ------------------------------- 10.66s", "ceph-osd : customize pool crush_rule ----------------------------------- 10.15s", "ceph-osd : customize pool size ----------------------------------------- 10.15s", "ceph-osd : customize pool min_size ------------------------------------- 10.14s", "ceph-osd : assign application to pool(s) ------------------------------- 10.13s", "ceph-osd : list existing pool(s) ---------------------------------------- 8.59s",

"ceph-mon : fetch ceph initial keys

...

-------------------------------------- 7.01s",

"ceph-container-common : get ceph version -------------------------------- 6.75s",

"ceph-prometheus : start prometheus services ----------------------------- 6.67s",

"ceph-mgr : wait for all mgr to be up ------------------------------------ 6.66s",

"ceph-grafana : start the grafana-server service ------------------------- 6.33s",

"ceph-mgr : create ceph mgr keyring(s) on a mon node --------------------- 6.26s" ],

"failed_when_result": true

} 2021-08-23 15:00:24.427687 | 525400e8-92c8-47b1-e162-00000000597d | TIMING | tripleo-ceph-run-ansible : print ceph-ansible outpu$ in case of failure | undercloud | 0:37:30.226345 | 0.25s

PLAY RECAP ********************************************************************* overcloud-computehci-0 : ok=213 changed=117 unreachable=0 failed=0 skipped=120 rescued=0 ignored=0 overcloud-computehci-1 : ok=207 changed=117 unreachable=0 failed=0 skipped=120 rescued=0 ignored=0 overcloud-computehci-2 : ok=207 changed=117 unreachable=0 failed=0 skipped=120 rescued=0 ignored=0 overcloud-controller-0 : ok=237 changed=145 unreachable=0 failed=0 skipped=128 rescued=0 ignored=0 overcloud-controller-1 : ok=232 changed=145 unreachable=0 failed=0 skipped=128 rescued=0 ignored=0 overcloud-controller-2 : ok=232 changed=145 unreachable=0 failed=0 skipped=128 rescued=0 ignored=0 undercloud : ok=100 changed=18 unreachable=0 failed=1 skipped=37 rescued=0 ignored=0

2021-08-23 15:00:24.559997 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Summary Information ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2021-08-23 15:00:24.560328 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Total Tasks: 1366 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2021-08-23 15:00:24.560419 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Elapsed Time: 0:37:30.359090 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2021-08-23 15:00:24.560490 | UUID | Info | Host | Task Name | Run Time 2021-08-23 15:00:24.560589 | 525400e8-92c8-47b1-e162-00000000597b | SUMMARY | undercloud | tripleo-ceph-run-ansible : run ceph-ans ible | 1082.71s 2021-08-23 15:00:24.560675 | 525400e8-92c8-47b1-e162-000000004d9a | SUMMARY | overcloud-controller-1 | Wait for container-puppet t asks (generate config) to finish | 356.02s 2021-08-23 15:00:24.560763 | 525400e8-92c8-47b1-e162-000000004d6a | SUMMARY | overcloud-controller-0 | Wait for container-puppet t asks (generate config) to finish | 355.74s 2021-08-23 15:00:24.560839 | 525400e8-92c8-47b1-e162-000000004dd0 | SUMMARY | overcloud-controller-2 | Wait for container-puppet t asks (generate config) to finish | 355.68s 2021-08-23 15:00:24.560912 | 525400e8-92c8-47b1-e162-000000003bb1 | SUMMARY | undercloud | Run tripleo-container-image-prepare log ged to: /var/log/tripleo-container-image-prepare.log | 143.03s 2021-08-23 15:00:24.560986 | 525400e8-92c8-47b1-e162-000000004b13 | SUMMARY | overcloud-controller-0 | Wait for puppet host config uration to finish | 125.36s 2021-08-23 15:00:24.561057 | 525400e8-92c8-47b1-e162-000000004b88 | SUMMARY | overcloud-controller-2 | Wait for puppet host config uration to finish | 125.33s 2021-08-23 15:00:24.561128 | 525400e8-92c8-47b1-e162-000000004b4b | SUMMARY | overcloud-controller-1 | Wait for puppet host config uration to finish | 125.25s 2021-08-23 15:00:24.561300 | 525400e8-92c8-47b1-e162-000000001dc4 | SUMMARY | overcloud-controller-2 | Run puppet on the host to a pply IPtables rules | 108.08s 2021-08-23 15:00:24.561374 | 525400e8-92c8-47b1-e162-000000001e4f | SUMMARY | overcloud-controller-0 | Run puppet on the host to a pply IPtables rules | 107.34s 2021-08-23 15:00:24.561444 | 525400e8-92c8-47b1-e162-000000004c8d | SUMMARY | overcloud-computehci-2 | Wait for container-puppet t asks (generate config) to finish | 96.56s 2021-08-23 15:00:24.561514 | 525400e8-92c8-47b1-e162-000000004c33 | SUMMARY | overcloud-computehci-0 | Wait for container-puppet t asks (generate config) to finish | 96.38s 2021-08-23 15:00:24.561580 | 525400e8-92c8-47b1-e162-000000004c60 | SUMMARY | overcloud-computehci-1 | Wait for container-puppet t asks (generate config) to finish | 93.41s 2021-08-23 15:00:24.561645 | 525400e8-92c8-47b1-e162-00000000434d | SUMMARY | overcloud-computehci-0 | Pre-fetch all the container s | 92.70s 2021-08-23 15:00:24.561712 | 525400e8-92c8-47b1-e162-0000000043ed | SUMMARY | overcloud-computehci-2 | Pre-fetch all the container s | 91.90s 2021-08-23 15:00:24.561782 | 525400e8-92c8-47b1-e162-000000004385 | SUMMARY | overcloud-computehci-1 | Pre-fetch all the container s | 91.88s 2021-08-23 15:00:24.561876 | 525400e8-92c8-47b1-e162-00000000491c | SUMMARY | overcloud-computehci-1 | Wait for puppet host config uration to finish | 90.37s 2021-08-23 15:00:24.561947 | 525400e8-92c8-47b1-e162-000000004951 | SUMMARY | overcloud-computehci-2 | Wait for puppet host config uration to finish | 90.37s 2021-08-23 15:00:24.562016 | 525400e8-92c8-47b1-e162-0000000048e6 | SUMMARY | overcloud-computehci-0 | Wait for puppet host config uration to finish | 90.35s 2021-08-23 15:00:24.562080 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ End Summary Information ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2021-08-23 15:00:24.562196 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ State Information ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2021-08-23 15:00:24.562311 | ~~~~~~~~~~~~~~~~~~ Number of nodes which did not deploy successfully: 1 ~~~~~~~~~~~~~~~~~ 2021-08-23 15:00:24.562379 | The following node(s) had failures: undercloud 2021-08-23 15:00:24.562456 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ *Host 10.0.2.40 not found in /home/stack/.ssh/known_hosts * Ansible failed, check log at /var/lib/mistral/overcloud/ansible.log.Overcloud Endpoint: http://10.0.2.40:5000 Overcloud Horizon Dashboard URL: http://10.0.2.40:80/dashboard Overcloud rc file: /home/stack/overcloudrc Overcloud Deployed with error Overcloud configuration failed.

Could someone help debug this, the ansible.log is huge, I can't see what's the origin of the problem, if someone can point me to the right direction it will aprecciated. Thanks in advance. Regards. Le mer. 18 août 2021 à 18:02, Wesley Hayutin <whayutin@redhat.com> a écrit :

...

On Wed, Aug 18, 2021 at 10:10 AM Dmitry Tantsur <dtantsur@redhat.com> wrote:

...
Hi,

On Wed, Aug 18, 2021 at 4:39 PM wodel youchi <wodel.youchi@gmail.com> wrote:

...
Hi, I am trying to deploy openstack with tripleO using VMs and nested-KVM for the compute node. This is for test and learning purposes.

I am using the Train version and following some tutorials. I prepared my different template files and started the deployment, but I got these errors :

*Failed to provision instance fc40457e-4b3c-4402-ae9d-c528f2c2ad30: Asynchronous exception: Node failed to deploy. Exception: Agent API for node 6d3724fc-6f13-4588-bbe5-56bc4f9a4f87 returned HTTP status code 404 with error: Not found: Extension with id iscsi not found. for node*

You somehow ended up using master (Xena release) deploy ramdisk with Train TripleO. You need to make sure to download Train images. I hope TripleO people can point you at the right place.

Dmitry

http://images.rdoproject.org/centos8/ http://images.rdoproject.org/centos8/train/rdo_trunk/current-tripleo/

...
...
and

*Got HTTP 409: {"errors": [{"status": 409, "title": "Conflict", "detail": "There was a conflict when trying to complete your request.\n\n Unable to allocate inventory: Unable to create allocation for 'CUSTOM_BAREMETAL' on resource provider '6d3724fc-6f13-4588-bbe5-56bc4f9a4f87'. The requested amount would exceed the capacity. ",*

Could you help understand what those errors mean? I couldn't find anything similar on the net.

Thanks in advance.

Regards.

-- Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn, Commercial register: Amtsgericht Muenchen, HRB 153243, Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neill

John Fulton

6:07 p.m.

On Mon, Aug 23, 2021 at 10:52 AM wodel youchi <wodel.youchi@gmail.com> wrote:

...

Hi,

I redid the undercloud deployment for the Train version for now. And I verified the download URL for the images. My overcloud deployment has moved forward but I still get errors.

This is what I got this time :

...
"TASK [ceph-grafana : wait for grafana to start] ********************************", "Monday 23 August 2021 14:55:21 +0100 (0:00:00.961) 0:12:59.319 ********* ", "fatal: [overcloud-controller-0]: FAILED! => {\"changed\": false, \"elapsed\": 300, \"msg\": \"Timeout when waiting for 10.20 0.7.151:3100\"}", "fatal: [overcloud-controller-1]: FAILED! => {\"changed\": false, \"elapsed\": 300, \"msg\": \"Timeout when waiting for 10.20 0.7.155:3100\"}", "fatal: [overcloud-controller-2]: FAILED! => {\"changed\": false, \"elapsed\": 300, \"msg\": \"Timeout when waiting for 10.20 0.7.165:3100\"}",

I'm not certain of the ceph-ansible version you're using but it should be a version 4 with train. ceph-ansible should already be installed on your undercloud judging by this error and in the latest version 4 this task is where it failed: https://github.com/ceph/ceph-ansible/blob/v4.0.64/roles/ceph-grafana/tasks/c... You can check the status of this service on your three controllers and then debug it directly. John

...

...
"RUNNING HANDLER [ceph-prometheus : service handler] ****************************", "Monday 23 August 2021 15:00:22 +0100 (0:05:00.767) 0:18:00.087 ********* ", "PLAY RECAP *********************************************************************", "overcloud-computehci-0 : ok=224 changed=23 unreachable=0 failed=0 skipped=415 rescued=0 ignored=0 ", "overcloud-computehci-1 : ok=199 changed=18 unreachable=0 failed=0 skipped=392 rescued=0 ignored=0 ", "overcloud-computehci-2 : ok=212 changed=23 unreachable=0 failed=0 skipped=390 rescued=0 ignored=0 ", "overcloud-controller-0 : ok=370 changed=52 unreachable=0 failed=1 skipped=539 rescued=0 ignored=0 ", "overcloud-controller-1 : ok=308 changed=43 unreachable=0 failed=1 skipped=495 rescued=0 ignored=0 ", "overcloud-controller-2 : ok=317 changed=45 unreachable=0 failed=1 skipped=493 rescued=0 ignored=0 ",

"INSTALLER STATUS ***************************************************************", "Install Ceph Monitor : Complete (0:00:52)", "Install Ceph Manager : Complete (0:05:49)", "Install Ceph OSD : Complete (0:02:28)", "Install Ceph RGW : Complete (0:00:27)", "Install Ceph Client : Complete (0:00:33)", "Install Ceph Grafana : In Progress (0:05:54)", "\tThis phase can be restarted by running: roles/ceph-grafana/tasks/main.yml", "Install Ceph Node Exporter : Complete (0:00:28)", "Monday 23 August 2021 15:00:22 +0100 (0:00:00.006) 0:18:00.094 ********* ", "=============================================================================== ", "ceph-grafana : wait for grafana to start ------------------------------ 300.77s", "ceph-facts : get ceph current status ---------------------------------- 300.27s", "ceph-container-common : pulling udtrain.ctlplane.umaitek.dz:8787/ceph-ci/daemon:v4.0.19-stable-4.0-nautilus-centos-7-x86_64 image -- 19.04s", "ceph-mon : waiting for the monitor(s) to form the quorum... ------------ 12.83s", "ceph-osd : use ceph-volume lvm batch to create bluestore osds ---------- 12.13s", "ceph-osd : wait for all osd to be up ----------------------------------- 11.88s", "ceph-osd : set pg_autoscale_mode value on pool(s) ---------------------- 11.00s", "ceph-osd : create openstack pool(s) ------------------------------------ 10.80s", "ceph-grafana : make sure grafana is down ------------------------------- 10.66s", "ceph-osd : customize pool crush_rule ----------------------------------- 10.15s", "ceph-osd : customize pool size ----------------------------------------- 10.15s", "ceph-osd : customize pool min_size ------------------------------------- 10.14s", "ceph-osd : assign application to pool(s) ------------------------------- 10.13s", "ceph-osd : list existing pool(s) ---------------------------------------- 8.59s",

"ceph-mon : fetch ceph initial keys -------------------------------------- 7.01s", "ceph-container-common : get ceph version -------------------------------- 6.75s", "ceph-prometheus : start prometheus services ----------------------------- 6.67s", "ceph-mgr : wait for all mgr to be up ------------------------------------ 6.66s", "ceph-grafana : start the grafana-server service ------------------------- 6.33s", "ceph-mgr : create ceph mgr keyring(s) on a mon node --------------------- 6.26s" ], "failed_when_result": true } 2021-08-23 15:00:24.427687 | 525400e8-92c8-47b1-e162-00000000597d | TIMING | tripleo-ceph-run-ansible : print ceph-ansible outpu$ in case of failure | undercloud | 0:37:30.226345 | 0.25s

PLAY RECAP ********************************************************************* overcloud-computehci-0 : ok=213 changed=117 unreachable=0 failed=0 skipped=120 rescued=0 ignored=0 overcloud-computehci-1 : ok=207 changed=117 unreachable=0 failed=0 skipped=120 rescued=0 ignored=0 overcloud-computehci-2 : ok=207 changed=117 unreachable=0 failed=0 skipped=120 rescued=0 ignored=0 overcloud-controller-0 : ok=237 changed=145 unreachable=0 failed=0 skipped=128 rescued=0 ignored=0 overcloud-controller-1 : ok=232 changed=145 unreachable=0 failed=0 skipped=128 rescued=0 ignored=0 overcloud-controller-2 : ok=232 changed=145 unreachable=0 failed=0 skipped=128 rescued=0 ignored=0 undercloud : ok=100 changed=18 unreachable=0 failed=1 skipped=37 rescued=0 ignored=0

2021-08-23 15:00:24.559997 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Summary Information ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2021-08-23 15:00:24.560328 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Total Tasks: 1366 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2021-08-23 15:00:24.560419 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Elapsed Time: 0:37:30.359090 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2021-08-23 15:00:24.560490 | UUID | Info | Host | Task Name | Run Time 2021-08-23 15:00:24.560589 | 525400e8-92c8-47b1-e162-00000000597b | SUMMARY | undercloud | tripleo-ceph-run-ansible : run ceph-ans ible | 1082.71s 2021-08-23 15:00:24.560675 | 525400e8-92c8-47b1-e162-000000004d9a | SUMMARY | overcloud-controller-1 | Wait for container-puppet t asks (generate config) to finish | 356.02s 2021-08-23 15:00:24.560763 | 525400e8-92c8-47b1-e162-000000004d6a | SUMMARY | overcloud-controller-0 | Wait for container-puppet t asks (generate config) to finish | 355.74s 2021-08-23 15:00:24.560839 | 525400e8-92c8-47b1-e162-000000004dd0 | SUMMARY | overcloud-controller-2 | Wait for container-puppet t asks (generate config) to finish | 355.68s 2021-08-23 15:00:24.560912 | 525400e8-92c8-47b1-e162-000000003bb1 | SUMMARY | undercloud | Run tripleo-container-image-prepare log ged to: /var/log/tripleo-container-image-prepare.log | 143.03s 2021-08-23 15:00:24.560986 | 525400e8-92c8-47b1-e162-000000004b13 | SUMMARY | overcloud-controller-0 | Wait for puppet host config uration to finish | 125.36s 2021-08-23 15:00:24.561057 | 525400e8-92c8-47b1-e162-000000004b88 | SUMMARY | overcloud-controller-2 | Wait for puppet host config uration to finish | 125.33s 2021-08-23 15:00:24.561128 | 525400e8-92c8-47b1-e162-000000004b4b | SUMMARY | overcloud-controller-1 | Wait for puppet host config uration to finish | 125.25s 2021-08-23 15:00:24.561300 | 525400e8-92c8-47b1-e162-000000001dc4 | SUMMARY | overcloud-controller-2 | Run puppet on the host to a pply IPtables rules | 108.08s 2021-08-23 15:00:24.561374 | 525400e8-92c8-47b1-e162-000000001e4f | SUMMARY | overcloud-controller-0 | Run puppet on the host to a pply IPtables rules | 107.34s 2021-08-23 15:00:24.561444 | 525400e8-92c8-47b1-e162-000000004c8d | SUMMARY | overcloud-computehci-2 | Wait for container-puppet t asks (generate config) to finish | 96.56s 2021-08-23 15:00:24.561514 | 525400e8-92c8-47b1-e162-000000004c33 | SUMMARY | overcloud-computehci-0 | Wait for container-puppet t asks (generate config) to finish | 96.38s 2021-08-23 15:00:24.561580 | 525400e8-92c8-47b1-e162-000000004c60 | SUMMARY | overcloud-computehci-1 | Wait for container-puppet t asks (generate config) to finish | 93.41s 2021-08-23 15:00:24.561645 | 525400e8-92c8-47b1-e162-00000000434d | SUMMARY | overcloud-computehci-0 | Pre-fetch all the container s | 92.70s 2021-08-23 15:00:24.561712 | 525400e8-92c8-47b1-e162-0000000043ed | SUMMARY | overcloud-computehci-2 | Pre-fetch all the container s | 91.90s 2021-08-23 15:00:24.561782 | 525400e8-92c8-47b1-e162-000000004385 | SUMMARY | overcloud-computehci-1 | Pre-fetch all the container s | 91.88s 2021-08-23 15:00:24.561876 | 525400e8-92c8-47b1-e162-00000000491c | SUMMARY | overcloud-computehci-1 | Wait for puppet host config uration to finish | 90.37s 2021-08-23 15:00:24.561947 | 525400e8-92c8-47b1-e162-000000004951 | SUMMARY | overcloud-computehci-2 | Wait for puppet host config uration to finish | 90.37s 2021-08-23 15:00:24.562016 | 525400e8-92c8-47b1-e162-0000000048e6 | SUMMARY | overcloud-computehci-0 | Wait for puppet host config uration to finish | 90.35s 2021-08-23 15:00:24.562080 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ End Summary Information ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2021-08-23 15:00:24.562196 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ State Information ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2021-08-23 15:00:24.562311 | ~~~~~~~~~~~~~~~~~~ Number of nodes which did not deploy successfully: 1 ~~~~~~~~~~~~~~~~~ 2021-08-23 15:00:24.562379 | The following node(s) had failures: undercloud 2021-08-23 15:00:24.562456 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Host 10.0.2.40 not found in /home/stack/.ssh/known_hosts Ansible failed, check log at /var/lib/mistral/overcloud/ansible.log.Overcloud Endpoint: http://10.0.2.40:5000 Overcloud Horizon Dashboard URL: http://10.0.2.40:80/dashboard Overcloud rc file: /home/stack/overcloudrc Overcloud Deployed with error Overcloud configuration failed.

Could someone help debug this, the ansible.log is huge, I can't see what's the origin of the problem, if someone can point me to the right direction it will aprecciated. Thanks in advance.

Regards.

Le mer. 18 août 2021 à 18:02, Wesley Hayutin <whayutin@redhat.com> a écrit :

...
On Wed, Aug 18, 2021 at 10:10 AM Dmitry Tantsur <dtantsur@redhat.com> wrote:

...
Hi,

On Wed, Aug 18, 2021 at 4:39 PM wodel youchi <wodel.youchi@gmail.com> wrote:

...
Hi, I am trying to deploy openstack with tripleO using VMs and nested-KVM for the compute node. This is for test and learning purposes.

I am using the Train version and following some tutorials. I prepared my different template files and started the deployment, but I got these errors :

Failed to provision instance fc40457e-4b3c-4402-ae9d-c528f2c2ad30: Asynchronous exception: Node failed to deploy. Exception: Agent API for node 6d3724fc-6f13-4588-bbe5-56bc4f9a4f87 returned HTTP status code 404 with error: Not found: Extension with id iscsi not found. for node

You somehow ended up using master (Xena release) deploy ramdisk with Train TripleO. You need to make sure to download Train images. I hope TripleO people can point you at the right place.

Dmitry

http://images.rdoproject.org/centos8/ http://images.rdoproject.org/centos8/train/rdo_trunk/current-tripleo/

...
...
and

Got HTTP 409: {"errors": [{"status": 409, "title": "Conflict", "detail": "There was a conflict when trying to complete your request.\n\n Unable to allocate inventory: Unable to create allocation for 'CUSTOM_BAREMETAL' on resource provider '6d3724fc-6f13-4588-bbe5-56bc4f9a4f87'. The requested amount would exceed the capacity. ",

Could you help understand what those errors mean? I couldn't find anything similar on the net.

Thanks in advance.

Regards.

-- Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn, Commercial register: Amtsgericht Muenchen, HRB 153243, Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neill

Francesco Pantano

6:28 p.m.

Hello, thanks John for your reply here. A few more comments inline: On Mon, Aug 23, 2021 at 6:16 PM John Fulton <johfulto@redhat.com> wrote:

...

On Mon, Aug 23, 2021 at 10:52 AM wodel youchi <wodel.youchi@gmail.com> wrote:

...
Hi,

I redid the undercloud deployment for the Train version for now. And I

verified the download URL for the images.

...
My overcloud deployment has moved forward but I still get errors.

This is what I got this time :

...
"TASK [ceph-grafana : wait for grafana to start]

********************************",

...
"Monday 23 August 2021 14:55:21 +0100 (0:00:00.961)

0:12:59.319 ********* ",

...
"fatal: [overcloud-controller-0]: FAILED! => {\"changed\":

false, \"elapsed\": 300, \"msg\": \"Timeout when waiting for 10.20

...
0.7.151:3100\"}", "fatal: [overcloud-controller-1]: FAILED! => {\"changed\": false, \"elapsed\": 300, \"msg\": \"Timeout when waiting for 10.20 0.7.155:3100\"}", "fatal: [overcloud-controller-2]: FAILED! => {\"changed\": false, \"elapsed\": 300, \"msg\": \"Timeout when waiting for 10.20 0.7.165:3100\"}",

I'm not certain of the ceph-ansible version you're using but it should be a version 4 with train. ceph-ansible should already be installed on your undercloud judging by this error and in the latest version 4 this task is where it failed:

https://github.com/ceph/ceph-ansible/blob/v4.0.64/roles/ceph-grafana/tasks/c...

You can check the status of this service on your three controllers and then debug it directly.

As John pointed out, ceph-ansible is able to configure, render and start the associated systemd unit for all the ceph monitoring stack components (node-exported, prometheus, alertmanager and grafana). You can ssh to your controllers, and check the systemd unit associated, checking the journal to see why they failed to start (I saw there's a timeout waiting for the container to start). A potential plan, in this case, could be: 1. check the systemd unit (I guess you can start with grafana which is the failed service) 2. look at the journal logs (feel free to attach here the relevant part of the output) 3. double check the network where the service is bound (can you attach the /var/lib/mistral/<stack>/ceph-ansible/group_vars/all.yaml) * The grafana process should be run on the storage network, but I see a "Timeout when waiting for 10.200.7.165:3100": is that network the right one?

...

John

...
...
"RUNNING HANDLER [ceph-prometheus : service handler]

****************************",

...
"Monday 23 August 2021 15:00:22 +0100 (0:05:00.767)

0:18:00.087 ********* ",

...
"PLAY RECAP

*********************************************************************",

...
"overcloud-computehci-0 : ok=224 changed=23

unreachable=0 failed=0 skipped=415 rescued=0 ignored=0 ",

...
"overcloud-computehci-1 : ok=199 changed=18

unreachable=0 failed=0 skipped=392 rescued=0 ignored=0 ",

...
"overcloud-computehci-2 : ok=212 changed=23

unreachable=0 failed=0 skipped=390 rescued=0 ignored=0 ",

...
"overcloud-controller-0 : ok=370 changed=52

unreachable=0 failed=1 skipped=539 rescued=0 ignored=0 ",

...
"overcloud-controller-1 : ok=308 changed=43

unreachable=0 failed=1 skipped=495 rescued=0 ignored=0 ",

...
"overcloud-controller-2 : ok=317 changed=45

unreachable=0 failed=1 skipped=493 rescued=0 ignored=0 ",

...
"INSTALLER STATUS

***************************************************************",

...
"Install Ceph Monitor : Complete (0:00:52)", "Install Ceph Manager : Complete (0:05:49)", "Install Ceph OSD : Complete (0:02:28)", "Install Ceph RGW : Complete (0:00:27)", "Install Ceph Client : Complete (0:00:33)", "Install Ceph Grafana : In Progress (0:05:54)", "\tThis phase can be restarted by running:

roles/ceph-grafana/tasks/main.yml",

...
"Install Ceph Node Exporter : Complete (0:00:28)", "Monday 23 August 2021 15:00:22 +0100 (0:00:00.006)

0:18:00.094 ********* ",

...
"=============================================================================== ",

...
"ceph-grafana : wait for grafana to start

------------------------------ 300.77s",

...
"ceph-facts : get ceph current status

---------------------------------- 300.27s",

...
"ceph-container-common : pulling

udtrain.ctlplane.umaitek.dz:8787/ceph-ci/daemon:v4.0.19-stable-4.0-nautilus-centos-7-x86_64

...
image -- 19.04s", "ceph-mon : waiting for the monitor(s) to form the quorum... ------------ 12.83s", "ceph-osd : use ceph-volume lvm batch to create bluestore osds ---------- 12.13s", "ceph-osd : wait for all osd to be up ----------------------------------- 11.88s", "ceph-osd : set pg_autoscale_mode value on pool(s) ---------------------- 11.00s", "ceph-osd : create openstack pool(s) ------------------------------------ 10.80s", "ceph-grafana : make sure grafana is down ------------------------------- 10.66s", "ceph-osd : customize pool crush_rule ----------------------------------- 10.15s", "ceph-osd : customize pool size ----------------------------------------- 10.15s", "ceph-osd : customize pool min_size ------------------------------------- 10.14s", "ceph-osd : assign application to pool(s) ------------------------------- 10.13s", "ceph-osd : list existing pool(s) ---------------------------------------- 8.59s",

"ceph-mon : fetch ceph initial keys -------------------------------------- 7.01s", "ceph-container-common : get ceph version -------------------------------- 6.75s", "ceph-prometheus : start prometheus services ----------------------------- 6.67s", "ceph-mgr : wait for all mgr to be up ------------------------------------ 6.66s", "ceph-grafana : start the grafana-server service ------------------------- 6.33s", "ceph-mgr : create ceph mgr keyring(s) on a mon node --------------------- 6.26s" ], "failed_when_result": true } 2021-08-23 15:00:24.427687 | 525400e8-92c8-47b1-e162-00000000597d | TIMING | tripleo-ceph-run-ansible : print ceph-ansible outpu$ in case of failure | undercloud | 0:37:30.226345 | 0.25s

PLAY RECAP

...
...
overcloud-computehci-0 : ok=213 changed=117 unreachable=0 failed=0 skipped=120 rescued=0 ignored=0 overcloud-computehci-1 : ok=207 changed=117 unreachable=0 failed=0 skipped=120 rescued=0 ignored=0 overcloud-computehci-2 : ok=207 changed=117 unreachable=0 failed=0 skipped=120 rescued=0 ignored=0 overcloud-controller-0 : ok=237 changed=145 unreachable=0 failed=0 skipped=128 rescued=0 ignored=0 overcloud-controller-1 : ok=232 changed=145 unreachable=0 failed=0 skipped=128 rescued=0 ignored=0 overcloud-controller-2 : ok=232 changed=145 unreachable=0 failed=0 skipped=128 rescued=0 ignored=0 undercloud : ok=100 changed=18 unreachable=0 failed=1 skipped=37 rescued=0 ignored=0

2021-08-23 15:00:24.559997 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Summary Information ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2021-08-23 15:00:24.560328 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Total Tasks: 1366 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2021-08-23 15:00:24.560419 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Elapsed Time: 0:37:30.359090 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2021-08-23 15:00:24.560490 | UUID | Info | Host | Task Name | Run Time 2021-08-23 15:00:24.560589 | 525400e8-92c8-47b1-e162-00000000597b | SUMMARY | undercloud | tripleo-ceph-run-ansible : run ceph-ans ible | 1082.71s 2021-08-23 15:00:24.560675 | 525400e8-92c8-47b1-e162-000000004d9a | SUMMARY | overcloud-controller-1 | Wait for container-puppet t asks (generate config) to finish | 356.02s 2021-08-23 15:00:24.560763 | 525400e8-92c8-47b1-e162-000000004d6a | SUMMARY | overcloud-controller-0 | Wait for container-puppet t asks (generate config) to finish | 355.74s 2021-08-23 15:00:24.560839 | 525400e8-92c8-47b1-e162-000000004dd0 | SUMMARY | overcloud-controller-2 | Wait for container-puppet t asks (generate config) to finish | 355.68s 2021-08-23 15:00:24.560912 | 525400e8-92c8-47b1-e162-000000003bb1 | SUMMARY | undercloud | Run tripleo-container-image-prepare log ged to: /var/log/tripleo-container-image-prepare.log | 143.03s 2021-08-23 15:00:24.560986 | 525400e8-92c8-47b1-e162-000000004b13 | SUMMARY | overcloud-controller-0 | Wait for puppet host config uration to finish | 125.36s 2021-08-23 15:00:24.561057 | 525400e8-92c8-47b1-e162-000000004b88 | SUMMARY | overcloud-controller-2 | Wait for puppet host config uration to finish | 125.33s 2021-08-23 15:00:24.561128 | 525400e8-92c8-47b1-e162-000000004b4b | SUMMARY | overcloud-controller-1 | Wait for puppet host config uration to finish | 125.25s 2021-08-23 15:00:24.561300 | 525400e8-92c8-47b1-e162-000000001dc4 | SUMMARY | overcloud-controller-2 | Run puppet on the host to a pply IPtables rules | 108.08s 2021-08-23 15:00:24.561374 | 525400e8-92c8-47b1-e162-000000001e4f | SUMMARY | overcloud-controller-0 | Run puppet on the host to a pply IPtables rules | 107.34s 2021-08-23 15:00:24.561444 | 525400e8-92c8-47b1-e162-000000004c8d | SUMMARY | overcloud-computehci-2 | Wait for container-puppet t asks (generate config) to finish | 96.56s 2021-08-23 15:00:24.561514 | 525400e8-92c8-47b1-e162-000000004c33 | SUMMARY | overcloud-computehci-0 | Wait for container-puppet t asks (generate config) to finish | 96.38s 2021-08-23 15:00:24.561580 | 525400e8-92c8-47b1-e162-000000004c60 | SUMMARY | overcloud-computehci-1 | Wait for container-puppet t asks (generate config) to finish | 93.41s 2021-08-23 15:00:24.561645 | 525400e8-92c8-47b1-e162-00000000434d | SUMMARY | overcloud-computehci-0 | Pre-fetch all the container s | 92.70s 2021-08-23 15:00:24.561712 | 525400e8-92c8-47b1-e162-0000000043ed | SUMMARY | overcloud-computehci-2 | Pre-fetch all the container s | 91.90s 2021-08-23 15:00:24.561782 | 525400e8-92c8-47b1-e162-000000004385 | SUMMARY | overcloud-computehci-1 | Pre-fetch all the container s | 91.88s 2021-08-23 15:00:24.561876 | 525400e8-92c8-47b1-e162-00000000491c | SUMMARY | overcloud-computehci-1 | Wait for puppet host config uration to finish | 90.37s 2021-08-23 15:00:24.561947 | 525400e8-92c8-47b1-e162-000000004951 | SUMMARY | overcloud-computehci-2 | Wait for puppet host config uration to finish | 90.37s 2021-08-23 15:00:24.562016 | 525400e8-92c8-47b1-e162-0000000048e6 | SUMMARY | overcloud-computehci-0 | Wait for puppet host config uration to finish | 90.35s 2021-08-23 15:00:24.562080 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ End Summary Information ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2021-08-23 15:00:24.562196 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ State Information ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2021-08-23 15:00:24.562311 | ~~~~~~~~~~~~~~~~~~ Number of nodes which did not deploy successfully: 1 ~~~~~~~~~~~~~~~~~ 2021-08-23 15:00:24.562379 | The following node(s) had failures: undercloud 2021-08-23 15:00:24.562456 |
>> Host 10.0.2.40 not found in /home/stack/.ssh/known_hosts
>> Ansible failed, check log at
/var/lib/mistral/overcloud/ansible.log.Overcloud Endpoint:
http://10.0.2.40:5000
>> Overcloud Horizon Dashboard URL: http://10.0.2.40:80/dashboard
>> Overcloud rc file: /home/stack/overcloudrc
>> Overcloud Deployed with error
>> Overcloud configuration failed.
>>
>
>
> Could someone help debug this, the ansible.log is huge, I can't see
what's the origin of the problem, if someone can point me to the right
direction it will aprecciated.
> Thanks in advance.
>
> Regards.
>
> Le mer. 18 août 2021 à 18:02, Wesley Hayutin &lt;whayutin@redhat.com> a
écrit :
>>
>>
>>
>> On Wed, Aug 18, 2021 at 10:10 AM Dmitry Tantsur &lt;dtantsur@redhat.com>
wrote:
>>>
>>> Hi,
>>>
>>> On Wed, Aug 18, 2021 at 4:39 PM wodel youchi &lt;wodel.youchi@gmail.com>
wrote:
>>>>
>>>> Hi,
>>>> I am trying to deploy openstack with tripleO using VMs and nested-KVM
for the compute node. This is for test and learning purposes.
>>>>
>>>> I am using the Train version and following some tutorials.
>>>> I prepared my different template files and started the deployment,
but I got these errors :
>>>>
>>>> Failed to provision instance fc40457e-4b3c-4402-ae9d-c528f2c2ad30:
Asynchronous exception: Node failed to deploy. Exception: Agent API for
node 6d3724fc-6f13-4588-bbe5-56bc4f9a4f87 returned HTTP status code 404
with error: Not found: Extension with id iscsi not found. for node
>>>>
>>>
>>> You somehow ended up using master (Xena release) deploy ramdisk with
Train TripleO. You need to make sure to download Train images. I hope
TripleO people can point you at the right place.
>>>
>>> Dmitry
>>
>>
>> http://images.rdoproject.org/centos8/
>> http://images.rdoproject.org/centos8/train/rdo_trunk/current-tripleo/
>>
>>>
>>>
>>>>
>>>> and
>>>>
>>>> Got HTTP 409: {"errors": [{"status": 409, "title": "Conflict",
"detail": "There was a conflict when trying to complete your request.\n\n
Unable to allocate inventory: Unable to create allocation for
'CUSTOM_BAREMETAL' on resource provider
'6d3724fc-6f13-4588-bbe5-56bc4f9a4f87'. The requested amount would exceed
the capacity. ",
>>>>
>>>> Could you help understand what those errors mean? I couldn't find
anything similar on the net.
>>>>
>>>> Thanks in advance.
>>>>
>>>> Regards.
>>>
>>>
>>>
>>> --
>>> Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn,
>>> Commercial register: Amtsgericht Muenchen, HRB 153243,
>>> Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs,
Michael O'Neill

-- Francesco Pantano GPG KEY: F41BD75C

wodel youchi

24 Aug 24 Aug

9:59 a.m.

Hi, and thanks for your help As for Ceph, here is container prepare parameter_defaults: ContainerImagePrepare: - push_destination: true set: ceph_alertmanager_image: alertmanager ceph_alertmanager_namespace: quay.ceph.io/prometheus ceph_alertmanager_tag: v0.16.2 ceph_grafana_image: grafana ceph_grafana_namespace: quay.ceph.io/app-sre *ceph_grafana_tag: 5.4.3* ceph_image: daemon ceph_namespace: quay.ceph.io/ceph-ci ceph_node_exporter_image: node-exporter ceph_node_exporter_namespace: quay.ceph.io/prometheus ceph_node_exporter_tag: v0.17.0 ceph_prometheus_image: prometheus ceph_prometheus_namespace: quay.ceph.io/prometheus ceph_prometheus_tag: v2.7.2 *ceph_tag: v4.0.19-stable-4.0-nautilus-centos-7-x86_64* name_prefix: centos-binary- name_suffix: '' namespace: quay.io/tripleotraincentos8 neutron_driver: ovn rhel_containers: false tag: current-tripleo tag_from_label: rdo_version And yes, the 10.200.7.0/24 network is my storage network Here is a snippet from my network_data.yaml - name: Storage vip: true vlan: 1107 name_lower: storage ip_subnet: '10.200.7.0/24' allocation_pools: [{'start': '10.200.7.150', 'end': '10.200.7.169'}] I will look into the grafana service to see why it's not booting and get back to you. Regards. Le lun. 23 août 2021 à 17:28, Francesco Pantano <fpantano@redhat.com> a écrit :

...

Hello, thanks John for your reply here. A few more comments inline:

On Mon, Aug 23, 2021 at 6:16 PM John Fulton <johfulto@redhat.com> wrote:

...
On Mon, Aug 23, 2021 at 10:52 AM wodel youchi <wodel.youchi@gmail.com> wrote:

...
Hi,

I redid the undercloud deployment for the Train version for now. And I

verified the download URL for the images.

...
My overcloud deployment has moved forward but I still get errors.

This is what I got this time :

...
"TASK [ceph-grafana : wait for grafana to start]

********************************",

...
"Monday 23 August 2021 14:55:21 +0100 (0:00:00.961)

0:12:59.319 ********* ",

...
"fatal: [overcloud-controller-0]: FAILED! => {\"changed\":

false, \"elapsed\": 300, \"msg\": \"Timeout when waiting for 10.20

...
0.7.151:3100\"}", "fatal: [overcloud-controller-1]: FAILED! => {\"changed\": false, \"elapsed\": 300, \"msg\": \"Timeout when waiting for 10.20 0.7.155:3100\"}", "fatal: [overcloud-controller-2]: FAILED! => {\"changed\": false, \"elapsed\": 300, \"msg\": \"Timeout when waiting for 10.20 0.7.165:3100\"}",

I'm not certain of the ceph-ansible version you're using but it should be a version 4 with train. ceph-ansible should already be installed on your undercloud judging by this error and in the latest version 4 this task is where it failed:

https://github.com/ceph/ceph-ansible/blob/v4.0.64/roles/ceph-grafana/tasks/c...

You can check the status of this service on your three controllers and then debug it directly.

As John pointed out, ceph-ansible is able to configure, render and start the associated systemd unit for all the ceph monitoring stack components (node-exported, prometheus, alertmanager and grafana). You can ssh to your controllers, and check the systemd unit associated, checking the journal to see why they failed to start (I saw there's a timeout waiting for the container to start). A potential plan, in this case, could be:

1. check the systemd unit (I guess you can start with grafana which is the failed service) 2. look at the journal logs (feel free to attach here the relevant part of the output) 3. double check the network where the service is bound (can you attach the /var/lib/mistral/<stack>/ceph-ansible/group_vars/all.yaml) * The grafana process should be run on the storage network, but I see a "Timeout when waiting for 10.200.7.165:3100": is that network the right one?

...
...
John

...
...
"RUNNING HANDLER [ceph-prometheus : service handler]

****************************",

...
"Monday 23 August 2021 15:00:22 +0100 (0:05:00.767)

0:18:00.087 ********* ",

...
"PLAY RECAP

*********************************************************************",

...
"overcloud-computehci-0 : ok=224 changed=23

unreachable=0 failed=0 skipped=415 rescued=0 ignored=0 ",

...
"overcloud-computehci-1 : ok=199 changed=18

unreachable=0 failed=0 skipped=392 rescued=0 ignored=0 ",

...
"overcloud-computehci-2 : ok=212 changed=23

unreachable=0 failed=0 skipped=390 rescued=0 ignored=0 ",

...
"overcloud-controller-0 : ok=370 changed=52

unreachable=0 failed=1 skipped=539 rescued=0 ignored=0 ",

...
"overcloud-controller-1 : ok=308 changed=43

unreachable=0 failed=1 skipped=495 rescued=0 ignored=0 ",

...
"overcloud-controller-2 : ok=317 changed=45

unreachable=0 failed=1 skipped=493 rescued=0 ignored=0 ",

...
"INSTALLER STATUS

***************************************************************",

...
"Install Ceph Monitor : Complete (0:00:52)", "Install Ceph Manager : Complete (0:05:49)", "Install Ceph OSD : Complete (0:02:28)", "Install Ceph RGW : Complete (0:00:27)", "Install Ceph Client : Complete (0:00:33)", "Install Ceph Grafana : In Progress (0:05:54)", "\tThis phase can be restarted by running:

roles/ceph-grafana/tasks/main.yml",

...
"Install Ceph Node Exporter : Complete (0:00:28)", "Monday 23 August 2021 15:00:22 +0100 (0:00:00.006)

0:18:00.094 ********* ",

...
"=============================================================================== ",

...
"ceph-grafana : wait for grafana to start

------------------------------ 300.77s",

...
"ceph-facts : get ceph current status

---------------------------------- 300.27s",

...
"ceph-container-common : pulling

udtrain.ctlplane.umaitek.dz:8787/ceph-ci/daemon:v4.0.19-stable-4.0-nautilus-centos-7-x86_64

...
image -- 19.04s", "ceph-mon : waiting for the monitor(s) to form the quorum... ------------ 12.83s", "ceph-osd : use ceph-volume lvm batch to create bluestore osds ---------- 12.13s", "ceph-osd : wait for all osd to be up ----------------------------------- 11.88s", "ceph-osd : set pg_autoscale_mode value on pool(s) ---------------------- 11.00s", "ceph-osd : create openstack pool(s) ------------------------------------ 10.80s", "ceph-grafana : make sure grafana is down ------------------------------- 10.66s", "ceph-osd : customize pool crush_rule ----------------------------------- 10.15s", "ceph-osd : customize pool size ----------------------------------------- 10.15s", "ceph-osd : customize pool min_size ------------------------------------- 10.14s", "ceph-osd : assign application to pool(s) ------------------------------- 10.13s", "ceph-osd : list existing pool(s) ---------------------------------------- 8.59s",

"ceph-mon : fetch ceph initial keys -------------------------------------- 7.01s", "ceph-container-common : get ceph version -------------------------------- 6.75s", "ceph-prometheus : start prometheus services ----------------------------- 6.67s", "ceph-mgr : wait for all mgr to be up ------------------------------------ 6.66s", "ceph-grafana : start the grafana-server service ------------------------- 6.33s", "ceph-mgr : create ceph mgr keyring(s) on a mon node --------------------- 6.26s" ], "failed_when_result": true } 2021-08-23 15:00:24.427687 | 525400e8-92c8-47b1-e162-00000000597d | TIMING | tripleo-ceph-run-ansible : print ceph-ansible outpu$ in case of failure | undercloud | 0:37:30.226345 | 0.25s

PLAY RECAP

...
...
overcloud-computehci-0 : ok=213 changed=117 unreachable=0 failed=0 skipped=120 rescued=0 ignored=0 overcloud-computehci-1 : ok=207 changed=117 unreachable=0 failed=0 skipped=120 rescued=0 ignored=0 overcloud-computehci-2 : ok=207 changed=117 unreachable=0 failed=0 skipped=120 rescued=0 ignored=0 overcloud-controller-0 : ok=237 changed=145 unreachable=0 failed=0 skipped=128 rescued=0 ignored=0 overcloud-controller-1 : ok=232 changed=145 unreachable=0 failed=0 skipped=128 rescued=0 ignored=0 overcloud-controller-2 : ok=232 changed=145 unreachable=0 failed=0 skipped=128 rescued=0 ignored=0 undercloud : ok=100 changed=18 unreachable=0 failed=1 skipped=37 rescued=0 ignored=0

2021-08-23 15:00:24.559997 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Summary Information ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2021-08-23 15:00:24.560328 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Total Tasks: 1366 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2021-08-23 15:00:24.560419 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Elapsed Time: 0:37:30.359090 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2021-08-23 15:00:24.560490 | UUID | Info | Host | Task Name | Run Time 2021-08-23 15:00:24.560589 | 525400e8-92c8-47b1-e162-00000000597b | SUMMARY | undercloud | tripleo-ceph-run-ansible : run ceph-ans ible | 1082.71s 2021-08-23 15:00:24.560675 | 525400e8-92c8-47b1-e162-000000004d9a | SUMMARY | overcloud-controller-1 | Wait for container-puppet t asks (generate config) to finish | 356.02s 2021-08-23 15:00:24.560763 | 525400e8-92c8-47b1-e162-000000004d6a | SUMMARY | overcloud-controller-0 | Wait for container-puppet t asks (generate config) to finish | 355.74s 2021-08-23 15:00:24.560839 | 525400e8-92c8-47b1-e162-000000004dd0 | SUMMARY | overcloud-controller-2 | Wait for container-puppet t asks (generate config) to finish | 355.68s 2021-08-23 15:00:24.560912 | 525400e8-92c8-47b1-e162-000000003bb1 | SUMMARY | undercloud | Run tripleo-container-image-prepare log ged to: /var/log/tripleo-container-image-prepare.log | 143.03s 2021-08-23 15:00:24.560986 | 525400e8-92c8-47b1-e162-000000004b13 | SUMMARY | overcloud-controller-0 | Wait for puppet host config uration to finish | 125.36s 2021-08-23 15:00:24.561057 | 525400e8-92c8-47b1-e162-000000004b88 | SUMMARY | overcloud-controller-2 | Wait for puppet host config uration to finish | 125.33s 2021-08-23 15:00:24.561128 | 525400e8-92c8-47b1-e162-000000004b4b | SUMMARY | overcloud-controller-1 | Wait for puppet host config uration to finish | 125.25s 2021-08-23 15:00:24.561300 | 525400e8-92c8-47b1-e162-000000001dc4 | SUMMARY | overcloud-controller-2 | Run puppet on the host to a pply IPtables rules | 108.08s 2021-08-23 15:00:24.561374 | 525400e8-92c8-47b1-e162-000000001e4f | SUMMARY | overcloud-controller-0 | Run puppet on the host to a pply IPtables rules | 107.34s 2021-08-23 15:00:24.561444 | 525400e8-92c8-47b1-e162-000000004c8d | SUMMARY | overcloud-computehci-2 | Wait for container-puppet t asks (generate config) to finish | 96.56s 2021-08-23 15:00:24.561514 | 525400e8-92c8-47b1-e162-000000004c33 | SUMMARY | overcloud-computehci-0 | Wait for container-puppet t asks (generate config) to finish | 96.38s 2021-08-23 15:00:24.561580 | 525400e8-92c8-47b1-e162-000000004c60 | SUMMARY | overcloud-computehci-1 | Wait for container-puppet t asks (generate config) to finish | 93.41s 2021-08-23 15:00:24.561645 | 525400e8-92c8-47b1-e162-00000000434d | SUMMARY | overcloud-computehci-0 | Pre-fetch all the container s | 92.70s 2021-08-23 15:00:24.561712 | 525400e8-92c8-47b1-e162-0000000043ed | SUMMARY | overcloud-computehci-2 | Pre-fetch all the container s | 91.90s 2021-08-23 15:00:24.561782 | 525400e8-92c8-47b1-e162-000000004385 | SUMMARY | overcloud-computehci-1 | Pre-fetch all the container s | 91.88s 2021-08-23 15:00:24.561876 | 525400e8-92c8-47b1-e162-00000000491c | SUMMARY | overcloud-computehci-1 | Wait for puppet host config uration to finish | 90.37s 2021-08-23 15:00:24.561947 | 525400e8-92c8-47b1-e162-000000004951 | SUMMARY | overcloud-computehci-2 | Wait for puppet host config uration to finish | 90.37s 2021-08-23 15:00:24.562016 | 525400e8-92c8-47b1-e162-0000000048e6 | SUMMARY | overcloud-computehci-0 | Wait for puppet host config uration to finish | 90.35s 2021-08-23 15:00:24.562080 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ End Summary Information ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2021-08-23 15:00:24.562196 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ State Information ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2021-08-23 15:00:24.562311 | ~~~~~~~~~~~~~~~~~~ Number of nodes which did not deploy successfully: 1 ~~~~~~~~~~~~~~~~~ 2021-08-23 15:00:24.562379 | The following node(s) had failures: undercloud 2021-08-23 15:00:24.562456 |
>> Host 10.0.2.40 not found in /home/stack/.ssh/known_hosts
>> Ansible failed, check log at
/var/lib/mistral/overcloud/ansible.log.Overcloud Endpoint:
http://10.0.2.40:5000
>> Overcloud Horizon Dashboard URL: http://10.0.2.40:80/dashboard
>> Overcloud rc file: /home/stack/overcloudrc
>> Overcloud Deployed with error
>> Overcloud configuration failed.
>>
>
>
> Could someone help debug this, the ansible.log is huge, I can't see
what's the origin of the problem, if someone can point me to the right
direction it will aprecciated.
> Thanks in advance.
>
> Regards.
>
> Le mer. 18 août 2021 à 18:02, Wesley Hayutin &lt;whayutin@redhat.com> a
écrit :
>>
>>
>>
>> On Wed, Aug 18, 2021 at 10:10 AM Dmitry Tantsur &lt;dtantsur@redhat.com>
wrote:
>>>
>>> Hi,
>>>
>>> On Wed, Aug 18, 2021 at 4:39 PM wodel youchi &lt;wodel.youchi@gmail.com>
wrote:
>>>>
>>>> Hi,
>>>> I am trying to deploy openstack with tripleO using VMs and
nested-KVM for the compute node. This is for test and learning purposes.
>>>>
>>>> I am using the Train version and following some tutorials.
>>>> I prepared my different template files and started the deployment,
but I got these errors :
>>>>
>>>> Failed to provision instance fc40457e-4b3c-4402-ae9d-c528f2c2ad30:
Asynchronous exception: Node failed to deploy. Exception: Agent API for
node 6d3724fc-6f13-4588-bbe5-56bc4f9a4f87 returned HTTP status code 404
with error: Not found: Extension with id iscsi not found. for node
>>>>
>>>
>>> You somehow ended up using master (Xena release) deploy ramdisk with
Train TripleO. You need to make sure to download Train images. I hope
TripleO people can point you at the right place.
>>>
>>> Dmitry
>>
>>
>> http://images.rdoproject.org/centos8/
>> http://images.rdoproject.org/centos8/train/rdo_trunk/current-tripleo/
>>
>>>
>>>
>>>>
>>>> and
>>>>
>>>> Got HTTP 409: {"errors": [{"status": 409, "title": "Conflict",
"detail": "There was a conflict when trying to complete your request.\n\n
Unable to allocate inventory: Unable to create allocation for
'CUSTOM_BAREMETAL' on resource provider
'6d3724fc-6f13-4588-bbe5-56bc4f9a4f87'. The requested amount would exceed
the capacity. ",
>>>>
>>>> Could you help understand what those errors mean? I couldn't find
anything similar on the net.
>>>>
>>>> Thanks in advance.
>>>>
>>>> Regards.
>>>
>>>
>>>
>>> --
>>> Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn,
>>> Commercial register: Amtsgericht Muenchen, HRB 153243,
>>> Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs,
Michael O'Neill
-- Francesco Pantano GPG KEY: F41BD75C

wodel youchi

11:25 p.m.

Hello, After digging after grafana, it seems it needed to download something from the internet, and i didn't really configure a proper gateway on the external network. So I started by configuring a proper gateway and I tested it with the half deployed nodes, the I redid the deployment, and again I got this error : 2021-08-24 21:29:29.616805 | 525400e8-92c8-d397-6f7e-000000006133 |

...

FATAL | Clean up legacy Cinder keystone catalog entries | undercloud | error={"changed": false, "module_stderr": "Fa iled to discover available identity versions when contacting http://10.0.2.40:5000. Attempting to parse version from URL.\nTraceback (most recent call last):\n File \"/usr/lib/python3.6/si te-packages/urllib3/connection.py\", line 162, in _new_conn\n (self._dns_host, self.port), self.timeout, **extra_kw)\n File \"/usr/lib/python3.6/site-packages/urllib3/util/connection.py \", line 80, in create_connection\n raise err\n File \"/usr/lib/python3.6/site-packages/urllib3/util/connection.py\", line 70, in create_connection\n sock.connect(sa)\nTimeoutError: [Errno 110] Connection timed out\n\nDuring handling of the above exception, another exception occurred:\n\nTraceback (most recent call last):\n File \"/usr/lib/python3.6/site-packages/urll ib3/connectionpool.py\", line 600, in urlopen\n chunked=chunked)\n File \"/usr/lib/python3.6/site-packages/urllib3/connectionpool.py\", line 354, in _make_request\n conn.request(meth od, url, **httplib_request_kw)\n File \"/usr/lib64/python3.6/http/client.py\", line 1269, in request\n self._send_request(method, url, body, headers, encode_chunked)\n File \"/usr/lib6 4/python3.6/http/client.py\", line 1315, in _send_request\n self.endheaders(body, encode_chunked=encode_chunked)\n File \"/usr/lib64/python3.6/http/client.py\", line 1264, in endheaders \n self._send_output(message_body, encode_chunked=encode_chunked)\n File \"/usr/lib64/python3.6/http/client.py\", line 1040, in _send_output\n self.send(msg)\n File \"/usr/lib64/pyt hon3.6/http/client.py\", line 978, in send\n self.connect()\n File \"/usr/lib/python3.6/site-packages/urllib3/connection.py\", line 184, in connect\n conn = self._new_conn()\n File \"/usr/lib/python3.6/site-packages/urllib3/connection.py\", line 171, in _new_conn\n self, \"Failed to establish a new connection: %s\" % e)\nurllib3.exceptions.NewConnectionError: <urll ib3.connection.HTTPConnection object at 0x7f96f7b10cc0>: Failed to establish a new connection: [Errno 110] Connection timed out\n\nDuring handling of the above exception, another exception occurred:\n\nTraceback (most recent call last):\n File \"/usr/lib/python3.6/site-packages/requests/adapters.py\", line 449, in send\n timeout=timeout\n File \"/usr/lib/python3.6/site-p ackages/urllib3/connectionpool.py\", line 638, in urlopen\n _stacktrace=sys.exc_info()[2])\n File \"/usr/lib/python3.6/site-packages/urllib3/util/retry.py\", line 399, in increment\n raise MaxRetryError(_pool, url, error or ResponseError(cause))\nurllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='10.0.2.40', port=5000): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f96f7b10cc0>: Failed to establish a new connection: [Errno 110] Connection timed out',))\n\nDuring handling of the ab$ ve exception, another exception occurred:\n\nTraceback (most recent call last):\n File \"/usr/lib/python3.6/site-packages/keystoneauth1/session.py\", line 997, in _send_request\n resp $ self.session.request(method, url, **kwargs)\n File \"/usr/lib/python3.6/site-packages/requests/sessions.py\", line 533, in request\n resp = self.send(prep, **send_kwargs)\n File \"/u$ r/lib/python3.6/site-packages/requests/sessions.py\", line 646, in send\n r = adapter.send(request, **kwargs)\n File \"/usr/lib/python3.6/site-packages/requests/adapters.py\", line 516$ in send\n raise ConnectionError(e, request=request)\nrequests.exceptions.ConnectionError: HTTPConnectionPool(host='10.0.2.40', port=5000): Max retries exceeded with url: / (Caused by N$wConnectionError('<urllib3.connection.HTTPConnection object at 0x7f96f7b10cc0>: Failed to establish a new connection: [Errno 110] Connection timed out',))\n\nDuring handling of the above e$ ception, another exception occurred:\n\nTraceback (most recent call last):\n File \"/usr/lib/python3.6/site-packages/keystoneauth1/identity/generic/base.py\", line 138, in _do_create_plug$ n\n authenticated=False)\n File \"/usr/lib/python3.6/site-packages/keystoneauth1/identity/base.py\", line 610, in get_discovery\n authenticated=authenticated)\n File \"/usr/lib/pyt$ on3.6/site-packages/keystoneauth1/discover.py\", line 1442, in get_discovery\n disc = Discover(session, url, authenticated=authenticated)\n File \"/usr/lib/python3.6/site-packages/keys$ oneauth1/discover.py\", line 526, in __init__\n authenticated=authenticated)\n File \"/usr/lib/python3.6/site-packages/keystoneauth1/discover.py\", line 101, in get_version_data\n r$ sp = session.get(url, headers=headers, authenticated=authenticated)\n File \"/usr/lib/python3.6/site-packages/keystoneauth1/session.py\", line 1116, in get\n return self.request(url, '$ ET', **kwargs)\n File \"/usr/lib/python3.6/site-packages/keystoneauth1/session.py\", line 906, in request\n resp = send(**kwargs)\n File \"/usr/lib/python3.6/site-packages/keystoneaut$ 1/session.py\", line 1013, in _send_request\n raise exceptions.ConnectFailure(msg)\nkeystoneauth1.exceptions.connection.ConnectFailure: Unable to establish connection to http://10.0.2.4$ :5000: HTTPConnectionPool(host='10.0.2.40', port=5000): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f96f7b10cc0>: Failed to establish a new connection: [Errno 110] Connection timed out',))\n\nDuring handling of the above exception, another exception occurred:\n\nTraceback (most recent call last):\n File \"<$ tdin>\", line 102, in <module>\n File \"<stdin>\", line 94, in _ansiballz_main\n File \"<stdin>\", line 40, in invoke_module\n File \"/usr/lib64/python3.6/runpy.py\", line 205, in run_m$ dule\n return _run_module_code(code, init_globals, run_name, mod_spec)\n File \"/usr/lib64/python3.6/runpy.py\", line 96, in _run_module_code\n mod_name, mod_spec, pkg_name, script_$ ame)\n File \"/usr/lib64/python3.6/runpy.py\", line 85, in _run_code\n exec(code, run_globals)\n File \"/tmp/ansible_os_keystone_service_payload_wcyk6h37/ansible_os_keystone_service_p$ yload.zip/ansible/modules/cloud/openstack/os_keystone_service.py\", line 194, in <module>\n File \"/tmp/ansible_os_keystone_service_payload_wcyk6h37/ansible_os_keystone_service_payload.zi$ /ansible/modules/cloud/openstack/os_keystone_service.py\", line 153, in main\n File \"/usr/lib/python3.6/site-packages/openstack/cloud/_identity.py\", line 510, in search_services\n se$ vices = self.list_services()\n File \"/usr/lib/python3.6/site-packages/openstack/cloud/_identity.py\", line 485, in list_services\n if self._is_client_version('identity', 2):\n File \$ /usr/lib/python3.6/site-packages/openstack/cloud/openstackcloud.py\", line 459, in _is_client_version\n client = getattr(self, client_name)\n File \"/usr/lib/python3.6/site-packages/op$ nstack/cloud/_identity.py\", line 32, in _identity_client\n 'identity', min_version=2, max_version='3.latest')\n File \"/usr/lib/python3.6/site-packages/openstack/cloud/openstackcloud.$ y\", line 406, in _get_versioned_client\n if adapter.get_endpoint():\n File \"/usr/lib/python3.6/site-packages/keystoneauth1/adapter.py\", line 282, in get_endpoint\n return self.se$ sion.get_endpoint(auth or self.auth, **kwargs)\n File \"/usr/lib/python3.6/site-packages/keystoneauth1/session.py\", line 1218, in get_endpoint\n return auth.get_endpoint(self, **kwarg$ )\n File \"/usr/lib/python3.6/site-packages/keystoneauth1/identity/base.py\", line 380, in get_endpoint\n allow_version_hack=allow_version_hack, **kwargs)\n File \"/usr/lib/python3.6/$ ite-packages/keystoneauth1/identity/base.py\", line 271, in get_endpoint_data\n service_catalog = self.get_access(session).service_catalog\n File \"/usr/lib/python3.6/site-packages/key$ toneauth1/identity/base.py\", line 134, in get_access\n self.auth_ref = self.get_auth_ref(session)\n File \"/usr/lib/python3.6/site-packages/keystoneauth1/identity/generic/base.py\", l$ ne 206, in get_auth_ref\n self._plugin = self._do_create_plugin(session)\n File \"/usr/lib/python3.6/site-packages/keystoneauth1/identity/generic/base.py\", line 161, in _do_create_plu$ in\n 'auth_url is correct. %s' % e)\nkeystoneauth1.exceptions.discovery.DiscoveryFailure: Could not find versioned identity endpoints when attempting to authenticate. Please check that $our auth_url is correct.

*Unable to establish connection to http://10.0.2.40:5000 <http://10.0.2.40:5000>: HTTPConnectionPool(host='10.0.2.40', port=5000): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f96f7b10cc0>: Failed to establish a new connection: [Errno 110] Connection timed out',))\n", "module_stdout": "", "msg": "MODULE FAILURE\nSee stdout/stderr for the exact error", "rc": 1} *

2021-08-24 21:29:29.617697 | 525400e8-92c8-d397-6f7e-000000006133 | TIMING | Clean up legacy Cinder keystone catalog entries | undercloud | 1:07:40.666419 | 130.85s

PLAY RECAP *********************************************************************

overcloud-computehci-0 : ok=260 changed=145 unreachable=0 failed=0 skipped=140 rescued=0 ignored=0

overcloud-computehci-1 : ok=258 changed=145 unreachable=0 failed=0 skipped=140 rescued=0 ignored=0

overcloud-computehci-2 : ok=255 changed=145 unreachable=0 failed=0 skipped=140 rescued=0 ignored=0

overcloud-controller-0 : ok=295 changed=181 unreachable=0 failed=0 skipped=151 rescued=0 ignored=0

overcloud-controller-1 : ok=289 changed=177 unreachable=0 failed=0 skipped=152 rescued=0 ignored=0

overcloud-controller-2 : ok=288 changed=177 unreachable=0 failed=0 skipped=152 rescued=0 ignored=0

undercloud : ok=105 changed=21 unreachable=0 failed=1 skipped=45 rescued=0 ignored=0

2021-08-24 21:29:29.730778 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Summary Information ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

2021-08-24 21:29:29.731007 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Total Tasks: 1723 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

2021-08-24 21:29:29.731098 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Elapsed Time: 1:07:40.779840 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

2021-08-24 21:29:29.731172 | UUID | Info | Host | Task Name | Run Time

2021-08-24 21:29:29.731251 | 525400e8-92c8-d397-6f7e-000000003b9a | SUMMARY | undercloud | Run tripleo-container-image-prepare logged to: /var/log/tripleo-container-image-prepare.log | 1762.93s

2021-08-24 21:29:29.731349 | 525400e8-92c8-d397-6f7e-0000000057aa | SUMMARY | undercloud | tripleo-ceph-run-ansible : run ceph-ansible | 990.24s 2021-08-24 21:29:29.731433 | 525400e8-92c8-d397-6f7e-000000005951 | SUMMARY | overcloud-controller-0 | tripleo_ha_wrapper : Run init bundle puppet on the host for haproxy | 133.22s 2021-08-24 21:29:29.731503 | 525400e8-92c8-d397-6f7e-000000006133 | SUMMARY | undercloud | Clean up legacy Cinder keystone catalog entries | 130.85s 2021-08-24 21:29:29.731569 | 525400e8-92c8-d397-6f7e-000000006012 | SUMMARY | overcloud-controller-0 | Wait for containers to start for step 3 using paunch | 103.45s 2021-08-24 21:29:29.731643 | 525400e8-92c8-d397-6f7e-000000004337 | SUMMARY | overcloud-computehci-0 | Pre-fetch all the containers | 94.00s

2021-08-24 21:29:29.731729 | 525400e8-92c8-d397-6f7e-000000004378 | SUMMARY | overcloud-computehci-2 | Pre-fetch all the containers | 92.64s

2021-08-24 21:29:29.731795 | 525400e8-92c8-d397-6f7e-000000004337 | SUMMARY | overcloud-computehci-1 | Pre-fetch all the containers | 86.38s

2021-08-24 21:29:29.731867 | 525400e8-92c8-d397-6f7e-000000004d68 | SUMMARY | overcloud-controller-0 | Wait for container-puppet tasks (generate config) to finish | 84.13s 2021-08-24 21:29:29.731946 | 525400e8-92c8-d397-6f7e-000000004d99 | SUMMARY | overcloud-controller-2 | Wait for container-puppet tasks (generate config) to finish | 80.76s 2021-08-24 21:29:29.732012 | 525400e8-92c8-d397-6f7e-00000000427c | SUMMARY | overcloud-controller-1 | Pre-fetch all the containers | 80.21s

2021-08-24 21:29:29.732073 | 525400e8-92c8-d397-6f7e-00000000427c | SUMMARY | overcloud-controller-0 | Pre-fetch all the containers | 77.03s

2021-08-24 21:29:29.732138 | 525400e8-92c8-d397-6f7e-0000000042f5 | SUMMARY | overcloud-controller-2 | Pre-fetch all the containers | 76.32s

2021-08-24 21:29:29.732202 | 525400e8-92c8-d397-6f7e-000000004dd3 | SUMMARY | overcloud-controller-1 | Wait for container-puppet tasks (generate config) to finish | 74.36s 2021-08-24 21:29:29.732266 | 525400e8-92c8-d397-6f7e-000000005da7 | SUMMARY | overcloud-controller-0 | tripleo_ha_wrapper : Run init bundle puppet on the host for ovn_dbs | 68.39s 2021-08-24 21:29:29.732329 | 525400e8-92c8-d397-6f7e-000000005ce2 | SUMMARY | overcloud-controller-0 | Wait for containers to start for step 2 using paunch | 64.55s 2021-08-24 21:29:29.732398 | 525400e8-92c8-d397-6f7e-000000004b97 | SUMMARY | overcloud-controller-2 | Wait for puppet host configuration to finish | 58.13s 2021-08-24 21:29:29.732463 | 525400e8-92c8-d397-6f7e-000000004c1a | SUMMARY | overcloud-controller-1 | Wait for puppet host configuration to finish | 58.11s 2021-08-24 21:29:29.732526 | 525400e8-92c8-d397-6f7e-000000005bd3 | SUMMARY | overcloud-controller-1 | Wait for containers to start for step 2 using paunch | 58.09s 2021-08-24 21:29:29.732589 | 525400e8-92c8-d397-6f7e-000000005b9b | SUMMARY | overcloud-controller-2 | Wait for containers to start for step 2 using paunch | 58.09s

Thank you again for your assistance. Regards. Le mar. 24 août 2021 à 08:59, wodel youchi <wodel.youchi@gmail.com> a écrit :

...

Hi, and thanks for your help

As for Ceph, here is container prepare parameter_defaults: ContainerImagePrepare: - push_destination: true set: ceph_alertmanager_image: alertmanager ceph_alertmanager_namespace: quay.ceph.io/prometheus ceph_alertmanager_tag: v0.16.2 ceph_grafana_image: grafana ceph_grafana_namespace: quay.ceph.io/app-sre *ceph_grafana_tag: 5.4.3* ceph_image: daemon ceph_namespace: quay.ceph.io/ceph-ci ceph_node_exporter_image: node-exporter ceph_node_exporter_namespace: quay.ceph.io/prometheus ceph_node_exporter_tag: v0.17.0 ceph_prometheus_image: prometheus ceph_prometheus_namespace: quay.ceph.io/prometheus ceph_prometheus_tag: v2.7.2 *ceph_tag: v4.0.19-stable-4.0-nautilus-centos-7-x86_64* name_prefix: centos-binary- name_suffix: '' namespace: quay.io/tripleotraincentos8 neutron_driver: ovn rhel_containers: false tag: current-tripleo tag_from_label: rdo_version

And yes, the 10.200.7.0/24 network is my storage network Here is a snippet from my network_data.yaml

- name: Storage vip: true vlan: 1107 name_lower: storage ip_subnet: '10.200.7.0/24' allocation_pools: [{'start': '10.200.7.150', 'end': '10.200.7.169'}]

I will look into the grafana service to see why it's not booting and get back to you.

Regards.

Le lun. 23 août 2021 à 17:28, Francesco Pantano <fpantano@redhat.com> a écrit :

...
Hello, thanks John for your reply here. A few more comments inline:

On Mon, Aug 23, 2021 at 6:16 PM John Fulton <johfulto@redhat.com> wrote:

...
On Mon, Aug 23, 2021 at 10:52 AM wodel youchi <wodel.youchi@gmail.com> wrote:

...
Hi,

I redid the undercloud deployment for the Train version for now. And I

verified the download URL for the images.

...
My overcloud deployment has moved forward but I still get errors.

This is what I got this time :

...
"TASK [ceph-grafana : wait for grafana to start]

********************************",

...
"Monday 23 August 2021 14:55:21 +0100 (0:00:00.961)

0:12:59.319 ********* ",

...
"fatal: [overcloud-controller-0]: FAILED! => {\"changed\":

false, \"elapsed\": 300, \"msg\": \"Timeout when waiting for 10.20

...
0.7.151:3100\"}", "fatal: [overcloud-controller-1]: FAILED! => {\"changed\": false, \"elapsed\": 300, \"msg\": \"Timeout when waiting for 10.20 0.7.155:3100\"}", "fatal: [overcloud-controller-2]: FAILED! => {\"changed\": false, \"elapsed\": 300, \"msg\": \"Timeout when waiting for 10.20 0.7.165:3100\"}",

I'm not certain of the ceph-ansible version you're using but it should be a version 4 with train. ceph-ansible should already be installed on your undercloud judging by this error and in the latest version 4 this task is where it failed:

https://github.com/ceph/ceph-ansible/blob/v4.0.64/roles/ceph-grafana/tasks/c...

You can check the status of this service on your three controllers and then debug it directly.

As John pointed out, ceph-ansible is able to configure, render and start the associated systemd unit for all the ceph monitoring stack components (node-exported, prometheus, alertmanager and grafana). You can ssh to your controllers, and check the systemd unit associated, checking the journal to see why they failed to start (I saw there's a timeout waiting for the container to start). A potential plan, in this case, could be:

1. check the systemd unit (I guess you can start with grafana which is the failed service) 2. look at the journal logs (feel free to attach here the relevant part of the output) 3. double check the network where the service is bound (can you attach the /var/lib/mistral/<stack>/ceph-ansible/group_vars/all.yaml) * The grafana process should be run on the storage network, but I see a "Timeout when waiting for 10.200.7.165:3100": is that network the right one?

...
...
John

...
...
"RUNNING HANDLER [ceph-prometheus : service handler]

****************************",

...
"Monday 23 August 2021 15:00:22 +0100 (0:05:00.767)

0:18:00.087 ********* ",

...
"PLAY RECAP

*********************************************************************",

...
"overcloud-computehci-0 : ok=224 changed=23

unreachable=0 failed=0 skipped=415 rescued=0 ignored=0 ",

...
"overcloud-computehci-1 : ok=199 changed=18

unreachable=0 failed=0 skipped=392 rescued=0 ignored=0 ",

...
"overcloud-computehci-2 : ok=212 changed=23

unreachable=0 failed=0 skipped=390 rescued=0 ignored=0 ",

...
"overcloud-controller-0 : ok=370 changed=52

unreachable=0 failed=1 skipped=539 rescued=0 ignored=0 ",

...
"overcloud-controller-1 : ok=308 changed=43

unreachable=0 failed=1 skipped=495 rescued=0 ignored=0 ",

...
"overcloud-controller-2 : ok=317 changed=45

unreachable=0 failed=1 skipped=493 rescued=0 ignored=0 ",

...
"INSTALLER STATUS

***************************************************************",

...
"Install Ceph Monitor : Complete (0:00:52)", "Install Ceph Manager : Complete (0:05:49)", "Install Ceph OSD : Complete (0:02:28)", "Install Ceph RGW : Complete (0:00:27)", "Install Ceph Client : Complete (0:00:33)", "Install Ceph Grafana : In Progress (0:05:54)", "\tThis phase can be restarted by running:

roles/ceph-grafana/tasks/main.yml",

...
"Install Ceph Node Exporter : Complete (0:00:28)", "Monday 23 August 2021 15:00:22 +0100 (0:00:00.006)

0:18:00.094 ********* ",

...
"=============================================================================== ",

...
"ceph-grafana : wait for grafana to start

------------------------------ 300.77s",

...
"ceph-facts : get ceph current status

---------------------------------- 300.27s",

...
"ceph-container-common : pulling

udtrain.ctlplane.umaitek.dz:8787/ceph-ci/daemon:v4.0.19-stable-4.0-nautilus-centos-7-x86_64

...
image -- 19.04s", "ceph-mon : waiting for the monitor(s) to form the quorum... ------------ 12.83s", "ceph-osd : use ceph-volume lvm batch to create bluestore osds ---------- 12.13s", "ceph-osd : wait for all osd to be up ----------------------------------- 11.88s", "ceph-osd : set pg_autoscale_mode value on pool(s) ---------------------- 11.00s", "ceph-osd : create openstack pool(s) ------------------------------------ 10.80s", "ceph-grafana : make sure grafana is down ------------------------------- 10.66s", "ceph-osd : customize pool crush_rule ----------------------------------- 10.15s", "ceph-osd : customize pool size ----------------------------------------- 10.15s", "ceph-osd : customize pool min_size ------------------------------------- 10.14s", "ceph-osd : assign application to pool(s) ------------------------------- 10.13s", "ceph-osd : list existing pool(s) ---------------------------------------- 8.59s",

"ceph-mon : fetch ceph initial keys -------------------------------------- 7.01s", "ceph-container-common : get ceph version -------------------------------- 6.75s", "ceph-prometheus : start prometheus services ----------------------------- 6.67s", "ceph-mgr : wait for all mgr to be up ------------------------------------ 6.66s", "ceph-grafana : start the grafana-server service ------------------------- 6.33s", "ceph-mgr : create ceph mgr keyring(s) on a mon node --------------------- 6.26s" ], "failed_when_result": true } 2021-08-23 15:00:24.427687 | 525400e8-92c8-47b1-e162-00000000597d | TIMING | tripleo-ceph-run-ansible : print ceph-ansible outpu$ in case of failure | undercloud | 0:37:30.226345 | 0.25s

PLAY RECAP

...
...
overcloud-computehci-0 : ok=213 changed=117 unreachable=0 failed=0 skipped=120 rescued=0 ignored=0 overcloud-computehci-1 : ok=207 changed=117 unreachable=0 failed=0 skipped=120 rescued=0 ignored=0 overcloud-computehci-2 : ok=207 changed=117 unreachable=0 failed=0 skipped=120 rescued=0 ignored=0 overcloud-controller-0 : ok=237 changed=145 unreachable=0 failed=0 skipped=128 rescued=0 ignored=0 overcloud-controller-1 : ok=232 changed=145 unreachable=0 failed=0 skipped=128 rescued=0 ignored=0 overcloud-controller-2 : ok=232 changed=145 unreachable=0 failed=0 skipped=128 rescued=0 ignored=0 undercloud : ok=100 changed=18 unreachable=0 failed=1 skipped=37 rescued=0 ignored=0

2021-08-23 15:00:24.559997 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Summary Information ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2021-08-23 15:00:24.560328 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Total Tasks: 1366 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2021-08-23 15:00:24.560419 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Elapsed Time: 0:37:30.359090 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2021-08-23 15:00:24.560490 | UUID | Info | Host | Task Name | Run Time 2021-08-23 15:00:24.560589 | 525400e8-92c8-47b1-e162-00000000597b | SUMMARY | undercloud | tripleo-ceph-run-ansible : run ceph-ans ible | 1082.71s 2021-08-23 15:00:24.560675 | 525400e8-92c8-47b1-e162-000000004d9a | SUMMARY | overcloud-controller-1 | Wait for container-puppet t asks (generate config) to finish | 356.02s 2021-08-23 15:00:24.560763 | 525400e8-92c8-47b1-e162-000000004d6a | SUMMARY | overcloud-controller-0 | Wait for container-puppet t asks (generate config) to finish | 355.74s 2021-08-23 15:00:24.560839 | 525400e8-92c8-47b1-e162-000000004dd0 | SUMMARY | overcloud-controller-2 | Wait for container-puppet t asks (generate config) to finish | 355.68s 2021-08-23 15:00:24.560912 | 525400e8-92c8-47b1-e162-000000003bb1 | SUMMARY | undercloud | Run tripleo-container-image-prepare log ged to: /var/log/tripleo-container-image-prepare.log | 143.03s 2021-08-23 15:00:24.560986 | 525400e8-92c8-47b1-e162-000000004b13 | SUMMARY | overcloud-controller-0 | Wait for puppet host config uration to finish | 125.36s 2021-08-23 15:00:24.561057 | 525400e8-92c8-47b1-e162-000000004b88 | SUMMARY | overcloud-controller-2 | Wait for puppet host config uration to finish | 125.33s 2021-08-23 15:00:24.561128 | 525400e8-92c8-47b1-e162-000000004b4b | SUMMARY | overcloud-controller-1 | Wait for puppet host config uration to finish | 125.25s 2021-08-23 15:00:24.561300 | 525400e8-92c8-47b1-e162-000000001dc4 | SUMMARY | overcloud-controller-2 | Run puppet on the host to a pply IPtables rules | 108.08s 2021-08-23 15:00:24.561374 | 525400e8-92c8-47b1-e162-000000001e4f | SUMMARY | overcloud-controller-0 | Run puppet on the host to a pply IPtables rules | 107.34s 2021-08-23 15:00:24.561444 | 525400e8-92c8-47b1-e162-000000004c8d | SUMMARY | overcloud-computehci-2 | Wait for container-puppet t asks (generate config) to finish | 96.56s 2021-08-23 15:00:24.561514 | 525400e8-92c8-47b1-e162-000000004c33 | SUMMARY | overcloud-computehci-0 | Wait for container-puppet t asks (generate config) to finish | 96.38s 2021-08-23 15:00:24.561580 | 525400e8-92c8-47b1-e162-000000004c60 | SUMMARY | overcloud-computehci-1 | Wait for container-puppet t asks (generate config) to finish | 93.41s 2021-08-23 15:00:24.561645 | 525400e8-92c8-47b1-e162-00000000434d | SUMMARY | overcloud-computehci-0 | Pre-fetch all the container s | 92.70s 2021-08-23 15:00:24.561712 | 525400e8-92c8-47b1-e162-0000000043ed | SUMMARY | overcloud-computehci-2 | Pre-fetch all the container s | 91.90s 2021-08-23 15:00:24.561782 | 525400e8-92c8-47b1-e162-000000004385 | SUMMARY | overcloud-computehci-1 | Pre-fetch all the container s | 91.88s 2021-08-23 15:00:24.561876 | 525400e8-92c8-47b1-e162-00000000491c | SUMMARY | overcloud-computehci-1 | Wait for puppet host config uration to finish | 90.37s 2021-08-23 15:00:24.561947 | 525400e8-92c8-47b1-e162-000000004951 | SUMMARY | overcloud-computehci-2 | Wait for puppet host config uration to finish | 90.37s 2021-08-23 15:00:24.562016 | 525400e8-92c8-47b1-e162-0000000048e6 | SUMMARY | overcloud-computehci-0 | Wait for puppet host config uration to finish | 90.35s 2021-08-23 15:00:24.562080 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ End Summary Information ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2021-08-23 15:00:24.562196 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ State Information ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2021-08-23 15:00:24.562311 | ~~~~~~~~~~~~~~~~~~ Number of nodes which did not deploy successfully: 1 ~~~~~~~~~~~~~~~~~ 2021-08-23 15:00:24.562379 | The following node(s) had failures: undercloud 2021-08-23 15:00:24.562456 |
>> Host 10.0.2.40 not found in /home/stack/.ssh/known_hosts
>> Ansible failed, check log at
/var/lib/mistral/overcloud/ansible.log.Overcloud Endpoint:
http://10.0.2.40:5000
>> Overcloud Horizon Dashboard URL: http://10.0.2.40:80/dashboard
>> Overcloud rc file: /home/stack/overcloudrc
>> Overcloud Deployed with error
>> Overcloud configuration failed.
>>
>
>
> Could someone help debug this, the ansible.log is huge, I can't see
what's the origin of the problem, if someone can point me to the right
direction it will aprecciated.
> Thanks in advance.
>
> Regards.
>
> Le mer. 18 août 2021 à 18:02, Wesley Hayutin &lt;whayutin@redhat.com> a
écrit :
>>
>>
>>
>> On Wed, Aug 18, 2021 at 10:10 AM Dmitry Tantsur &lt;dtantsur@redhat.com>
wrote:
>>>
>>> Hi,
>>>
>>> On Wed, Aug 18, 2021 at 4:39 PM wodel youchi &lt;wodel.youchi@gmail.com>
wrote:
>>>>
>>>> Hi,
>>>> I am trying to deploy openstack with tripleO using VMs and
nested-KVM for the compute node. This is for test and learning purposes.
>>>>
>>>> I am using the Train version and following some tutorials.
>>>> I prepared my different template files and started the deployment,
but I got these errors :
>>>>
>>>> Failed to provision instance fc40457e-4b3c-4402-ae9d-c528f2c2ad30:
Asynchronous exception: Node failed to deploy. Exception: Agent API for
node 6d3724fc-6f13-4588-bbe5-56bc4f9a4f87 returned HTTP status code 404
with error: Not found: Extension with id iscsi not found. for node
>>>>
>>>
>>> You somehow ended up using master (Xena release) deploy ramdisk with
Train TripleO. You need to make sure to download Train images. I hope
TripleO people can point you at the right place.
>>>
>>> Dmitry
>>
>>
>> http://images.rdoproject.org/centos8/
>> http://images.rdoproject.org/centos8/train/rdo_trunk/current-tripleo/
>>
>>>
>>>
>>>>
>>>> and
>>>>
>>>> Got HTTP 409: {"errors": [{"status": 409, "title": "Conflict",
"detail": "There was a conflict when trying to complete your request.\n\n
Unable to allocate inventory: Unable to create allocation for
'CUSTOM_BAREMETAL' on resource provider
'6d3724fc-6f13-4588-bbe5-56bc4f9a4f87'. The requested amount would exceed
the capacity. ",
>>>>
>>>> Could you help understand what those errors mean? I couldn't find
anything similar on the net.
>>>>
>>>> Thanks in advance.
>>>>
>>>> Regards.
>>>
>>>
>>>
>>> --
>>> Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn,
>>> Commercial register: Amtsgericht Muenchen, HRB 153243,
>>> Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs,
Michael O'Neill
-- Francesco Pantano GPG KEY: F41BD75C

wodel youchi

25 Aug 25 Aug

10:45 p.m.

Hi, And thank you all for your help, I've managed to deploy my first overcloud. But, again I have another problem. I am using HCI deployment and I did include ceph dashboard in my deployment script, but I didn't find the dashboard, after reviewing the RedHat documentation, it seems that I have to use this role "ControllerStorageDashboard". This is what I did, but I got this :

...

*RESP BODY: {"resources": [{"updated_time": "2021-08-25T20:14:20Z", "creation_time": "2021-08-25T20:14:20Z", "logical_resource_id": "0", "resource_name": "0", "physical_resource_id": "a21b3498-fbdb-4a19-8e23-9dd71232b473", "resource_status": "CREATE_FAILED", "resource_status_reason": "BadRequest: resources[0].resources.OVNMacAddressPort: Invalid input for operation: 'tripleo_ovn_mac_port_name=ControllerStorageDashboard-ovn-mac-0' exceeds maximum length of 60.\nNeutron server returns request_ids: ['req-467b58ef-dfd7-42c5-bb07-4f0f99b77332']", "resource_type": "OS* ::TripleO::OVNMacAddressPort", "links": [{"href": " https://10.200.24.2:13004/v1/4f94deb9a28549c0a78f232756c7599a/stacks/overclo... rollerStorageDashboardOVNChassisMacPorts-ui4dsb2tnkbk/ae81eb26-2f4b-4ae0-8826-af32be18ce14/resources/0", "rel": "self"}, {"href": " https://10.200.24.2:13004/v1/4f94deb9a28549c0a78f232756c75 99a/stacks/overcloud-ControllerStorageDashboard-vtmxtvxpzggi-1-ue2d2riknvna-ControllerStorageDashboardOVNChassisMacPorts-ui4dsb2tnkbk/ae81eb26-2f4b-4ae0-8826-af32be18ce14", "rel": "stack"}, {"href": " https://10.200.24.2:13004/v1/4f94deb9a28549c0a78f232756c7599a/stacks/overclo... -ui4dsb2tnkbk-0-yfuxj4ahxviu/a21b3498-fbdb-4a19-8e23-9dd71232b473", "rel": "nested"}], "required_by": [], "parent_resource": "ControllerStorageDashboardOVNChassisMacPorts"}]} GET call to orchestration for https://10.200.24.2:13004/v1/4f94deb9a28549c0a78f232756c7599a/stacks/overclo... dOVNChassisMacPorts-ui4dsb2tnkbk/ae81eb26-2f4b-4ae0-8826-af32be18ce14/resources used request id req-9609844f-f173-4e80-a3bd-bc287e88b00f REQ: curl -g -i --cacert "/etc/pki/ca-trust/source/anchors/cm-local-ca.pem" -X GET https://10.200.24.2:13004/v1/4f94deb9a28549c0a78f232756c7599a/stacks/a21b349... resources -H "Accept: application/json" -H "Content-Type: application/json" -H "User-Agent: python-heatclient" -H "X-Auth-Token: {SHA256}d296097c7cdf0beb50127e0a1d03cb8a702e18d543600f51b16d ab4987811a6a" -H "X-Region-Name: " https://10.200.24.2:13004 "GET /v1/4f94deb9a28549c0a78f232756c7599a/stacks/a21b3498-fbdb-4a19-8e23-9dd71232b473/resources HTTP/1.1" 302 649 RESP: [302] Content-Length: 649 Content-Type: application/json Date: Wed, 25 Aug 2021 20:15:11 GMT Location: https://10.200.24.2:13004/v1/4f94deb9a28549c0a78f232756c7599a/stacks/overclo... ontrollerStorageDashboard-vtmxtvxpzggi-1-ue2d2rik

...

overcloud.ControllerStorageDashboard.0.ControllerStorageDashboardOVNChassisMacPorts.0.OVNMacAddressPort:

...

resource_type: OS::Neutron::Port physical_resource_id: 259e39f8-9e7b-4494-bb2d-ff7b2cf0ad40 status: CREATE_FAILED status_reason: |

*BadRequest: resources.OVNMacAddressPort: Invalid input for operation: 'tripleo_ovn_mac_port_name=ControllerStorageDashboard-ovn-mac-0' exceeds maximum length of 60. Neutron server returns request_ids: ['req-322ab0aa-0e1c-416f-be81-b48230d3dab1']overcloud.ControllerStorageDashboard.2.ControllerStorageDashboardOVNChassisMacPorts.0.OVNMacAddressPort: resource_type: OS::Neutron::Port* physical_resource_id: c7daf26b-7f96-43cf-8678-11d456b5cdfe status: CREATE_FAILED status_reason: | BadRequest: resources.OVNMacAddressPort: Invalid input for operation: 'tripleo_ovn_mac_port_name=ControllerStorageDashboard-ovn-mac-0' exceeds maximum length of 60. Neutron server returns request_ids: ['req-9e3e19dd-4974-4007-9df0-ee9774369495']

overcloud.ControllerStorageDashboard.1.ControllerStorageDashboardOVNChassisMacPorts.0.OVNMacAddressPort: resource_type: OS::Neutron::Port physical_resource_id: c902e259-f299-457f-8b0d-c37fb40e0d32 status: CREATE_FAILED status_reason: | BadRequest: resources.OVNMacAddressPort: Invalid input for operation: 'tripleo_ovn_mac_port_name=ControllerStorageDashboard-ovn-mac-0' exceeds maximum length of 60. Neutron server returns request_ids: ['req-467b58ef-dfd7-42c5-bb07-4f0f99b77332'] clean_up ListStackFailures: END return value: 0 Instantiating messaging websocket client: wss://10.200.24.2:3000

I couldn't find anything on the web about this error. Regards Le mar. 24 août 2021 à 22:25, wodel youchi <wodel.youchi@gmail.com> a écrit :

...

Hello,

After digging after grafana, it seems it needed to download something from the internet, and i didn't really configure a proper gateway on the external network. So I started by configuring a proper gateway and I tested it with the half deployed nodes, the I redid the deployment, and again I got this error :

2021-08-24 21:29:29.616805 | 525400e8-92c8-d397-6f7e-000000006133 |

...
FATAL | Clean up legacy Cinder keystone catalog entries | undercloud | error={"changed": false, "module_stderr": "Fa iled to discover available identity versions when contacting http://10.0.2.40:5000. Attempting to parse version from URL.\nTraceback (most recent call last):\n File \"/usr/lib/python3.6/si te-packages/urllib3/connection.py\", line 162, in _new_conn\n (self._dns_host, self.port), self.timeout, **extra_kw)\n File \"/usr/lib/python3.6/site-packages/urllib3/util/connection.py \", line 80, in create_connection\n raise err\n File \"/usr/lib/python3.6/site-packages/urllib3/util/connection.py\", line 70, in create_connection\n sock.connect(sa)\nTimeoutError: [Errno 110] Connection timed out\n\nDuring handling of the above exception, another exception occurred:\n\nTraceback (most recent call last):\n File \"/usr/lib/python3.6/site-packages/urll ib3/connectionpool.py\", line 600, in urlopen\n chunked=chunked)\n File \"/usr/lib/python3.6/site-packages/urllib3/connectionpool.py\", line 354, in _make_request\n conn.request(meth od, url, **httplib_request_kw)\n File \"/usr/lib64/python3.6/http/client.py\", line 1269, in request\n self._send_request(method, url, body, headers, encode_chunked)\n File \"/usr/lib6 4/python3.6/http/client.py\", line 1315, in _send_request\n self.endheaders(body, encode_chunked=encode_chunked)\n File \"/usr/lib64/python3.6/http/client.py\", line 1264, in endheaders \n self._send_output(message_body, encode_chunked=encode_chunked)\n File \"/usr/lib64/python3.6/http/client.py\", line 1040, in _send_output\n self.send(msg)\n File \"/usr/lib64/pyt hon3.6/http/client.py\", line 978, in send\n self.connect()\n File \"/usr/lib/python3.6/site-packages/urllib3/connection.py\", line 184, in connect\n conn = self._new_conn()\n File \"/usr/lib/python3.6/site-packages/urllib3/connection.py\", line 171, in _new_conn\n self, \"Failed to establish a new connection: %s\" % e)\nurllib3.exceptions.NewConnectionError: <urll ib3.connection.HTTPConnection object at 0x7f96f7b10cc0>: Failed to establish a new connection: [Errno 110] Connection timed out\n\nDuring handling of the above exception, another exception occurred:\n\nTraceback (most recent call last):\n File \"/usr/lib/python3.6/site-packages/requests/adapters.py\", line 449, in send\n timeout=timeout\n File \"/usr/lib/python3.6/site-p ackages/urllib3/connectionpool.py\", line 638, in urlopen\n _stacktrace=sys.exc_info()[2])\n File \"/usr/lib/python3.6/site-packages/urllib3/util/retry.py\", line 399, in increment\n raise MaxRetryError(_pool, url, error or ResponseError(cause))\nurllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='10.0.2.40', port=5000): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f96f7b10cc0>: Failed to establish a new connection: [Errno 110] Connection timed out',))\n\nDuring handling of the ab$ ve exception, another exception occurred:\n\nTraceback (most recent call last):\n File \"/usr/lib/python3.6/site-packages/keystoneauth1/session.py\", line 997, in _send_request\n resp $ self.session.request(method, url, **kwargs)\n File \"/usr/lib/python3.6/site-packages/requests/sessions.py\", line 533, in request\n resp = self.send(prep, **send_kwargs)\n File \"/u$ r/lib/python3.6/site-packages/requests/sessions.py\", line 646, in send\n r = adapter.send(request, **kwargs)\n File \"/usr/lib/python3.6/site-packages/requests/adapters.py\", line 516$ in send\n raise ConnectionError(e, request=request)\nrequests.exceptions.ConnectionError: HTTPConnectionPool(host='10.0.2.40', port=5000): Max retries exceeded with url: / (Caused by N$wConnectionError('<urllib3.connection.HTTPConnection object at 0x7f96f7b10cc0>: Failed to establish a new connection: [Errno 110] Connection timed out',))\n\nDuring handling of the above e$ ception, another exception occurred:\n\nTraceback (most recent call last):\n File \"/usr/lib/python3.6/site-packages/keystoneauth1/identity/generic/base.py\", line 138, in _do_create_plug$ n\n authenticated=False)\n File \"/usr/lib/python3.6/site-packages/keystoneauth1/identity/base.py\", line 610, in get_discovery\n authenticated=authenticated)\n File \"/usr/lib/pyt$ on3.6/site-packages/keystoneauth1/discover.py\", line 1442, in get_discovery\n disc = Discover(session, url, authenticated=authenticated)\n File \"/usr/lib/python3.6/site-packages/keys$ oneauth1/discover.py\", line 526, in __init__\n authenticated=authenticated)\n File \"/usr/lib/python3.6/site-packages/keystoneauth1/discover.py\", line 101, in get_version_data\n r$ sp = session.get(url, headers=headers, authenticated=authenticated)\n File \"/usr/lib/python3.6/site-packages/keystoneauth1/session.py\", line 1116, in get\n return self.request(url, '$ ET', **kwargs)\n File \"/usr/lib/python3.6/site-packages/keystoneauth1/session.py\", line 906, in request\n resp = send(**kwargs)\n File \"/usr/lib/python3.6/site-packages/keystoneaut$ 1/session.py\", line 1013, in _send_request\n raise exceptions.ConnectFailure(msg)\nkeystoneauth1.exceptions.connection.ConnectFailure: Unable to establish connection to http://10.0.2.4$ :5000: HTTPConnectionPool(host='10.0.2.40', port=5000): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f96f7b10cc0>: Failed to establish a new connection: [Errno 110] Connection timed out',))\n\nDuring handling of the above exception, another exception occurred:\n\nTraceback (most recent call last):\n File \"<$ tdin>\", line 102, in <module>\n File \"<stdin>\", line 94, in _ansiballz_main\n File \"<stdin>\", line 40, in invoke_module\n File \"/usr/lib64/python3.6/runpy.py\", line 205, in run_m$ dule\n return _run_module_code(code, init_globals, run_name, mod_spec)\n File \"/usr/lib64/python3.6/runpy.py\", line 96, in _run_module_code\n mod_name, mod_spec, pkg_name, script_$ ame)\n File \"/usr/lib64/python3.6/runpy.py\", line 85, in _run_code\n exec(code, run_globals)\n File \"/tmp/ansible_os_keystone_service_payload_wcyk6h37/ansible_os_keystone_service_p$ yload.zip/ansible/modules/cloud/openstack/os_keystone_service.py\", line 194, in <module>\n File \"/tmp/ansible_os_keystone_service_payload_wcyk6h37/ansible_os_keystone_service_payload.zi$ /ansible/modules/cloud/openstack/os_keystone_service.py\", line 153, in main\n File \"/usr/lib/python3.6/site-packages/openstack/cloud/_identity.py\", line 510, in search_services\n se$ vices = self.list_services()\n File \"/usr/lib/python3.6/site-packages/openstack/cloud/_identity.py\", line 485, in list_services\n if self._is_client_version('identity', 2):\n File \$ /usr/lib/python3.6/site-packages/openstack/cloud/openstackcloud.py\", line 459, in _is_client_version\n client = getattr(self, client_name)\n File \"/usr/lib/python3.6/site-packages/op$ nstack/cloud/_identity.py\", line 32, in _identity_client\n 'identity', min_version=2, max_version='3.latest')\n File \"/usr/lib/python3.6/site-packages/openstack/cloud/openstackcloud.$ y\", line 406, in _get_versioned_client\n if adapter.get_endpoint():\n File \"/usr/lib/python3.6/site-packages/keystoneauth1/adapter.py\", line 282, in get_endpoint\n return self.se$ sion.get_endpoint(auth or self.auth, **kwargs)\n File \"/usr/lib/python3.6/site-packages/keystoneauth1/session.py\", line 1218, in get_endpoint\n return auth.get_endpoint(self, **kwarg$ )\n File \"/usr/lib/python3.6/site-packages/keystoneauth1/identity/base.py\", line 380, in get_endpoint\n allow_version_hack=allow_version_hack, **kwargs)\n File \"/usr/lib/python3.6/$ ite-packages/keystoneauth1/identity/base.py\", line 271, in get_endpoint_data\n service_catalog = self.get_access(session).service_catalog\n File \"/usr/lib/python3.6/site-packages/key$ toneauth1/identity/base.py\", line 134, in get_access\n self.auth_ref = self.get_auth_ref(session)\n File \"/usr/lib/python3.6/site-packages/keystoneauth1/identity/generic/base.py\", l$ ne 206, in get_auth_ref\n self._plugin = self._do_create_plugin(session)\n File \"/usr/lib/python3.6/site-packages/keystoneauth1/identity/generic/base.py\", line 161, in _do_create_plu$ in\n 'auth_url is correct. %s' % e)\nkeystoneauth1.exceptions.discovery.DiscoveryFailure: Could not find versioned identity endpoints when attempting to authenticate. Please check that $our auth_url is correct.

*Unable to establish connection to http://10.0.2.40:5000 <http://10.0.2.40:5000>: HTTPConnectionPool(host='10.0.2.40', port=5000): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f96f7b10cc0>: Failed to establish a new connection: [Errno 110] Connection timed out',))\n", "module_stdout": "", "msg": "MODULE FAILURE\nSee stdout/stderr for the exact error", "rc": 1} *

2021-08-24 21:29:29.617697 | 525400e8-92c8-d397-6f7e-000000006133 | TIMING | Clean up legacy Cinder keystone catalog entries | undercloud | 1:07:40.666419 | 130.85s

PLAY RECAP *********************************************************************

overcloud-computehci-0 : ok=260 changed=145 unreachable=0 failed=0 skipped=140 rescued=0 ignored=0

overcloud-computehci-1 : ok=258 changed=145 unreachable=0 failed=0 skipped=140 rescued=0 ignored=0

overcloud-computehci-2 : ok=255 changed=145 unreachable=0 failed=0 skipped=140 rescued=0 ignored=0

overcloud-controller-0 : ok=295 changed=181 unreachable=0 failed=0 skipped=151 rescued=0 ignored=0

overcloud-controller-1 : ok=289 changed=177 unreachable=0 failed=0 skipped=152 rescued=0 ignored=0

overcloud-controller-2 : ok=288 changed=177 unreachable=0 failed=0 skipped=152 rescued=0 ignored=0

undercloud : ok=105 changed=21 unreachable=0 failed=1 skipped=45 rescued=0 ignored=0

2021-08-24 21:29:29.730778 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Summary Information ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

2021-08-24 21:29:29.731007 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Total Tasks: 1723 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

2021-08-24 21:29:29.731098 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Elapsed Time: 1:07:40.779840 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

2021-08-24 21:29:29.731172 | UUID | Info | Host | Task Name | Run Time

2021-08-24 21:29:29.731251 | 525400e8-92c8-d397-6f7e-000000003b9a | SUMMARY | undercloud | Run tripleo-container-image-prepare logged to: /var/log/tripleo-container-image-prepare.log | 1762.93s

2021-08-24 21:29:29.731349 | 525400e8-92c8-d397-6f7e-0000000057aa | SUMMARY | undercloud | tripleo-ceph-run-ansible : run ceph-ansible | 990.24s 2021-08-24 21:29:29.731433 | 525400e8-92c8-d397-6f7e-000000005951 | SUMMARY | overcloud-controller-0 | tripleo_ha_wrapper : Run init bundle puppet on the host for haproxy | 133.22s 2021-08-24 21:29:29.731503 | 525400e8-92c8-d397-6f7e-000000006133 | SUMMARY | undercloud | Clean up legacy Cinder keystone catalog entries | 130.85s 2021-08-24 21:29:29.731569 | 525400e8-92c8-d397-6f7e-000000006012 | SUMMARY | overcloud-controller-0 | Wait for containers to start for step 3 using paunch | 103.45s 2021-08-24 21:29:29.731643 | 525400e8-92c8-d397-6f7e-000000004337 | SUMMARY | overcloud-computehci-0 | Pre-fetch all the containers | 94.00s

2021-08-24 21:29:29.731729 | 525400e8-92c8-d397-6f7e-000000004378 | SUMMARY | overcloud-computehci-2 | Pre-fetch all the containers | 92.64s

2021-08-24 21:29:29.731795 | 525400e8-92c8-d397-6f7e-000000004337 | SUMMARY | overcloud-computehci-1 | Pre-fetch all the containers | 86.38s

2021-08-24 21:29:29.731867 | 525400e8-92c8-d397-6f7e-000000004d68 | SUMMARY | overcloud-controller-0 | Wait for container-puppet tasks (generate config) to finish | 84.13s 2021-08-24 21:29:29.731946 | 525400e8-92c8-d397-6f7e-000000004d99 | SUMMARY | overcloud-controller-2 | Wait for container-puppet tasks (generate config) to finish | 80.76s 2021-08-24 21:29:29.732012 | 525400e8-92c8-d397-6f7e-00000000427c | SUMMARY | overcloud-controller-1 | Pre-fetch all the containers | 80.21s

2021-08-24 21:29:29.732073 | 525400e8-92c8-d397-6f7e-00000000427c | SUMMARY | overcloud-controller-0 | Pre-fetch all the containers | 77.03s

2021-08-24 21:29:29.732138 | 525400e8-92c8-d397-6f7e-0000000042f5 | SUMMARY | overcloud-controller-2 | Pre-fetch all the containers | 76.32s

2021-08-24 21:29:29.732202 | 525400e8-92c8-d397-6f7e-000000004dd3 | SUMMARY | overcloud-controller-1 | Wait for container-puppet tasks (generate config) to finish | 74.36s 2021-08-24 21:29:29.732266 | 525400e8-92c8-d397-6f7e-000000005da7 | SUMMARY | overcloud-controller-0 | tripleo_ha_wrapper : Run init bundle puppet on the host for ovn_dbs | 68.39s 2021-08-24 21:29:29.732329 | 525400e8-92c8-d397-6f7e-000000005ce2 | SUMMARY | overcloud-controller-0 | Wait for containers to start for step 2 using paunch | 64.55s 2021-08-24 21:29:29.732398 | 525400e8-92c8-d397-6f7e-000000004b97 | SUMMARY | overcloud-controller-2 | Wait for puppet host configuration to finish | 58.13s 2021-08-24 21:29:29.732463 | 525400e8-92c8-d397-6f7e-000000004c1a | SUMMARY | overcloud-controller-1 | Wait for puppet host configuration to finish | 58.11s 2021-08-24 21:29:29.732526 | 525400e8-92c8-d397-6f7e-000000005bd3 | SUMMARY | overcloud-controller-1 | Wait for containers to start for step 2 using paunch | 58.09s 2021-08-24 21:29:29.732589 | 525400e8-92c8-d397-6f7e-000000005b9b | SUMMARY | overcloud-controller-2 | Wait for containers to start for step 2 using paunch | 58.09s

Thank you again for your assistance.

Regards.

Le mar. 24 août 2021 à 08:59, wodel youchi <wodel.youchi@gmail.com> a écrit :

...
Hi, and thanks for your help

As for Ceph, here is container prepare parameter_defaults: ContainerImagePrepare: - push_destination: true set: ceph_alertmanager_image: alertmanager ceph_alertmanager_namespace: quay.ceph.io/prometheus ceph_alertmanager_tag: v0.16.2 ceph_grafana_image: grafana ceph_grafana_namespace: quay.ceph.io/app-sre *ceph_grafana_tag: 5.4.3* ceph_image: daemon ceph_namespace: quay.ceph.io/ceph-ci ceph_node_exporter_image: node-exporter ceph_node_exporter_namespace: quay.ceph.io/prometheus ceph_node_exporter_tag: v0.17.0 ceph_prometheus_image: prometheus ceph_prometheus_namespace: quay.ceph.io/prometheus ceph_prometheus_tag: v2.7.2 *ceph_tag: v4.0.19-stable-4.0-nautilus-centos-7-x86_64* name_prefix: centos-binary- name_suffix: '' namespace: quay.io/tripleotraincentos8 neutron_driver: ovn rhel_containers: false tag: current-tripleo tag_from_label: rdo_version

And yes, the 10.200.7.0/24 network is my storage network Here is a snippet from my network_data.yaml

- name: Storage vip: true vlan: 1107 name_lower: storage ip_subnet: '10.200.7.0/24' allocation_pools: [{'start': '10.200.7.150', 'end': '10.200.7.169'}]

I will look into the grafana service to see why it's not booting and get back to you.

Regards.

Le lun. 23 août 2021 à 17:28, Francesco Pantano <fpantano@redhat.com> a écrit :

...
Hello, thanks John for your reply here. A few more comments inline:

On Mon, Aug 23, 2021 at 6:16 PM John Fulton <johfulto@redhat.com> wrote:

...
On Mon, Aug 23, 2021 at 10:52 AM wodel youchi <wodel.youchi@gmail.com> wrote:

...
Hi,

I redid the undercloud deployment for the Train version for now. And

I verified the download URL for the images.

...
My overcloud deployment has moved forward but I still get errors.

This is what I got this time :

...
"TASK [ceph-grafana : wait for grafana to start]

********************************",

...
"Monday 23 August 2021 14:55:21 +0100 (0:00:00.961)

0:12:59.319 ********* ",

...
"fatal: [overcloud-controller-0]: FAILED! => {\"changed\":

false, \"elapsed\": 300, \"msg\": \"Timeout when waiting for 10.20

...
0.7.151:3100\"}", "fatal: [overcloud-controller-1]: FAILED! => {\"changed\": false, \"elapsed\": 300, \"msg\": \"Timeout when waiting for 10.20 0.7.155:3100\"}", "fatal: [overcloud-controller-2]: FAILED! => {\"changed\": false, \"elapsed\": 300, \"msg\": \"Timeout when waiting for 10.20 0.7.165:3100\"}",

I'm not certain of the ceph-ansible version you're using but it should be a version 4 with train. ceph-ansible should already be installed on your undercloud judging by this error and in the latest version 4 this task is where it failed:

https://github.com/ceph/ceph-ansible/blob/v4.0.64/roles/ceph-grafana/tasks/c...

You can check the status of this service on your three controllers and then debug it directly.

As John pointed out, ceph-ansible is able to configure, render and start the associated systemd unit for all the ceph monitoring stack components (node-exported, prometheus, alertmanager and grafana). You can ssh to your controllers, and check the systemd unit associated, checking the journal to see why they failed to start (I saw there's a timeout waiting for the container to start). A potential plan, in this case, could be:

1. check the systemd unit (I guess you can start with grafana which is the failed service) 2. look at the journal logs (feel free to attach here the relevant part of the output) 3. double check the network where the service is bound (can you attach the /var/lib/mistral/<stack>/ceph-ansible/group_vars/all.yaml) * The grafana process should be run on the storage network, but I see a "Timeout when waiting for 10.200.7.165:3100": is that network the right one?

...
...
John

...
...
"RUNNING HANDLER [ceph-prometheus : service handler]

****************************",

...
"Monday 23 August 2021 15:00:22 +0100 (0:05:00.767)

0:18:00.087 ********* ",

...
"PLAY RECAP

*********************************************************************",

...
"overcloud-computehci-0 : ok=224 changed=23

unreachable=0 failed=0 skipped=415 rescued=0 ignored=0 ",

...
"overcloud-computehci-1 : ok=199 changed=18

unreachable=0 failed=0 skipped=392 rescued=0 ignored=0 ",

...
"overcloud-computehci-2 : ok=212 changed=23

unreachable=0 failed=0 skipped=390 rescued=0 ignored=0 ",

...
"overcloud-controller-0 : ok=370 changed=52

unreachable=0 failed=1 skipped=539 rescued=0 ignored=0 ",

...
"overcloud-controller-1 : ok=308 changed=43

unreachable=0 failed=1 skipped=495 rescued=0 ignored=0 ",

...
"overcloud-controller-2 : ok=317 changed=45

unreachable=0 failed=1 skipped=493 rescued=0 ignored=0 ",

...
"INSTALLER STATUS

***************************************************************",

...
"Install Ceph Monitor : Complete (0:00:52)", "Install Ceph Manager : Complete (0:05:49)", "Install Ceph OSD : Complete (0:02:28)", "Install Ceph RGW : Complete (0:00:27)", "Install Ceph Client : Complete (0:00:33)", "Install Ceph Grafana : In Progress (0:05:54)", "\tThis phase can be restarted by running:

roles/ceph-grafana/tasks/main.yml",

...
"Install Ceph Node Exporter : Complete (0:00:28)", "Monday 23 August 2021 15:00:22 +0100 (0:00:00.006)

0:18:00.094 ********* ",

...
"=============================================================================== ",

...
"ceph-grafana : wait for grafana to start

------------------------------ 300.77s",

...
"ceph-facts : get ceph current status

---------------------------------- 300.27s",

...
"ceph-container-common : pulling

udtrain.ctlplane.umaitek.dz:8787/ceph-ci/daemon:v4.0.19-stable-4.0-nautilus-centos-7-x86_64

...
image -- 19.04s", "ceph-mon : waiting for the monitor(s) to form the quorum... ------------ 12.83s", "ceph-osd : use ceph-volume lvm batch to create bluestore osds ---------- 12.13s", "ceph-osd : wait for all osd to be up ----------------------------------- 11.88s", "ceph-osd : set pg_autoscale_mode value on pool(s) ---------------------- 11.00s", "ceph-osd : create openstack pool(s) ------------------------------------ 10.80s", "ceph-grafana : make sure grafana is down ------------------------------- 10.66s", "ceph-osd : customize pool crush_rule ----------------------------------- 10.15s", "ceph-osd : customize pool size ----------------------------------------- 10.15s", "ceph-osd : customize pool min_size ------------------------------------- 10.14s", "ceph-osd : assign application to pool(s) ------------------------------- 10.13s", "ceph-osd : list existing pool(s) ---------------------------------------- 8.59s",

"ceph-mon : fetch ceph initial keys -------------------------------------- 7.01s", "ceph-container-common : get ceph version -------------------------------- 6.75s", "ceph-prometheus : start prometheus services ----------------------------- 6.67s", "ceph-mgr : wait for all mgr to be up ------------------------------------ 6.66s", "ceph-grafana : start the grafana-server service ------------------------- 6.33s", "ceph-mgr : create ceph mgr keyring(s) on a mon node --------------------- 6.26s" ], "failed_when_result": true } 2021-08-23 15:00:24.427687 | 525400e8-92c8-47b1-e162-00000000597d | TIMING | tripleo-ceph-run-ansible : print ceph-ansible outpu$ in case of failure | undercloud | 0:37:30.226345 | 0.25s

PLAY RECAP

...
...
overcloud-computehci-0 : ok=213 changed=117 unreachable=0 failed=0 skipped=120 rescued=0 ignored=0 overcloud-computehci-1 : ok=207 changed=117 unreachable=0 failed=0 skipped=120 rescued=0 ignored=0 overcloud-computehci-2 : ok=207 changed=117 unreachable=0 failed=0 skipped=120 rescued=0 ignored=0 overcloud-controller-0 : ok=237 changed=145 unreachable=0 failed=0 skipped=128 rescued=0 ignored=0 overcloud-controller-1 : ok=232 changed=145 unreachable=0 failed=0 skipped=128 rescued=0 ignored=0 overcloud-controller-2 : ok=232 changed=145 unreachable=0 failed=0 skipped=128 rescued=0 ignored=0 undercloud : ok=100 changed=18 unreachable=0 failed=1 skipped=37 rescued=0 ignored=0

2021-08-23 15:00:24.559997 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Summary Information ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2021-08-23 15:00:24.560328 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Total Tasks: 1366 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2021-08-23 15:00:24.560419 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Elapsed Time: 0:37:30.359090 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2021-08-23 15:00:24.560490 | UUID | Info | Host | Task Name | Run Time 2021-08-23 15:00:24.560589 | 525400e8-92c8-47b1-e162-00000000597b | SUMMARY | undercloud | tripleo-ceph-run-ansible : run ceph-ans ible | 1082.71s 2021-08-23 15:00:24.560675 | 525400e8-92c8-47b1-e162-000000004d9a | SUMMARY | overcloud-controller-1 | Wait for container-puppet t asks (generate config) to finish | 356.02s 2021-08-23 15:00:24.560763 | 525400e8-92c8-47b1-e162-000000004d6a | SUMMARY | overcloud-controller-0 | Wait for container-puppet t asks (generate config) to finish | 355.74s 2021-08-23 15:00:24.560839 | 525400e8-92c8-47b1-e162-000000004dd0 | SUMMARY | overcloud-controller-2 | Wait for container-puppet t asks (generate config) to finish | 355.68s 2021-08-23 15:00:24.560912 | 525400e8-92c8-47b1-e162-000000003bb1 | SUMMARY | undercloud | Run tripleo-container-image-prepare log ged to: /var/log/tripleo-container-image-prepare.log | 143.03s 2021-08-23 15:00:24.560986 | 525400e8-92c8-47b1-e162-000000004b13 | SUMMARY | overcloud-controller-0 | Wait for puppet host config uration to finish | 125.36s 2021-08-23 15:00:24.561057 | 525400e8-92c8-47b1-e162-000000004b88 | SUMMARY | overcloud-controller-2 | Wait for puppet host config uration to finish | 125.33s 2021-08-23 15:00:24.561128 | 525400e8-92c8-47b1-e162-000000004b4b | SUMMARY | overcloud-controller-1 | Wait for puppet host config uration to finish | 125.25s 2021-08-23 15:00:24.561300 | 525400e8-92c8-47b1-e162-000000001dc4 | SUMMARY | overcloud-controller-2 | Run puppet on the host to a pply IPtables rules | 108.08s 2021-08-23 15:00:24.561374 | 525400e8-92c8-47b1-e162-000000001e4f | SUMMARY | overcloud-controller-0 | Run puppet on the host to a pply IPtables rules | 107.34s 2021-08-23 15:00:24.561444 | 525400e8-92c8-47b1-e162-000000004c8d | SUMMARY | overcloud-computehci-2 | Wait for container-puppet t asks (generate config) to finish | 96.56s 2021-08-23 15:00:24.561514 | 525400e8-92c8-47b1-e162-000000004c33 | SUMMARY | overcloud-computehci-0 | Wait for container-puppet t asks (generate config) to finish | 96.38s 2021-08-23 15:00:24.561580 | 525400e8-92c8-47b1-e162-000000004c60 | SUMMARY | overcloud-computehci-1 | Wait for container-puppet t asks (generate config) to finish | 93.41s 2021-08-23 15:00:24.561645 | 525400e8-92c8-47b1-e162-00000000434d | SUMMARY | overcloud-computehci-0 | Pre-fetch all the container s | 92.70s 2021-08-23 15:00:24.561712 | 525400e8-92c8-47b1-e162-0000000043ed | SUMMARY | overcloud-computehci-2 | Pre-fetch all the container s | 91.90s 2021-08-23 15:00:24.561782 | 525400e8-92c8-47b1-e162-000000004385 | SUMMARY | overcloud-computehci-1 | Pre-fetch all the container s | 91.88s 2021-08-23 15:00:24.561876 | 525400e8-92c8-47b1-e162-00000000491c | SUMMARY | overcloud-computehci-1 | Wait for puppet host config uration to finish | 90.37s 2021-08-23 15:00:24.561947 | 525400e8-92c8-47b1-e162-000000004951 | SUMMARY | overcloud-computehci-2 | Wait for puppet host config uration to finish | 90.37s 2021-08-23 15:00:24.562016 | 525400e8-92c8-47b1-e162-0000000048e6 | SUMMARY | overcloud-computehci-0 | Wait for puppet host config uration to finish | 90.35s 2021-08-23 15:00:24.562080 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ End Summary Information ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2021-08-23 15:00:24.562196 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ State Information ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 2021-08-23 15:00:24.562311 | ~~~~~~~~~~~~~~~~~~ Number of nodes which did not deploy successfully: 1 ~~~~~~~~~~~~~~~~~ 2021-08-23 15:00:24.562379 | The following node(s) had failures: undercloud 2021-08-23 15:00:24.562456 |
>> Host 10.0.2.40 not found in /home/stack/.ssh/known_hosts
>> Ansible failed, check log at
/var/lib/mistral/overcloud/ansible.log.Overcloud Endpoint:
http://10.0.2.40:5000
>> Overcloud Horizon Dashboard URL: http://10.0.2.40:80/dashboard
>> Overcloud rc file: /home/stack/overcloudrc
>> Overcloud Deployed with error
>> Overcloud configuration failed.
>>
>
>
> Could someone help debug this, the ansible.log is huge, I can't see
what's the origin of the problem, if someone can point me to the right
direction it will aprecciated.
> Thanks in advance.
>
> Regards.
>
> Le mer. 18 août 2021 à 18:02, Wesley Hayutin &lt;whayutin@redhat.com> a
écrit :
>>
>>
>>
>> On Wed, Aug 18, 2021 at 10:10 AM Dmitry Tantsur &lt;dtantsur@redhat.com>
wrote:
>>>
>>> Hi,
>>>
>>> On Wed, Aug 18, 2021 at 4:39 PM wodel youchi &lt;
wodel.youchi@gmail.com> wrote:
>>>>
>>>> Hi,
>>>> I am trying to deploy openstack with tripleO using VMs and
nested-KVM for the compute node. This is for test and learning purposes.
>>>>
>>>> I am using the Train version and following some tutorials.
>>>> I prepared my different template files and started the deployment,
but I got these errors :
>>>>
>>>> Failed to provision instance fc40457e-4b3c-4402-ae9d-c528f2c2ad30:
Asynchronous exception: Node failed to deploy. Exception: Agent API for
node 6d3724fc-6f13-4588-bbe5-56bc4f9a4f87 returned HTTP status code 404
with error: Not found: Extension with id iscsi not found. for node
>>>>
>>>
>>> You somehow ended up using master (Xena release) deploy ramdisk
with Train TripleO. You need to make sure to download Train images. I hope
TripleO people can point you at the right place.
>>>
>>> Dmitry
>>
>>
>> http://images.rdoproject.org/centos8/
>>
http://images.rdoproject.org/centos8/train/rdo_trunk/current-tripleo/
>>
>>>
>>>
>>>>
>>>> and
>>>>
>>>> Got HTTP 409: {"errors": [{"status": 409, "title": "Conflict",
"detail": "There was a conflict when trying to complete your request.\n\n
Unable to allocate inventory: Unable to create allocation for
'CUSTOM_BAREMETAL' on resource provider
'6d3724fc-6f13-4588-bbe5-56bc4f9a4f87'. The requested amount would exceed
the capacity. ",
>>>>
>>>> Could you help understand what those errors mean? I couldn't find
anything similar on the net.
>>>>
>>>> Thanks in advance.
>>>>
>>>> Regards.
>>>
>>>
>>>
>>> --
>>> Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn,
>>> Commercial register: Amtsgericht Muenchen, HRB 153243,
>>> Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs,
Michael O'Neill
-- Francesco Pantano GPG KEY: F41BD75C

1576

Age (days ago)

1583

Last active (days ago)

List overview

Download

11 comments

5 participants

participants (5)

Dmitry Tantsur
Francesco Pantano
John Fulton
Wesley Hayutin
wodel youchi

Need help deploying Openstack

tags

participants (5)