Re: Need help deploying Openstack

23 Aug 2021


      Hello,
thanks John for your reply here.
A few more comments inline:

On Mon, Aug 23, 2021 at 6:16 PM John Fulton <johfulto@redhat.com> wrote:
...
On Mon, Aug 23, 2021 at 10:52 AM wodel youchi <wodel.youchi@gmail.com>
wrote:
...
Hi,
I redid the undercloud deployment for the Train version for now. And I
verified the download URL for the images.
...
My overcloud deployment has moved forward but I still get errors.
This is what I got this time :
...
"TASK [ceph-grafana : wait for grafana to start]
********************************",
...
"Monday 23 August 2021  14:55:21 +0100 (0:00:00.961)
0:12:59.319 ********* ",
...
"fatal: [overcloud-controller-0]: FAILED! => {\"changed\":
false, \"elapsed\": 300, \"msg\": \"Timeout when waiting for 10.20
...
0.7.151:3100\"}",
       "fatal: [overcloud-controller-1]: FAILED! => {\"changed\":
false, \"elapsed\": 300, \"msg\": \"Timeout when waiting for 10.20
0.7.155:3100\"}",
       "fatal: [overcloud-controller-2]: FAILED! => {\"changed\":
false, \"elapsed\": 300, \"msg\": \"Timeout when waiting for 10.20
0.7.165:3100\"}",
I'm not certain of the ceph-ansible version you're using but it should
be a version 4 with train. ceph-ansible should already be installed on
your undercloud judging by this error and in the latest version 4 this
task is where it failed:
https://github.com/ceph/ceph-ansible/blob/v4.0.64/roles/ceph-grafana/tasks/c...
You can check the status of this service on your three controllers and
then debug it directly.
As John pointed out, ceph-ansible is able to configure, render and start
the associated
systemd unit for all the ceph monitoring stack components (node-exported,
prometheus, alertmanager and
grafana).
You can ssh to your controllers, and check the systemd unit associated,
checking the journal to see why
they failed to start (I saw there's a timeout waiting for the container to
start).
A potential plan, in this case, could be:

1. check the systemd unit (I guess you can start with grafana which is the
failed service)
2. look at the journal logs (feel free to attach here the relevant part of
the output)
3. double check the network where the service is bound (can you attach the
/var/lib/mistral/<stack>/ceph-ansible/group_vars/all.yaml)
    * The grafana process should be run on the storage network, but I see a
"Timeout when waiting for 10.200.7.165:3100": is that network the right one?
...

...
John
...
...
"RUNNING HANDLER [ceph-prometheus : service handler]
****************************",
...
"Monday 23 August 2021  15:00:22 +0100 (0:05:00.767)
0:18:00.087 ********* ",
...
"PLAY RECAP
*********************************************************************",
...
"overcloud-computehci-0     : ok=224  changed=23
unreachable=0    failed=0    skipped=415  rescued=0    ignored=0   ",
...
"overcloud-computehci-1     : ok=199  changed=18
unreachable=0    failed=0    skipped=392  rescued=0    ignored=0   ",
...
"overcloud-computehci-2     : ok=212  changed=23
unreachable=0    failed=0    skipped=390  rescued=0    ignored=0   ",
...
"overcloud-controller-0     : ok=370  changed=52
unreachable=0    failed=1    skipped=539  rescued=0    ignored=0   ",
...
"overcloud-controller-1     : ok=308  changed=43
unreachable=0    failed=1    skipped=495  rescued=0    ignored=0   ",
...
"overcloud-controller-2     : ok=317  changed=45
unreachable=0    failed=1    skipped=493  rescued=0    ignored=0   ",
...
"INSTALLER STATUS
***************************************************************",
...
"Install Ceph Monitor           : Complete (0:00:52)",
       "Install Ceph Manager           : Complete (0:05:49)",
       "Install Ceph OSD               : Complete (0:02:28)",
       "Install Ceph RGW               : Complete (0:00:27)",
       "Install Ceph Client            : Complete (0:00:33)",
       "Install Ceph Grafana           : In Progress (0:05:54)",
       "\tThis phase can be restarted by running:
roles/ceph-grafana/tasks/main.yml",
...
"Install Ceph Node Exporter     : Complete (0:00:28)",
       "Monday 23 August 2021  15:00:22 +0100 (0:00:00.006)
0:18:00.094 ********* ",
...
"===============================================================================
",
...
"ceph-grafana : wait for grafana to start
------------------------------ 300.77s",
...
"ceph-facts : get ceph current status
---------------------------------- 300.27s",
...
"ceph-container-common : pulling
udtrain.ctlplane.umaitek.dz:8787/ceph-ci/daemon:v4.0.19-stable-4.0-nautilus-centos-7-x86_64
...
image -- 19.04s",
       "ceph-mon : waiting for the monitor(s) to form the quorum...
------------ 12.83s",
       "ceph-osd : use ceph-volume lvm batch to create bluestore osds
---------- 12.13s",
       "ceph-osd : wait for all osd to be up
----------------------------------- 11.88s",
       "ceph-osd : set pg_autoscale_mode value on pool(s)
---------------------- 11.00s",
       "ceph-osd : create openstack pool(s)
------------------------------------ 10.80s",
       "ceph-grafana : make sure grafana is down
------------------------------- 10.66s",
       "ceph-osd : customize pool crush_rule
----------------------------------- 10.15s",
       "ceph-osd : customize pool size
----------------------------------------- 10.15s",
       "ceph-osd : customize pool min_size
------------------------------------- 10.14s",
       "ceph-osd : assign application to pool(s)
------------------------------- 10.13s",
       "ceph-osd : list existing pool(s)
---------------------------------------- 8.59s",
"ceph-mon : fetch ceph initial keys
-------------------------------------- 7.01s",
       "ceph-container-common : get ceph version
-------------------------------- 6.75s",
       "ceph-prometheus : start prometheus services
----------------------------- 6.67s",
       "ceph-mgr : wait for all mgr to be up
------------------------------------ 6.66s",
       "ceph-grafana : start the grafana-server service
------------------------- 6.33s",
       "ceph-mgr : create ceph mgr keyring(s) on a mon node
--------------------- 6.26s"
   ],
   "failed_when_result": true
}
2021-08-23 15:00:24.427687 | 525400e8-92c8-47b1-e162-00000000597d |
 TIMING | tripleo-ceph-run-ansible : print ceph-ansible outpu$
in case of failure | undercloud | 0:37:30.226345 | 0.25s
PLAY RECAP

...
...
overcloud-computehci-0     : ok=213  changed=117  unreachable=0
failed=0    skipped=120  rescued=0    ignored=0
overcloud-computehci-1     : ok=207  changed=117  unreachable=0
failed=0    skipped=120  rescued=0    ignored=0
overcloud-computehci-2     : ok=207  changed=117  unreachable=0
failed=0    skipped=120  rescued=0    ignored=0
overcloud-controller-0     : ok=237  changed=145  unreachable=0
failed=0    skipped=128  rescued=0    ignored=0
overcloud-controller-1     : ok=232  changed=145  unreachable=0
failed=0    skipped=128  rescued=0    ignored=0
overcloud-controller-2     : ok=232  changed=145  unreachable=0
failed=0    skipped=128  rescued=0    ignored=0
undercloud                 : ok=100  changed=18   unreachable=0
failed=1    skipped=37   rescued=0    ignored=0
2021-08-23 15:00:24.559997 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Summary
Information ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2021-08-23 15:00:24.560328 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Total
Tasks: 1366       ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2021-08-23 15:00:24.560419 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Elapsed
Time: 0:37:30.359090 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2021-08-23 15:00:24.560490 |                                 UUID |
   Info |       Host |   Task Name |   Run Time
2021-08-23 15:00:24.560589 | 525400e8-92c8-47b1-e162-00000000597b |
SUMMARY | undercloud | tripleo-ceph-run-ansible : run ceph-ans
ible | 1082.71s
2021-08-23 15:00:24.560675 | 525400e8-92c8-47b1-e162-000000004d9a |
SUMMARY | overcloud-controller-1 | Wait for container-puppet t
asks (generate config) to finish | 356.02s
2021-08-23 15:00:24.560763 | 525400e8-92c8-47b1-e162-000000004d6a |
SUMMARY | overcloud-controller-0 | Wait for container-puppet t
asks (generate config) to finish | 355.74s
2021-08-23 15:00:24.560839 | 525400e8-92c8-47b1-e162-000000004dd0 |
SUMMARY | overcloud-controller-2 | Wait for container-puppet t
asks (generate config) to finish | 355.68s
2021-08-23 15:00:24.560912 | 525400e8-92c8-47b1-e162-000000003bb1 |
SUMMARY | undercloud | Run tripleo-container-image-prepare log
ged to: /var/log/tripleo-container-image-prepare.log | 143.03s
2021-08-23 15:00:24.560986 | 525400e8-92c8-47b1-e162-000000004b13 |
SUMMARY | overcloud-controller-0 | Wait for puppet host config
uration to finish | 125.36s
2021-08-23 15:00:24.561057 | 525400e8-92c8-47b1-e162-000000004b88 |
SUMMARY | overcloud-controller-2 | Wait for puppet host config
uration to finish | 125.33s
2021-08-23 15:00:24.561128 | 525400e8-92c8-47b1-e162-000000004b4b |
SUMMARY | overcloud-controller-1 | Wait for puppet host config
uration to finish | 125.25s
2021-08-23 15:00:24.561300 | 525400e8-92c8-47b1-e162-000000001dc4 |
SUMMARY | overcloud-controller-2 | Run puppet on the host to a
pply IPtables rules | 108.08s
2021-08-23 15:00:24.561374 | 525400e8-92c8-47b1-e162-000000001e4f |
SUMMARY | overcloud-controller-0 | Run puppet on the host to a
pply IPtables rules | 107.34s
2021-08-23 15:00:24.561444 | 525400e8-92c8-47b1-e162-000000004c8d |
SUMMARY | overcloud-computehci-2 | Wait for container-puppet t
asks (generate config) to finish | 96.56s
2021-08-23 15:00:24.561514 | 525400e8-92c8-47b1-e162-000000004c33 |
SUMMARY | overcloud-computehci-0 | Wait for container-puppet t
asks (generate config) to finish | 96.38s
2021-08-23 15:00:24.561580 | 525400e8-92c8-47b1-e162-000000004c60 |
SUMMARY | overcloud-computehci-1 | Wait for container-puppet t
asks (generate config) to finish | 93.41s
2021-08-23 15:00:24.561645 | 525400e8-92c8-47b1-e162-00000000434d |
SUMMARY | overcloud-computehci-0 | Pre-fetch all the container
s | 92.70s
2021-08-23 15:00:24.561712 | 525400e8-92c8-47b1-e162-0000000043ed |
SUMMARY | overcloud-computehci-2 | Pre-fetch all the container
s | 91.90s
2021-08-23 15:00:24.561782 | 525400e8-92c8-47b1-e162-000000004385 |
SUMMARY | overcloud-computehci-1 | Pre-fetch all the container
s | 91.88s
2021-08-23 15:00:24.561876 | 525400e8-92c8-47b1-e162-00000000491c |
SUMMARY | overcloud-computehci-1 | Wait for puppet host config
uration to finish | 90.37s
2021-08-23 15:00:24.561947 | 525400e8-92c8-47b1-e162-000000004951 |
SUMMARY | overcloud-computehci-2 | Wait for puppet host config
uration to finish | 90.37s
2021-08-23 15:00:24.562016 | 525400e8-92c8-47b1-e162-0000000048e6 |
SUMMARY | overcloud-computehci-0 | Wait for puppet host config
uration to finish | 90.35s
2021-08-23 15:00:24.562080 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ End
Summary Information ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2021-08-23 15:00:24.562196 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ State
Information ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2021-08-23 15:00:24.562311 | ~~~~~~~~~~~~~~~~~~ Number of nodes which
did not deploy successfully: 1 ~~~~~~~~~~~~~~~~~
2021-08-23 15:00:24.562379 |  The following node(s) had failures:
undercloud
2021-08-23 15:00:24.562456 |
>> Host 10.0.2.40 not found in /home/stack/.ssh/known_hosts
>> Ansible failed, check log at
/var/lib/mistral/overcloud/ansible.log.Overcloud Endpoint:
http://10.0.2.40:5000
>> Overcloud Horizon Dashboard URL: http://10.0.2.40:80/dashboard
>> Overcloud rc file: /home/stack/overcloudrc
>> Overcloud Deployed with error
>> Overcloud configuration failed.
>>
>
>
> Could someone help debug this, the ansible.log is huge, I can't see
what's the origin of the problem, if someone can point me to the right
direction it will aprecciated.
> Thanks in advance.
>
> Regards.
>
> Le mer. 18 août 2021 à 18:02, Wesley Hayutin &lt;whayutin@redhat.com> a
écrit :
>>
>>
>>
>> On Wed, Aug 18, 2021 at 10:10 AM Dmitry Tantsur &lt;dtantsur@redhat.com>
wrote:
>>>
>>> Hi,
>>>
>>> On Wed, Aug 18, 2021 at 4:39 PM wodel youchi &lt;wodel.youchi@gmail.com>
wrote:
>>>>
>>>> Hi,
>>>> I am trying to deploy openstack with tripleO using VMs and nested-KVM
for the compute node. This is for test and learning purposes.
>>>>
>>>> I am using the Train version and following some tutorials.
>>>> I prepared my different template files and started the deployment,
but I got these errors :
>>>>
>>>> Failed to provision instance fc40457e-4b3c-4402-ae9d-c528f2c2ad30:
Asynchronous exception: Node failed to deploy. Exception: Agent API for
node 6d3724fc-6f13-4588-bbe5-56bc4f9a4f87 returned HTTP status code 404
with error: Not found: Extension with id iscsi not found. for node
>>>>
>>>
>>> You somehow ended up using master (Xena release) deploy ramdisk with
Train TripleO. You need to make sure to download Train images. I hope
TripleO people can point you at the right place.
>>>
>>> Dmitry
>>
>>
>> http://images.rdoproject.org/centos8/
>> http://images.rdoproject.org/centos8/train/rdo_trunk/current-tripleo/
>>
>>>
>>>
>>>>
>>>> and
>>>>
>>>> Got HTTP 409: {"errors": [{"status": 409, "title": "Conflict",
"detail": "There was a conflict when trying to complete your request.\n\n
Unable to allocate inventory: Unable to create allocation for
'CUSTOM_BAREMETAL' on resource provider
'6d3724fc-6f13-4588-bbe5-56bc4f9a4f87'. The requested amount would exceed
the capacity. ",
>>>>
>>>> Could you help understand what those errors mean? I couldn't find
anything similar on the net.
>>>>
>>>> Thanks in advance.
>>>>
>>>> Regards.
>>>
>>>
>>>
>>> --
>>> Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn,
>>> Commercial register: Amtsgericht Muenchen, HRB 153243,
>>> Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs,
Michael O'Neill


-- 
Francesco Pantano
GPG KEY: F41BD75C