Hi,

I think I found what causes the problem, but I don't understand why.

I removed the verbosity, i.e I removed -vvv I only kept just one and I disabled ANSIBLE_DEBUG variable, and voila the deployment went till the end.
First I suspected the tmux process, some kind of buffer overflow because of the quantity of the logs, but then I connected to the VM's console and it is the behavior.

With one -v the process goes without problem, but if I put more -vvv it gets stuck somewhere.
If someone can explain this to me!!!!???
 


Regards.

Le lun. 24 oct. 2022 à 14:00, wodel youchi <wodel.youchi@gmail.com> a écrit :
Anyone????

Le lun. 24 oct. 2022 à 07:53, wodel youchi <wodel.youchi@gmail.com> a écrit :
Hi,

My setup is simple, it's an hci deployment composed of 3 controllers nodes and 6 compute and storage nodes.
I am using ceph-ansible for deploying the storage part and the deployment goes well.

My base OS is Rocky Linux 8 fully updated.

My network is composed of a 1Gb management network for OS, application deployment and server management. And a 40Gb with LACP (80Gb) data network. I am using vlans to segregate openstack networks.

I updated both Xena and Yoga kolla-ansible package I updated several times the container images (I am using a local registry).

No matter how many times I tried to deploy it's the same behavior. The setup gets stuck somewhere.

I tried to deploy the core modules without SSL, I tried to use an older kernel, I tried to use the 40Gb network to deploy, nothing works. The problem is the lack of error if there was one it would have been a starting point but I have nothing.

Regards.

On Sun, Oct 23, 2022, 00:42 wodel youchi <wodel.youchi@gmail.com> wrote:
Hi,

Here you can find the kolla-ansible deploy log with ANSIBLE_DEBUG=1

Regards.

Le sam. 22 oct. 2022 à 23:55, wodel youchi <wodel.youchi@gmail.com> a écrit :
Hi,

I am trying to deploy a new platform using kolla-ansible Yoga and I am trying to upgrade another platform from Xena to yoga.

On both platforms the prechecks went well, but when I start the process of deployment for the first and upgrade for the second, the process gets stuck.

I tried to tail -f /var/log/kolla/*/*.log but I can't get hold of the cause.

In the first platform, some services get deployed, and at some point the script gets stuck, several times in the modprobe phase.

In the second platform, the upgrade gets stuck on :

Escalation succeeded                                                                               [204/1859]
<20.3.0.28> (0, b'\n{"path": "/etc/kolla/cron", "changed": false, "diff": {"before": {"path": "/etc/kolla/cro
n"}, "after": {"path": "/etc/kolla/cron"}}, "uid": 0, "gid": 0, "owner": "root", "group": "root", "mode": "07
70", "state": "directory", "secontext": "unconfined_u:object_r:etc_t:s0", "size": 70, "invocation": {"module_
args": {"path": "/etc/kolla/cron", "owner": "root", "group": "root", "mode": "0770", "recurse": false, "force
": false, "follow": true, "modification_time_format": "%Y%m%d%H%M.%S", "access_time_format": "%Y%m%d%H%M.%S",
 "unsafe_writes": false, "state": "directory", "_original_basename": null, "_diff_peek": null, "src": null, "
modification_time": null, "access_time": null, "seuser": null, "serole": null, "selevel": null, "setype": nul
l, "attributes": null}}}\n', b'')
ok: [20.3.0.28] => (item={'key': 'cron', 'value': {'container_name': 'cron', 'group': 'cron', 'enabled': True
, 'image': '20.3.0.34:4000/openstack.kolla/centos-source-cron:yoga', 'environment': {'DUMMY_ENVIRONMENT': 'ko
lla_useless_env', 'KOLLA_LOGROTATE_SCHEDULE': 'daily'}, 'volumes': ['/etc/kolla/cron/:/var/lib/kolla/config_f
iles/:ro', '/etc/localtime:/etc/localtime:ro', '', 'kolla_logs:/var/log/kolla/'], 'dimensions': {}}}) => {
    "ansible_loop_var": "item",
    "changed": false,
    "diff": {
        "after": {
            "path": "/etc/kolla/cron"
        },
        "before": {
            "path": "/etc/kolla/cron"
        }
    },
    "gid": 0,
    "group": "root",

How to start debugging the situation.

Regards.