Hello again,

Here is my bond_vlan.j2 is it correct ?

---
{% set mtu_list = [ctlplane_mtu] %}
{% for network in role_networks %}
{{ mtu_list.append(lookup('vars', networks_lower[network] ~ '_mtu')) }}
{%- endfor %}
{% set min_viable_mtu = mtu_list | max %}
network_config:
- type: interface
 name: nic1
 mtu: {{ ctlplane_mtu }}
 use_dhcp: false
 addresses:
 - ip_netmask: {{ ctlplane_ip }}/{{ ctlplane_subnet_cidr }}
 routes: {{ ctlplane_host_routes }}
- type: ovs_bridge
 name: br1
 dns_servers: {{ ctlplane_dns_nameservers }}
 domain: {{ dns_search_domains }}
 members:
 - type: ovs_bond
   name: bond1
   mtu: {{ min_viable_mtu }}
   ovs_options: {{ bond_interface_ovs_options }}
   members:
   - type: interface
     name: nic3
     mtu: {{ min_viable_mtu }}
     primary: true
   - type: interface
     name: nic4
     mtu: {{ min_viable_mtu }}
 - type: vlan
   mtu: {{ min_viable_mtu }}
   vlan_id: storage_vlan_id
   addresses:
   - ip_netmask: storage_ip/ storage_cidr
   routes: storage_host_routes
 - type: vlan
   mtu: {{ min_viable_mtu }}
   vlan_id: storage_mgmt_vlan_id
   addresses:
   - ip_netmask: storage_mgmt_ip/storage_mgmt_cidr
   routes: storage_mgmt_host_routes
- type: ovs_bridge
 name: br2
 dns_servers: {{ ctlplane_dns_nameservers }}
 domain: {{ dns_search_domains }}
 members:
 - type: ovs_bond
   name: bond2
   mtu: {{ min_viable_mtu }}
   ovs_options: {{ bond_interface_ovs_options }}
   members:
   - type: interface
     name: nic5
     mtu: {{ min_viable_mtu }}
     primary: true
   - type: interface
     name: nic6
     mtu: {{ min_viable_mtu }}
 - type: vlan
   mtu: {{ min_viable_mtu }}
   vlan_id: internal_api_vlan_id
   addresses:
   - ip_netmask: internal_api_ip/internal_api_cidr
   routes: internal_api_host_routes
 - type: vlan
   mtu: {{ min_viable_mtu }}
   vlan_id: tenant_vlan_id
   addresses:
   - ip_netmask: tenant_ip/tenant_cidr
   routes: tenant_host_routes
 - type: vlan
   mtu: {{ min_viable_mtu }}
   vlan_id: external_vlan_id
   addresses:
   - ip_netmask: external_ip/external_cidr
   routes: external_host_routes



Regards

Le mar. 24 août 2021 à 18:07, wodel youchi <wodel.youchi@gmail.com> a écrit :
Hi and thanks for the help.

My network is simple, I have 5 nics per node :
- first nic : provisioning
- second and third nics as bond : storage and storage mgmt
- fourth and fifth nics as bond : tenant, api and external


I modified the baremetal_deployment.yaml as you suggested,
- name: Controller
 count: 3
 hostname_format: controller-%index%
 defaults:
   profile: control
   networks:
     - network: external
       subnet: external_subnet
     - network: internal_api
       subnet: internal_api_subnet
     - network: storage
       subnet: storage_subnet
     - network: storage_mgmt
       subnet: storage_mgmt_subnet
     - network: tenant
       subnet: tenant_subnet
   network_config:
     template: /home/stack/templates/nic-configs/bonds_vlans.j2
     default_route_network:
       - external
- name: ComputeHCI
 count: 3
 hostname_format: computehci-%index%
 defaults:
   profile: computeHCI
   networks:
     - network: internal_api
       subnet: internal_api_subnet
     - network: tenant
       subnet: tenant_subnet
     - network: storage
       subnet: storage_subnet
     - network: storage_mgmt
       subnet: storage_mgmt_subnet
   network_config:
     template: /home/stack/templates/nic-configs/bonds_vlans.j2



but still errors, and this time I have nothing to go with
Error :
<10.100.4.7> SSH: EXEC ssh -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no -o ControlMaster=auto -o ControlPersist=30m -o ServerAliveInterval=64 -o ServerAliveCountMax=[55/1822]
ompression=no -o TCPKeepAlive=yes -o VerifyHostKeyDNS=no -o ForwardX11=no -o ForwardAgent=yes -o PreferredAuthentications=publickey -T -o StrictHostKeyChecking=no -o KbdInteractiveAuthentic
ation=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o 'User="heat-admin"' -o ConnectTimeout=30 -o ControlPath=/home/stack/.an
sible/cp/ca32d1049e 10.100.4.7 '/bin/sh -c '"'"'rm -f -r /home/heat-admin/.ansible/tmp/ansible-tmp-1629824478.3189409-228106-213914436967091/ > /dev/null 2>&1 && sleep 0'"'"''
2021-08-24 18:01:18.371725 | 52540075-9baf-8232-d4fa-0000000000a0 |      FATAL | Render network_config from template | computehci-1 | error={
    "censored": "the output has been hidden due to the fact that 'no_log: true' was specified for this result",
    "changed": false
}
2021-08-24 18:01:18.372773 | 52540075-9baf-8232-d4fa-0000000000a0 |     TIMING | tripleo_network_config : Render network_config from template | computehci-1 | 0:00:45.837722 | 0.19s      
2021-08-24 18:01:18.373749 | 52540075-9baf-8232-d4fa-0000000000a0 |      FATAL | Render network_config from template | computehci-0 | error={
    "censored": "the output has been hidden due to the fact that 'no_log: true' was specified for this result",
    "changed": false
}

2021-08-24 18:01:18.374225 | 52540075-9baf-8232-d4fa-0000000000a0 |     TIMING | tripleo_network_config : Render network_config from template | computehci-0 | 0:00:45.839181 | 0.20s      
<10.100.4.13> rc=0, stdout and stderr censored due to no log
<10.100.4.10> rc=0, stdout and stderr censored due to no log
<10.100.4.23> rc=0, stdout and stderr censored due to no log
2021-08-24 18:01:18.385393 | 52540075-9baf-8232-d4fa-0000000000a0 |      FATAL | Render network_config from template | controller-0 | error={
    "censored": "the output has been hidden due to the fact that 'no_log: true' was specified for this result",
    "changed": false
}
2021-08-24 18:01:18.385962 | 52540075-9baf-8232-d4fa-0000000000a0 |     TIMING | tripleo_network_config : Render network_config from template | controller-0 | 0:00:45.850915 | 0.19s      
2021-08-24 18:01:18.387075 | 52540075-9baf-8232-d4fa-0000000000a0 |      FATAL | Render network_config from template | controller-1 | error={                                              
    "censored": "the output has been hidden due to the fact that 'no_log: true' was specified for this result",                                                                            
    "changed": false
}
2021-08-24 18:01:18.387597 | 52540075-9baf-8232-d4fa-0000000000a0 |     TIMING | tripleo_network_config : Render network_config from template | controller-1 | 0:00:45.852553 | 0.18s      
2021-08-24 18:01:18.388389 | 52540075-9baf-8232-d4fa-0000000000a0 |      FATAL | Render network_config from template | computehci-2 | error={                                              
    "censored": "the output has been hidden due to the fact that 'no_log: true' was specified for this result",                                                                            
    "changed": false
}
2021-08-24 18:01:18.388902 | 52540075-9baf-8232-d4fa-0000000000a0 |     TIMING | tripleo_network_config : Render network_config from template | computehci-2 | 0:00:45.853857 | 0.20s
<10.100.4.7> rc=0, stdout and stderr censored due to no log
2021-08-24 18:01:18.399921 | 52540075-9baf-8232-d4fa-0000000000a0 |      FATAL | Render network_config from template | controller-2 | error={
    "censored": "the output has been hidden due to the fact that 'no_log: true' was specified for this result",
    "changed": false

Any ideas?
 

Le mar. 24 août 2021 à 15:28, Sandeep Yadav <sandeepggn93@gmail.com> a écrit :
>> Could you please change subnets names to be the same for your Controller and  ComputeHCI role say internal_api.

typo: Could you please change subnets names to be the same for your Controller and  ComputeHCI role say internal_api_subnet.


On Tue, Aug 24, 2021 at 7:48 PM Sandeep Yadav <sandeepggn93@gmail.com> wrote:
Hello,

To me it looks like the example shared in the documentation[1] is for leaf-spine arch.  

Currently, You have a different set of subnets under your Controller and  ComputeHCI role.

Taking internal_api reference from your baremetal_deployment.yaml
~~~
     - network: internal_api
       subnet: internal_api_subnet01 >>>
.
.      
     - network: internal_api
       subnet: internal_api_subnet02 >>>>
~~~

If leaf-spine arch is not what you want, Could you please change subnets names to be the same for your Controller and  ComputeHCI role say internal_api.

Also, I am assuming you are following documentation [2], For "openstack overcloud network provision" command also make sure your networks/subnets names in network_data.yaml (sample ref[3]) are consistent with what you as wish to do in baremetal_deployment.yaml

[1] https://opendev.org/openstack/tripleo-heat-templates/src/branch/master/network-data-samples/default-network-isolation.yaml
[2] https://docs.openstack.org/project-deploy-guide/tripleo-docs/latest/deployment/network_v2.html#provision-baremetal-instances
[3] https://opendev.org/openstack/tripleo-heat-templates/src/branch/master/network-data-samples/default-network-isolation.yaml

On Tue, Aug 24, 2021 at 5:20 PM wodel youchi <wodel.youchi@gmail.com> wrote:
Hi again,

Here is the error I am getting when trying to generate the overcloud-baremetal-deployed.yaml file :
CMD : openstack overcloud node provision --stack overcloud --network-config --output ~/templates/overcloud-baremetal-deployed.yaml ~/templates/baremetal_deployment.yaml

Error :
The full traceback is:                                                                                                                                                        
  File "/tmp/ansible_tripleo_overcloud_network_ports_payload_xszb9ooz/ansible_tripleo_overcloud_network_ports_payload.zip/ansible/modules/tripleo_overcloud_network_ports.py",
line 601, in run_module                                                                                                      
  File "/tmp/ansible_tripleo_overcloud_network_ports_payload_xszb9ooz/ansible_tripleo_overcloud_network_ports_payload.zip/ansible/modules/tripleo_overcloud_network_ports.py",
line 494, in manage_instances_ports
  File "/usr/lib64/python3.6/concurrent/futures/thread.py", line 56, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/tmp/ansible_tripleo_overcloud_network_ports_payload_xszb9ooz/ansible_tripleo_overcloud_network_ports_payload.zip/ansible/modules/tripleo_overcloud_network_ports.py",
line 385, in _provision_ports
  File "/tmp/ansible_tripleo_overcloud_network_ports_payload_xszb9ooz/ansible_tripleo_overcloud_network_ports_payload.zip/ansible/modules/tripleo_overcloud_network_ports.py",
line 319, in generate_port_defs

    ],
                        "template": "/home/stack/templates/nic-configs/bonds_vlans.j2"
                    },
                    "networks": [
                        {
                            "network": "external",
                            "subnet": "external_subnet"
                        },
                        {
                            "network": "internal_api",
                            "subnet": "internal_api_subnet01"
                        },
                        {
                            "network": "storage",
                            "subnet": "storage_subnet01"
                        },
                        {
                            "network": "storage_mgmt",
                            "subnet": "storage_mgmt_subnet01"
                        },
                        {                                                                                                                                            [215/1899]
                            "network": "tenant",
                            "subnet": "tenant_subnet01"
                        },
                        {
                            "network": "ctlplane",
                            "vif": true
                        }
                    ],
                    "nics": [
                        {
                            "network": "ctlplane"
                        }
                    ],
                    "ssh_public_keys": "ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAAAgQDdFv9qwUs3x6egY5Xke3gh2O8CnXTJ2h2jRpWYEFzL1fyZrMKykMBUEfbkQGYzONsE29/BpS265Df4RgZB3eHx4KWcaskSwjl
DaUzxP0ZsSl2MzxtDIqE3UTrsmivNGx0ungcTorOc96V9daqU/Vu2HU8J+YEA6+OjddPX1ngz/w== root@undercloud.umaitek.dz ",
                    "user_name": "heat-admin"
                },
                {
                    "capabilities": {
                        "profile": "computeHCI"
                    },
                    "config_drive": {
                        "meta_data": {
                            "instance-type": "ComputeHCI"
                        }
                    },
                    "hostname": "computehci-0",
                    "image": {
                        "href": "file:///var/lib/ironic/images/overcloud-full.raw",
                        "kernel": "file:///var/lib/ironic/images/overcloud-full.vmlinuz",
                        "ramdisk": "file:///var/lib/ironic/images/overcloud-full.initrd"
                    },
                    "network_config": {
                        "template": "/home/stack/templates/nic-configs/bonds_vlans.j2"
                    },
                    "networks": [
                            "network": "internal_api",
                            "subnet": "internal_api_subnet02"
                        },
                        {
                            "network": "tenant",
                            "subnet": "tenant_subnet02"
                        },
                        {
                            "network": "storage",
                            "subnet": "storage_subnet02"
                        },
                        {
                            "network": "storage_mgmt",
                            "subnet": "storage_mgmt_subnet02"
                   2021-08-24 10:21:18.492374 | 52540075-9baf-0191-8598-000000000019 |      FATAL | Provision instance network ports | localhost | error={
    "changed": true,
    "error": "'internal_api_subnet02'",
    "invocation": {
        "module_args": {
            "api_timeout": null,
            "auth": null,
            "auth_type": null,
            "availability_zone": null,
            "ca_cert": null,
            "client_cert": null,
            "client_key": null,
                        "concurrency": 2,
            "hostname_role_map": {
                "computehci-0": "ComputeHCI",
                "controller-0": "Controller"
            },
...
...
...
            "provisioned_instances": [                                                                                                                                [38/1899]
                {
                    "hostname": "controller-0",
                    "id": "1dff400f-0dd1-4eb0-b4c1-84397d387a4a",
                    "name": "controller0"
                },
                {
                    "hostname": "computehci-0",
                    "id": "3d6c399f-53b7-472b-b784-67193a485e43",
                    "name": "computeHCI0"
                }
            ],
            "region_name": null,
            "stack_name": "overcloud",
            "state": "present",
            "timeout": 180,
            "validate_certs": null,
            "wait": true
        }
    },
    "msg": "Error managing network ports 'internal_api_subnet02'",
    "node_port_map": {},
    "success": false
}

2021-08-24 10:21:18.494473 | 52540075-9baf-0191-8598-000000000019 |     TIMING | Provision instance network ports | localhost | 0:04:58.315416 | 3.72s                        

NO MORE HOSTS LEFT *************************************************************

PLAY RECAP *********************************************************************
localhost                  : ok=10   changed=3    unreachable=0    failed=1    skipped=5    rescued=0    ignored=0
2021-08-24 10:21:18.498948 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Summary Information ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2021-08-24 10:21:18.499338 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Total Tasks: 16         ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2021-08-24 10:21:18.499755 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Elapsed Time: 0:04:58.320717 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2021-08-24 10:21:18.500105 |                                 UUID |       Info |       Host |   Task Name |   Run Time
2021-08-24 10:21:18.500449 | 52540075-9baf-0191-8598-000000000017 |    SUMMARY |  localhost | Provision instances | 285.25s
2021-08-24 10:21:18.500868 | 52540075-9baf-0191-8598-000000000014 |    SUMMARY |  localhost | Reserve instances | 6.08s
2021-08-24 10:21:18.501228 | 52540075-9baf-0191-8598-000000000019 |    SUMMARY |  localhost | Provision instance network ports | 3.72s
2021-08-24 10:21:18.501588 | 52540075-9baf-0191-8598-000000000013 |    SUMMARY |  localhost | Find existing instances | 1.52s
2021-08-24 10:21:18.501944 | 52540075-9baf-0191-8598-000000000012 |    SUMMARY |  localhost | Expand roles | 0.92s
2021-08-24 10:21:18.502281 | 52540075-9baf-0191-8598-00000000000c |    SUMMARY |  localhost | stat overcloud-full.raw | 0.26s
2021-08-24 10:21:18.502706 | 52540075-9baf-0191-8598-00000000000d |    SUMMARY |  localhost | stat overcloud-full.initrd | 0.19s
2021-08-24 10:21:18.503053 | 52540075-9baf-0191-8598-00000000000e |    SUMMARY |  localhost | Set file based default image | 0.04s
2021-08-24 10:21:18.503419 | 52540075-9baf-0191-8598-000000000018 |    SUMMARY |  localhost | Metalsmith log for provision instances | 0.04s
2021-08-24 10:21:18.503806 | 52540075-9baf-0191-8598-000000000016 |    SUMMARY |  localhost | Set concurrency fact | 0.04s
2021-08-24 10:21:18.504139 | 52540075-9baf-0191-8598-000000000015 |    SUMMARY |  localhost | Metalsmith log for reserve instances | 0.04s
2021-08-24 10:21:18.504469 | 52540075-9baf-0191-8598-00000000000f |    SUMMARY |  localhost | Set whole-disk file based default image | 0.03s
2021-08-24 10:21:18.504849 | 52540075-9baf-0191-8598-000000000010 |    SUMMARY |  localhost | Set glance based default image | 0.03s
2021-08-24 10:21:18.505246 | 52540075-9baf-0191-8598-000000000009 |    SUMMARY |  localhost | fail | 0.03s
2021-08-24 10:21:18.505627 | 52540075-9baf-0191-8598-000000000008 |    SUMMARY |  localhost | fail | 0.03s
2021-08-24 10:21:18.505987 | 52540075-9baf-0191-8598-00000000000a |    SUMMARY |  localhost | fail | 0.02s
2021-08-24 10:21:18.506315 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ End Summary Information ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2021-08-24 10:21:18.506693 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ State Information ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2021-08-24 10:21:18.507032 | ~~~~~~~~~~~~~~~~~~ Number of nodes which did not deploy successfully: 1 ~~~~~~~~~~~~~~~~~
2021-08-24 10:21:18.507351 |  The following node(s) had failures: localhost
2021-08-24 10:21:18.507720 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Temporary directory [ /tmp/tripleob9lxg9vi ] cleaned up
Ansible execution failed. playbook: /usr/share/ansible/tripleo-playbooks/cli-overcloud-node-provision.yaml, Run Status: failed, Return Code: 2
Temporary directory [ /tmp/tripleoyso22wsn ] cleaned up
Exception occured while running the command
Traceback (most recent call last):
  File "/usr/lib/python3.6/site-packages/tripleoclient/command.py", line 34, in run
    super(Command, self).run(parsed_args)
  File "/usr/lib/python3.6/site-packages/osc_lib/command/command.py", line 39, in run
    return super(Command, self).run(parsed_args)
  File "/usr/lib/python3.6/site-packages/cliff/command.py", line 185, in run
    return_code = self.take_action(parsed_args) or 0
  File "/usr/lib/python3.6/site-packages/tripleoclient/v2/overcloud_node.py", line 323, in take_action
    extra_vars=extra_vars,
  File "/usr/lib/python3.6/site-packages/tripleoclient/utils.py", line 724, in run_ansible_playbook
    raise RuntimeError(err_msg)
RuntimeError: Ansible execution failed. playbook: /usr/share/ansible/tripleo-playbooks/cli-overcloud-node-provision.yaml, Run Status: failed, Return Code: 2
Ansible execution failed. playbook: /usr/share/ansible/tripleo-playbooks/cli-overcloud-node-provision.yaml, Run Status: failed, Return Code: 2
clean_up ProvisionNode: Ansible execution failed. playbook: /usr/share/ansible/tripleo-playbooks/cli-overcloud-node-provision.yaml, Run Status: failed, Return Code: 2
END return value: 1

Here is my baremetal_deployment.yaml file :
- name: Controller
 count: 1
 hostname_format: controller-%index%
 defaults:
   profile: control
   networks:
     - network: external
       subnet: external_subnet
     - network: internal_api
       subnet: internal_api_subnet01
     - network: storage
       subnet: storage_subnet01
     - network: storage_mgmt
       subnet: storage_mgmt_subnet01
     - network: tenant
       subnet: tenant_subnet01
   network_config:
     template: /home/stack/templates/nic-configs/bonds_vlans.j2
     default_route_network:
       - external
- name: ComputeHCI
 count: 1
 hostname_format: computehci-%index%
 defaults:
   profile: computeHCI
   networks:
     - network: internal_api
       subnet: internal_api_subnet02
     - network: tenant
       subnet: tenant_subnet02
     - network: storage
       subnet: storage_subnet02
     - network: storage_mgmt
       subnet: storage_mgmt_subnet02
   network_config:
     template: /home/stack/templates/nic-configs/bonds_vlans.j2


If someone can help me and point me where to look.
Any help will be appreciated.

Regards.

Le lun. 23 août 2021 à 12:48, wodel youchi <wodel.youchi@gmail.com> a écrit :
Hi,
I am trying to deploy openstack Wallaby.
I need some help to understand the meaning of this file "Baremetal Provision Configuration"
Here is the example given in the documentation :

- name: Controller
  count: 3
  defaults:
    networks:
    - network: ctlplane
      subnet: ctlplane-subnet
      vif: true
    - network: external
      subnet: external_subnet
    - network: internalapi
      subnet: internal_api_subnet01
    - network: storage
      subnet: storage_subnet01
    - network: storagemgmt
      subnet: storage_mgmt_subnet01
    - network: tenant
      subnet: tenant_subnet01
    network_config:
      template: /home/stack/nic-config/controller.j2
      default_route_network:
      - external
- name: Compute
  count: 100
  defaults:
    networks:
    - network: ctlplane
      subnet: ctlplane-subnet
      vif: true
    - network: internalapi
      subnet: internal_api_subnet02
    - network: tenant
      subnet: tenant_subnet02
    - network: storage
      subnet: storage_subnet02
    network_config:
      template: /home/stack/nic-config/compute.j2
- name: Controller
  count: 1
  hostname_format: controller-%index%
  ansible_playbooks:
    - playbook: bm-deploy-playbook.yaml
  defaults:
    profile: control
    networks:
      - network: external
        subnet: external_subnet
      - network: internal_api
        subnet: internal_api_subnet01
      - network: storage
        subnet: storage_subnet01
      - network: storage_mgmt
        subnet: storage_mgmt_subnet01
      - network: tenant
        subnet: tenant_subnet01
    network_config:
      template: templates/multiple_nics/multiple_nics_dvr.j2
      default_route_network:
        - external
- name: Compute
  count: 1
  hostname_format: compute-%index%
  ansible_playbooks:
    - playbook: bm-deploy-playbook.yaml
  defaults:
    profile: compute-leaf2
    networks:
      - network: internal_api
        subnet: internal_api_subnet02
      - network: tenant
        subnet: tenant_subnet02
      - network: storage
        subnet: storage_subnet02
    network_config:
      template: templates/multiple_nics/multiple_nics_dvr.j2

My questions :
1 - Does the name of the network have to match the name of the network (name_lower) in network_data.yaml? because there is an underscore missing in the first example
2 - What is the meaning of the numbers in the subnet name of the network "internal_api_subnet01 for controllers and internal_api_subnet02 for compute nodes" ? why a different number? What is the meaning of it?

I have tried to create the overcloud-baremetal-deployed.yaml file several times, and every time I get errors here.

Regards.