[Kolla][Kolla-Ansible] Ironic Node Cleaning Failed

Anirudh Gupta anyrude10 at gmail.com
Tue Aug 3 15:02:47 UTC 2021


Hi Dmitry,

I might be wrong, but as per my understanding if there would be an issue in
dnsmasq, then IP 20.20.20.10 would not have been assigned to the machine.

TCPDUMP logs are as below:

20:16:58.938089 IP controller.bootps > 255.255.255.255.bootpc: BOOTP/DHCP,
Reply, length 312
20:17:02.765291 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP,
Request from 98:f2:b3:3f:72:e5 (oui Unknown), length 359
20:17:02.766303 IP controller.bootps > 255.255.255.255.bootpc: BOOTP/DHCP,
Reply, length 312
20:17:26.944378 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP,
Request from 98:f2:b3:3f:72:e5 (oui Unknown), length 347
20:17:26.944756 IP controller.bootps > 255.255.255.255.bootpc: BOOTP/DHCP,
Reply, length 312
20:17:30.763627 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP,
Request from 98:f2:b3:3f:72:e5 (oui Unknown), length 359
20:17:30.764620 IP controller.bootps > 255.255.255.255.bootpc: BOOTP/DHCP,
Reply, length 312
20:17:54.938791 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP,
Request from 98:f2:b3:3f:72:e5 (oui Unknown), length 347

Also the neutron dnsmasq logs and ironic inspector logs are attached in the
mail.

Regards
Anirudh Gupta


On Tue, Aug 3, 2021 at 7:29 PM Dmitry Tantsur <dtantsur at redhat.com> wrote:

> Hi,
>
> You need to check the dnsmasq logs (there are two dnsmasqs: from neutron
> and from ironic-inspector). tcpdump may also help to determine where the
> packages are lost.
>
> Dmitry
>
> On Fri, Jul 30, 2021 at 10:29 PM Anirudh Gupta <anyrude10 at gmail.com>
> wrote:
>
>> Hi Dmitry
>>
>> Thanks for your time.
>>
>> My system is getting IP 20.20.20.10 which is in the range defined in
>> ironic_dnsmasq_dhcp_range field under globals.yml file.
>>
>> ironic_dnsmasq_dhcp_range: "20.20.20.10,20.20.20.100"
>>
>> And in the cleaning network (public1), the range defined is
>> 20.20.20.150-20.20.20.200
>>
>> As per my understanding, these 2 ranges should be mutually exclusive.
>>
>> Please suggest if my understanding is not correct.
>>
>> Any suggestions what should I do to resolve this issue?
>>
>> Regards
>> Anirudh Gupta
>>
>>
>> On Sat, 31 Jul, 2021, 12:06 am Dmitry Tantsur, <dtantsur at redhat.com>
>> wrote:
>>
>>>
>>>
>>> On Thu, Jul 29, 2021 at 6:05 PM Anirudh Gupta <anyrude10 at gmail.com>
>>> wrote:
>>>
>>>> Hi Team,
>>>>
>>>> In  to the email below, I have some updated information:-
>>>>
>>>> Earlier the allocation range mentioned in "*ironic_dnsmasq_dhcp_range*"
>>>> in globals.yml had an overlapping range with the cleaning network, due to
>>>> which there was some issue in receiving the DHCP request
>>>>
>>>> After creating a cleaning network with a separate allocation range, I
>>>> am successfully getting IP allocated to my Baremetal Node
>>>>
>>>>    - openstack subnet create subnet1 --network public1 --subnet-range
>>>>    20.20.20.0/24 --allocation-pool start=20.20.20.150,end=20.20.20.200
>>>>    --ip-version=4  --gateway=20.20.20.1 --dhcp
>>>>
>>>>
>>>> [image: image.png]
>>>>
>>>> After getting the IP, there is no further action on the node. From "
>>>> *clean_wait*", it goes into "*clean_failed*" state after around half
>>>> an hour.
>>>>
>>>
>>> The IP address is not from the cleaning range, it may come from
>>> inspection. You probably need to investigate your network topology, maybe
>>> use tcpdump.
>>>
>>> Unfortunately, I'm not fluent in Kolla to say if it can be a bug or not.
>>>
>>> Dmitry
>>>
>>>
>>>>
>>>> On verifying the logs, I could see the below error messages
>>>>
>>>>
>>>>    - In */var/log/kolla/ironic/ironic-conductor.log*, we observed the
>>>>    following error:
>>>>
>>>> ERROR ironic.conductor.utils [-] Cleaning for node
>>>> 3a56748e-a8ca-4dec-a332-ace18e6d494e failed. *Timeout reached while
>>>> cleaning the node. Please check if the ramdisk responsible for the cleaning
>>>> is running on the node. Failed on step {}.*
>>>>
>>>>
>>>> Note : For Cleaning the node, we have used the below images
>>>>
>>>>
>>>>
>>>> https://tarballs.openstack.org/ironic-python-agent/dib/files/ipa-centos8-master.kernel
>>>>
>>>>
>>>> https://tarballs.openstack.org/ironic-python-agent/dib/files/ipa-centos8-master.initramfs
>>>>
>>>>
>>>>    - In /var/log/kolla/nova/nova-compute-ironic.log, we observed the
>>>>    error
>>>>
>>>> ERROR nova.compute.manager [req-810ffedf-3343-471c-94db-85411984e6cc -
>>>> - - - -] No compute node record for host controller-ironic:
>>>> nova.exception_Remote.ComputeHostNotFound_Remote: Compute host
>>>> controller-ironic could not be found.
>>>>
>>>>
>>>> Can someone please help in this regard?
>>>>
>>>> Regards
>>>> Anirudh Gupta
>>>>
>>>>
>>>> On Tue, Jul 27, 2021 at 12:52 PM Anirudh Gupta <anyrude10 at gmail.com>
>>>> wrote:
>>>>
>>>>> Hi Team,
>>>>>
>>>>> We have deployed 2 node kolla ansible *12.0.0* in order to deploy
>>>>> openstack *wallaby* release. We have also enabled ironic in order to
>>>>> provision the bare metal nodes.
>>>>>
>>>>> On each server we have 3 nics
>>>>>
>>>>>    - *eno1* - OAM for external connectivity and endpoint's publicURL
>>>>>    - *eno2* - Mgmt for internal communication between various
>>>>>    openstack services.
>>>>>    - *ens2f0* - Data Interface
>>>>>
>>>>>
>>>>> Corresponding to this we have defined the following fields in
>>>>> globals.yml
>>>>>
>>>>>
>>>>>    - kolla_base_distro: "centos"
>>>>>    - kolla_install_type: "source"
>>>>>    - openstack_release: "wallaby"
>>>>>    - network_interface: "eno2"                               # MGMT
>>>>>    interface
>>>>>    - kolla_external_vip_interface: "eno1"               # OAM
>>>>>    Interface
>>>>>    - kolla_internal_vip_address: "192.168.10.3"    # MGMT Subnet free
>>>>>    ip
>>>>>    - kolla_external_vip_address: "10.0.1.136"       # OAM subnet free
>>>>>    IP
>>>>>    - neutron_external_interface: "ens2f0"             # Data Interface
>>>>>    - enable_neutron_provider_networks: "yes"
>>>>>
>>>>> Note: Only relevant fields are being shown in this query
>>>>>
>>>>> Also, for ironic following fields have been defined in globals.yml
>>>>>
>>>>>    - enable_ironic: "yes"
>>>>>    - enable_ironic_neutron_agent: "{{ enable_neutron | bool and
>>>>>    enable_ironic | bool }}"
>>>>>    - enable_horizon_ironic: "{{ enable_ironic | bool }}"
>>>>>    - ironic_dnsmasq_interface: "*ens2f0*"                       #
>>>>>    Data interface
>>>>>    - ironic_dnsmasq_dhcp_range: "20.20.20.10,20.20.20.100"
>>>>>    - ironic_dnsmasq_boot_file: "pxelinux.0"
>>>>>    - ironic_cleaning_network: "public1"
>>>>>    - ironic_dnsmasq_default_gateway: "20.20.20.1"
>>>>>
>>>>>
>>>>> After successful deployment, a flat provider network with the name
>>>>> public1 is being created in openstack using the below commands:
>>>>>
>>>>>
>>>>>    - openstack network create public1 --provider-network-type flat
>>>>>    --provider-physical-network physnet1
>>>>>    - openstack subnet create subnet1 --network public1 --subnet-range
>>>>>    20.20.20.0/24 --allocation-pool start=20.20.20.10,end=20.20.20.100
>>>>>    --ip-version=4  --gateway=20.20.20.1 --dhcp
>>>>>
>>>>>
>>>>> Issue/Queries:
>>>>>
>>>>>
>>>>>    - Is the configuration done in globals.yml correct or is there
>>>>>    anything else that needs to be done in order to separate control and data
>>>>>    plane traffic?
>>>>>
>>>>>
>>>>>    - Also I have set automated_cleaning as "true" in ironic-conductor
>>>>>    conatiner settings.But after creating the baremetal node, we run "node
>>>>>    manage" command which runs successfully. Running "*openstack
>>>>>    baremetal node provide <node id>"* command powers on the machine,
>>>>>    sets the boot mode on Network Boot but no DHCP request for that particular
>>>>>    mac is obtained on the controller. Is there anything I am missing that
>>>>>    needs to be done in order to make ironic work?
>>>>>
>>>>> Note: I have also verified that the nic is PXE enabled in system
>>>>> configuration setting
>>>>>
>>>>> Regards
>>>>> Anirudh Gupta
>>>>>
>>>>>
>>>>>
>>>
>>> --
>>> Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn,
>>> Commercial register: Amtsgericht Muenchen, HRB 153243,
>>> Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael
>>> O'Neill
>>>
>>
>
> --
> Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn,
> Commercial register: Amtsgericht Muenchen, HRB 153243,
> Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael
> O'Neill
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20210803/e9475e90/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 38285 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20210803/e9475e90/attachment-0001.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ironic-inspector.log
Type: application/octet-stream
Size: 80891 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20210803/e9475e90/attachment-0002.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: neutron_dnsmasq.log
Type: application/octet-stream
Size: 2720 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20210803/e9475e90/attachment-0003.obj>


More information about the openstack-discuss mailing list