[ironic] [infra] Making Glean work with IPA for static IP assignment

Dmitry Tantsur dtantsur at redhat.com
Wed Mar 17 15:52:10 UTC 2021


Hi,

Getting back to this, sorry for the delay. Yes, I'm pretty sure it's
NetworkManager, not something else. Here are relevant parts of boot logs
from a recent runs:

[   63.613821] NetworkManager[244]: <info>  [1615995259.7778]
NetworkManager (version 1.26.0-12.el8_3) is starting... (for the first time)
[   71.637264] systemd[1]: Starting Glean for interface enp1s0 with
NetworkManager...
         Starting Glean for interface enp1s0 with NetworkManager...
[   77.622901] glean.sh[327]: mount: /mnt/config: /dev/sr0 already mounted
on /mnt/config.

!!! As you see, Glean starts quite early, but then... !!!

[   92.699494] NetworkManager[244]: <info>  [1615995288.9848] manager:
(lo): new Generic device (/org/freedesktop/NetworkManager/Devices/1)
[   93.040232] NetworkManager[244]: <info>  [1615995289.3256] manager:
(enp1s0): new Ethernet device (/org/freedesktop/NetworkManager/Devices/2)
[   94.434450] NetworkManager[244]: <info>  [1615995290.7198] device
(enp1s0): state change: unmanaged -> unavailable (reason 'managed',
sys-iface-state: 'external')
[   94.713545] NetworkManager[244]: <info>  [1615995290.9986] device
(enp1s0): carrier: link connected
[   96.487825] NetworkManager[244]: <info>  [1615995292.7699] device
(enp1s0): state change: unavailable -> disconnected (reason 'none',
sys-iface-state: 'managed')
[   96.712608] NetworkManager[244]: <info>  [1615995292.9979] policy:
auto-activating connection 'Wired connection 1'
(cabef811-9cf9-3d92-9391-95712a3d3481)

!!! This auto-activation triggers DHCP !!!

[   97.789768] NetworkManager[244]: <info>  [1615995294.0750] device
(enp1s0): state change: config -> ip-config (reason 'none',
sys-iface-state: 'managed')
[   98.084735] NetworkManager[244]: <info>  [1615995294.3699] dhcp4
(enp1s0): activation: beginning transaction (timeout in 30 seconds)
[   98.303574] NetworkManager[244]: <info>  [1615995294.5883] dhcp4
(enp1s0): dhclient started with pid 382
[  108.882870] NetworkManager[244]: <info>  [1615995305.0369] dhcp4
(enp1s0):   address 192.168.122.105

!!! 10 seconds later we have the IP address configured !!!

[  126.636082] glean.sh[326]: DEBUG:glean:Starting glean
[  127.885587] glean.sh[326]: DEBUG:glean:Only considering interface enp1s0
from arguments
[  127.908001] glean.sh[326]: DEBUG:glean:Interface matched: enp1s0
(52:54:00:9e:b1:16)
[  127.920045] glean.sh[326]: DEBUG:glean:52:54:00:9e:b1:16 configured via
config-drive
[  128.635484] systemd[1]: Started Glean for interface enp1s0 with
NetworkManager.

!!! 20 seconds later (it's a nested VM, everything is slow) glean actually
kicks in !!!

[  130.752564] systemd[1]: Reached target Network is Online.

At this point the IP address is from DHCP, not from Glean.

Any ideas?

Dmitry

On Thu, Nov 26, 2020 at 2:20 AM Ian Wienand <iwienand at redhat.com> wrote:

> On Wed, Nov 25, 2020 at 11:54:13AM +0100, Dmitry Tantsur wrote:
> >
> > # systemd-analyze critical-chain
> > multi-user.target @2min 6.301s
> > └─tuned.service @1min 32.273s +34.024s
> >   └─network.target @1min 31.590s
> >     └─network-pre.target @1min 31.579s
> >       └─glean at enp1s0.service @36.594s +54.952s
> >         └─system-glean.slice @36.493s
> >           └─system.slice @4.083s
> >             └─-.slice @4.080s
> >
> > # systemd-analyze critical-chain NetworkManager.service
> > NetworkManager.service +9.287s
> > └─network-pre.target @1min 31.579s
> >   └─glean at enp1s0.service @36.594s +54.952s
> >     └─system-glean.slice @36.493s
> >       └─system.slice @4.083s
> >         └─-.slice @4.080s
>
> > It seems that the ordering is correct and the interface service is
> > executed, but the IP address is nonetheless wrong.
>
> I agree, this seems to say to me that NetworkManager should run after
> network.pre-target, and glean at enp1s0 should be running before it.
>
> The glean at enp1s0.service is set as oneshot [1] which should prevent
> network-pre.target being reached until it exits:
>
>  oneshot ... [the] service manager will consider the unit up after the
>  main process exits. It will then start follow-up units.
>
> To the best of my knowledge the dependencies are correct; but if you
> go through the "git log" of the project you can find some history of
> us thinking ordering was correct and finding issues.
>
> > Can it be related to how long glean takes to run in my case (54 seconds
> vs
> > 1 second in your case)?
>
> The glean script doesn't run asynchronously in any way (at least not
> on purpose!).  I can't see any way it could exit before the ifcfg file
> is written out.
>
> > # cat /etc/sysconfig/network-scripts/ifcfg-enp1s0
> ...
>
> The way NM support works is writing out this file which is read by the
> NM ifcfg-rh plugin [2].  AFAIK that's built-in to NM so would not be
> missing, and I think you'd have to go to effort to manually edit
> /etc/NetworkManager/conf.d/99-main-plugins.conf to have it ignored.
>
> I'm afraid that's overall not much help.  Are you sure there isn't an
> errant dhclient running somehow that grabs a different address?  Does
> it get the correct address on reboot; implying the ifcfg- file is read
> correctly but somehow isn't in place before NetworkManager starts?
>
> -i
>
> [1]
> https://opendev.org/opendev/glean/src/branch/master/glean/init/glean-nm@.service#L13
> [2]
> https://developer.gnome.org/NetworkManager/stable/nm-settings-ifcfg-rh.html
>
>

-- 
Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn,
Commercial register: Amtsgericht Muenchen, HRB 153243,
Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael
O'Neill
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20210317/64b0d93c/attachment.html>


More information about the openstack-discuss mailing list