Hi,

Getting back to this, sorry for the delay. Yes, I'm pretty sure it's NetworkManager, not something else. Here are relevant parts of boot logs from a recent runs:

[   63.613821] NetworkManager[244]: <info>  [1615995259.7778] NetworkManager (version 1.26.0-12.el8_3) is starting... (for the first time)
[   71.637264] systemd[1]: Starting Glean for interface enp1s0 with NetworkManager...
         Starting Glean for interface enp1s0 with NetworkManager...
[   77.622901] glean.sh[327]: mount: /mnt/config: /dev/sr0 already mounted on /mnt/config.

!!! As you see, Glean starts quite early, but then... !!!

[   92.699494] NetworkManager[244]: <info>  [1615995288.9848] manager: (lo): new Generic device (/org/freedesktop/NetworkManager/Devices/1)
[   93.040232] NetworkManager[244]: <info>  [1615995289.3256] manager: (enp1s0): new Ethernet device (/org/freedesktop/NetworkManager/Devices/2)
[   94.434450] NetworkManager[244]: <info>  [1615995290.7198] device (enp1s0): state change: unmanaged -> unavailable (reason 'managed', sys-iface-state: 'external')
[   94.713545] NetworkManager[244]: <info>  [1615995290.9986] device (enp1s0): carrier: link connected
[   96.487825] NetworkManager[244]: <info>  [1615995292.7699] device (enp1s0): state change: unavailable -> disconnected (reason 'none', sys-iface-state: 'managed')
[   96.712608] NetworkManager[244]: <info>  [1615995292.9979] policy: auto-activating connection 'Wired connection 1' (cabef811-9cf9-3d92-9391-95712a3d3481)

!!! This auto-activation triggers DHCP !!!

[   97.789768] NetworkManager[244]: <info>  [1615995294.0750] device (enp1s0): state change: config -> ip-config (reason 'none', sys-iface-state: 'managed')
[   98.084735] NetworkManager[244]: <info>  [1615995294.3699] dhcp4 (enp1s0): activation: beginning transaction (timeout in 30 seconds)
[   98.303574] NetworkManager[244]: <info>  [1615995294.5883] dhcp4 (enp1s0): dhclient started with pid 382
[  108.882870] NetworkManager[244]: <info>  [1615995305.0369] dhcp4 (enp1s0):   address 192.168.122.105

!!! 10 seconds later we have the IP address configured !!!

[  126.636082] glean.sh[326]: DEBUG:glean:Starting glean
[  127.885587] glean.sh[326]: DEBUG:glean:Only considering interface enp1s0 from arguments
[  127.908001] glean.sh[326]: DEBUG:glean:Interface matched: enp1s0 (52:54:00:9e:b1:16)
[  127.920045] glean.sh[326]: DEBUG:glean:52:54:00:9e:b1:16 configured via config-drive
[  128.635484] systemd[1]: Started Glean for interface enp1s0 with NetworkManager.

!!! 20 seconds later (it's a nested VM, everything is slow) glean actually kicks in !!!

[  130.752564] systemd[1]: Reached target Network is Online.

At this point the IP address is from DHCP, not from Glean.

Any ideas?

Dmitry

On Thu, Nov 26, 2020 at 2:20 AM Ian Wienand <iwienand@redhat.com> wrote:
On Wed, Nov 25, 2020 at 11:54:13AM +0100, Dmitry Tantsur wrote:
>
> # systemd-analyze critical-chain
> multi-user.target @2min 6.301s
> └─tuned.service @1min 32.273s +34.024s
>   └─network.target @1min 31.590s
>     └─network-pre.target @1min 31.579s
>       └─glean@enp1s0.service @36.594s +54.952s
>         └─system-glean.slice @36.493s
>           └─system.slice @4.083s
>             └─-.slice @4.080s
>
> # systemd-analyze critical-chain NetworkManager.service
> NetworkManager.service +9.287s
> └─network-pre.target @1min 31.579s
>   └─glean@enp1s0.service @36.594s +54.952s
>     └─system-glean.slice @36.493s
>       └─system.slice @4.083s
>         └─-.slice @4.080s

> It seems that the ordering is correct and the interface service is
> executed, but the IP address is nonetheless wrong.

I agree, this seems to say to me that NetworkManager should run after
network.pre-target, and glean@enp1s0 should be running before it.

The glean@enp1s0.service is set as oneshot [1] which should prevent
network-pre.target being reached until it exits:

 oneshot ... [the] service manager will consider the unit up after the
 main process exits. It will then start follow-up units.

To the best of my knowledge the dependencies are correct; but if you
go through the "git log" of the project you can find some history of
us thinking ordering was correct and finding issues.

> Can it be related to how long glean takes to run in my case (54 seconds vs
> 1 second in your case)?

The glean script doesn't run asynchronously in any way (at least not
on purpose!).  I can't see any way it could exit before the ifcfg file
is written out.

> # cat /etc/sysconfig/network-scripts/ifcfg-enp1s0
...

The way NM support works is writing out this file which is read by the
NM ifcfg-rh plugin [2].  AFAIK that's built-in to NM so would not be
missing, and I think you'd have to go to effort to manually edit
/etc/NetworkManager/conf.d/99-main-plugins.conf to have it ignored.

I'm afraid that's overall not much help.  Are you sure there isn't an
errant dhclient running somehow that grabs a different address?  Does
it get the correct address on reboot; implying the ifcfg- file is read
correctly but somehow isn't in place before NetworkManager starts?

-i

[1] https://opendev.org/opendev/glean/src/branch/master/glean/init/glean-nm@.service#L13
[2] https://developer.gnome.org/NetworkManager/stable/nm-settings-ifcfg-rh.html



--
Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn,
Commercial register: Amtsgericht Muenchen, HRB 153243,
Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael O'Neill