[ironic] [infra] Making Glean work with IPA for static IP assignment

Dmitry Tantsur dtantsur at redhat.com
Thu Mar 18 11:18:25 UTC 2021


Ian, Jay,


On Thu, Mar 18, 2021 at 6:12 AM Ian Wienand <iwienand at redhat.com> wrote:

> On Wed, Mar 17, 2021 at 04:52:10PM +0100, Dmitry Tantsur wrote:
> > [   63.613821] NetworkManager[244]: <info>  [1615995259.7778]
> NetworkManager (version 1.26.0-12.el8_3) is starting... (for the first time)
> > [   71.637264] systemd[1]: Starting Glean for interface enp1s0 with
>
> > Any ideas?
>
> That seems to say that the NetworkManager daemon is starting before
> glean.sh.
>
> My NetworkManager /usr/lib/systemd/system/NetworkManager.service has
>
>   [Unit]
>   Description=Network Manager
>   Documentation=man:NetworkManager(8)
>   Wants=network.target
>   After=network-pre.target dbus.service
>

I have this too.


>   Before=network.target network.service
>
> The glean service
>
> https://opendev.org/opendev/glean/src/branch/master/glean/init/glean@.service
> has
>
>  [Unit]
>  Description=Glean for interface %I
>  DefaultDependencies=no
>  Before=network-pre.target
>  Wants=network-pre.target
>  ...
>  [Service]
>  Type=oneshot
>
> It feels like we're really doing out best to tell NetworkManager to
> start after network-pre.target and glean to start before it.
>
> The service is "oneshot", doesn't exit until it is finished, and has
> no timeout, so I don't see how network-pre can become active before
> glean at .service finishes?
>
> Can you run with "debug" on the kernel command-line, to maybe see why
> it chose to start NM?  Can you dump "systemd-analyze" plot maybe?  I
> know we looked at the dependency chain previously and it seemed OK ...
>

I think systemd ordering is of no use here. What I suspect is happening is
NetworkManager starting to start before udev inserts glean-nm@ services.

The issue with network-pre is similar. It does not finish before glean-nm@
starts, but it does finish long after NetworkManager. The explanation I can
come up with is the following: network-pre is a passive target, it does not
fire until something requests it. glean-nm@ requests it with
Wants=network-pre, but at this point NetworkManager is already starting, so
its After=network-pre (without Wants, as intended) does not have an effect.

These are pure speculations at this point, but that's all I have.

What I'm considering now to fix Glean is an additional systemd service that
will start glean without arguments (i.e. for all interfaces that are
already up) very early, maybe explicitly Before=NetworkManager. Since it
will be a normal service, not one inserted by udev, the ordering will work
correctly.


>
> As you've seen with
>
>  https://review.opendev.org/c/opendev/glean/+/781133
>  https://review.opendev.org/c/opendev/glean/+/781174
>
> there are certainly ways we can optimise glean more.  But I really
> would have thought these would just slow down the boot, not cause
> ordering issues...
>

Oh, and another thing: Glean has a lock that is interface-agnostic (i.e.
global). Which means that while it's processing the loopback interface, it
cannot be processing real interfaces. This forced serialization may
contribute to the slowness.

In the end, we may go down a different path in ironic-python-agent since we
may not really want Glean by default, only when configdrive is present. But
fixing Glean would be nice anyway.

Dmitry


>
> -i
>
>

-- 
Red Hat GmbH, https://de.redhat.com/ , Registered seat: Grasbrunn,
Commercial register: Amtsgericht Muenchen, HRB 153243,
Managing Directors: Charles Cachera, Brian Klemm, Laurie Krebs, Michael
O'Neill
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20210318/5158576d/attachment.html>


More information about the openstack-discuss mailing list