[neutron][openstack-ansible] Instances can only connect to provider-net via tenant-net but not directly

Oliver Wenz oliver.wenz at dhbw-mannheim.de
Wed Nov 25 13:01:45 UTC 2020

Starting instances which are connected directly to a provider network
results in an error in my Ussuri cloud. However, I can associate
floating IPs for this provider network when I connect instances to it
via a tenant network and it works fine.
Both nova-compute and neutron-linuxbridge-agent services don't show
errors, I only get an error in the instance status:

'code': 500, 'created': '2020-11-25T12:05:42Z', 'message': 'Build of
instance 274e0a7d-fb33-430a-986e-74fceae6a36d aborted: Failed to
allocate the network(s), not rescheduling.', 'details': 'Traceback (most
recent call last):
line 6549, in _create_domain_and_network
  File "/usr/lib/python3.6/contextlib.py", line 88, in __exit__
line 513, in wait_for_instance_event
    actual_event = event.wait()
line 125, in wait
    result = hub.switch()
line 298, in switch
    return self.greenlet.switch()
eventlet.timeout.Timeout: 300 seconds

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
line 2378, in _build_and_run_instance
line 3683, in spawn
line 6572, in _create_domain_and_network
    raise exception.VirtualInterfaceCreateException()
nova.exception.VirtualInterfaceCreateException: Virtual Interface
creation failed

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
line 2200, in _do_build_and_run_instance
    filter_properties, request_spec, accel_uuids)
line 2444, in _build_and_run_instance
nova.exception.BuildAbortException: Build of instance
274e0a7d-fb33-430a-986e-74fceae6a36d aborted: Failed to allocate the
network(s), not rescheduling.

Any ideas what could cause this?

Kind regards,

On 2020-11-25 13:00, openstack-discuss-request at lists.openstack.org wrote:
> Send openstack-discuss mailing list submissions to
> 	openstack-discuss at lists.openstack.org
> To subscribe or unsubscribe via the World Wide Web, visit
> 	http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-discuss
> or, via email, send a message with subject or body 'help' to
> 	openstack-discuss-request at lists.openstack.org
> You can reach the person managing the list at
> 	openstack-discuss-owner at lists.openstack.org
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of openstack-discuss digest..."
> Today's Topics:
>    1. Re:
>       [nova][tripleo][rpm-packaging][kolla][puppet][debian][osa] Nova
>       enforces that no DB credentials are allowed for the nova-compute
>       service (Balázs Gibizer)
>    2. Re: [ironic] [infra] Making Glean work with IPA for static IP
>       assignment (Dmitry Tantsur)
> ----------------------------------------------------------------------
> Message: 1
> Date: Wed, 25 Nov 2020 11:13:23 +0100
> From: Balázs Gibizer <balazs.gibizer at est.tech>
> To: Thomas Goirand <zigo at debian.org>
> Cc: openstack maillist <openstack-discuss at lists.openstack.org>
> Subject: Re:
> 	[nova][tripleo][rpm-packaging][kolla][puppet][debian][osa] Nova
> 	enforces that no DB credentials are allowed for the nova-compute
> 	service
> Message-ID: <BEKCKQ.PYGQ12VO6AF23 at est.tech>
> Content-Type: text/plain; charset=iso-8859-1; format=flowed
> On Mon, Nov 23, 2020 at 13:47, Thomas Goirand <zigo at debian.org> wrote:
>> On 11/23/20 11:31 AM, Balázs Gibizer wrote:
>>>  It is still a security problem if nova-compute ignores the config 
>>> as the
>>>  config still exists on the hypervisor node (in some deployment 
>>> scenarios)
>> Let's say we apply the patch you're proposing, and that nova-compute
>> isn't loaded anymore with the db credentials, because it's on a 
>> separate
>> file, and nova-compute doesn't load it.
>> In such scenario, the /etc/nova/nova-db.conf could still be present 
>> with
>> db credentials filled-in. So, the patch you're proposing is still not
>> effective for wrong configuration of nova-compute hosts.
> Obviously we cannot prevent that the deployer stores the DB creds on a 
> compute host as we cannot detect it in general. But we can detect it if 
> it is stored in the config the nova-compute reads. I don't know why 
> should we not make sure to tell the deployer not to do that as it is 
> generally considered unsafe.
>>>  From the nova-compute perspective we might be able to
>>>  replace the [api_database]connection dependency with some hack. E.g 
>>> to
>>>  put the service name to the global CONF object at the start of the
>>>  service binary and depend on that instead of other part of the 
>>> config.
>>>  But I feel pretty bad about this hack.
>> Because of the above, I very much think it'd be the best way to go, 
>> but
>> I understand your point of view. Going to the /etc/nova/nova-db.conf 
>> and
>> nova-api-db.conf thing is probably good anyways.
>> As for the nova-conductor thing, I very much would prefer if we had a
>> clean and explicit "superconductor=true" directive, with possibly some
>> checks to display big warnings in the nova-conductor.log file in case 
>> of
>> a wrong configuration. If we don't have that, then at least things 
>> must
>> be extensively documented, because that's really not obvious what's
>> going on.
> I agree that superconductor=true would be a more explicit config option 
> than [api_database]connection. However this would also enforce that 
> deployers need a separate config file for nova-compute as there neither 
> superconductor=true nor superconductor=false (meaning it is a cell 
> conductor) make sense.
>> Cheers,
>> Thomas Goirand (zigo)
> ------------------------------
> Message: 2
> Date: Wed, 25 Nov 2020 11:54:13 +0100
> From: Dmitry Tantsur <dtantsur at redhat.com>
> To: Ian Wienand <iwienand at redhat.com>
> Cc: openstack-discuss <openstack-discuss at lists.openstack.org>
> Subject: Re: [ironic] [infra] Making Glean work with IPA for static IP
> 	assignment
> Message-ID:
> 	<CACNgkFwVhMMxVRK2PkFEkHORRwY4wY49g7G3CpzYwaFzC27Bjw at mail.gmail.com>
> Content-Type: text/plain; charset="utf-8"
> Hi,
> Thank you for your input!
> On Wed, Nov 25, 2020 at 3:09 AM Ian Wienand <iwienand at redhat.com> wrote:
>> On Tue, Nov 24, 2020 at 11:54:55AM +0100, Dmitry Tantsur wrote:
>>> The problem is, I cannot make Glean work with any ramdisk I build. The
>> crux
>>> of the problem seems to be that NetworkManager (used by default in RHEL,
>>> CentOS, Fedora and Debian at least) starts very early, creates the
>> default
>>> connection and ignores whatever files Glean happens to write afterwards.
>> On
>>> Debian running `systemctl restart networking` actually helped to pick the
>>> new configuration, but I'm not sure we want to do that in Glean. I
>> haven't
>>> been able to make NetworkManager pick up the changes on RH systems so
>> far.
>> So we do use NetworkManager in the OpenDev images, and we do not see
>> NetworkManager starting before glean.
> Okay, thanks for confirming. Maybe it's related to how IPA is built? It's
> not exactly a normal image after all, although it's pretty close to one.
>> The way it should work is that simple-init in dib installs glean to
>> the image.  That runs the glean install script (use --use-nm argument
>> if DIB_SIMPLE_INIT_NETWORKMANAGER, which is default on centos/fedora)
>> which installs two things; udev rules and a systemd handler.
> I have checked that these are installed, but I don't know how to verify a
> udev rule.
>> The udev is pretty simple [1] and should add a "Wants" for each net
>> device; e.g. eth1 would match and create a Wants glean at eth1.service,
>> which then matches [2] which should write out the ifcfg config file.
>> After this, NetworkManager should start, notice the config file for
>> the interface and bring it up.
> Yeah, I definitely see logging from NetworkManager DHCP before this service
> is run (i.e. before the output from Glean).
>>> Do you maybe have any hints how to proceed? I'd be curious to know how
>>> static IP assignment works in the infra setup. Do you have images with
>>> NetworkManager there? Do you use the simple-init element?
>> As noted yes we use this.  Really only in two contexts; it's Rackspace
>> that doesn't have DHCP so we have to setup the interface statically
>> from the configdrive data.  Other clouds all provide DHCP, which is
>> used when there's no configdrive data.
>> Here is a systemd-analyze from one of our Centos nodes if it helps:
>> graphical.target @18.403s
>> └─multi-user.target @18.403s
>>   └─unbound.service @5.467s +12.934s
>>     └─network.target @5.454s
>>       └─NetworkManager.service @5.339s +112ms
>>         └─network-pre.target @5.334s
>>           └─glean at ens3.service @4.227s +1.102s
>>             └─basic.target @4.167s
>>               └─sockets.target @4.166s
>>                 └─iscsiuio.socket @4.165s
>>                   └─sysinit.target @4.153s
>>                     └─systemd-udev-settle.service @1.905s +2.245s
>>                       └─systemd-udev-trigger.service @1.242s +659ms
>>                         └─systemd-udevd-control.socket @1.239s
>>                           └─system.slice
> # systemd-analyze critical-chain
> multi-user.target @2min 6.301s
> └─tuned.service @1min 32.273s +34.024s
>   └─network.target @1min 31.590s
>     └─network-pre.target @1min 31.579s
>       └─glean at enp1s0.service @36.594s +54.952s
>         └─system-glean.slice @36.493s
>           └─system.slice @4.083s
>             └─-.slice @4.080s
> # systemd-analyze critical-chain NetworkManager.service
> NetworkManager.service +9.287s
> └─network-pre.target @1min 31.579s
>   └─glean at enp1s0.service @36.594s +54.952s
>     └─system-glean.slice @36.493s
>       └─system.slice @4.083s
>         └─-.slice @4.080s
> # cat /etc/sysconfig/network-scripts/ifcfg-enp1s0
> # Automatically generated, do not edit
> DEVICE=enp1s0
> BOOTPROTO=static
> HWADDR=52:54:00:1f:79:7e
> ONBOOT=yes
> # ip addr
> ...
> 2: enp1s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state
> UP group default qlen 1000
>     link/ether 52:54:00:1f:79:7e brd ff:ff:ff:ff:ff:ff
>     inet brd scope global dynamic
> noprefixroute enp1s0
>        valid_lft 42957sec preferred_lft 42957sec
>     inet6 fe80::f182:7fb4:7a39:eb7b/64 scope link noprefixroute
>        valid_lft forever preferred_lft forever
>> At a guess; I feel like the udev bits are probably not happening
>> correctly in your case?  That's important to get the glean@<interface>
>> service in the chain to pre-create the config file
> It seems that the ordering is correct and the interface service is
> executed, but the IP address is nonetheless wrong.
> Can it be related to how long glean takes to run in my case (54 seconds vs
> 1 second in your case)?
> Dmitry
>> -i
>> [1]
>> https://opendev.org/opendev/glean/src/branch/master/glean/init/glean-udev.rules
>> [2]
>> https://opendev.org/opendev/glean/src/branch/master/glean/init/glean-nm@.service

More information about the openstack-discuss mailing list