Re: [ops][nova][neutron] Routed provider networking and physnet port scheduling

3 Sep 2025

      On 03/09/2025 16:40, Andrew Bonney wrote:
...
We have been using routed provider networks with SRIOV for a while, 
but I believe this works a little 'by accident', so I'm not suggesting 
this is a recommended path. I'll describe as best I can but it's a 
while since I've worked through this so I may not be 100% on why it works.
Like your description, we have a physnet per rack, with a segment and 
associated VLAN for each. A user can create a 'Direct' type port 
associated with the segmented network, and with deferred IP allocation 
so it doesn't tie the port to a rack.
If I'm remembering correctly the accidental 'hack' here is that ALL of 
the hypervisors in all racks have a PCI device spec listing ALL of the 
physical networks, and they all use the same interface name, as 
follows. Only the Neutron SRIOV config differs dependent on the rack 
the host is in.
[pci]
# PCI devices available to VMs
device_spec = { "physical_network": "physnet_media_a1", "devname": 
"enp129s0f0np0" }
device_spec = { "physical_network": "physnet_media_a2", "devname": 
"enp129s0f0np0" }
device_spec = { "physical_network": "physnet_media_a3", "devname": 
"enp129s0f0np0" }
so ^ is not supported form a nova perspective.

you are not meant to be able to list the same device multiple time and 
merge the phsynets to form effectively a list like that.

it would be entirely supprotred if each line referenced a different device.

this should actually be a startup config error in nova.

on the neutron side phsynets must provide l2 isolation or it will break 
multi tenantcy

i.e. vlan 100 on physnet_media_a1 __MUST__ not ever allow communication 
to vlan 100 on physnet_media_a2 without the two phsynets being 
interconnect by an l3 router

so this is also invalid config on the neutron side as you do not meet 
the requirements for defining separate phsynets

it will break how multi tenancy is designed to work on ther side.

This can work in a private cloud and it can work fi you do not allow 
vlan/flat tenant network in neutron but you as the admin
take on the burden of makeing sure that you do not violate multi tenancy 
in this case isntead of neutron doing it.

this is a variation of the hack that telco do for numa local networks 
before that was actully possibel in nova.

i.e. they woudl have phsynet_1_numa_0 and phsynet_1_numa_1 and use that 
to force pci devices to come form the relevent numa node

but in reality they would nto be sperate pnsyical network so they would 
create a multi provider network using the same valan on both to 
interconnect them.

this is well into unsupproted land but if you deeply understand the 
secuirty implciations and that is accptbel for you private cloud you 
could do htis.

if it break however for any reason its not an upstream bug as you are 
deliberatly misconfiguring nova and neutorn to make this work.
...
From Nova's perspective it is always selecting the first physical 
network in the list, but this means that when the scheduler picks a 
hypervisor in a rack other than '1', it will still be happy to proceed.
------------------------------------------------------------------------
*From:* Sean Mooney <smooney@redhat.com>
*Sent:* Wednesday, September 03, 2025 16:08
*To:* openstack-discuss@lists.openstack.org 
<openstack-discuss@lists.openstack.org>; nathanh@graphcore.ai 
<nathanh@graphcore.ai>
*Cc:* bens@graphcore.ai <bens@graphcore.ai>
*Subject:* Re: [ops][nova][neutron] Routed provider networking and 
physnet port scheduling
External: Think before clicking
just adding bens back incase they are not on the list.
i slectecte the wrogn reply type before
On 03/09/2025 16:07, Sean Mooney wrote:
...
On 03/09/2025 15:10, Nathan Harper wrote:
...
Hi all,
We have been looking at building some routed provider networks,
following this documentation:
https://docs.openstack.org/neutron/latest/admin/config-routed-networks.html 
<https://docs.openstack.org/neutron/latest/admin/config-routed-networks.html>
...
...
In this scenario we have 4 racks, and have defined physnets for each
rack and assigned SRIOV interfaces for each.     We then have created
a multisegment network, with a segment associated with each
physnet.    We get the expected resource provider in placement
containing only these hypervisors.
When scheduling instances onto this network, the allocation
candidates are any hypervisors in racks 1-4 (openstack filters the
hypervisors using the aggregates for each segment that neutron
creates).
...
However, during instance build the pci device request sent to
nova-compute always contains the physnet of the same segment.
Debugging the builds, we ended up here:
https://opendev.org/openstack/nova/src/branch/master/nova/network/neutron.py... 
<https://opendev.org/openstack/nova/src/branch/master/nova/network/neutron.py#L2226>,
...
with:
# TODO(vladikr): Additional work will be required to handle the
# case of multiple vlan segments associated with different
# physical networks.
Which originates from this commit:
https://opendev.org/openstack/nova/commit/b9d9d96a407db5a2adde3aed81e61cc958... 
<https://opendev.org/openstack/nova/commit/b9d9d96a407db5a2adde3aed81e61cc9589c291a>
...
This suggests that despite the documentation describing using
multiple VLAN backed segments in this fashion, this has never worked?
That is correct. nova has never supported the multiple phsent exteion
that was added to neutron in general
https://github.com/openstack/neutron-lib/blob/master/neutron_lib/api/definit... 
<https://github.com/openstack/neutron-lib/blob/master/neutron_lib/api/definitions/multiprovidernet.py>
...
you can have multiple network segments on the same phsenet but when
routed provider networks was first designed there was
an intention to have a second away to associate hosts with segments
that did not depend on phsynets however that was never implemented.
there was meant to be a way to associate host with segments directly
via api or config that did not use phsynets to do that mapping.
...
    Or are we missing something?  Has anyone successfully used routed
provider networks?
the wya that rotehed provider networks are typiclaly used is that you
do not have a singel network that spans phsnets
you can a 1:1 mapping between phsnet and segment and create seperate
networks for each physnet
sriov has extra complications beause nova normally does tno have any
awareness fo phsynets at all but for sriov you have to declare a
single phsynet
for them in nova pci devspec.
a phsynet is intended to be effectively an l2 brodcast domain which is
more or less what a sgment is as well.
technically the requirements for a neutron phsyent is stricter in its
requirement of l2 isolation between phsnets then the isolation between
segments.
...
--
Regards,
Nathan Harper
Principal Engineer – Cloud Development
Platform Engineering
nathanh@graphcore.ai <mailto:nathanh@graphcore.ai
<mailto:nathanh@graphcore.ai>>
...
...
www.graphcore.ai <http://www.graphcore.ai> →
<http://www.graphcore.ai/ <http://www.graphcore.ai/>>
...
** We have updated our privacy policy, which contains important
information about how we collect and process your personal data. To
read the policy, please click here <http://www.graphcore.ai/privacy
<http://www.graphcore.ai/privacy>> **
...
This email and its attachments are intended solely for the addressed
recipients and may contain confidential or legally privileged
information.
If you are not the intended recipient you must not copy, distribute
or disseminate this email in any way; to do so may be unlawful.
Any personal data/special category personal data herein are processed
in accordance with UK data protection legislation.
All associated feasible security measures are in place. Further
details are available from the Privacy Notice on the website and/or
from the Company.
Graphcore Limited (registered in England and Wales with registration
number 10185006) is registered at, 1 Maple Road, Bramhall, Stockport,
Cheshire, UK, SK7 2DH.
This message was scanned for viruses upon transmission. However
Graphcore accepts no liability for any such transmission.