[nova][neutron][cyborg] Bandwidth (and accel) providers are broken if CONF.host is set
smooney at redhat.com
Thu Nov 28 12:57:05 UTC 2019
On Thu, 2019-11-28 at 08:54 +0000, Balázs Gibizer wrote:
> On Wed, Nov 27, 2019 at 17:03, Sean Mooney <smooney at redhat.com> wrote:
> > On Wed, 2019-11-27 at 16:20 +0100, Bence Romsics wrote:
> > > > > resource_provider_hypervisors = br-physnet0:hypervisor0,...
> > > >
> > > > this also wont work as the same bridge name will exists on
> > > multipel hosts
> > >
> > > Of course the same bridge/nic name can exist on multiple hosts. And
> > > each report_state message is clearly belonging to a single agent and
> > > the configurations field is persisted per agent, so there won't be a
> > > collision ever.
> > >
> > that is in the non iroinc smart nic case. in the ironic smart nic
> > case with the ovs super agent
> > which is the only case where there would be multiple hypervisor
> > managed by the same
> > agent the agent will be remote.
> When you say "ironic smart nic case with the ovs super agent", do you
> refer to this abandoned spec ?
yes and https://review.opendev.org/#/c/595512/2
there are two related abandoned spec that came up in train but i dont think either
are progressing anymore.
> > so in the non ironic case it does not need to be a list.
> > in the smartnic case it might need to be a list
> In that spec the author proposes  not to break the 1-1 mapping
> between OVS agent and remote OVS. So as far as I see there is no need
> for a list in this case either.
i agree although there was an expression of the desire to allow the agent
to manage multiple hosts. its been a while but i belive that is what
> > but a mapping of bridge or pyshnet wont be unique
> > and a agent hostname (CONF.host) to hypervior host would be 1:N so
> > its not clear how you would select
> > form the N RPs if all you know form nova is the binding host which is
> > the service host not hypervior hostname.
> Are we talking about a problem during binding here?
yes and know it would be a proablem during binding as we just pass the
service host in the binding host so wew would need to add a binding:hypervior_host
also if we wanted port bidnign to work in that case. otherwise we would be changing
the meaning of binding:host in ml2/ovs. currenlty it refers to the service host
wich is shared bettwen nova and neutron for both the comptue and networking agent.
in the agentless case it is used more like the hypervior hostname. odl and i think ovn
add info to the agents table in neutron even though they dont have agents to allow
per host configuraiton to be expressed. the binding host is used to select that.
anyway in the rp case you have a similar problem. today odl and ovn do not support minium
bandwidth, in the futrue if they add it they would have to create an rp per host based on the
info in the agents table. if ml2 ovs was extened to have a 1:N mapping between neutorn ovs agent
and multpie host the service host set in CONF.host would map to the host the agent is running
on not the host the vm is being booted on and you would need some addtional mapping the same way
the ironic driver work. in any case https://review.opendev.org/#/c/595512 is also abandoned
so i dont think we shoudl try to cater for that case now especially since we want to back port this
to stien. if we wanted to support 1:N mappings in the ovs agent and not requrie chagnes in nova would actully
want to change CONF.host to be a list and have all the bandwith provider config be keyed off of the service host.
you could do this in a number of ways that are not importnat right now like dynamic config but usign the device
or physnet are not good was to approch this.
> As this feels to be
> a different problem than from creating device RPs under the proper
> compute node RP.
> Anyhow my simple understanding is the following:
> * a physical NIC or an OVS integration bridge always belongs to one
> single hypervisor. While a hypervisor might have more than on physical
> NIC or an OVS bridge
for the most part yes there are way to make that not true when dealing with
pcie over rdma and other no production composable infrasture stuff but from
an openstack point of view and with relevence to stien and train you are correct.
> * the identity (e.g. hypervisor hostname) of such hypervisor is known
> at deployment time
yes, and its often set by the deployment tool although not always as it can be set via dhcp or manually.
> * the neutron agent config can have a mapping between the device (NIC
> or OVS bridge) and the hypervisor identity and this mapping can be sent
> up to the neutron server via RPC
yes although i dont think that is required. i think we jsut need to pass back
the hypervior host name and no other info mataion that is not currently in the agent report.
> * the neutron agent already sends up the service host name where the
> agent runs to the neutron server via RPC.
yes and neutron uses that for the same usecase that nova does to determin which host
the agent run on and match to the host set in the binding_details:host field we set when binding the port.
> * the neutron server knowing the service host and the device ->
> hypervisor identity mapping can find the compute node RP under which
> the device RP needs to be created.
you dont need the the device to hypervior mapping. in a non sriov case you dont typeically
have a device that the port is assocaited too just a physnet which is None in the case of tunneled ports.
so in ovs or linux bridge it more typical for the prot to be assocated with a segmenation tyep that is assocaited
with a bridge that may have an interface attached but its only loosely assocated with a device.
> @Sean: Where does my list of reasoning breaks from your perspective?
your resoning that id does not need to be a list? if that is you assertion i agree compltely it should not be a list.
the once case it could break is if the agent starts to manage multiple host the same way the ironic agent does.
however to support that nova would have to change the infor it sets in the port bidnign we would have to set the
hypervior host name instead of the service host name. that would be a big change and would require a new api extention
in my view so i dont think we should condier it now.
More information about the openstack-discuss