[nova][neutron][cyborg] Bandwidth (and accel) providers are broken if CONF.host is set
Sean Mooney
smooney at redhat.com
Tue Nov 26 16:44:51 UTC 2019
On Tue, 2019-11-26 at 16:57 +0100, Bence Romsics wrote:
> Hi All,
>
> After gibi opened the neutron bug [1] I assigned it to myself and now
> I got to summarize my plan to fix it. Please comment either in a reply
> to this or in the bug report if you see problems, have questions, etc.
>
> (1) Extend ovs and sriov agents config with
> 'resource_provider_hypervisors' for example:
>
> ml2_conf.ini
> [ovs]
> bridge_mappings = physnet0:br-physnet0,...
> resource_provider_bandwidths = br-physnet0:10000000:10000000,...
> resource_provider_hypervisors = physnet0:hypervisor0,... # this is
im guessing you are adding this for the ironic smart nic usecasue but i dont think this makes sense.
in the non irionc smarnic case where ovs is deployed localy there will be only 1 hypervior
listed for all physnets on the hsot.
in the ironic case i dont think it makes sense as physntes span multiple hyperviors
and we dont use the physnet today as part of the lookup so even in this case mapping physnets to a list of hyperviour
does not make senses.
what would make sense woudl be a mapping between CONF.host and the hypervior hostname but i honestly think we shouyd not
require a config option at all. if we simply report the hypervior_hostname by calling socket.gethostname
that shoudl be sufficent. the nova api will be extended so you can get the compute node uuid by using the value in
CONF.host so this new hypervior hostname is only for backport reasons to have a way to wrokaround the lack of query
support in the /os-hyperviors api in stein and train.
the other config option that might make sense woudl be the root resource provider uuid but we dont want to require
that nova is installed the compute node started before you can generate teh neutron agent conf so i dont think that is a
good approch.
> new, values default to socket.gethostname() for each key in
> resource_provider_bandwidths
>
> sriov_agent.ini
> [sriov_nic]
> physical_device_mappings = physnet1:ens5,...
> resource_provider_bandwidths = ens5:40000000:40000000,...
> resource_provider_hypervisors = physnet1:hypervisor1,... # this is
> new, defaults as above
>
> The values for resource_provider_hypervisors are opaque identifiers
> for neutron. Since each physical device can have its own hypervisor
> associated possible features like ovs-superagent (for smartnics) could
> be supported.
it could be i think this is a rather poor way to facilitate that.
the pyysnet is not a correct key to use for resource_provider_hypervisors
> Use of socket.gethostname() is only hardcoded as a
> default, so non-libvirt hypervisors are taken care of.
the requirement that the conf.host match on both nova and neutron is not livirt sepecific.
nova and neutron need to agree on the name of the host for the port bidnign extention to work correctly.
>
> (2) Extend the report_state message's configurations field alike:
>
> {
> 'bridge_mappings': {'physnet0': 'br-physnet0'},
> 'resource_provider_bandwidths': {'br-physnet0': {'egress': 10000000,
> 'ingress': 10000000}},
> 'resource_provider_hypervisors': {'br-physnet0': 'hypervisor0'},
again i think this should be 'resource_provider_name': 'hypervisor0',
or 'resource_provider_hypervisors': {"value set in CONF.host": 'hypervisor0', ... },
> 'resource_provider_inventory_defaults': {'allocation_ratio': 1.0,
> 'min_unit': 1, 'step_size': 1, 'reserved': 0}
> }
>
> Do not touch the host field of the same message.
>
> Since we always treated the configurations field as free format, IMO
> changes to it should be backportable. Let me know if you think
> otherwise.
yes i think they should be backporatble
>
> (3) In neutron-server
>
> report_state.host is used in binding as now - no change here.
+1
>
> report_state.configurations.resource_provider_hypervisors.PHYSDEV
for a given host all phynest or network interface will have the same
hypervior hostname.
if we had a resource_provider_name config we would only need to use it if its
value did not match the existing host value so we should just do this
name = report_state.configurations.get('resource_provider_name', report_state.host)
> to
> be used in selecting parent resource provider for agent and physdev
> RP-tree. When not available in the message still fall back to using
> report_state.host as today.
>
> (4) At the moment I don't see the need to use the proposed new nova
> API to query hypervisors managed by a nova-compute since as soon as it
> returns 1+ hypervisors neutron cannot do anything with the result.
it would only do that for ironic and there is no current usage of placement via neutron with ironic.
i dont think the config option proposed above is a better solution so i think we need to continue to disucss
this.
>
> Cheers,
> Bence
>
> [1] https://bugs.launchpad.net/neutron/+bug/1853840
>
More information about the openstack-discuss
mailing list