On Tue, 2019-11-26 at 16:57 +0100, Bence Romsics wrote:
Hi All,
After gibi opened the neutron bug [1] I assigned it to myself and now I got to summarize my plan to fix it. Please comment either in a reply to this or in the bug report if you see problems, have questions, etc.
(1) Extend ovs and sriov agents config with 'resource_provider_hypervisors' for example:
ml2_conf.ini [ovs] bridge_mappings = physnet0:br-physnet0,... resource_provider_bandwidths = br-physnet0:10000000:10000000,... resource_provider_hypervisors = physnet0:hypervisor0,... # this is im guessing you are adding this for the ironic smart nic usecasue but i dont think this makes sense.
new, values default to socket.gethostname() for each key in resource_provider_bandwidths
sriov_agent.ini [sriov_nic] physical_device_mappings = physnet1:ens5,... resource_provider_bandwidths = ens5:40000000:40000000,... resource_provider_hypervisors = physnet1:hypervisor1,... # this is new, defaults as above
The values for resource_provider_hypervisors are opaque identifiers for neutron. Since each physical device can have its own hypervisor associated possible features like ovs-superagent (for smartnics) could be supported. it could be i think this is a rather poor way to facilitate that.
Use of socket.gethostname() is only hardcoded as a default, so non-libvirt hypervisors are taken care of.
in the non irionc smarnic case where ovs is deployed localy there will be only 1 hypervior listed for all physnets on the hsot. in the ironic case i dont think it makes sense as physntes span multiple hyperviors and we dont use the physnet today as part of the lookup so even in this case mapping physnets to a list of hyperviour does not make senses. what would make sense woudl be a mapping between CONF.host and the hypervior hostname but i honestly think we shouyd not require a config option at all. if we simply report the hypervior_hostname by calling socket.gethostname that shoudl be sufficent. the nova api will be extended so you can get the compute node uuid by using the value in CONF.host so this new hypervior hostname is only for backport reasons to have a way to wrokaround the lack of query support in the /os-hyperviors api in stein and train. the other config option that might make sense woudl be the root resource provider uuid but we dont want to require that nova is installed the compute node started before you can generate teh neutron agent conf so i dont think that is a good approch. the pyysnet is not a correct key to use for resource_provider_hypervisors the requirement that the conf.host match on both nova and neutron is not livirt sepecific. nova and neutron need to agree on the name of the host for the port bidnign extention to work correctly.
(2) Extend the report_state message's configurations field alike:
{ 'bridge_mappings': {'physnet0': 'br-physnet0'}, 'resource_provider_bandwidths': {'br-physnet0': {'egress': 10000000, 'ingress': 10000000}}, 'resource_provider_hypervisors': {'br-physnet0': 'hypervisor0'},
again i think this should be 'resource_provider_name': 'hypervisor0', or 'resource_provider_hypervisors': {"value set in CONF.host": 'hypervisor0', ... },
'resource_provider_inventory_defaults': {'allocation_ratio': 1.0, 'min_unit': 1, 'step_size': 1, 'reserved': 0} }
Do not touch the host field of the same message.
Since we always treated the configurations field as free format, IMO changes to it should be backportable. Let me know if you think otherwise. yes i think they should be backporatble
(3) In neutron-server
report_state.host is used in binding as now - no change here. +1
report_state.configurations.resource_provider_hypervisors.PHYSDEV for a given host all phynest or network interface will have the same hypervior hostname.
if we had a resource_provider_name config we would only need to use it if its value did not match the existing host value so we should just do this name = report_state.configurations.get('resource_provider_name', report_state.host)
to be used in selecting parent resource provider for agent and physdev RP-tree. When not available in the message still fall back to using report_state.host as today.
(4) At the moment I don't see the need to use the proposed new nova API to query hypervisors managed by a nova-compute since as soon as it returns 1+ hypervisors neutron cannot do anything with the result.
it would only do that for ironic and there is no current usage of placement via neutron with ironic. i dont think the config option proposed above is a better solution so i think we need to continue to disucss this.
Cheers, Bence