[neutron][nova][placement] bug 1926693: What would be the reasonable solution ?
Hi All, I've been working on bug 1926693[1], and am lost about the reasonable solutions we expect. Ideally I'd need to bring this topic in the team meeting but because of the timezone gap and complicated background, I'd like to gather some feedback in ml first. [1] https://bugs.launchpad.net/neutron/+bug/1926693 TL;DR Which one(or ones) would be reasonable solutions for this issue ? (1) https://review.opendev.org/c/openstack/neutron/+/763563 (2) https://review.opendev.org/c/openstack/neutron/+/788893 (3) Implement something different The issue I reported in the bug is that there is an inconsistency between nova and neutron about the way to determine a hypervisor name. Currently neutron uses socket.gethostname() (which always returns shortname) to determine a hypervisor name to search the corresponding resource provider. On the other hand, nova uses libvirt's getHostname function (if libvirt driver is used) which returns a canonical name. Canonical name can be shortname or FQDN (*1) and if FQDN is used then neutron and nova never agree. (*1) IMO this is likely to happen in real deployments. For example, TripelO uses FQDN for canonical names. Neutron already provides the resource_provider_defauly_hypervisors option to override a hypervisor name used. However because this option accepts a map between interface and hypervisor, setting this parameter requires very redundant description especially when a compute node has multiple interfaces/bridges. The following example shows how redundant the current requirement is. ~~~ [OVS] resource_provider_bandwidths=br-data1:1024:1024,br-data2:1024:1024,\ br-data3:1024,1024,br-data4,1024:1024 resource_provider_hypervisors=br-data1:compute0.mydomain,br-data2:\ compute0.mydomain,br-data3:compute0.mydomain,br-data4:compute0.mydomain ~~~ I've submitted a change to propose a new single parameter to override the base hypervisor name but this is currently -2ed, mainly because I lacked analysis about the root cause of mismatch when I proposed this. (1) https://review.opendev.org/c/openstack/neutron/+/763563 On the other hand, I submitted a different change to neutron which implements the logic to get a hypervisor name which is fully compatible with libvirt. While this would save users from even overriding hypervisor names, I'm aware that this might break the other virt driver which depends on a different logic to generate a hypervisor name. IMO the patch is still useful considering the libvirt driver would be the most popular option now, but I'm not fully aware of the impact on the other drivers, especially because I don't know which virt driver would support the minimum QoS feature now. (2) https://review.opendev.org/c/openstack/neutron/+/788893/ In the review of (2), Sean mentioned implementing a logic to determine an appropriate resource provider(3) even if there is a mismatch about host name format, but I'm not sure how I would implement that, tbh. My current thought is to merge (1) as a quick solution first, and discuss whether we should merge (2), but I'd like to ask for some feedback about this plan (like we should NOT merge (2)). I'd appreciate your thoughts about this $topic. Thank you, Takashi
Hello Takashi and Neutrinos: First of all, thank you for working on this. Currently users have the ability to override the host name using "resource_provider_hypervisors". That means this parameter is always configurable; IMO we are safe on this. The problem we have is how we should retrieve this host name if "resource_provider_hypervisors" is not provided. I think the solution could be a combination of: - A first patch providing the ability to select the hypervisor type. The default one could be "libvirt". Each driver can have a particular host name retrieval implementation. The default one will be the implemented right now: "socket.gethostname()" - https://review.opendev.org/c/openstack/neutron/+/788893, providing full compatibility for libvirt. Those are my two cents. Regards. On Thu, Jun 10, 2021 at 4:12 PM Takashi Kajinami <tkajinam@redhat.com> wrote:
Hi All,
I've been working on bug 1926693[1], and am lost about the reasonable solutions we expect. Ideally I'd need to bring this topic in the team meeting but because of the timezone gap and complicated background, I'd like to gather some feedback in ml first.
[1] https://bugs.launchpad.net/neutron/+bug/1926693
TL;DR Which one(or ones) would be reasonable solutions for this issue ? (1) https://review.opendev.org/c/openstack/neutron/+/763563 (2) https://review.opendev.org/c/openstack/neutron/+/788893 (3) Implement something different
The issue I reported in the bug is that there is an inconsistency between nova and neutron about the way to determine a hypervisor name. Currently neutron uses socket.gethostname() (which always returns shortname) to determine a hypervisor name to search the corresponding resource provider. On the other hand, nova uses libvirt's getHostname function (if libvirt driver is used) which returns a canonical name. Canonical name can be shortname or FQDN (*1) and if FQDN is used then neutron and nova never agree.
(*1) IMO this is likely to happen in real deployments. For example, TripelO uses FQDN for canonical names.
Neutron already provides the resource_provider_defauly_hypervisors option to override a hypervisor name used. However because this option accepts a map between interface and hypervisor, setting this parameter requires very redundant description especially when a compute node has multiple interfaces/bridges. The following example shows how redundant the current requirement is. ~~~ [OVS] resource_provider_bandwidths=br-data1:1024:1024,br-data2:1024:1024,\ br-data3:1024,1024,br-data4,1024:1024 resource_provider_hypervisors=br-data1:compute0.mydomain,br-data2:\ compute0.mydomain,br-data3:compute0.mydomain,br-data4:compute0.mydomain ~~~
I've submitted a change to propose a new single parameter to override the base hypervisor name but this is currently -2ed, mainly because I lacked analysis about the root cause of mismatch when I proposed this. (1) https://review.opendev.org/c/openstack/neutron/+/763563
On the other hand, I submitted a different change to neutron which implements the logic to get a hypervisor name which is fully compatible with libvirt. While this would save users from even overriding hypervisor names, I'm aware that this might break the other virt driver which depends on a different logic to generate a hypervisor name. IMO the patch is still useful considering the libvirt driver would be the most popular option now, but I'm not fully aware of the impact on the other drivers, especially because I don't know which virt driver would support the minimum QoS feature now. (2) https://review.opendev.org/c/openstack/neutron/+/788893/
In the review of (2), Sean mentioned implementing a logic to determine an appropriate resource provider(3) even if there is a mismatch about host name format, but I'm not sure how I would implement that, tbh.
My current thought is to merge (1) as a quick solution first, and discuss whether we should merge (2), but I'd like to ask for some feedback about this plan (like we should NOT merge (2)).
I'd appreciate your thoughts about this $topic.
Thank you, Takashi
Hi, Dnia piątek, 11 czerwca 2021 09:57:27 CEST Rodolfo Alonso Hernandez pisze:
Hello Takashi and Neutrinos:
First of all, thank you for working on this.
Currently users have the ability to override the host name using "resource_provider_hypervisors". That means this parameter is always configurable; IMO we are safe on this.
The problem we have is how we should retrieve this host name if "resource_provider_hypervisors" is not provided. I think the solution could be a combination of:
- A first patch providing the ability to select the hypervisor type. The default one could be "libvirt". Each driver can have a particular host name retrieval implementation. The default one will be the implemented right now: "socket.gethostname()" - https://review.opendev.org/c/openstack/neutron/+/788893, providing full compatibility for libvirt.
Those are my two cents.
We can move on with the patch https://review.opendev.org/c/openstack/neutron/+/ 763563[1] to provide new config option as it's now and additionally implement https:// review.opendev.org/c/openstack/neutron/+/788893[2] so users who are using libvirt will not need to change anything, but if someone is using other hypervisor, this will allow adjustments. Wdyt?
Regards.
On Thu, Jun 10, 2021 at 4:12 PM Takashi Kajinami <tkajinam@redhat.com>
wrote:
Hi All,
I've been working on bug 1926693[1], and am lost about the reasonable solutions we expect. Ideally I'd need to bring this topic in the team meeting but because of the timezone gap and complicated background, I'd like to gather some feedback in ml first.
[1] https://bugs.launchpad.net/neutron/+bug/1926693
TL;DR
Which one(or ones) would be reasonable solutions for this issue ?
(1) https://review.opendev.org/c/openstack/neutron/+/763563 (2) https://review.opendev.org/c/openstack/neutron/+/788893 (3) Implement something different
The issue I reported in the bug is that there is an inconsistency between nova and neutron about the way to determine a hypervisor name. Currently neutron uses socket.gethostname() (which always returns shortname) to determine a hypervisor name to search the corresponding resource provider. On the other hand, nova uses libvirt's getHostname function (if libvirt driver is used) which returns a canonical name. Canonical name can be shortname or FQDN (*1) and if FQDN is used then neutron and nova never agree.
(*1) IMO this is likely to happen in real deployments. For example, TripelO uses FQDN for canonical names.
Neutron already provides the resource_provider_defauly_hypervisors option to override a hypervisor name used. However because this option accepts a map between interface and hypervisor, setting this parameter requires very redundant description especially when a compute node has multiple interfaces/bridges. The following example shows how redundant the current requirement is. ~~~ [OVS] resource_provider_bandwidths=br-data1:1024:1024,br-data2:1024:1024,\ br-data3:1024,1024,br-data4,1024:1024 resource_provider_hypervisors=br-data1:compute0.mydomain,br-data2:\ compute0.mydomain,br-data3:compute0.mydomain,br-data4:compute0.mydomain ~~~
I've submitted a change to propose a new single parameter to override the base hypervisor name but this is currently -2ed, mainly because I lacked analysis about the root cause of mismatch when I proposed this.
(1) https://review.opendev.org/c/openstack/neutron/+/763563
On the other hand, I submitted a different change to neutron which implements the logic to get a hypervisor name which is fully compatible with libvirt. While this would save users from even overriding hypervisor names, I'm aware that this might break the other virt driver which depends on a different logic to generate a hypervisor name. IMO the patch is still useful considering the libvirt driver would be the most popular option now, but I'm not fully aware of the impact on the other drivers, especially because I don't know which virt driver would support the minimum QoS feature now.
(2) https://review.opendev.org/c/openstack/neutron/+/788893/
In the review of (2), Sean mentioned implementing a logic to determine an appropriate resource provider(3) even if there is a mismatch about host name format, but I'm not sure how I would implement that, tbh.
I agree with this idea but what https://review.opendev.org/c/openstack/neutron/+/763563 is proposing differs from what I'm saying: instead of providing the hostname (that is something we can do "resource_provider_hypervisors"), we should provide the hypervisor name (default: libvirt). On Fri, Jun 11, 2021 at 10:36 AM Slawek Kaplonski <skaplons@redhat.com> wrote:
Hi,
Dnia piątek, 11 czerwca 2021 09:57:27 CEST Rodolfo Alonso Hernandez pisze:
Hello Takashi and Neutrinos:
First of all, thank you for working on this.
Currently users have the ability to override the host name using
"resource_provider_hypervisors". That means this parameter is always
configurable; IMO we are safe on this.
The problem we have is how we should retrieve this host name if
"resource_provider_hypervisors" is not provided. I think the solution could
be a combination of:
- A first patch providing the ability to select the hypervisor type. The
default one could be "libvirt". Each driver can have a particular host name
retrieval implementation. The default one will be the implemented right
now: "socket.gethostname()"
- https://review.opendev.org/c/openstack/neutron/+/788893, providing
full compatibility for libvirt.
Those are my two cents.
We can move on with the patch https://review.opendev.org/c/openstack/neutron/+/763563 to provide new config option as it's now and additionally implement https://review.opendev.org/c/openstack/neutron/+/788893 so users who are using libvirt will not need to change anything, but if someone is using other hypervisor, this will allow adjustments. Wdyt?
Regards.
On Thu, Jun 10, 2021 at 4:12 PM Takashi Kajinami <tkajinam@redhat.com>
wrote:
Hi All,
I've been working on bug 1926693[1], and am lost about the reasonable
solutions we expect. Ideally I'd need to bring this topic in the team
meeting
but because of the timezone gap and complicated background, I'd like to
gather some feedback in ml first.
TL;DR
Which one(or ones) would be reasonable solutions for this issue ?
(3) Implement something different
The issue I reported in the bug is that there is an inconsistency between
nova and neutron about the way to determine a hypervisor name.
Currently neutron uses socket.gethostname() (which always returns
shortname)
to determine a hypervisor name to search the corresponding resource
provider.
On the other hand, nova uses libvirt's getHostname function (if libvirt
driver is used)
which returns a canonical name. Canonical name can be shortname or FQDN
(*1)
and if FQDN is used then neutron and nova never agree.
(*1)
IMO this is likely to happen in real deployments. For example, TripelO uses
FQDN for canonical names.
Neutron already provides the resource_provider_defauly_hypervisors option
to override a hypervisor name used. However because this option accepts
a map between interface and hypervisor, setting this parameter requires
very redundant description especially when a compute node has multiple
interfaces/bridges. The following example shows how redundant the current
requirement is.
~~~
[OVS]
resource_provider_bandwidths=br-data1:1024:1024,br-data2:1024:1024,\
br-data3:1024,1024,br-data4,1024:1024
resource_provider_hypervisors=br-data1:compute0.mydomain,br-data2:\
compute0.mydomain,br-data3:compute0.mydomain,br-data4:compute0.mydomain
~~~
I've submitted a change to propose a new single parameter to override
the base hypervisor name but this is currently -2ed, mainly because
I lacked analysis about the root cause of mismatch when I proposed this.
On the other hand, I submitted a different change to neutron which
implements
the logic to get a hypervisor name which is fully compatible with libvirt.
While this would save users from even overriding hypervisor names, I'm
aware
that this might break the other virt driver which depends on a different
logic
to generate a hypervisor name. IMO the patch is still useful considering
the libvirt driver would be the most popular option now, but I'm not fully
aware of the impact on the other drivers, especially because I don't know
which virt driver would support the minimum QoS feature now.
(2) https://review.opendev.org/c/openstack/neutron/+/788893/
In the review of (2), Sean mentioned implementing a logic to determine
an appropriate resource provider(3) even if there is a mismatch about
host name format, but I'm not sure how I would implement that, tbh.
My current thought is to merge (1) as a quick solution first, and discuss
whether
we should merge (2), but I'd like to ask for some feedback about this plan
(like we should NOT merge (2)).
I'd appreciate your thoughts about this $topic.
Thank you,
Takashi
--
Slawek Kaplonski
Principal Software Engineer
Red Hat
Hi Slawek and Radolfo, Thank you for your feedback. On Fri, Jun 11, 2021 at 5:47 PM Rodolfo Alonso Hernandez < ralonsoh@redhat.com> wrote:
I agree with this idea but what https://review.opendev.org/c/openstack/neutron/+/763563 is proposing differs from what I'm saying: instead of providing the hostname (that is something we can do "resource_provider_hypervisors"), we should provide the hypervisor name (default: libvirt).
The main problem is that the logic to determine "hypervisor name" is different in each virt driver. For example libvirt driver uses canonical name while power driver uses [DEFAULT] host in nova.conf . So if we fix compatibility with one virt driver then it would break compatibility with the other driver. Because neutron is not aware of the virt driver used, it's impossible to avoid that inconsistency completely. Thank you, Takashi
On Fri, Jun 11, 2021 at 10:36 AM Slawek Kaplonski <skaplons@redhat.com> wrote:
Hi,
Dnia piątek, 11 czerwca 2021 09:57:27 CEST Rodolfo Alonso Hernandez pisze:
Hello Takashi and Neutrinos:
First of all, thank you for working on this.
Currently users have the ability to override the host name using
"resource_provider_hypervisors". That means this parameter is always
configurable; IMO we are safe on this.
The problem we have is how we should retrieve this host name if
"resource_provider_hypervisors" is not provided. I think the solution could
be a combination of:
- A first patch providing the ability to select the hypervisor type. The
default one could be "libvirt". Each driver can have a particular host name
retrieval implementation. The default one will be the implemented right
now: "socket.gethostname()"
- https://review.opendev.org/c/openstack/neutron/+/788893, providing
full compatibility for libvirt.
Those are my two cents.
We can move on with the patch https://review.opendev.org/c/openstack/neutron/+/763563 to provide new config option as it's now and additionally implement https://review.opendev.org/c/openstack/neutron/+/788893 so users who are using libvirt will not need to change anything, but if someone is using other hypervisor, this will allow adjustments. Wdyt?
Regards.
On Thu, Jun 10, 2021 at 4:12 PM Takashi Kajinami <tkajinam@redhat.com>
wrote:
Hi All,
I've been working on bug 1926693[1], and am lost about the reasonable
solutions we expect. Ideally I'd need to bring this topic in the team
meeting
but because of the timezone gap and complicated background, I'd like to
gather some feedback in ml first.
TL;DR
Which one(or ones) would be reasonable solutions for this issue ?
(3) Implement something different
The issue I reported in the bug is that there is an inconsistency between
nova and neutron about the way to determine a hypervisor name.
Currently neutron uses socket.gethostname() (which always returns
shortname)
to determine a hypervisor name to search the corresponding resource
provider.
On the other hand, nova uses libvirt's getHostname function (if libvirt
driver is used)
which returns a canonical name. Canonical name can be shortname or FQDN
(*1)
and if FQDN is used then neutron and nova never agree.
(*1)
IMO this is likely to happen in real deployments. For example, TripelO uses
FQDN for canonical names.
Neutron already provides the resource_provider_defauly_hypervisors option
to override a hypervisor name used. However because this option accepts
a map between interface and hypervisor, setting this parameter requires
very redundant description especially when a compute node has multiple
interfaces/bridges. The following example shows how redundant the current
requirement is.
~~~
[OVS]
resource_provider_bandwidths=br-data1:1024:1024,br-data2:1024:1024,\
br-data3:1024,1024,br-data4,1024:1024
resource_provider_hypervisors=br-data1:compute0.mydomain,br-data2:\
compute0.mydomain,br-data3:compute0.mydomain,br-data4:compute0.mydomain
~~~
I've submitted a change to propose a new single parameter to override
the base hypervisor name but this is currently -2ed, mainly because
I lacked analysis about the root cause of mismatch when I proposed this.
On the other hand, I submitted a different change to neutron which
implements
the logic to get a hypervisor name which is fully compatible with libvirt.
While this would save users from even overriding hypervisor names, I'm
aware
that this might break the other virt driver which depends on a different
logic
to generate a hypervisor name. IMO the patch is still useful considering
the libvirt driver would be the most popular option now, but I'm not fully
aware of the impact on the other drivers, especially because I don't know
which virt driver would support the minimum QoS feature now.
(2) https://review.opendev.org/c/openstack/neutron/+/788893/
In the review of (2), Sean mentioned implementing a logic to determine
an appropriate resource provider(3) even if there is a mismatch about
host name format, but I'm not sure how I would implement that, tbh.
My current thought is to merge (1) as a quick solution first, and discuss
whether
we should merge (2), but I'd like to ask for some feedback about this plan
(like we should NOT merge (2)).
I'd appreciate your thoughts about this $topic.
Thank you,
Takashi
--
Slawek Kaplonski
Principal Software Engineer
Red Hat
Hello: I think I'm not explaining myself correctly. This is what I'm proposing: to provide a "hypervisor_type" variable in Neutron and implement, for each supported hypervisor, a hostname method retrieval. If we don't support the hypervisor used, the user can always provide the hostname via "resource_provider_hypervisors". Regards. On Fri, Jun 11, 2021 at 12:20 PM Takashi Kajinami <tkajinam@redhat.com> wrote:
Hi Slawek and Radolfo,
Thank you for your feedback.
On Fri, Jun 11, 2021 at 5:47 PM Rodolfo Alonso Hernandez < ralonsoh@redhat.com> wrote:
I agree with this idea but what https://review.opendev.org/c/openstack/neutron/+/763563 is proposing differs from what I'm saying: instead of providing the hostname (that is something we can do "resource_provider_hypervisors"), we should provide the hypervisor name (default: libvirt).
The main problem is that the logic to determine "hypervisor name" is different in each virt driver. For example libvirt driver uses canonical name while power driver uses [DEFAULT] host in nova.conf . So if we fix compatibility with one virt driver then it would break compatibility with the other driver. Because neutron is not aware of the virt driver used, it's impossible to avoid that inconsistency completely.
Thank you, Takashi
On Fri, Jun 11, 2021 at 10:36 AM Slawek Kaplonski <skaplons@redhat.com> wrote:
Hi,
Dnia piątek, 11 czerwca 2021 09:57:27 CEST Rodolfo Alonso Hernandez pisze:
Hello Takashi and Neutrinos:
First of all, thank you for working on this.
Currently users have the ability to override the host name using
"resource_provider_hypervisors". That means this parameter is always
configurable; IMO we are safe on this.
The problem we have is how we should retrieve this host name if
"resource_provider_hypervisors" is not provided. I think the solution could
be a combination of:
- A first patch providing the ability to select the hypervisor type. The
default one could be "libvirt". Each driver can have a particular host name
retrieval implementation. The default one will be the implemented right
now: "socket.gethostname()"
- https://review.opendev.org/c/openstack/neutron/+/788893, providing
full compatibility for libvirt.
Those are my two cents.
We can move on with the patch https://review.opendev.org/c/openstack/neutron/+/763563 to provide new config option as it's now and additionally implement https://review.opendev.org/c/openstack/neutron/+/788893 so users who are using libvirt will not need to change anything, but if someone is using other hypervisor, this will allow adjustments. Wdyt?
Regards.
On Thu, Jun 10, 2021 at 4:12 PM Takashi Kajinami <tkajinam@redhat.com>
wrote:
Hi All,
I've been working on bug 1926693[1], and am lost about the reasonable
solutions we expect. Ideally I'd need to bring this topic in the team
meeting
but because of the timezone gap and complicated background, I'd like to
gather some feedback in ml first.
TL;DR
Which one(or ones) would be reasonable solutions for this issue ?
(3) Implement something different
The issue I reported in the bug is that there is an inconsistency between
nova and neutron about the way to determine a hypervisor name.
Currently neutron uses socket.gethostname() (which always returns
shortname)
to determine a hypervisor name to search the corresponding resource
provider.
On the other hand, nova uses libvirt's getHostname function (if libvirt
driver is used)
which returns a canonical name. Canonical name can be shortname or FQDN
(*1)
and if FQDN is used then neutron and nova never agree.
(*1)
IMO this is likely to happen in real deployments. For example, TripelO uses
FQDN for canonical names.
Neutron already provides the resource_provider_defauly_hypervisors option
to override a hypervisor name used. However because this option accepts
a map between interface and hypervisor, setting this parameter requires
very redundant description especially when a compute node has multiple
interfaces/bridges. The following example shows how redundant the current
requirement is.
~~~
[OVS]
resource_provider_bandwidths=br-data1:1024:1024,br-data2:1024:1024,\
br-data3:1024,1024,br-data4,1024:1024
resource_provider_hypervisors=br-data1:compute0.mydomain,br-data2:\
compute0.mydomain,br-data3:compute0.mydomain,br-data4:compute0.mydomain
~~~
I've submitted a change to propose a new single parameter to override
the base hypervisor name but this is currently -2ed, mainly because
I lacked analysis about the root cause of mismatch when I proposed this.
On the other hand, I submitted a different change to neutron which
implements
the logic to get a hypervisor name which is fully compatible with libvirt.
While this would save users from even overriding hypervisor names, I'm
aware
that this might break the other virt driver which depends on a different
logic
to generate a hypervisor name. IMO the patch is still useful considering
the libvirt driver would be the most popular option now, but I'm not fully
aware of the impact on the other drivers, especially because I don't know
which virt driver would support the minimum QoS feature now.
(2) https://review.opendev.org/c/openstack/neutron/+/788893/
In the review of (2), Sean mentioned implementing a logic to determine
an appropriate resource provider(3) even if there is a mismatch about
host name format, but I'm not sure how I would implement that, tbh.
My current thought is to merge (1) as a quick solution first, and discuss
whether
we should merge (2), but I'd like to ask for some feedback about this plan
(like we should NOT merge (2)).
I'd appreciate your thoughts about this $topic.
Thank you,
Takashi
--
Slawek Kaplonski
Principal Software Engineer
Red Hat
Hi Radolfo, Thank you for your clarification and sorry I misread what you wrote. My concern with that approach is that adding the hypervisor_type parameter would mean neutron will implement a logic for the other virt drivers, which is currently maintained in nova or hypervisor like libvirt in the future and it would expand the scope of neutron too much. IIUC current Neutron doesn't care about virt drivers used, and I agree with Slawek that it's better to keep that current design here. Thank you, Takashi On Fri, Jun 11, 2021 at 7:39 PM Rodolfo Alonso Hernandez < ralonsoh@redhat.com> wrote:
Hello:
I think I'm not explaining myself correctly. This is what I'm proposing: to provide a "hypervisor_type" variable in Neutron and implement, for each supported hypervisor, a hostname method retrieval.
If we don't support the hypervisor used, the user can always provide the hostname via "resource_provider_hypervisors".
Regards.
On Fri, Jun 11, 2021 at 12:20 PM Takashi Kajinami <tkajinam@redhat.com> wrote:
Hi Slawek and Radolfo,
Thank you for your feedback.
On Fri, Jun 11, 2021 at 5:47 PM Rodolfo Alonso Hernandez < ralonsoh@redhat.com> wrote:
I agree with this idea but what https://review.opendev.org/c/openstack/neutron/+/763563 is proposing differs from what I'm saying: instead of providing the hostname (that is something we can do "resource_provider_hypervisors"), we should provide the hypervisor name (default: libvirt).
The main problem is that the logic to determine "hypervisor name" is different in each virt driver. For example libvirt driver uses canonical name while power driver uses [DEFAULT] host in nova.conf . So if we fix compatibility with one virt driver then it would break compatibility with the other driver. Because neutron is not aware of the virt driver used, it's impossible to avoid that inconsistency completely.
Thank you, Takashi
On Fri, Jun 11, 2021 at 10:36 AM Slawek Kaplonski <skaplons@redhat.com> wrote:
Hi,
Dnia piątek, 11 czerwca 2021 09:57:27 CEST Rodolfo Alonso Hernandez pisze:
Hello Takashi and Neutrinos:
First of all, thank you for working on this.
Currently users have the ability to override the host name using
"resource_provider_hypervisors". That means this parameter is always
configurable; IMO we are safe on this.
The problem we have is how we should retrieve this host name if
"resource_provider_hypervisors" is not provided. I think the solution could
be a combination of:
- A first patch providing the ability to select the hypervisor type. The
default one could be "libvirt". Each driver can have a particular host name
retrieval implementation. The default one will be the implemented right
now: "socket.gethostname()"
- https://review.opendev.org/c/openstack/neutron/+/788893, providing
full compatibility for libvirt.
Those are my two cents.
We can move on with the patch https://review.opendev.org/c/openstack/neutron/+/763563 to provide new config option as it's now and additionally implement https://review.opendev.org/c/openstack/neutron/+/788893 so users who are using libvirt will not need to change anything, but if someone is using other hypervisor, this will allow adjustments. Wdyt?
Regards.
On Thu, Jun 10, 2021 at 4:12 PM Takashi Kajinami <tkajinam@redhat.com
wrote:
Hi All,
I've been working on bug 1926693[1], and am lost about the reasonable
solutions we expect. Ideally I'd need to bring this topic in the team
meeting
but because of the timezone gap and complicated background, I'd like to
gather some feedback in ml first.
TL;DR
Which one(or ones) would be reasonable solutions for this issue ?
(3) Implement something different
The issue I reported in the bug is that there is an inconsistency between
nova and neutron about the way to determine a hypervisor name.
Currently neutron uses socket.gethostname() (which always returns
shortname)
to determine a hypervisor name to search the corresponding resource
provider.
On the other hand, nova uses libvirt's getHostname function (if libvirt
driver is used)
which returns a canonical name. Canonical name can be shortname or FQDN
(*1)
and if FQDN is used then neutron and nova never agree.
(*1)
IMO this is likely to happen in real deployments. For example, TripelO uses
FQDN for canonical names.
Neutron already provides the resource_provider_defauly_hypervisors option
to override a hypervisor name used. However because this option accepts
a map between interface and hypervisor, setting this parameter requires
very redundant description especially when a compute node has multiple
interfaces/bridges. The following example shows how redundant the current
requirement is.
~~~
[OVS]
resource_provider_bandwidths=br-data1:1024:1024,br-data2:1024:1024,\
br-data3:1024,1024,br-data4,1024:1024
resource_provider_hypervisors=br-data1:compute0.mydomain,br-data2:\
compute0.mydomain,br-data3:compute0.mydomain,br-data4:compute0.mydomain
~~~
I've submitted a change to propose a new single parameter to override
the base hypervisor name but this is currently -2ed, mainly because
I lacked analysis about the root cause of mismatch when I proposed this.
On the other hand, I submitted a different change to neutron which
implements
the logic to get a hypervisor name which is fully compatible with libvirt.
While this would save users from even overriding hypervisor names, I'm
aware
that this might break the other virt driver which depends on a different
logic
to generate a hypervisor name. IMO the patch is still useful considering
the libvirt driver would be the most popular option now, but I'm not fully
aware of the impact on the other drivers, especially because I don't know
which virt driver would support the minimum QoS feature now.
(2) https://review.opendev.org/c/openstack/neutron/+/788893/
In the review of (2), Sean mentioned implementing a logic to determine
an appropriate resource provider(3) even if there is a mismatch about
host name format, but I'm not sure how I would implement that, tbh.
My current thought is to merge (1) as a quick solution first, and discuss
whether
we should merge (2), but I'd like to ask for some feedback about this plan
(like we should NOT merge (2)).
I'd appreciate your thoughts about this $topic.
Thank you,
Takashi
--
Slawek Kaplonski
Principal Software Engineer
Red Hat
Hi, Dnia piątek, 11 czerwca 2021 13:14:03 CEST Takashi Kajinami pisze:
Hi Radolfo,
Thank you for your clarification and sorry I misread what you wrote.
My concern with that approach is that adding the hypervisor_type parameter would mean neutron will implement a logic for the other virt drivers, which is currently maintained in nova or hypervisor like libvirt in the future and it would expand the scope of neutron too much.
IIUC current Neutron doesn't care about virt drivers used, and I agree with Slawek that it's better to keep that current design here.
Thank you, Takashi
On Fri, Jun 11, 2021 at 7:39 PM Rodolfo Alonso Hernandez <
ralonsoh@redhat.com> wrote:
Hello:
I think I'm not explaining myself correctly. This is what I'm proposing: to provide a "hypervisor_type" variable in Neutron and implement, for each supported hypervisor, a hostname method retrieval.
If we don't support the hypervisor used, the user can always provide the hostname via "resource_provider_hypervisors".
Regards.
On Fri, Jun 11, 2021 at 12:20 PM Takashi Kajinami <tkajinam@redhat.com>
wrote:
Hi Slawek and Radolfo,
Thank you for your feedback.
On Fri, Jun 11, 2021 at 5:47 PM Rodolfo Alonso Hernandez <
ralonsoh@redhat.com> wrote:
I agree with this idea but what https://review.opendev.org/c/openstack/neutron/+/763563 is proposing differs from what I'm saying: instead of providing the hostname (that is something we can do "resource_provider_hypervisors"), we should provide
I'm not sure if adding "hypervisor drivers" to neutron is good idea. Solution proposed by Takashi is simpler IMHO. If user just want's to override hostname for all resources, this new option can be used. But in some case, where it's needed to do it "per bridge", that's also possible. I know it's maybe not perfect but IMO still better than nothing. the
hypervisor name (default: libvirt).
The main problem is that the logic to determine "hypervisor name" is different in each virt driver. For example libvirt driver uses canonical name while power driver uses [DEFAULT] host in nova.conf . So if we fix compatibility with one virt driver then it would break compatibility with the other driver. Because neutron is not aware of the virt driver used, it's impossible to avoid that inconsistency completely.
Thank you, Takashi
On Fri, Jun 11, 2021 at 10:36 AM Slawek Kaplonski <skaplons@redhat.com>
wrote:
Hi,
Dnia piątek, 11 czerwca 2021 09:57:27 CEST Rodolfo Alonso Hernandez
pisze:
Hello Takashi and Neutrinos:
First of all, thank you for working on this.
Currently users have the ability to override the host name using
"resource_provider_hypervisors". That means this parameter is always
configurable; IMO we are safe on this.
The problem we have is how we should retrieve this host name if
"resource_provider_hypervisors" is not provided. I think the solution
could
be a combination of: - A first patch providing the ability to select the hypervisor
type. The
default one could be "libvirt". Each driver can have a particular
host name
retrieval implementation. The default one will be the implemented
right
now: "socket.gethostname()"
providing
full compatibility for libvirt.
Those are my two cents.
We can move on with the patch https://review.opendev.org/c/openstack/neutron/+/763563 to provide new config option as it's now and additionally implement https://review.opendev.org/c/openstack/neutron/+/788893 so users who are using libvirt will not need to change anything, but if someone is using other hypervisor, this will allow adjustments. Wdyt?
Regards.
On Thu, Jun 10, 2021 at 4:12 PM Takashi Kajinami <tkajinam@redhat.com
wrote: > Hi All, > > > > > > I've been working on bug 1926693[1], and am lost about the
reasonable
> solutions we expect. Ideally I'd need to bring this topic in the
team
> meeting > > but because of the timezone gap and complicated background, I'd
like to
> gather some feedback in ml first. > > > > [1] https://bugs.launchpad.net/neutron/+bug/1926693 > > > > TL;DR > > Which one(or ones) would be reasonable solutions for this issue ? > > (1) https://review.opendev.org/c/openstack/neutron/+/763563 > > (2) https://review.opendev.org/c/openstack/neutron/+/788893 > > (3) Implement something different > > The issue I reported in the bug is that there is an inconsistency
between
> nova and neutron about the way to determine a hypervisor name. > > Currently neutron uses socket.gethostname() (which always returns > > shortname) > > to determine a hypervisor name to search the corresponding resource > > provider. > > On the other hand, nova uses libvirt's getHostname function (if
libvirt
> driver is used) > > which returns a canonical name. Canonical name can be shortname or
FQDN
> (*1) > > and if FQDN is used then neutron and nova never agree. > > > > (*1) > > IMO this is likely to happen in real deployments. For example,
TripelO uses
> FQDN for canonical names. > > > > Neutron already provides the resource_provider_defauly_hypervisors
option
> to override a hypervisor name used. However because this option
accepts
> a map between interface and hypervisor, setting this parameter
requires
> very redundant description especially when a compute node has
multiple
> interfaces/bridges. The following example shows how redundant the
current
> requirement is. > > ~~~ > > [OVS] > > resource_provider_bandwidths=br-data1:1024:1024,br-data2:1024:1024, \ > > br-data3:1024,1024,br-data4,1024:1024 > > resource_provider_hypervisors=br-data1:compute0.mydomain,br-data2:\
compute0.mydomain,br-data3:compute0.mydomain,br-data4:compute0.mydomain
> ~~~ > > > > I've submitted a change to propose a new single parameter to
override
> the base hypervisor name but this is currently -2ed, mainly because > > I lacked analysis about the root cause of mismatch when I proposed
this.
> (1) https://review.opendev.org/c/openstack/neutron/+/763563 > > On the other hand, I submitted a different change to neutron which > > implements > > the logic to get a hypervisor name which is fully compatible with
libvirt.
> While this would save users from even overriding hypervisor names,
I'm
> aware > > that this might break the other virt driver which depends on a
different
> logic > > to generate a hypervisor name. IMO the patch is still useful
considering
> the libvirt driver would be the most popular option now, but I'm
not fully
> aware of the impact on the other drivers, especially because I
don't know
> which virt driver would support the minimum QoS feature now. > > (2) https://review.opendev.org/c/openstack/neutron/+/788893/ > > In the review of (2), Sean mentioned implementing a logic to
determine
> an appropriate resource provider(3) even if there is a mismatch
about
> host name format, but I'm not sure how I would implement that, tbh. > > > > > > My current thought is to merge (1) as a quick solution first, and
discuss
> whether > > we should merge (2), but I'd like to ask for some feedback about
this plan
> (like we should NOT merge (2)). > > > > I'd appreciate your thoughts about this $topic. > > > > Thank you, > > Takashi
--
Slawek Kaplonski
Principal Software Engineer
Red Hat
-- Slawek Kaplonski Principal Software Engineer Red Hat
Hi Takashi, On Thu, 10 Jun 2021 at 15:06, Takashi Kajinami <tkajinam@redhat.com> wrote:
Hi All,
I've been working on bug 1926693[1], and am lost about the reasonable solutions we expect. Ideally I'd need to bring this topic in the team meeting but because of the timezone gap and complicated background, I'd like to gather some feedback in ml first.
[1] https://bugs.launchpad.net/neutron/+bug/1926693
TL;DR Which one(or ones) would be reasonable solutions for this issue ? (1) https://review.opendev.org/c/openstack/neutron/+/763563 (2) https://review.opendev.org/c/openstack/neutron/+/788893 (3) Implement something different
The issue I reported in the bug is that there is an inconsistency between nova and neutron about the way to determine a hypervisor name. Currently neutron uses socket.gethostname() (which always returns shortname)
socket.gethostname() can return fqdn or shortname - https://docs.python.org/3/library/socket.html#socket.gethostname. I've seen cases where it switched from short to fqdn but I'm not sure of the root cause - DHCP lease setting a hostname/domainname perhaps. Thanks, Ollie to determine a hypervisor name to search the corresponding resource
provider. On the other hand, nova uses libvirt's getHostname function (if libvirt driver is used) which returns a canonical name. Canonical name can be shortname or FQDN (*1) and if FQDN is used then neutron and nova never agree.
(*1) IMO this is likely to happen in real deployments. For example, TripelO uses FQDN for canonical names.
Neutron already provides the resource_provider_defauly_hypervisors option to override a hypervisor name used. However because this option accepts a map between interface and hypervisor, setting this parameter requires very redundant description especially when a compute node has multiple interfaces/bridges. The following example shows how redundant the current requirement is. ~~~ [OVS] resource_provider_bandwidths=br-data1:1024:1024,br-data2:1024:1024,\ br-data3:1024,1024,br-data4,1024:1024 resource_provider_hypervisors=br-data1:compute0.mydomain,br-data2:\ compute0.mydomain,br-data3:compute0.mydomain,br-data4:compute0.mydomain ~~~
I've submitted a change to propose a new single parameter to override the base hypervisor name but this is currently -2ed, mainly because I lacked analysis about the root cause of mismatch when I proposed this. (1) https://review.opendev.org/c/openstack/neutron/+/763563
On the other hand, I submitted a different change to neutron which implements the logic to get a hypervisor name which is fully compatible with libvirt. While this would save users from even overriding hypervisor names, I'm aware that this might break the other virt driver which depends on a different logic to generate a hypervisor name. IMO the patch is still useful considering the libvirt driver would be the most popular option now, but I'm not fully aware of the impact on the other drivers, especially because I don't know which virt driver would support the minimum QoS feature now. (2) https://review.opendev.org/c/openstack/neutron/+/788893/
In the review of (2), Sean mentioned implementing a logic to determine an appropriate resource provider(3) even if there is a mismatch about host name format, but I'm not sure how I would implement that, tbh.
My current thought is to merge (1) as a quick solution first, and discuss whether we should merge (2), but I'd like to ask for some feedback about this plan (like we should NOT merge (2)).
I'd appreciate your thoughts about this $topic.
Thank you, Takashi
On Fri, Jun 11, 2021 at 8:48 PM Oliver Walsh <owalsh@redhat.com> wrote:
Hi Takashi,
On Thu, 10 Jun 2021 at 15:06, Takashi Kajinami <tkajinam@redhat.com> wrote:
Hi All,
I've been working on bug 1926693[1], and am lost about the reasonable solutions we expect. Ideally I'd need to bring this topic in the team meeting but because of the timezone gap and complicated background, I'd like to gather some feedback in ml first.
[1] https://bugs.launchpad.net/neutron/+bug/1926693
TL;DR Which one(or ones) would be reasonable solutions for this issue ? (1) https://review.opendev.org/c/openstack/neutron/+/763563 (2) https://review.opendev.org/c/openstack/neutron/+/788893 (3) Implement something different
The issue I reported in the bug is that there is an inconsistency between nova and neutron about the way to determine a hypervisor name. Currently neutron uses socket.gethostname() (which always returns shortname)
socket.gethostname() can return fqdn or shortname - https://docs.python.org/3/library/socket.html#socket.gethostname.
You are correct and my statement was not accurate. So socket.gethostname() returns what is returned by gethostname system call, and gethostname/sethostname accept both FQDN and short name, socket.gethostname() can return one of FQDN or short name. However the root problem is that this logic is not completely same as the ones used in each virt driver. Of cause we can require people the "correct" format usage for canonical name as well as "hostname", but fixthing this problem in neutron would be much more helpful considering the effect caused by enforcing users to "fix" hostname/canonical name formatting at this point.
I've seen cases where it switched from short to fqdn but I'm not sure of the root cause - DHCP lease setting a hostname/domainname perhaps.
Thanks, Ollie
to determine a hypervisor name to search the corresponding resource
provider. On the other hand, nova uses libvirt's getHostname function (if libvirt driver is used) which returns a canonical name. Canonical name can be shortname or FQDN (*1) and if FQDN is used then neutron and nova never agree.
(*1) IMO this is likely to happen in real deployments. For example, TripelO uses FQDN for canonical names.
Neutron already provides the resource_provider_defauly_hypervisors option to override a hypervisor name used. However because this option accepts a map between interface and hypervisor, setting this parameter requires very redundant description especially when a compute node has multiple interfaces/bridges. The following example shows how redundant the current requirement is. ~~~ [OVS] resource_provider_bandwidths=br-data1:1024:1024,br-data2:1024:1024,\ br-data3:1024,1024,br-data4,1024:1024 resource_provider_hypervisors=br-data1:compute0.mydomain,br-data2:\ compute0.mydomain,br-data3:compute0.mydomain,br-data4:compute0.mydomain ~~~
I've submitted a change to propose a new single parameter to override the base hypervisor name but this is currently -2ed, mainly because I lacked analysis about the root cause of mismatch when I proposed this. (1) https://review.opendev.org/c/openstack/neutron/+/763563
On the other hand, I submitted a different change to neutron which implements the logic to get a hypervisor name which is fully compatible with libvirt. While this would save users from even overriding hypervisor names, I'm aware that this might break the other virt driver which depends on a different logic to generate a hypervisor name. IMO the patch is still useful considering the libvirt driver would be the most popular option now, but I'm not fully aware of the impact on the other drivers, especially because I don't know which virt driver would support the minimum QoS feature now. (2) https://review.opendev.org/c/openstack/neutron/+/788893/
In the review of (2), Sean mentioned implementing a logic to determine an appropriate resource provider(3) even if there is a mismatch about host name format, but I'm not sure how I would implement that, tbh.
My current thought is to merge (1) as a quick solution first, and discuss whether we should merge (2), but I'd like to ask for some feedback about this plan (like we should NOT merge (2)).
I'd appreciate your thoughts about this $topic.
Thank you, Takashi
Hello: I'll approve [1] although I see no need for it. Having "resource_provider_hypervisors", there is no need for a second configuration parameter to provide the same information, regardless of the comfort of providing one single string and not a list of tuples. Regards. [1]https://review.opendev.org/c/openstack/neutron/+/763563 On Fri, Jun 11, 2021 at 5:51 PM Takashi Kajinami <tkajinam@redhat.com> wrote:
On Fri, Jun 11, 2021 at 8:48 PM Oliver Walsh <owalsh@redhat.com> wrote:
Hi Takashi,
On Thu, 10 Jun 2021 at 15:06, Takashi Kajinami <tkajinam@redhat.com> wrote:
Hi All,
I've been working on bug 1926693[1], and am lost about the reasonable solutions we expect. Ideally I'd need to bring this topic in the team meeting but because of the timezone gap and complicated background, I'd like to gather some feedback in ml first.
[1] https://bugs.launchpad.net/neutron/+bug/1926693
TL;DR Which one(or ones) would be reasonable solutions for this issue ? (1) https://review.opendev.org/c/openstack/neutron/+/763563 (2) https://review.opendev.org/c/openstack/neutron/+/788893 (3) Implement something different
The issue I reported in the bug is that there is an inconsistency between nova and neutron about the way to determine a hypervisor name. Currently neutron uses socket.gethostname() (which always returns shortname)
socket.gethostname() can return fqdn or shortname - https://docs.python.org/3/library/socket.html#socket.gethostname.
You are correct and my statement was not accurate. So socket.gethostname() returns what is returned by gethostname system call, and gethostname/sethostname accept both FQDN and short name, socket.gethostname() can return one of FQDN or short name.
However the root problem is that this logic is not completely same as the ones used in each virt driver. Of cause we can require people the "correct" format usage for canonical name as well as "hostname", but fixthing this problem in neutron would be much more helpful considering the effect caused by enforcing users to "fix" hostname/canonical name formatting at this point.
I've seen cases where it switched from short to fqdn but I'm not sure of the root cause - DHCP lease setting a hostname/domainname perhaps.
Thanks, Ollie
to determine a hypervisor name to search the corresponding resource
provider. On the other hand, nova uses libvirt's getHostname function (if libvirt driver is used) which returns a canonical name. Canonical name can be shortname or FQDN (*1) and if FQDN is used then neutron and nova never agree.
(*1) IMO this is likely to happen in real deployments. For example, TripelO uses FQDN for canonical names.
Neutron already provides the resource_provider_defauly_hypervisors option to override a hypervisor name used. However because this option accepts a map between interface and hypervisor, setting this parameter requires very redundant description especially when a compute node has multiple interfaces/bridges. The following example shows how redundant the current requirement is. ~~~ [OVS] resource_provider_bandwidths=br-data1:1024:1024,br-data2:1024:1024,\ br-data3:1024,1024,br-data4,1024:1024 resource_provider_hypervisors=br-data1:compute0.mydomain,br-data2:\ compute0.mydomain,br-data3:compute0.mydomain,br-data4:compute0.mydomain ~~~
I've submitted a change to propose a new single parameter to override the base hypervisor name but this is currently -2ed, mainly because I lacked analysis about the root cause of mismatch when I proposed this. (1) https://review.opendev.org/c/openstack/neutron/+/763563
On the other hand, I submitted a different change to neutron which implements the logic to get a hypervisor name which is fully compatible with libvirt. While this would save users from even overriding hypervisor names, I'm aware that this might break the other virt driver which depends on a different logic to generate a hypervisor name. IMO the patch is still useful considering the libvirt driver would be the most popular option now, but I'm not fully aware of the impact on the other drivers, especially because I don't know which virt driver would support the minimum QoS feature now. (2) https://review.opendev.org/c/openstack/neutron/+/788893/
In the review of (2), Sean mentioned implementing a logic to determine an appropriate resource provider(3) even if there is a mismatch about host name format, but I'm not sure how I would implement that, tbh.
My current thought is to merge (1) as a quick solution first, and discuss whether we should merge (2), but I'd like to ask for some feedback about this plan (like we should NOT merge (2)).
I'd appreciate your thoughts about this $topic.
Thank you, Takashi
On Fri, Jun 11, 2021 at 8:48 PM Oliver Walsh <owalsh@redhat.com> wrote:
Hi Takashi,
On Thu, 10 Jun 2021 at 15:06, Takashi Kajinami <tkajinam@redhat.com> wrote:
Hi All,
I've been working on bug 1926693[1], and am lost about the reasonable solutions we expect. Ideally I'd need to bring this topic in the team meeting but because of the timezone gap and complicated background, I'd like to gather some feedback in ml first.
[1] https://bugs.launchpad.net/neutron/+bug/1926693
TL;DR Which one(or ones) would be reasonable solutions for this issue ? (1) https://review.opendev.org/c/openstack/neutron/+/763563 (2) https://review.opendev.org/c/openstack/neutron/+/788893 (3) Implement something different
The issue I reported in the bug is that there is an inconsistency between nova and neutron about the way to determine a hypervisor name. Currently neutron uses socket.gethostname() (which always returns shortname)
socket.gethostname() can return fqdn or shortname - https://docs.python.org/3/library/socket.html#socket.gethostname.
You are correct and my statement was not accurate. So socket.gethostname() returns what is returned by gethostname system call, and gethostname/sethostname accept both FQDN and short name, socket.gethostname() can return one of FQDN or short name.
However the root problem is that this logic is not completely same as the ones used in each virt driver. Of cause we can require people the "correct" format usage for canonical name as well as "hostname", but fixthing this problem in neutron would be much more helpful considering the effect caused by enforcing users to "fix" hostname/canonical name formatting at this point.
On Sat, 2021-06-12 at 00:46 +0900, Takashi Kajinami wrote: this is not really something that can be fixed in neutron we can either create a common funciton in oslo.utils or placement-lib that we can use in nova, neutron and all other project or we can use the config option. if we want to "fix" this in neutron then neutron should either try looking up the RP using the host name and then fall back to using the fqdn or we shoudl look at using the hypervior api as we discussed a few years ago when this last came up http://lists.openstack.org/pipermail/openstack-discuss/2019-November/011044.... i dont think neutron shoudl know anything about hyperviors so i would just proceed with the new config option that takashi has proposed but i would not implemente Rodolfo's solution of adding a hypervisor_type. just as nova has no awareness of the neutron backend and trys to treat all fo them the same neutron should remain hypervior independent and we should look to provide common code that can be reused to identify the RP in a seperate lib as a longer term solution. for many deployment that do not set the fqdn as the canonical host name in /etc/host the current default behavior works out of the box whatever solution we take we need to ensure that no existing deployment is affected by the change which means we cannot default to only using the fqdn or similar as that would be an upgrade breakage so we have to maintain the current behavior by default and enhance neutron to either fall back to the fqdn if the hostname based lookup fails or use the new config intoduc ed by takashi's patch where the fqdn is used as the server canonical hostname.
I've seen cases where it switched from short to fqdn but I'm not sure of the root cause - DHCP lease setting a hostname/domainname perhaps.
Thanks, Ollie
to determine a hypervisor name to search the corresponding resource provider. On the other hand, nova uses libvirt's getHostname function (if libvirt driver is used) which returns a canonical name. Canonical name can be shortname or FQDN (*1) and if FQDN is used then neutron and nova never agree.
(*1) IMO this is likely to happen in real deployments. For example, TripelO uses FQDN for canonical names.
Neutron already provides the resource_provider_defauly_hypervisors option to override a hypervisor name used. However because this option accepts a map between interface and hypervisor, setting this parameter requires very redundant description especially when a compute node has multiple interfaces/bridges. The following example shows how redundant the current requirement is. ~~~ [OVS] resource_provider_bandwidths=br-data1:1024:1024,br- data2:1024:1024,\ br-data3:1024,1024,br-data4,1024:1024 resource_provider_hypervisors=br-data1:compute0.mydomain,br-data2:\ compute0.mydomain,br-data3:compute0.mydomain,br- data4:compute0.mydomain ~~~
I've submitted a change to propose a new single parameter to override the base hypervisor name but this is currently -2ed, mainly because I lacked analysis about the root cause of mismatch when I proposed this. (1) https://review.opendev.org/c/openstack/neutron/+/763563
On the other hand, I submitted a different change to neutron which implements the logic to get a hypervisor name which is fully compatible with libvirt. While this would save users from even overriding hypervisor names, I'm aware that this might break the other virt driver which depends on a different logic to generate a hypervisor name. IMO the patch is still useful considering the libvirt driver would be the most popular option now, but I'm not fully aware of the impact on the other drivers, especially because I don't know which virt driver would support the minimum QoS feature now. (2) https://review.opendev.org/c/openstack/neutron/+/788893/
In the review of (2), Sean mentioned implementing a logic to determine an appropriate resource provider(3) even if there is a mismatch about host name format, but I'm not sure how I would implement that, tbh.
My current thought is to merge (1) as a quick solution first, and discuss whether we should merge (2), but I'd like to ask for some feedback about this plan (like we should NOT merge (2)).
I'd appreciate your thoughts about this $topic.
Thank you, Takashi
Thank you all for your additional thoughts. Because I've not received very strong objections about existing two patches[1][2], I updated these patches to resolve conflicts between these patches. [1] https://review.opendev.org/c/openstack/neutron/+/763563 [2] https://review.opendev.org/c/openstack/neutron/+/788893 I made the patch to add default hypervisor name as base one because it doesn't change behavior and would be "safe" for backports. So far we have received positive feedback about fixing compatibility with libvirt (in master) but I'll create a backport of that change as well to ask some feedback about its profit and risk for backport. I think strategy is now clear with this feedback but please feel free to put your thoughts in this thread or the above patches.
if we want to "fix" this in neutron then neutron should either try looking up the RP using the host name and then fall back to using the fqdn or we should look at using the hypervior api as we discussed a few years ago when this last came up
for many deployment that do not set the fqdn as the canonical host name in /etc/host the current default behavior works out of the box whatever solution we take we need to ensure that no existing deployment is affected by the change which means we cannot default to only using the fqdn or similar as that would be an upgrade breakage so we have to maintain the current behavior by default and enhance neutron to either fall back to the fqdn if the hostname based lookup fails or use the new config intoduc ed by takashi's patch where the fqdn is used as the server canonical hostname. Thank you for pointing this out. To be clear, the behavior change I
http://lists.openstack.org/pipermail/openstack-discuss/2019-November/011044.... <http://lists.openstack.org/pipermail/openstack-discuss/2019-November/011044.html> I feel like this discussion would be a good chance to revisit the requirement of basic client implementation for placement. (or abstraction layer like castellan) Currently each components like nova, neutron, and cyborg(?) have their own placement client implementation (and logic to query resource providers) but IMO it is more efficient if we can maintain the common client implementation instead. proposed[2] doesn't break any deployment with libvirt but would break deployments with non-libvirt drivers. This point should be considered when reviewing that change. So far most of the feedback I received is that it is preferred to fix compatibility with libvirt as it's the "default" option but please share your thoughts on the patch. On Mon, Jun 14, 2021 at 7:30 PM Sean Mooney <smooney@redhat.com> wrote:
On Fri, Jun 11, 2021 at 8:48 PM Oliver Walsh <owalsh@redhat.com> wrote:
Hi Takashi,
On Thu, 10 Jun 2021 at 15:06, Takashi Kajinami <tkajinam@redhat.com> wrote:
Hi All,
I've been working on bug 1926693[1], and am lost about the reasonable solutions we expect. Ideally I'd need to bring this topic in the team meeting but because of the timezone gap and complicated background, I'd like to gather some feedback in ml first.
[1] https://bugs.launchpad.net/neutron/+bug/1926693
TL;DR Which one(or ones) would be reasonable solutions for this issue ? (1) https://review.opendev.org/c/openstack/neutron/+/763563 (2) https://review.opendev.org/c/openstack/neutron/+/788893 (3) Implement something different
The issue I reported in the bug is that there is an inconsistency between nova and neutron about the way to determine a hypervisor name. Currently neutron uses socket.gethostname() (which always returns shortname)
socket.gethostname() can return fqdn or shortname - https://docs.python.org/3/library/socket.html#socket.gethostname.
You are correct and my statement was not accurate. So socket.gethostname() returns what is returned by gethostname system call, and gethostname/sethostname accept both FQDN and short name, socket.gethostname() can return one of FQDN or short name.
However the root problem is that this logic is not completely same as the ones used in each virt driver. Of cause we can require people the "correct" format usage for canonical name as well as "hostname", but fixthing this problem in neutron would be much more helpful considering the effect caused by enforcing users to "fix" hostname/canonical name formatting at this point.
On Sat, 2021-06-12 at 00:46 +0900, Takashi Kajinami wrote: this is not really something that can be fixed in neutron we can either create a common funciton in oslo.utils or placement-lib that we can use in nova, neutron and all other project or we can use the config option.
if we want to "fix" this in neutron then neutron should either try looking up the RP using the host name and then fall back to using the fqdn or we shoudl look at using the hypervior api as we discussed a few years ago when this last came up
http://lists.openstack.org/pipermail/openstack-discuss/2019-November/011044....
i dont think neutron shoudl know anything about hyperviors so i would just proceed with the new config option that takashi has proposed but i would not implemente Rodolfo's solution of adding a hypervisor_type.
just as nova has no awareness of the neutron backend and trys to treat all fo them the same neutron should remain hypervior independent and we should look to provide common code that can be reused to identify the RP in a seperate lib as a longer term solution.
for many deployment that do not set the fqdn as the canonical host name in /etc/host the current default behavior works out of the box whatever solution we take we need to ensure that no existing deployment is affected by the change which means we cannot default to only using the fqdn or similar as that would be an upgrade breakage so we have to maintain the current behavior by default and enhance neutron to either fall back to the fqdn if the hostname based lookup fails or use the new config intoduc ed by takashi's patch where the fqdn is used as the server canonical hostname.
I've seen cases where it switched from short to fqdn but I'm not sure of the root cause - DHCP lease setting a hostname/domainname perhaps.
Thanks, Ollie
to determine a hypervisor name to search the corresponding resource provider. On the other hand, nova uses libvirt's getHostname function (if libvirt driver is used) which returns a canonical name. Canonical name can be shortname or FQDN (*1) and if FQDN is used then neutron and nova never agree.
(*1) IMO this is likely to happen in real deployments. For example, TripelO uses FQDN for canonical names.
Neutron already provides the resource_provider_defauly_hypervisors option to override a hypervisor name used. However because this option accepts a map between interface and hypervisor, setting this parameter requires very redundant description especially when a compute node has multiple interfaces/bridges. The following example shows how redundant the current requirement is. ~~~ [OVS] resource_provider_bandwidths=br-data1:1024:1024,br- data2:1024:1024,\ br-data3:1024,1024,br-data4,1024:1024 resource_provider_hypervisors=br-data1:compute0.mydomain,br-data2:\ compute0.mydomain,br-data3:compute0.mydomain,br- data4:compute0.mydomain ~~~
I've submitted a change to propose a new single parameter to override the base hypervisor name but this is currently -2ed, mainly because I lacked analysis about the root cause of mismatch when I proposed this. (1) https://review.opendev.org/c/openstack/neutron/+/763563
On the other hand, I submitted a different change to neutron which implements the logic to get a hypervisor name which is fully compatible with libvirt. While this would save users from even overriding hypervisor names, I'm aware that this might break the other virt driver which depends on a different logic to generate a hypervisor name. IMO the patch is still useful considering the libvirt driver would be the most popular option now, but I'm not fully aware of the impact on the other drivers, especially because I don't know which virt driver would support the minimum QoS feature now. (2) https://review.opendev.org/c/openstack/neutron/+/788893/
In the review of (2), Sean mentioned implementing a logic to determine an appropriate resource provider(3) even if there is a mismatch about host name format, but I'm not sure how I would implement that, tbh.
My current thought is to merge (1) as a quick solution first, and discuss whether we should merge (2), but I'd like to ask for some feedback about this plan (like we should NOT merge (2)).
I'd appreciate your thoughts about this $topic.
Thank you, Takashi
Thank you all for your additional thoughts.
Because I've not received very strong objections about existing two patches[1][2], I updated these patches to resolve conflicts between these patches. [1] https://review.opendev.org/c/openstack/neutron/+/763563
[2] https://review.opendev.org/c/openstack/neutron/+/788893 I made the patch to add default hypervisor name as base one because it doesn't change behavior and would be "safe" for backports. So far we have received positive feedback about fixing compatibility with libvirt (in master) but I'll create a backport of that change as well to ask some feedback about its profit and risk for backport.
I think strategy is now clear with this feedback but please feel free to put your thoughts in this thread or the above patches.
if we want to "fix" this in neutron then neutron should either try looking up the RP using the host name and then fall back to using the fqdn or we should look at using the hypervior api as we discussed a few years ago when this last came up http://lists.openstack.org/pipermail/openstack-discuss/2019- November/011044.html
I feel like this discussion would be a good chance to revisit the requirement of basic client implementation for placement. (or abstraction layer like castellan) Currently each components like nova, neutron, and cyborg(?) have their own placement client implementation (and logic to query resource providers) but IMO it is more efficient if we can maintain the common client implementation instead. it may be useful in a form of placement-lib
On Tue, 2021-06-15 at 09:17 +0900, Takashi Kajinami wrote: this is not somethign that coudl have been adress in a common client however as for example ironic or other clustered driver have 1 compute service but multipel resouce provider per compute service so we cant always assume 1:1 mappings. its why we cant use conf.HOST in the general case altough we could have used it for libvirt.
for many deployment that do not set the fqdn as the canonical host name in /etc/host the current default behavior works out of the box whatever solution we take we need to ensure that no existing deployment is affected by the change which means we cannot default to only using the fqdn or similar as that would be an upgrade breakage so we have to maintain the current behavior by default and enhance neutron to either fall back to the fqdn if the hostname based lookup fails or use the new config intoduc ed by takashi's patch where the fqdn is used as the server canonical hostname. Thank you for pointing this out. To be clear, the behavior change I proposed[2] doesn't break any deployment with libvirt but would break deployments with non-libvirt drivers. This point should be considered when reviewing that change. So far most of the feedback I received is that it is preferred to fix compatibility with libvirt as it's the "default" option but please share your thoughts on the patch.
ok there are 3 sets of name that are likely to be used the hostname, the fqdn, and the value of conf.HOST conf.HOST default to the hostname. if we are to enhance the default behavior i think we should just implement a fallback behavior which would check all 3 values if they are distinct i.e. lookup by hostname, if that fails lookup by fqdn, if that fails lookup by conf.HOST if and only if it not the same as the hostname(its default value) or the fqdn. it would be unusual fo rthe conf.host to not match the hostname or fqdn but it does happen for example if you are rinning multiple virt driver on the same host wehn you deploy say libvirt and ironic on the same host or you use the fake dirver for scale testing.
On Mon, Jun 14, 2021 at 7:30 PM Sean Mooney <smooney@redhat.com> wrote:
On Fri, Jun 11, 2021 at 8:48 PM Oliver Walsh <owalsh@redhat.com> wrote:
Hi Takashi,
On Thu, 10 Jun 2021 at 15:06, Takashi Kajinami <tkajinam@redhat.com> wrote:
Hi All,
I've been working on bug 1926693[1], and am lost about the reasonable solutions we expect. Ideally I'd need to bring this topic in
team meeting but because of the timezone gap and complicated background, I'd like to gather some feedback in ml first.
[1] https://bugs.launchpad.net/neutron/+bug/1926693
TL;DR Which one(or ones) would be reasonable solutions for this issue ? (1) https://review.opendev.org/c/openstack/neutron/+/763563 (2) https://review.opendev.org/c/openstack/neutron/+/788893 (3) Implement something different
The issue I reported in the bug is that there is an inconsistency between nova and neutron about the way to determine a hypervisor name. Currently neutron uses socket.gethostname() (which always returns shortname)
socket.gethostname() can return fqdn or shortname -
https://docs.python.org/3/library/socket.html#socket.gethostname.
You are correct and my statement was not accurate. So socket.gethostname() returns what is returned by gethostname system call, and gethostname/sethostname accept both FQDN and short name, socket.gethostname() can return one of FQDN or short name.
However the root problem is that this logic is not completely same as the ones used in each virt driver. Of cause we can require people the "correct" format usage for canonical name as well as "hostname", but fixthing this problem in neutron would be much more helpful considering the effect caused by enforcing users to "fix" hostname/canonical name formatting at this point.
On Sat, 2021-06-12 at 00:46 +0900, Takashi Kajinami wrote: the this is not really something that can be fixed in neutron we can either create a common funciton in oslo.utils or placement- lib that we can use in nova, neutron and all other project or we can use the config option.
if we want to "fix" this in neutron then neutron should either try looking up the RP using the host name and then fall back to using the fqdn or we shoudl look at using the hypervior api as we discussed a few years ago when this last came up
http://lists.openstack.org/pipermail/openstack-discuss/2019-November/011044....
i dont think neutron shoudl know anything about hyperviors so i would just proceed with the new config option that takashi has proposed but i would not implemente Rodolfo's solution of adding a hypervisor_type.
just as nova has no awareness of the neutron backend and trys to treat all fo them the same neutron should remain hypervior independent and we should look to provide common code that can be reused to identify the RP in a seperate lib as a longer term solution.
I've seen cases where it switched from short to fqdn but I'm not sure of the root cause - DHCP lease setting a hostname/domainname
Thanks, Ollie
to determine a hypervisor name to search the corresponding
resource
provider. On the other hand, nova uses libvirt's getHostname function (if libvirt driver is used) which returns a canonical name. Canonical name can be shortname or FQDN (*1) and if FQDN is used then neutron and nova never agree.
(*1) IMO this is likely to happen in real deployments. For example, TripelO uses FQDN for canonical names.
Neutron already provides the resource_provider_defauly_hypervisors option to override a hypervisor name used. However because this
accepts a map between interface and hypervisor, setting this
requires very redundant description especially when a compute node has multiple interfaces/bridges. The following example shows how redundant
current requirement is. ~~~ [OVS] resource_provider_bandwidths=br-data1:1024:1024,br- data2:1024:1024,\ br-data3:1024,1024,br-data4,1024:1024 resource_provider_hypervisors=br-data1:compute0.mydomain,br- data2:\ compute0.mydomain,br-data3:compute0.mydomain,br- data4:compute0.mydomain ~~~
I've submitted a change to propose a new single parameter to override the base hypervisor name but this is currently -2ed, mainly because I lacked analysis about the root cause of mismatch when I
this. (1) https://review.opendev.org/c/openstack/neutron/+/763563
On the other hand, I submitted a different change to neutron which implements the logic to get a hypervisor name which is fully compatible with libvirt. While this would save users from even overriding hypervisor names, I'm aware that this might break the other virt driver which depends on a different logic to generate a hypervisor name. IMO the patch is still useful considering the libvirt driver would be the most popular option now, but I'm not fully aware of the impact on the other drivers, especially because I don't know which virt driver would support the minimum QoS feature now. (2) https://review.opendev.org/c/openstack/neutron/+/788893/
In the review of (2), Sean mentioned implementing a logic to determine an appropriate resource provider(3) even if there is a mismatch about host name format, but I'm not sure how I would implement
for many deployment that do not set the fqdn as the canonical host name in /etc/host the current default behavior works out of the box whatever solution we take we need to ensure that no existing deployment is affected by the change which means we cannot default to only using the fqdn or similar as that would be an upgrade breakage so we have to maintain the current behavior by default and enhance neutron to either fall back to the fqdn if the hostname based lookup fails or use the new config intoduc ed by takashi's patch where the fqdn is used as the server canonical hostname. perhaps. option parameter the proposed that, tbh.
My current thought is to merge (1) as a quick solution first,
and
discuss whether we should merge (2), but I'd like to ask for some feedback about this plan (like we should NOT merge (2)).
I'd appreciate your thoughts about this $topic.
Thank you, Takashi
participants (5)
-
Oliver Walsh
-
Rodolfo Alonso Hernandez
-
Sean Mooney
-
Slawek Kaplonski
-
Takashi Kajinami