<html><head></head><body><div>On Tue, 2021-06-15 at 09:17 +0900, Takashi Kajinami wrote:</div><blockquote type="cite" style="margin:0 0 0 .8ex; border-left:2px #729fcf solid;padding-left:1ex"><div dir="ltr"><div>Thank you all for your additional thoughts.</div><div><br></div><div>Because I've not received very strong objections about existing two patches[1][2],</div><div>I updated these patches to resolve conflicts between these patches.<br></div><span class="gmail-im"> [1] <a href="https://review.opendev.org/c/openstack/neutron/+/763563" target="_blank">https://review.opendev.org/c/openstack/neutron/+/763563</a><br></span><div><span class="gmail-im"> [2] <a href="https://review.opendev.org/c/openstack/neutron/+/788893" target="_blank">https://review.opendev.org/c/openstack/neutron/+/788893</a><br></span></div><div> </div><div>I made the patch to add default hypervisor name as base one because it doesn't</div><div>change behavior and would be "safe" for backports. So far we have received positive</div><div>feedback about fixing compatibility with libvirt (in master) but I'll create a backport</div><div>of that change as well to ask some feedback about its profit and risk for backport.</div><div><br></div><div>I think strategy is now clear with this feedback but please feel free to put your</div><div>thoughts in this thread or the above patches.<br></div><div dir="ltr"><br>> if we want to "fix" this in neutron then neutron should either try<br>> looking up the RP using the host name and then fall back to using the<br>> fqdn or we should look at using the hypervior api as we discussed a few<br>> years ago when this last came up<br><a href="http://lists.openstack.org/pipermail/openstack-discuss/2019-November/011044.html" rel="noreferrer" target="_blank">> http://lists.openstack.org/pipermail/openstack-discuss/2019-November/011044.html</a></div><div><br></div><div>I feel like this discussion would be a good chance to revisit the requirement of basic client</div><div>implementation for placement. (or abstraction layer like castellan)</div><div>Currently each components like nova, neutron, and cyborg(?) have their own placement</div><div>client implementation (and logic to query resource providers) but IMO it is more efficient<br></div><div>if we can maintain the common client implementation instead.<br></div></div></blockquote><div>it may be useful in a form of placement-lib</div><div>this is not somethign that coudl have been adress in a common client however as for example ironic</div><div>or other clustered driver have 1 compute service but multipel resouce provider per compute service</div><div>so we cant always assume 1:1 mappings. its why we cant use conf.HOST in the general case altough we could</div><div>have used it for libvirt.</div><blockquote type="cite" style="margin:0 0 0 .8ex; border-left:2px #729fcf solid;padding-left:1ex"><div dir="ltr"><div><br></div><div>> for many deployment that do not set the fqdn as the canonical host name<br>> in /etc/host the current default behavior works out of the box<br>> whatever solution we take we need to ensure that no existing deployment<br>> is affected by the change which means we cannot default to only using<br>> the fqdn or similar as that would be an upgrade breakage so we have<br>> to maintain the current behavior by default and enhance neutron to<br>> either fall back to the fqdn if the hostname based lookup fails or use<br>> the new config intoduc ed by takashi's patch where the fqdn is used as<br>> the server canonical hostname.</div><div>Thank you for pointing this out. To be clear, the behavior change I proposed[2] doesn't</div><div>break any deployment with libvirt but would break deployments with non-libvirt drivers.</div><div>This point should be considered when reviewing that change. So far most of the feedback</div><div>I received is that it is preferred to fix compatibility with libvirt as it's the "default" option</div><div>but please share your thoughts on the patch.</div></div></blockquote><div>ok there are 3 sets of name that are likely to be used</div><div>the hostname, the fqdn, and the value of conf.HOST</div><div>conf.HOST default to the hostname.</div><div>if we are to enhance the default behavior i think we should just implement a fallback behavior</div><div>which would check all 3 values if they are distinct</div><div>i.e. lookup by hostname, if that fails lookup by fqdn, if that fails lookup by conf.HOST if and only if it not the same as the hostname(its default value) or the fqdn.</div><div>it would be unusual fo rthe conf.host to not match the hostname or fqdn but it does happen for example if you are rinning multiple</div><div>virt driver on the same host wehn you deploy say libvirt and ironic on the same host or you use the fake dirver for scale testing.</div><div><br></div><blockquote type="cite" style="margin:0 0 0 .8ex; border-left:2px #729fcf solid;padding-left:1ex"><div dir="ltr"><div><br></div><div><br></div><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Mon, Jun 14, 2021 at 7:30 PM Sean Mooney <<a href="mailto:smooney@redhat.com">smooney@redhat.com</a>> wrote:<br></div><blockquote type="cite" style="margin:0 0 0 .8ex; border-left:2px #729fcf solid;padding-left:1ex"><div>On Sat, 2021-06-12 at 00:46 +0900, Takashi Kajinami wrote:<br>> On Fri, Jun 11, 2021 at 8:48 PM Oliver Walsh <<a href="mailto:owalsh@redhat.com" target="_blank">owalsh@redhat.com</a>> wrote:<br>> > Hi Takashi,<br>> > <br>> > On Thu, 10 Jun 2021 at 15:06, Takashi Kajinami <<a href="mailto:tkajinam@redhat.com" target="_blank">tkajinam@redhat.com</a>><br>> > wrote:<br>> > > Hi All,<br>> > > <br>> > > <br>> > > I've been working on bug 1926693[1], and am lost about the<br>> > > reasonable<br>> > > solutions we expect. Ideally I'd need to bring this topic in the<br>> > > team meeting<br>> > > but because of the timezone gap and complicated background, I'd<br>> > > like to<br>> > > gather some feedback in ml first.<br>> > > <br>> > > [1] <a href="https://bugs.launchpad.net/neutron/+bug/1926693" rel="noreferrer" target="_blank">https://bugs.launchpad.net/neutron/+bug/1926693</a><br>> > > <br>> > > TL;DR<br>> > > Which one(or ones) would be reasonable solutions for this issue ?<br>> > > (1) <a href="https://review.opendev.org/c/openstack/neutron/+/763563" rel="noreferrer" target="_blank">https://review.opendev.org/c/openstack/neutron/+/763563</a><br>> > > (2) <a href="https://review.opendev.org/c/openstack/neutron/+/788893" rel="noreferrer" target="_blank">https://review.opendev.org/c/openstack/neutron/+/788893</a><br>> > > (3) Implement something different<br>> > > <br>> > > The issue I reported in the bug is that there is an inconsistency<br>> > > between<br>> > > nova and neutron about the way to determine a hypervisor name.<br>> > > Currently neutron uses socket.gethostname() (which always returns<br>> > > shortname)<br>> > > <br>> > <br>> > <br>> > socket.gethostname() can return fqdn or shortname - <br>> > <a href="https://docs.python.org/3/library/socket.html#socket.gethostname" rel="noreferrer" target="_blank">https://docs.python.org/3/library/socket.html#socket.gethostname</a>.<br>> > <br>> <br>> You are correct and my statement was not accurate.<br>> So socket.gethostname() returns what is returned by gethostname system<br>> call,<br>> and gethostname/sethostname accept both FQDN and short name,<br>> socket.gethostname()<br>> can return one of FQDN or short name.<br>> <br>> However the root problem is that this logic is not completely same as<br>> the ones used<br>> in each virt driver. Of cause we can require people the "correct"<br>> format usage for<br>> canonical name as well as "hostname", but fixthing this problem in<br>> neutron would<br>> be much more helpful considering the effect caused by enforcing users<br>> to "fix"<br>> hostname/canonical name formatting at this point.<br>this is not really something that can be fixed in neutron <br>we can either create a common funciton in oslo.utils or placement-lib<br>that we can use in nova, neutron and all other project or we can use<br>the config option.<br></div><div><br>if we want to "fix" this in neutron then neutron should either try<br>looking up the RP using the host name and then fall back to using the<br>fqdn or we shoudl look at using the hypervior api as we discussed a few<br>years ago when this last came up<br></div><div><a href="http://lists.openstack.org/pipermail/openstack-discuss/2019-November/011044.html" rel="noreferrer" target="_blank">http://lists.openstack.org/pipermail/openstack-discuss/2019-November/011044.html</a><br></div><div><br>i dont think neutron shoudl know anything about hyperviors so i would<br>just proceed with the new config option that takashi has proposed but i<br>would not implemente Rodolfo's solution of adding a hypervisor_type.<br></div><div><br>just as nova has no awareness of the neutron backend and trys to treat<br>all fo them the same neutron should remain hypervior independent and we<br>should look to provide common code that can be reused to identify the<br>RP in a seperate lib as a longer term solution.<br></div><div><br>for many deployment that do not set the fqdn as the canonical host name<br>in /etc/host the current default behavior works out of the box<br>whatever solution we take we need to ensure that no existing deployment<br>is affected by the change which means we cannot default to only using<br>the fqdn or similar as that would be an upgrade breakage so we have<br>to maintain the current behavior by default and enhance neutron to<br>either fall back to the fqdn if the hostname based lookup fails or use<br>the new config intoduc ed by takashi's patch where the fqdn is used as<br>the server canonical hostname.<br>> <br>> > I've seen cases where it switched from short to fqdn but I'm not sure<br>> > of the root cause - DHCP lease setting a hostname/domainname perhaps.<br>> > <br>> > Thanks,<br>> > Ollie<br>> > <br>> > > to determine a hypervisor name to search the corresponding resource<br>> > > provider.<br>> > > On the other hand, nova uses libvirt's getHostname function (if<br>> > > libvirt driver is used)<br>> > > which returns a canonical name. Canonical name can be shortname or<br>> > > FQDN (*1)<br>> > > and if FQDN is used then neutron and nova never agree.<br>> > > <br>> > > (*1)<br>> > > IMO this is likely to happen in real deployments. For example,<br>> > > TripelO uses<br>> > > FQDN for canonical names. <br>> > > <br>> > > <br>> > > Neutron already provides the resource_provider_defauly_hypervisors<br>> > > option<br>> > > to override a hypervisor name used. However because this option<br>> > > accepts<br>> > > a map between interface and hypervisor, setting this parameter<br>> > > requires<br>> > > very redundant description especially when a compute node has<br>> > > multiple<br>> > > interfaces/bridges. The following example shows how redundant the<br>> > > current<br>> > > requirement is.<br>> > > ~~~<br>> > > [OVS]<br>> > > resource_provider_bandwidths=br-data1:1024:1024,br-<br>> > > data2:1024:1024,\<br>> > > br-data3:1024,1024,br-data4,1024:1024<br>> > > resource_provider_hypervisors=br-data1:compute0.mydomain,br-data2:\<br>> > > compute0.mydomain,br-data3:compute0.mydomain,br-<br>> > > data4:compute0.mydomain<br>> > > ~~~<br>> > > <br>> > > I've submitted a change to propose a new single parameter to<br>> > > override<br>> > > the base hypervisor name but this is currently -2ed, mainly because<br>> > > I lacked analysis about the root cause of mismatch when I proposed<br>> > > this.<br>> > > (1) <a href="https://review.opendev.org/c/openstack/neutron/+/763563" rel="noreferrer" target="_blank">https://review.opendev.org/c/openstack/neutron/+/763563</a><br>> > > <br>> > > <br>> > > On the other hand, I submitted a different change to neutron which<br>> > > implements<br>> > > the logic to get a hypervisor name which is fully compatible with<br>> > > libvirt.<br>> > > While this would save users from even overriding hypervisor names,<br>> > > I'm aware<br>> > > that this might break the other virt driver which depends on a<br>> > > different logic<br>> > > to generate a hypervisor name. IMO the patch is still useful<br>> > > considering<br>> > > the libvirt driver would be the most popular option now, but I'm<br>> > > not fully<br>> > > aware of the impact on the other drivers, especially because I<br>> > > don't know<br>> > > which virt driver would support the minimum QoS feature now.<br>> > > (2) <a href="https://review.opendev.org/c/openstack/neutron/+/788893/" rel="noreferrer" target="_blank">https://review.opendev.org/c/openstack/neutron/+/788893/</a><br>> > > <br>> > > <br>> > > In the review of (2), Sean mentioned implementing a logic to<br>> > > determine<br>> > > an appropriate resource provider(3) even if there is a mismatch<br>> > > about<br>> > > host name format, but I'm not sure how I would implement that, tbh.<br>> > > <br>> > > <br>> > > My current thought is to merge (1) as a quick solution first, and<br>> > > discuss whether<br>> > > we should merge (2), but I'd like to ask for some feedback about<br>> > > this plan<br>> > > (like we should NOT merge (2)).<br>> > > <br>> > > I'd appreciate your thoughts about this $topic.<br>> > > <br>> > > Thank you,<br>> > > Takashi<br></div><div><br></div><div><br></div></blockquote></div></div></blockquote><div><br></div><div><span></span></div></body></html>