<div dir="ltr"><div>Hello all:</div><div><br></div><div>I understand that by default we don't allow backporting a config knob default value. But I'm with Sean and his explanation. For "uwsgi" applications, if pthread is False, the only drawback will be the reconnection of the MQ socket. But in the case described by Slawek, the problem is more relevant because once the agent has been disconnected for a long time from the MQ, it is not possible to reconnect again and the agent needs to be manually restarted. I would backport the patch setting this config knob to False.</div><div><br></div><div>Regards.</div><div><br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Sat, Aug 6, 2022 at 12:08 AM Sean Mooney <<a href="mailto:smooney@redhat.com">smooney@redhat.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">On Fri, Aug 5, 2022 at 7:40 PM Ghanshyam Mann <<a href="mailto:gmann@ghanshyammann.com" target="_blank">gmann@ghanshyammann.com</a>> wrote:<br>
><br>
> ---- On Fri, 05 Aug 2022 17:54:25 +0530 Slawek Kaplonski wrote ---<br>
> > Hi,<br>
> ><br>
> > Some time ago oslo.messaging changed default value of the "heartbeat_in_pthread" config option to "True" [1].<br>
> > As was noticed some time ago, this don't works well with nova-compute - see bug [2] for details.<br>
> > Recently we noticed in our downstream Red Hat OpenStack, that it's not only nova-compute which don't works well with it and can hangs. We saw the same issue in various neutron agent processes. And it seems that it can be the same for any non-wsgi service which is using rabbitmq to send heartbeats.<br>
> > So giving all of that, I just proposed change of the default value of that config option to be "False" again [3].<br>
> > And my question is - would it be possible and acceptable to backport such change up to stable/wallaby (if and when it will be approved for master of course). IMO this could be useful for users as using this option set as "True" be default don't makes any sense for the non-wsgi applications really and may cause more bad then good things really. What are You opinions about it?<br>
><br>
> This is tricky, in general the default value change should not be backported because it change<br>
> the default behavior and so does the compatibility. But along with considering the cases do not<br>
> work with the current default value (you mentioned in this email), we should consider if this worked<br>
> in any other case or not. If so then I think we should not backport this and tell operator to override<br>
> it to False as workaround for stable branch fixes.<br>
as afar as i am aware the only impact of setting the default to false<br>
for wsgi applications is<br>
running under mod_wsgi or uwsgi may have the heatbeat greenthread<br>
killed when the wsgi server susspand the application<br>
after a time out following the processing of an api request.<br>
<br>
there is no known negitive impact to this other then a log message<br>
that can safely be ignored on both rabbitmq and the api log relating<br>
to the amqp messing connection being closed and repopend.<br>
<br>
keeping the value at true can cause the nova compute agent, neutron<br>
agent and i susppoct nova conductor/schduler to hang following a<br>
rabbitmq disconnect.<br>
that can leave the relevnet service unresponcei until its restarted.<br>
<br>
so having the default set to true is known to breake several services<br>
but tehre are no know issue that are caused by setting it to false<br>
that impact the operation fo any service.<br>
<br>
so i have a stong preference for setting thsi to false by default on<br>
stable branches.<br>
><br>
> -gmann<br>
><br>
> ><br>
> > [1] <a href="https://review.opendev.org/c/openstack/oslo.messaging/+/747395" rel="noreferrer" target="_blank">https://review.opendev.org/c/openstack/oslo.messaging/+/747395</a><br>
> > [2] <a href="https://bugs.launchpad.net/oslo.messaging/+bug/1934937" rel="noreferrer" target="_blank">https://bugs.launchpad.net/oslo.messaging/+bug/1934937</a><br>
> > [3] <a href="https://review.opendev.org/c/openstack/oslo.messaging/+/852251/" rel="noreferrer" target="_blank">https://review.opendev.org/c/openstack/oslo.messaging/+/852251/</a><br>
> ><br>
> > --<br>
> > Slawek Kaplonski<br>
> > Principal Software Engineer<br>
> > Red Hat<br>
><br>
<br>
<br>
</blockquote></div>