<div dir="auto">Note the HTTPS in the traceback in the bug report. Also the mention of adjusting the Apache mpm settings to fix it. That seems to point to an issue with Apache in the middle rather than eventlet and API_WORKERS. </div><div class="gmail_extra"><br><div class="gmail_quote">On Feb 2, 2017 14:36, "Ihar Hrachyshka" <<a href="mailto:ihrachys@redhat.com">ihrachys@redhat.com</a>> wrote:<br type="attribution"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">The BadStatusLine error is well known:<br>

<a href="https://bugs.launchpad.net/nova/+bug/1630664" rel="noreferrer" target="_blank">https://bugs.launchpad.net/<wbr>nova/+bug/1630664</a><br>

<br>

Now, it doesn't mean that the root cause of the error message is the<br>

same, and it may as well be that lowering the number of workers<br>

triggered it. All I am saying is we saw that error in the past.<br>

<br>

Ihar<br>

<br>

On Thu, Feb 2, 2017 at 1:07 PM, Kevin Benton <kevin@benton.pub> wrote:<br>

> This error seems to be new in the ocata cycle. It's either related to a<br>

> dependency change or the fact that we put Apache in between the services<br>

> now. Handling more concurrent requests than workers wasn't an issue before.<br>

><br>

> It seems that you are suggesting that eventlet can't handle concurrent<br>

> connections, which is the entire purpose of the library, no?<br>

><br>

> On Feb 2, 2017 13:53, "Sean Dague" <<a href="mailto:sean@dague.net">sean@dague.net</a>> wrote:<br>

>><br>

>> On 02/02/2017 03:32 PM, Armando M. wrote:<br>

>> ><br>

>> ><br>

>> > On 2 February 2017 at 12:19, Sean Dague <<a href="mailto:sean@dague.net">sean@dague.net</a><br>

>> > <mailto:<a href="mailto:sean@dague.net">sean@dague.net</a>>> wrote:<br>

>> ><br>

>> >     On 02/02/2017 02:28 PM, Armando M. wrote:<br>

>> >     ><br>

>> >     ><br>

>> >     > On 2 February 2017 at 10:08, Sean Dague <<a href="mailto:sean@dague.net">sean@dague.net</a><br>

>> > <mailto:<a href="mailto:sean@dague.net">sean@dague.net</a>><br>

>> >     > <mailto:<a href="mailto:sean@dague.net">sean@dague.net</a> <mailto:<a href="mailto:sean@dague.net">sean@dague.net</a>>>> wrote:<br>

>> >     ><br>

>> >     >     On 02/02/2017 12:49 PM, Armando M. wrote:<br>

>> >     >     ><br>

>> >     >     ><br>

>> >     >     > On 2 February 2017 at 08:40, Sean Dague <<a href="mailto:sean@dague.net">sean@dague.net</a><br>

>> > <mailto:<a href="mailto:sean@dague.net">sean@dague.net</a>> <mailto:<a href="mailto:sean@dague.net">sean@dague.net</a><br>

>> >     <mailto:<a href="mailto:sean@dague.net">sean@dague.net</a>>><br>

>> >     >     > <mailto:<a href="mailto:sean@dague.net">sean@dague.net</a> <mailto:<a href="mailto:sean@dague.net">sean@dague.net</a>><br>

>> >     <mailto:<a href="mailto:sean@dague.net">sean@dague.net</a> <mailto:<a href="mailto:sean@dague.net">sean@dague.net</a>>>>> wrote:<br>

>> >     >     ><br>

>> >     >     >     On 02/02/2017 11:16 AM, Matthew Treinish wrote:<br>

>> >     >     >     <snip><br>

>> >     >     >     > <oops, forgot to finish my though><br>

>> >     >     >     ><br>

>> >     >     >     > We definitely aren't saying running a single worker is<br>

>> > how<br>

>> >     >     we recommend people<br>

>> >     >     >     > run OpenStack by doing this. But it just adds on to<br>

>> > the<br>

>> >     >     differences between the<br>

>> >     >     >     > gate and what we expect things actually look like.<br>

>> >     >     ><br>

>> >     >     >     I'm all for actually getting to the bottom of this, but<br>

>> >     >     honestly real<br>

>> >     >     >     memory profiling is needed here. The growth across<br>

>> > projects<br>

>> >     >     probably<br>

>> >     >     >     means that some common libraries are some part of this.<br>

>> > The<br>

>> >     >     ever growing<br>

>> >     >     >     requirements list is demonstrative of that. Code reuse<br>

>> > is<br>

>> >     >     good, but if<br>

>> >     >     >     we are importing much of a library to get access to a<br>

>> >     couple of<br>

>> >     >     >     functions, we're going to take a bunch of memory weight<br>

>> >     on that<br>

>> >     >     >     (especially if that library has friendly auto imports in<br>

>> >     top level<br>

>> >     >     >     __init__.py so we can't get only the parts we want).<br>

>> >     >     ><br>

>> >     >     >     Changing the worker count is just shuffling around deck<br>

>> >     chairs.<br>

>> >     >     ><br>

>> >     >     >     I'm not familiar enough with memory profiling tools in<br>

>> >     python<br>

>> >     >     to know<br>

>> >     >     >     the right approach we should take there to get this down<br>

>> > to<br>

>> >     >     individual<br>

>> >     >     >     libraries / objects that are containing all our memory.<br>

>> >     Anyone<br>

>> >     >     more<br>

>> >     >     >     skilled here able to help lead the way?<br>

>> >     >     ><br>

>> >     >     ><br>

>> >     >     > From what I hear, the overall consensus on this matter is to<br>

>> >     determine<br>

>> >     >     > what actually caused the memory consumption bump and how to<br>

>> >     >     address it,<br>

>> >     >     > but that's more of a medium to long term action. In fact, to<br>

>> > me<br>

>> >     >     this is<br>

>> >     >     > one of the top priority matters we should talk about at the<br>

>> >     >     imminent PTG.<br>

>> >     >     ><br>

>> >     >     > For the time being, and to provide relief to the gate,<br>

>> > should we<br>

>> >     >     want to<br>

>> >     >     > lock the API_WORKERS to 1? I'll post something for review<br>

>> >     and see how<br>

>> >     >     > many people shoot it down :)<br>

>> >     ><br>

>> >     >     I don't think we want to do that. It's going to force down the<br>

>> >     eventlet<br>

>> >     >     API workers to being a single process, and it's not super<br>

>> >     clear that<br>

>> >     >     eventlet handles backups on the inbound socket well. I<br>

>> >     honestly would<br>

>> >     >     expect that creates different hard to debug issues, especially<br>

>> >     with high<br>

>> >     >     chatter rates between services.<br>

>> >     ><br>

>> >     ><br>

>> >     > I must admit I share your fear, but out of the tests that I have<br>

>> >     > executed so far in [1,2,3], the house didn't burn in a fire. I am<br>

>> >     > looking for other ways to have a substantial memory saving with a<br>

>> >     > relatively quick and dirty fix, but coming up empty handed thus<br>

>> > far.<br>

>> >     ><br>

>> >     > [1] <a href="https://review.openstack.org/#/c/428303/" rel="noreferrer" target="_blank">https://review.openstack.org/#<wbr>/c/428303/</a><br>

>> >     <<a href="https://review.openstack.org/#/c/428303/" rel="noreferrer" target="_blank">https://review.openstack.org/<wbr>#/c/428303/</a>><br>

>> >     > [2] <a href="https://review.openstack.org/#/c/427919/" rel="noreferrer" target="_blank">https://review.openstack.org/#<wbr>/c/427919/</a><br>

>> >     <<a href="https://review.openstack.org/#/c/427919/" rel="noreferrer" target="_blank">https://review.openstack.org/<wbr>#/c/427919/</a>><br>

>> >     > [3] <a href="https://review.openstack.org/#/c/427921/" rel="noreferrer" target="_blank">https://review.openstack.org/#<wbr>/c/427921/</a><br>

>> >     <<a href="https://review.openstack.org/#/c/427921/" rel="noreferrer" target="_blank">https://review.openstack.org/<wbr>#/c/427921/</a>><br>

>> ><br>

>> >     This failure in the first patch -<br>

>> ><br>

>> > <a href="http://logs.openstack.org/03/428303/1/check/gate-tempest-dsvm-neutron-full-ubuntu-xenial/71f42ea/logs/screen-n-api.txt.gz?level=TRACE#_2017-02-02_19_14_11_751" rel="noreferrer" target="_blank">http://logs.openstack.org/03/<wbr>428303/1/check/gate-tempest-<wbr>dsvm-neutron-full-ubuntu-<wbr>xenial/71f42ea/logs/screen-n-<wbr>api.txt.gz?level=TRACE#_2017-<wbr>02-02_19_14_11_751</a><br>

>> ><br>

>> > <<a href="http://logs.openstack.org/03/428303/1/check/gate-tempest-dsvm-neutron-full-ubuntu-xenial/71f42ea/logs/screen-n-api.txt.gz?level=TRACE#_2017-02-02_19_14_11_751" rel="noreferrer" target="_blank">http://logs.openstack.org/03/<wbr>428303/1/check/gate-tempest-<wbr>dsvm-neutron-full-ubuntu-<wbr>xenial/71f42ea/logs/screen-n-<wbr>api.txt.gz?level=TRACE#_2017-<wbr>02-02_19_14_11_751</a>><br>

>> >     looks exactly like I would expect by API Worker starvation.<br>

>> ><br>

>> ><br>

>> > Not sure I agree on this one, this has been observed multiple times in<br>

>> > the gate already [1] (though I am not sure there's a bug for it), and I<br>

>> > don't believe it has anything to do with the number of API workers,<br>

>> > unless not even two workers are enough.<br>

>><br>

>> There is no guarntee that 2 workers are enough. I'm not surprised if we<br>

>> see that failure some today. This was all guess work on trimming worker<br>

>> counts to deal with the memory issue in the past. But we're running<br>

>> tests in parallel, and the services are making calls back to other<br>

>> services all the time.<br>

>><br>

>> This is one of the reasons to get the wsgi stack off of eventlet and<br>

>> into a real webserver, as they handle HTTP request backups much much<br>

>> better.<br>

>><br>

>> I do understand that people want a quick fix here, but I'm not convinced<br>

>> that it exists.<br>

>><br>

>>         -Sean<br>

>><br>

>> --<br>

>> Sean Dague<br>

>> <a href="http://dague.net" rel="noreferrer" target="_blank">http://dague.net</a><br>

>><br>

>> ______________________________<wbr>______________________________<wbr>______________<br>

>> OpenStack Development Mailing List (not for usage questions)<br>

>> Unsubscribe: <a href="http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe" rel="noreferrer" target="_blank">OpenStack-dev-request@lists.<wbr>openstack.org?subject:<wbr>unsubscribe</a><br>

>> <a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev" rel="noreferrer" target="_blank">http://lists.openstack.org/<wbr>cgi-bin/mailman/listinfo/<wbr>openstack-dev</a><br>

><br>

><br>

> ______________________________<wbr>______________________________<wbr>______________<br>

> OpenStack Development Mailing List (not for usage questions)<br>

> Unsubscribe: <a href="http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe" rel="noreferrer" target="_blank">OpenStack-dev-request@lists.<wbr>openstack.org?subject:<wbr>unsubscribe</a><br>

> <a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev" rel="noreferrer" target="_blank">http://lists.openstack.org/<wbr>cgi-bin/mailman/listinfo/<wbr>openstack-dev</a><br>

><br>

<br>

______________________________<wbr>______________________________<wbr>______________<br>

OpenStack Development Mailing List (not for usage questions)<br>

Unsubscribe: <a href="http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe" rel="noreferrer" target="_blank">OpenStack-dev-request@lists.<wbr>openstack.org?subject:<wbr>unsubscribe</a><br>

<a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev" rel="noreferrer" target="_blank">http://lists.openstack.org/<wbr>cgi-bin/mailman/listinfo/<wbr>openstack-dev</a><br>

</blockquote></div></div>