<html>

  <head>

    <meta http-equiv="Content-Type" content="text/html; charset=utf-8">

  </head>

  <body text="#000000" bgcolor="#FFFFFF">

    <p>Hi,</p>

    <p>Finally, the issue solved. It is caused by the low number of

      connections set to memcached servers: 1024. I noticed too many

      files open and too many sockets errors in my logs. Changing the

      memcached max allowed connections to 2048 or higher solved my

      problem.</p>

    <p>You can check your memcached server: echo stats|nc localhost

      11211|grep listen_disabled . If you have listen_disabled > 1

      you need to increase the max connections for you memcached server.</p>

    <p>For the OSA you need to set this parameter: <code>memcached_connections:

        4096</code></p>

    <pre class="moz-signature" cols="72">Regards,

Robert Varjasi

consultant@Component Soft Ltd.

Tel: +36/30-259-9221</pre>

    <div class="moz-cite-prefix">On 10/12/2018 04:35 PM, Robert Varjasi

      wrote:<br>

    </div>

    <blockquote type="cite"

      cite="mid:915bf691-d741-dd9c-7143-050088b5700b@componentsoft.io">

      <pre wrap="">Hi,

I found that my controller nodes were a bit overloaded with 16 uwsgi

nova-api-os compute processes. I reduced the nova-api-os uwsgi processes

to 10 and timeout and slowdowns were eliminated. My cloud went stable

and the response times went lower. I have 20 vcpus on a Xeon(R) CPU

E5-2630 v4 @ 2.20GHz.

For the openstack-ansible I need to change this variable from 16 to 10:

nova_wsgi_processes_max: 10. Seems I need to set it to an equal number

of my cpu cores.

Regards,

Robert Varjasi

consultant@Component Soft Ltd.

Tel: +36/30-259-9221

On 10/08/2018 06:33 PM, Robert Varjasi wrote:

</pre>

      <blockquote type="cite">

        <pre wrap="">Hi,

After a few tempest run I noticed slowdowns in the nova-api-os-compute

uwsgi  processes. I check the processes with py-spy and found that a lot

of process blocked on read(). Here is my py-spy output from one of my

nova-api-os-compute uwsgi process: <a class="moz-txt-link-freetext" href="http://paste.openstack.org/show/731677/">http://paste.openstack.org/show/731677/</a>

And the stack trace:

thread_id = Thread-2 filename = /usr/lib/python2.7/threading.py lineno =

774 function = __bootstrap line = self.__bootstrap_inner()

thread_id = Thread-2 filename = /usr/lib/python2.7/threading.py lineno =

801 function = __bootstrap_inner line = self.run()

thread_id = Thread-2 filename = /usr/lib/python2.7/threading.py lineno =

754 function = run line = self.__target(*self.__args, **self.__kwargs)

thread_id = Thread-2 filename =

/openstack/venvs/nova-17.0.4/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py

lineno = 382 function = poll line =

self.conn.consume(timeout=current_timeout)

thread_id = Thread-2 filename =

/openstack/venvs/nova-17.0.4/lib/python2.7/site-packages/oslo_messaging/_drivers/impl_rabbit.py

lineno = 1083 function = consume line = error_callback=_error_callback)

thread_id = Thread-2 filename =

/openstack/venvs/nova-17.0.4/lib/python2.7/site-packages/oslo_messaging/_drivers/impl_rabbit.py

lineno = 807 function = ensure line = ret, channel = autoretry_method()

thread_id = Thread-2 filename =

/openstack/venvs/nova-17.0.4/lib/python2.7/site-packages/kombu/connection.py

lineno = 494 function = _ensured line = return fun(*args, **kwargs)

thread_id = Thread-2 filename =

/openstack/venvs/nova-17.0.4/lib/python2.7/site-packages/kombu/connection.py

lineno = 570 function = __call__ line = return fun(*args,

channel=channels[0], **kwargs), channels[0]

thread_id = Thread-2 filename =

/openstack/venvs/nova-17.0.4/lib/python2.7/site-packages/oslo_messaging/_drivers/impl_rabbit.py

lineno = 796 function = execute_method line = method()

thread_id = Thread-2 filename =

/openstack/venvs/nova-17.0.4/lib/python2.7/site-packages/oslo_messaging/_drivers/impl_rabbit.py

lineno = 1068 function = _consume line =

self.connection.drain_events(timeout=poll_timeout)

thread_id = Thread-2 filename =

/openstack/venvs/nova-17.0.4/lib/python2.7/site-packages/kombu/connection.py

lineno = 301 function = drain_events line = return

self.transport.drain_events(self.connection, **kwargs)

thread_id = Thread-2 filename =

/openstack/venvs/nova-17.0.4/lib/python2.7/site-packages/kombu/transport/pyamqp.py

lineno = 103 function = drain_events line = return

connection.drain_events(**kwargs)

thread_id = Thread-2 filename =

/openstack/venvs/nova-17.0.4/lib/python2.7/site-packages/amqp/connection.py

lineno = 471 function = drain_events line = while not

self.blocking_read(timeout):

thread_id = Thread-2 filename =

/openstack/venvs/nova-17.0.4/lib/python2.7/site-packages/amqp/connection.py

lineno = 476 function = blocking_read line = frame =

self.transport.read_frame()

thread_id = Thread-2 filename =

/openstack/venvs/nova-17.0.4/lib/python2.7/site-packages/amqp/transport.py

lineno = 226 function = read_frame line = frame_header = read(7, True)

thread_id = Thread-2 filename =

/openstack/venvs/nova-17.0.4/lib/python2.7/site-packages/amqp/transport.py

lineno = 346 function = _read line = s = recv(n - len(rbuf))  # see note

above

thread_id = Thread-2 filename = /usr/lib/python2.7/ssl.py lineno = 643

function = read line = v = self._sslobj.read(len)

I am using nova 17.0.4.dev1, amqp (2.2.2), oslo.messaging (5.35.0),

kombu (4.1.0). I have 3 controller nodes. The openstack deployed by OSA

17.0.4.

I can reproduce the read() block if I click on "Log" in Horizon to see

the console outputs from one of my VM or run a tempest test:

tempest.api.compute.admin.test_hypervisor.HypervisorAdminTestJSON.test_get_hypervisor_uptime.

The nova-api response time increasing when more and more nova-api

processes get blocked at this read. Is it a normal behavior?

--- 

Regards,

Robert Varjasi

consultant@Component Soft Ltd.

</pre>

      </blockquote>

      <pre wrap="">

</pre>

    </blockquote>

    <br>

  </body>

</html>