Open Stack

Thu Sep 18 10:49:17 UTC 2014

On 09/17/2014 11:50 PM, Clark Boylan wrote:
> On Wed, Sep 17, 2014, at 06:48 PM, Clark Boylan wrote:
>> On Wed, Sep 17, 2014, at 06:37 PM, Matt Riedemann wrote:
>>>
>>>
>>> On 9/17/2014 7:59 PM, Ian Wienand wrote:
>>>> On 09/18/2014 09:49 AM, Clark Boylan wrote:
>>>>> Recent sampling of test run times shows that our tempest jobs run
>>>>> against clouds using PostgreSQL are significantly slower than jobs run
>>>>> against clouds using MySQL.
>>>>
>>>> FYI There is a possibly relevant review out for max_connections limits
>>>> [1], although it seems to have some issues with shmem usage
>>>>
>>>> -i
>>>>
>>>> [1] https://review.openstack.org/#/c/121952/
>>>>
>>>> _______________________________________________
>>>> OpenStack-dev mailing list
>>>> OpenStack-dev at lists.openstack.org
>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>>
>>>
>>> That's a backport of a fix from master where we were hitting fatal 
>>> errors due to too many DB connections which was brought on by the 
>>> changes to cinder and glance to run as many workers as there were CPUs 
>>> available.  So I don't think it probably plays here...
>>>
>>> The errors pointed out in another part of the thread have been around 
>>> for awhile, I think they are due to negative tests where we're hitting 
>>> unique constraints because of the negative tests, so they are expected.
>>>
>>> We should also note that the postgresql jobs run with the nova metadata 
>>> API service, I'm not sure how much of a factor that would have here.
>>>
>>> Is there anything else unique about those jobs from the MySQL ones?
>>>
>> Good question. There are apparently other differences. The postgres job
>> runs Keystone under eventlet instead of via apache mod_wsgi. It also
>> sets FORCE_CONFIGDRIVE=False instead of always. And the final difference
>> I can find is the one you point out, nova api metadata service is run as
>> an independent thing.
>>
>> Could these things be related? It would be relatively simple to push a
>> change or two to devstack-gate to test this but there are enough options
>> here that I probably won't do that until we think at least one of these
>> options is at fault.
> I am starting to feel bad that I picked on PostgreSQL and completely
> forgot that there were other items in play here. I went ahead and
> uploaded [0] to run all devstack jobs without keystone wsgi services
> (eventlet) and [1] to run all devstack job with keystone wsgi services
> and the initial results are pretty telling.
> 
> It appears that keystone eventlet is the source of the slowness in this
> job. With keystone eventlet all of the devstack jobs are slower and with
> keystone wsgi all of the jobs are quicker. Probably need to collect a
> bit more data but this doesn't look good for keystone eventlet.
> 
> Thank you Matt for pointing me in this direction.
> 
> [0] https://review.openstack.org/#/c/122299/
> [1] https://review.openstack.org/#/c/122300/

Don't feel bad. :)

The point that Clark highlights here is a good one. There is an
assumption that once someone creates a job in infra, the magic elves are
responsible for it.

But there are no magic elves. So jobs like this need sponsors.

Maybe the right thing to do is not conflate this configuration and put
an eventlet version of the keystone job only on keystone (because the
keystone team was the one that proposed having a config like that, but
it's so far away from their project they aren't ever noticing when it's
regressing).

Same issue with the metadata server split. That's really only a thing
Nova cares about. It shouldn't impact anyone else.

	-Sean

-- 
Sean Dague
http://dague.net

Open Stack

[openstack-dev] PostgreSQL jobs slow in the gate

OpenStack

Community

Documentation

Branding & Legal