[openstack-dev] PostgreSQL jobs slow in the gate

Matt Riedemann mriedem at linux.vnet.ibm.com
Thu Sep 18 14:11:04 UTC 2014



On 9/18/2014 5:49 AM, Sean Dague wrote:
> On 09/17/2014 11:50 PM, Clark Boylan wrote:
>> On Wed, Sep 17, 2014, at 06:48 PM, Clark Boylan wrote:
>>> On Wed, Sep 17, 2014, at 06:37 PM, Matt Riedemann wrote:
>>>>
>>>>
>>>> On 9/17/2014 7:59 PM, Ian Wienand wrote:
>>>>> On 09/18/2014 09:49 AM, Clark Boylan wrote:
>>>>>> Recent sampling of test run times shows that our tempest jobs run
>>>>>> against clouds using PostgreSQL are significantly slower than jobs run
>>>>>> against clouds using MySQL.
>>>>>
>>>>> FYI There is a possibly relevant review out for max_connections limits
>>>>> [1], although it seems to have some issues with shmem usage
>>>>>
>>>>> -i
>>>>>
>>>>> [1] https://review.openstack.org/#/c/121952/
>>>>>
>>>>> _______________________________________________
>>>>> OpenStack-dev mailing list
>>>>> OpenStack-dev at lists.openstack.org
>>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>>>
>>>>
>>>> That's a backport of a fix from master where we were hitting fatal
>>>> errors due to too many DB connections which was brought on by the
>>>> changes to cinder and glance to run as many workers as there were CPUs
>>>> available.  So I don't think it probably plays here...
>>>>
>>>> The errors pointed out in another part of the thread have been around
>>>> for awhile, I think they are due to negative tests where we're hitting
>>>> unique constraints because of the negative tests, so they are expected.
>>>>
>>>> We should also note that the postgresql jobs run with the nova metadata
>>>> API service, I'm not sure how much of a factor that would have here.
>>>>
>>>> Is there anything else unique about those jobs from the MySQL ones?
>>>>
>>> Good question. There are apparently other differences. The postgres job
>>> runs Keystone under eventlet instead of via apache mod_wsgi. It also
>>> sets FORCE_CONFIGDRIVE=False instead of always. And the final difference
>>> I can find is the one you point out, nova api metadata service is run as
>>> an independent thing.
>>>
>>> Could these things be related? It would be relatively simple to push a
>>> change or two to devstack-gate to test this but there are enough options
>>> here that I probably won't do that until we think at least one of these
>>> options is at fault.
>> I am starting to feel bad that I picked on PostgreSQL and completely
>> forgot that there were other items in play here. I went ahead and
>> uploaded [0] to run all devstack jobs without keystone wsgi services
>> (eventlet) and [1] to run all devstack job with keystone wsgi services
>> and the initial results are pretty telling.
>>
>> It appears that keystone eventlet is the source of the slowness in this
>> job. With keystone eventlet all of the devstack jobs are slower and with
>> keystone wsgi all of the jobs are quicker. Probably need to collect a
>> bit more data but this doesn't look good for keystone eventlet.
>>
>> Thank you Matt for pointing me in this direction.
>>
>> [0] https://review.openstack.org/#/c/122299/
>> [1] https://review.openstack.org/#/c/122300/
>
> Don't feel bad. :)
>
> The point that Clark highlights here is a good one. There is an
> assumption that once someone creates a job in infra, the magic elves are
> responsible for it.
>
> But there are no magic elves. So jobs like this need sponsors.
>
> Maybe the right thing to do is not conflate this configuration and put
> an eventlet version of the keystone job only on keystone (because the
> keystone team was the one that proposed having a config like that, but
> it's so far away from their project they aren't ever noticing when it's
> regressing).
>
> Same issue with the metadata server split. That's really only a thing
> Nova cares about. It shouldn't impact anyone else.
>
> 	-Sean
>

Neutron cares about the nova metadata API service right?

-- 

Thanks,

Matt Riedemann




More information about the OpenStack-dev mailing list