[openstack-dev] [gate] concurrent workers are overwhelming postgresql in the gate - bug 1338841
Matt Riedemann
mriedem at linux.vnet.ibm.com
Wed Jul 9 20:07:52 UTC 2014
On 7/9/2014 2:59 PM, Matt Riedemann wrote:
> Bug 1338841 [1] started showing up yesterday and I first noticed it on
> the change to set osapi_volume_workers equal to the number of CPUs
> available by default. Similar patches for trove (api/conductor workers)
> and glance (api/registry workers) have landed in the last week also, and
> nova has been running with multiple api/conductor workers by default
> since Icehouse.
>
> It looks like the cinder change tipped the default postgresql
> max_connections over and we started getting asynchronous connection
> failures in that job. [2]
>
> We can also note that the postgresql job is the only one that runs the
> nova api-metadata service, which has it's own workers.
>
> The VMs the jobs are running on have 8 VCPUs, so that's at least 88
> workers between nova (3), cinder (1), glance (2), trove (2), neutron,
> heat and ceilometer.
>
> So osapi_volume_workers (8) + n-api-meta workers (8) seems to have
> tipped it over.
>
> The first attempt at a fix is to simply double the default
> max_connections value [3].
>
> While looking up the postgresql configuration docs, I also read a bit on
> synchronous_commit=off and fsync=off, which sound like we might want to
> also think about using one of those in devstack runs since they are
> supposed to be more performant if you don't care about disaster recovery
> (which we don't in gate runs on VMs).
>
> Anyway, bumping max connections might fix the gate, I'm just sending
> this out to see if there are any postgresql experts out there with
> additional tips or insights on things we can tweak or look for,
> including whether or not it might be worthwhile to set
> synchronous_commit=off or fsync=off for gate runs.
>
> [1] https://bugs.launchpad.net/nova/+bug/1338841
> [2] http://goo.gl/yRBDjQ
> [3] https://review.openstack.org/#/c/105854/
>
Typo in my math on the workers, it should be:
nova (3*8), cinder (1*8), glance (2*8), trove (2*8), neutron (1), heat
(1) and ceilometer (1) = 67.
--
Thanks,
Matt Riedemann
More information about the OpenStack-dev
mailing list