[dev] [infra] [devstack] [qa] [kuryr] DevStack's etcd performance on gate VM's

Michał Dulko mdulko at redhat.com
Thu Dec 13 16:20:02 UTC 2018


On Thu, 2018-12-13 at 07:45 -0600, Ben Nemec wrote:
> 
> On 12/13/18 6:39 AM, Michał Dulko wrote:
> > Hi,
> > 
> > In Kuryr-Kubernetes we're using the DevStack-installed etcd as a
> > backend store for Kubernetes that we run on our gates. For some time we
> > can see its degraded performance manifesting like this [1] in the logs.
> > Later on this leads to various K8s errors [2], [3], up to missing
> > notifications from the API, which causes failures in Kuryr-Kubernetes
> > tempest tests. From what I've seen those etcd warnings normally mean
> > that disk latency is high.
> > 
> > This seems to be mostly happening on OVH and RAX hosts. I've looked at
> > this with OVH folks and there isn't anything immediately alarming about
> > their hosts running gate VM's.
> > 
> > Upgrading the etcd version doesn't seem to help, as well as patch [4]
> > which increases IO priority for etcd process.
> > 
> > Any ideas of what I can try next? I think we're the only project that
> > puts so much pressure on the DevStack's etcd. Help would really be
> > welcomed, getting rid of this issue will greatly increase our gates
> > stability.
> 
> Do you by any chance use grpcio to talk to etcd? If so, it's possible 
> you are hitting https://bugs.launchpad.net/python-tooz/+bug/1808046
> 
> In tooz that presents as random timeouts and everything taking a lot 
> longer than it should.

Seems like it's something else. We don't call etcd from Python using
any lib. It's only Kubernetes that's doing that in our gates.

> > Thanks,
> > Michał
> > 
> > [1] http://logs.openstack.org/49/624749/1/check/kuryr-kubernetes-tempest-daemon-octavia/4a47162/controller/logs/screen-etcd.txt.gz#_Dec_12_17_19_33_618619
> > [2] http://logs.openstack.org/49/624749/1/check/kuryr-kubernetes-tempest-daemon-octavia/4a47162/controller/logs/screen-kubernetes-api.txt.gz#_Dec_12_17_20_19_772688
> > [3] http://logs.openstack.org/49/624749/1/check/kuryr-kubernetes-tempest-daemon-octavia/4a47162/controller/logs/screen-kubernetes-scheduler.txt.gz#_Dec_12_17_18_59_045347
> > [4] https://review.openstack.org/#/c/624730/
> > 
> > 





More information about the openstack-discuss mailing list