[dev] [infra] [devstack] [qa] [kuryr] DevStack's etcd performance on gate VM's

Michał Dulko mdulko at redhat.com
Thu Dec 13 12:39:35 UTC 2018


Hi,

In Kuryr-Kubernetes we're using the DevStack-installed etcd as a
backend store for Kubernetes that we run on our gates. For some time we
can see its degraded performance manifesting like this [1] in the logs.
Later on this leads to various K8s errors [2], [3], up to missing
notifications from the API, which causes failures in Kuryr-Kubernetes
tempest tests. From what I've seen those etcd warnings normally mean
that disk latency is high.

This seems to be mostly happening on OVH and RAX hosts. I've looked at
this with OVH folks and there isn't anything immediately alarming about
their hosts running gate VM's.

Upgrading the etcd version doesn't seem to help, as well as patch [4]
which increases IO priority for etcd process.

Any ideas of what I can try next? I think we're the only project that
puts so much pressure on the DevStack's etcd. Help would really be
welcomed, getting rid of this issue will greatly increase our gates
stability.

Thanks,
Michał

[1] http://logs.openstack.org/49/624749/1/check/kuryr-kubernetes-tempest-daemon-octavia/4a47162/controller/logs/screen-etcd.txt.gz#_Dec_12_17_19_33_618619
[2] http://logs.openstack.org/49/624749/1/check/kuryr-kubernetes-tempest-daemon-octavia/4a47162/controller/logs/screen-kubernetes-api.txt.gz#_Dec_12_17_20_19_772688
[3] http://logs.openstack.org/49/624749/1/check/kuryr-kubernetes-tempest-daemon-octavia/4a47162/controller/logs/screen-kubernetes-scheduler.txt.gz#_Dec_12_17_18_59_045347
[4] https://review.openstack.org/#/c/624730/




More information about the openstack-discuss mailing list