[magnum][kolla] etcd wal sync duration issue

11 Jan 2020

      Hi,

We are using the following coe cluster template and cluster create
commands on an OpenStack Stein installation that installs Magnum 8.2.0
Kolla containers installed by Kolla-Ansible 8.0.1:

openstack coe cluster template create \

  --image Fedora-AtomicHost-29-20191126.0.x86_64_raw \

  --keypair userkey \

  --external-network ext-net \

  --dns-nameserver 1.1.1.1 \

  --master-flavor c5sd.4xlarge \

  --flavor m5sd.4xlarge \

  --coe kubernetes \

  --network-driver flannel \

  --volume-driver cinder \

  --docker-storage-driver overlay2 \

  --docker-volume-size 100 \

  --registry-enabled \

 --master-lb-enabled \

  --floating-ip-disabled \

  --fixed-network KubernetesProjectNetwork001 \

  --fixed-subnet KubernetesProjectSubnet001 \

  --labels
kube_tag=v1.15.7,cloud_provider_tag=v1.15.0,heat_container_agent_tag=ste
in-dev,master_lb_floating_ip_enabled=true \

  k8s-cluster-template-1.15.7-production-private

openstack coe cluster create \

  --cluster-template k8s-cluster-template-1.15.7-production-private \

  --keypair userkey \

  --master-count 3 \

  --node-count 3 \

  k8s-cluster001

The deploy process works perfectly, however, the cluster health status
flips between healthy and unhealthy.  The unhealthy status indicates
that etcd has an issue.

When logged into master-0 (out of 3, as configured above), "systemctl
status etcd" shows the stdout from etcd, which shows:

Jan 11 17:27:36 k8s-cluster001-4effrc2irvjq-master-0.novalocal
runc[2725]: 2020-01-11 17:27:36.548453 W | etcdserver: timed out waiting
for read index response

Jan 11 17:28:02 k8s-cluster001-4effrc2irvjq-master-0.novalocal
runc[2725]: 2020-01-11 17:28:02.960977 W | wal: sync duration of
1.696804699s, expected less than 1s

Jan 11 17:28:31 k8s-cluster001-4effrc2irvjq-master-0.novalocal
runc[2725]: 2020-01-11 17:28:31.292753 W | wal: sync duration of
2.249722223s, expected less than 1s

We also see:

Jan 11 17:40:39 k8s-cluster001-4effrc2irvjq-master-0.novalocal
runc[2725]: 2020-01-11 17:40:39.132459 I | etcdserver/api/v3rpc: grpc:
Server.processUnaryRPC failed to write status: stream error: code =
DeadlineExceeded desc = "context deadline exceeded"

We initially used relatively small flavors, but increased these to
something very large to be sure resources were not constrained in any
way.  "top" reported no CPU nor memory contention on any nodes in either
case.

Multiple clusters have been deployed, and they all have this issue,
including empty clusters that were just deployed.

I see a very large number of reports of similar issues with etcd, but
discussions lead to disk performance, which can't be the cause here, not
only because persistent storage for etcd isn't configured in Magnum, but
also the disks are "very" fast in this environment.  Looking at "vmstat
-D" from within master-0, the number of writes is minimal.  Ceilometer
logs about 15 to 20 write IOPS for this VM in Gnocchi.

Any ideas?

We are finalizing procedures to upgrade to Train, so we wanted to be
sure that we weren't running into some common issue with Stein that
would immediately be solved with Train.  If so, we will simply proceed
with the upgrade and avoid diagnosing this issue further.

Thanks!

Eric

Eric K. Miller

Feilong Wang

Eric K. Miller

Radosław Piliszek

feilong

Eric K. Miller

Feilong Wang

tags

participants (4)