creating instances, haproxy eats CPU, glance eats RAM
Hi everyone, We have a Queens/Rocky environment with haproxy in front of most services. Recently we've found a problem when creating multiple instances (2 VCPUs, 6GB RAM) from large images. The behaviour is the same whether we use Horizon or Terraform, so I've continued on with Terraform since it's easier to repeat attempts. WORKS: 340MB image, create 80x instances, 40 at a time FAILS: 20GB image, create 40x instances, 20 at a time ... Changed haproxy config, added "balance roundrobin" to glance, cinder, nova, neutron stanzas (there was no 'balance' config before, not sure what it would have been doing) ... WORKS sometimes[1]: 20GB image, create 40x instances, 20 at a time FAILS: 20GB image, create 80x instances, 40 at a time The failure condition: * The active haproxy server has a single core go to 100% usage (the rest are idle) * One glance server's RAM usage grows rapidly and continuously * Some instances that are building complete * Creating new instances fails (BUILD state forever) * Horizon becomes unresponsive * Ceph and Cinder don't appear to be overloaded (ceph -s, logs, system state) * These states do not recover until we take the following actions... To recover: * Kill any remaining (Terraform) attempts to launch instances * Stop haproxy on the active server * Wait a few seconds * Start haproxy again [1] When we create enough to not quite overload it, haproxy server goes to 100% on one core but recovers once the instances are (slooowly) created. The cause of the problem is not clear (e.g. from haproxy and glance logs, system state), and I'm looking for pointers on where to look or what to try next. Can you help? Thank you, Greg.
Hi again everyone, On 1/8/19 11:12 am, Gregory Orange wrote:
We have a Queens/Rocky environment with haproxy in front of most services. Recently we've found a problem when creating multiple instances (2 VCPUs, 6GB RAM) from large images. The behaviour is the same whether we use Horizon or Terraform, so I've continued on with Terraform since it's easier to repeat attempts.
As a followup, I found a neutron server stuck with one of its cores consumed to 100%, and RAM and swap exhausted. After rebooting that server, everything worked fine. Over the next hour, RAM and swap was exhausted again by lots of spawning processes (a few hundred neutron-rootwrap-daemon), and oom-killer cleaned it up, resulting in a loop where it fills and empties RAM every 20-60 minutes. We have some other neutron changes planned, so for now we have left that one turned off, and the other two (which have less RAM) are working fine without these symptoms. Strange, but I'm glad to have found something, and that it's working for now. Regards, Greg.
when in newton release were introduced role separation, we divided memory hungry processes into 4 different VM's on 3 physical boxes: 1) Networker: all Neutron agent processes (network throughput) 2) Systemd: all services started by systemd (Neutron) 3) pcs: all services controlled by pcs (Galera + RabbitMQ) 4) horizon not sure how to do now, I think I will go for VMs again and those VMs will include containers. It is easier to recover and rebuild the whole OpenStack. Gregory > do you have local storage for swift and cinder background? if local; then do you use RAID? if yes; then which RAID?; fi do you use ssd? do you use CEPH as background for cinder and swift? fi also double check where _base image is located? is it in /var/lib/nova/instances/_base/* ? and flavor disks stored in /var/lib/nova/instances ? (can check on compute by: virsh domiflist instance-00000## ) On Thu, 1 Aug 2019 at 09:25, Gregory Orange <gregory.orange@pawsey.org.au> wrote:
Hi again everyone,
On 1/8/19 11:12 am, Gregory Orange wrote:
We have a Queens/Rocky environment with haproxy in front of most services. Recently we've found a problem when creating multiple instances (2 VCPUs, 6GB RAM) from large images. The behaviour is the same whether we use Horizon or Terraform, so I've continued on with Terraform since it's easier to repeat attempts.
As a followup, I found a neutron server stuck with one of its cores consumed to 100%, and RAM and swap exhausted. After rebooting that server, everything worked fine. Over the next hour, RAM and swap was exhausted again by lots of spawning processes (a few hundred neutron-rootwrap-daemon), and oom-killer cleaned it up, resulting in a loop where it fills and empties RAM every 20-60 minutes. We have some other neutron changes planned, so for now we have left that one turned off, and the other two (which have less RAM) are working fine without these symptoms.
Strange, but I'm glad to have found something, and that it's working for now.
Regards, Greg.
-- Ruslanas Gžibovskis +370 6030 7030
Hello Ruslanas and thank you for the response. I didn't see it until now! I have given some responses inline... On 1/8/19 3:57 pm, Ruslanas Gžibovskis wrote:
when in newton release were introduced role separation, we divided memory hungry processes into 4 different VM's on 3 physical boxes: 1) Networker: all Neutron agent processes (network throughput) 2) Systemd: all services started by systemd (Neutron) 3) pcs: all services controlled by pcs (Galera + RabbitMQ) 4) horizon
We have separated each control plane service (Glance, Neutron, Cinder, etc) onto its own VM. We are considering containers instead of VMs in future.
Gregory > do you have local storage for swift and cinder background?
Our Cinder and Glance use Ceph as backend. No Swift installed.
also double check where _base image is located? is it in /var/lib/nova/instances/_base/* ? and flavor disks stored in /var/lib/nova/instances ? (can check on compute by: virsh domiflist instance-00000## )
domiflist shows the VM's interface - how does that help? Greg.
I am bad at containers, just starting to learn them, not sure how they are limited. So you are using local hard drives. I guess it is one of the points for slow down, somehow. I ask my developers to use heat to create more than 1 instance/resource. Try checking CEPH speed. I think CEPH has the option to send "created" callback after 1 copy created/written to HDD, and then finish duplicating or tripling data in the background, what makes CEPH data not so reliable but MUUUCH faster. Need to google for that, I do not remember it. sorry, yes my fault, not domiflist but domblklist: virsh domblklist instance-00000## Generally, I have the same issue as you have, but on older version of OpenStack (Mitaka, Mirantis implementation). I have difficulties when I have an instance, which is using CEPH based volume and sharing it over NFS in the instance1 in compute1 to another instance2 in another compute2. I receive around 13KB/s, if I reshare it on root drive, I get around 30KB/s still too low. On Thu, 15 Aug 2019 at 09:35, Gregory Orange <gregory.orange@pawsey.org.au> wrote:
Hello Ruslanas and thank you for the response. I didn't see it until now! I have given some responses inline...
On 1/8/19 3:57 pm, Ruslanas Gžibovskis wrote:
when in newton release were introduced role separation, we divided memory hungry processes into 4 different VM's on 3 physical boxes: 1) Networker: all Neutron agent processes (network throughput) 2) Systemd: all services started by systemd (Neutron) 3) pcs: all services controlled by pcs (Galera + RabbitMQ) 4) horizon
We have separated each control plane service (Glance, Neutron, Cinder, etc) onto its own VM. We are considering containers instead of VMs in future.
Gregory > do you have local storage for swift and cinder background?
Our Cinder and Glance use Ceph as backend. No Swift installed.
also double check where _base image is located? is it in /var/lib/nova/instances/_base/* ? and flavor disks stored in /var/lib/nova/instances ? (can check on compute by: virsh domiflist instance-00000## )
domiflist shows the VM's interface - how does that help?
Greg.
-- Ruslanas Gžibovskis +370 6030 7030
participants (2)
-
Gregory Orange
-
Ruslanas Gžibovskis