[Openstack-operators] Problem with Heavy Network IO and Dnsmasq

Narayan Desai narayan.desai at gmail.com
Wed Aug 15 12:00:20 UTC 2012

On Wed, Aug 15, 2012 at 6:19 AM, Thomas Vachon <vachon at sessionm.com> wrote:
> I reported this as a bug here: https://bugs.launchpad.net/nova/+bug/1037065
> However, I was looking to see if anyone else has seen this.
> <snip>
> I was running a load test against a 4 node Cassandra cluster in
> Openstack. I have separate tenancy for each node to ensure there was
> no funny contention. Running the test 3 times produced the same
> results each time.
> About 1/3 of the way through the test, the dnsmasq process crashes
> (with no warning or error in any log). The instance will continue
> "working" but only inside of the VNC console as all outside
> connectivity is now unroutable.
> Here is a log from the dnsmasq process. The first two rows show that
> dnsmasq was working, then it just fails to route correctly back to the
> instance.

I ran into some sort of virtio-net bug that manifested itself in a
similar fashion recently. (Ubuntu Precise, fwiw). Basically, when
moving large quantities of network traffic into VMs on some node types
(but not others, oddly enough). In my case, it looked like dnsmasq was
failing, but the process was still running; it had just stopped
getting requests from the clients. Rebooting instances would bring
them back into service.

Can you bring the network back up via VNC? If so, this isn't the same
as the issue I saw. If you can't then something is stuck in virtio. I
was able to work around the problem by enabling the vhost_net module
in the hypervisor.

More information about the OpenStack-operators mailing list