[openstack-dev] [nova] Fixing the console.log grows forever bug.

Daniel P. Berrange berrange at redhat.com
Mon Dec 8 13:39:35 UTC 2014


On Mon, Dec 08, 2014 at 01:20:19PM +0000, Dave Walker wrote:
> On 8 December 2014 at 10:33, Daniel P. Berrange <berrange at redhat.com> wrote:
> > On Sat, Dec 06, 2014 at 04:38:52PM +1100, Tony Breeds wrote:
> >> Hi All,
> >>     In the most recent team meeting we briefly discussed: [1] where the
> >> console.log grows indefinitely, eventually causing guest stalls.  I mentioned
> >> that I was working on a spec to fix this issue.
> >>
> >> My original plan was fairly similar to [2]  In that we'd switch libvirt/qemu to
> >> using a unix domain socket and write a simple helper to read from that socket
> >> and write to disk.  That helper would close and reopen the on disk file upon
> >> receiving a HUP (so logrotate just works).   Life would be good. and we could
> >> all move on.
> >>
> >> However I was encouraged to investigate fixing this in qemu, such that qemu
> >> could process the HUP and make life better for all.  This is certainly doable
> >> and I'm happy[3] to do this work.  I've floated the idea past qemu-devel and
> >> they seem okay with the idea.  My main concern is in lag and supporting
> >> qemu/libvirt that can't handle this option.
> >
> > As mentioned in my reply on qemu-devel, I think the right long term solution
> > for this is to fix it in libvirt. We have a general security goal to remove
> > QEMU's ability to open any files whatsoever, instead having it receive all
> > host resources as pre-opened file descriptors from libvirt. So what we
> > anticipate is a new libvirt daemon for processing logs, virtlogd. Anywhere
> > where QEMU currently gets a file to log to (<serial> devices, and its
> > stdout/stderr), it would instead be given a FD that's connected to virtlogd.
> > virtlogd would simply write the data out to file & would be able to close
> > & re-open files to integrate with logrotate.
> >
> >> For the sake of discussion  I'll lay out my best guess right now on fixing this
> >> in qemu.
> >>
> >> qemu 2.2.0 /should/ release this year the ETA is 2014-12-09[4] so the fix I'm
> >> proposing would be available in qemu 2.3.0 which I think will be available in
> >> June/July 2015.  So we'd be into 'L' development before this fix is available
> >> and possibly 'M' before the community distros (Fedora and  Ubuntu)[5] include
> >> and almost certainly longer for Enterprise distros.  Along with the qemu
> >> development I expect there to be some libvirt development as well but right now
> >> I don't think that's critical to the feature or this discussion.
> >>
> >> So if that timeline is approximately correct:
> >>
> >> - Can we wait this long to fix the bug?  As opposed to having it squashed in Kilo.
> >> - What do we do in nova for the next ~12 months while know there isn't a qemu to fix this?
> >> - Then once there is a qemu that fixes the issue, do we just say 'thou must use
> >>   qemu 2.3.0' or would nova still need to support old and new qemu's ?
> >
> > FWIW, by comparison libvirt is on a monthly release schedule, so a fix done in
> > libvirt has potential to be available sooner, though obviously there's bigger
> > dev work to be done in libvirt for this.
> >
> > Regards,
> > Daniel
> 
> Hey,
> 
> This thread started by suggesting having a scheduled task to read from
> a unix socket.  I don't think this can really be considered an
> acceptable fix, as the guest does indeed lock up when the buffer is
> full.
> 
> Initially, I proposed a quick fix for this back in 2011 which provided
> a config option to enable a kernel level ring buffer via a
> non-mainline module called emlog.  This was not merged for
> understandable reasons.  (pre gerrit) -
> http://bazaar.launchpad.net/~davewalker/nova/832507_with_emlog/revision/1509/nova/virt/libvirt/connection.py
> 
> Later that same year, Robie Basak presented a change which introduced
> similar logic ringbuffer support in the nova code itself making use of
> eventlet. This seems quite a reasonable fix, but there was concern it
> might lock-up guests.. https://review.openstack.org/#/c/706/
> 
> I think shortly after this, it was pretty widely agreed that fixing
> this in Nova is not the correct layer.  Personally, I struggle
> thinking qemu or libvirt is right layer either.  I can't think that
> treating a console as a flat log file is the best default behavior.
> 
> I still quite like the emlog approach, as having a ringbuffer device
> type in the kernel provides exactly what we need and is pretty simple
> to implement.
> 
> Does anyone know if this generic ringbuffer kernel support was
> proposed to mainline kernel?

The emlog approach means the data would only ever be stored in RAM on the
host, so in the event of a host reboot/crash you loose all guest logs.
While that might be ok for some people, I think we need to support the
persistent store of the logs on disk for historical / auditing record
purposes.

We don't need kernel support to provide a ring buffer. An more or less
identical solution can be done in userspace with just a pair of fixed
size files. eg write to one file, when it hits a limit, switch to the
second file, then back to the original, etc. We can easily do this in
a libvirt based solution.

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|



More information about the OpenStack-dev mailing list