[openstack-dev] [nova] Fixing the console.log grows forever bug.

Daniel P. Berrange berrange at redhat.com
Mon Dec 8 10:33:46 UTC 2014


On Sat, Dec 06, 2014 at 04:38:52PM +1100, Tony Breeds wrote:
> Hi All,
>     In the most recent team meeting we briefly discussed: [1] where the
> console.log grows indefinitely, eventually causing guest stalls.  I mentioned
> that I was working on a spec to fix this issue.
> 
> My original plan was fairly similar to [2]  In that we'd switch libvirt/qemu to
> using a unix domain socket and write a simple helper to read from that socket
> and write to disk.  That helper would close and reopen the on disk file upon
> receiving a HUP (so logrotate just works).   Life would be good. and we could
> all move on.
> 
> However I was encouraged to investigate fixing this in qemu, such that qemu
> could process the HUP and make life better for all.  This is certainly doable
> and I'm happy[3] to do this work.  I've floated the idea past qemu-devel and
> they seem okay with the idea.  My main concern is in lag and supporting
> qemu/libvirt that can't handle this option.

As mentioned in my reply on qemu-devel, I think the right long term solution
for this is to fix it in libvirt. We have a general security goal to remove
QEMU's ability to open any files whatsoever, instead having it receive all
host resources as pre-opened file descriptors from libvirt. So what we
anticipate is a new libvirt daemon for processing logs, virtlogd. Anywhere
where QEMU currently gets a file to log to (<serial> devices, and its
stdout/stderr), it would instead be given a FD that's connected to virtlogd.
virtlogd would simply write the data out to file & would be able to close
& re-open files to integrate with logrotate.

> For the sake of discussion  I'll lay out my best guess right now on fixing this
> in qemu.
> 
> qemu 2.2.0 /should/ release this year the ETA is 2014-12-09[4] so the fix I'm
> proposing would be available in qemu 2.3.0 which I think will be available in
> June/July 2015.  So we'd be into 'L' development before this fix is available
> and possibly 'M' before the community distros (Fedora and  Ubuntu)[5] include
> and almost certainly longer for Enterprise distros.  Along with the qemu
> development I expect there to be some libvirt development as well but right now
> I don't think that's critical to the feature or this discussion.
> 
> So if that timeline is approximately correct:
> 
> - Can we wait this long to fix the bug?  As opposed to having it squashed in Kilo.
> - What do we do in nova for the next ~12 months while know there isn't a qemu to fix this?
> - Then once there is a qemu that fixes the issue, do we just say 'thou must use
>   qemu 2.3.0' or would nova still need to support old and new qemu's ?

FWIW, by comparison libvirt is on a monthly release schedule, so a fix done in
libvirt has potential to be available sooner, though obviously there's bigger
dev work to be done in libvirt for this.

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|



More information about the OpenStack-dev mailing list