<html><head><meta http-equiv="Content-Type" content="text/html charset=us-ascii"></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; "><div><blockquote type="cite"><blockquote type="cite"></blockquote>On Fri, Aug 16, 2013 at 11:05:19AM +0100, Daniel P. Berrange wrote:<br><blockquote type="cite"><blockquote type="cite"></blockquote>On Wed, Aug 14, 2013 at 04:53:01PM -0700, Vishvananda Ishaya wrote:<br><blockquote type="cite"><blockquote type="cite"></blockquote>Hi Everyone,<br><blockquote type="cite"></blockquote><br><blockquote type="cite"></blockquote>I have been trying for some time to get the code for the live-snapshot blueprint[1]<br><blockquote type="cite"></blockquote>in. Going through the review process for the rpc and interface code[2] was easy. I<br><blockquote type="cite"></blockquote>suspect the api-extension code[3] will also be relatively trivial to get in. The<br><blockquote type="cite"></blockquote>main concern is with the libvirt driver implementation[4]. I'd like to discuss the<br><blockquote type="cite"></blockquote>concerns and see if we can make some progress.<br><blockquote type="cite"></blockquote><br><blockquote type="cite"></blockquote>Short Summary (tl;dr)<br><blockquote type="cite"></blockquote>=====================<br><blockquote type="cite"></blockquote><br><blockquote type="cite"></blockquote>I propose we merge live-cloning as an experimental feature for havanna and have the<br><blockquote type="cite"></blockquote>api extension disabled by default.<br><blockquote type="cite"></blockquote><br><blockquote type="cite"></blockquote>Overview<br><blockquote type="cite"></blockquote>========<br><blockquote type="cite"></blockquote><br><blockquote type="cite"></blockquote>First of all, let me express the value of live snapshoting. The<br><blockquote type="cite"></blockquote>slowest part of the vm provisioning process is generally booting<br>of the OS.<br></blockquote></blockquote><font color="#007316"><br></font><blockquote type="cite"></blockquote>Like Dan I'm dubious about this whole plan. But this ^^ statement in<br>particular. I would like to see hard data to back this up.<br></blockquote><blockquote type="cite"></blockquote><font color="#0f61c8"><br></font><blockquote type="cite"></blockquote>What we need to keep in mind is that "boot" is a small part of the picture, at least "boot" as commonly referred to in Linux.<br><blockquote type="cite"></blockquote><font color="#0f61c8"><br></font><blockquote type="cite"></blockquote>Consider a web sphere-like Java bundle of code. These things take a while to load. JiT-ed methods provide a tremendous performance boost. Nevermind if the the server constructs secondary indices to perform fast lookups of data.<br><blockquote type="cite"></blockquote><font color="#0f61c8"><br></font><blockquote type="cite"></blockquote>That is just Linux. Windows is well known for pounding storage fabrics with thousands of small reads during boot storms. Certainly a boot Windows sequence has baked in a lot of service startup sequences that prime a lot of memory content for performance objectives.<br><blockquote type="cite"></blockquote><font color="#0f61c8"><br></font><blockquote type="cite"></blockquote>Boot here means "ready to rock-n-roll", not "Cirros is up."<br><blockquote type="cite"></blockquote><font color="#0f61c8"><br></font><blockquote type="cite"></blockquote>We have live deployments that are based on bypassing the entire *application startup* sequence and have a server ready to provide high-performance responses to queries once spawned from a live saved image.<br><blockquote type="cite"></blockquote><font color="#0f61c8"><br></font><blockquote type="cite"></blockquote><font color="#0f61c8"><br></font><blockquote type="cite"><blockquote type="cite"></blockquote><font color="#007316"><br></font><blockquote type="cite"></blockquote>You should be able to boot an OS pretty quickly, and furthermore it's<br><blockquote type="cite"></blockquote>(a) much safer for all the reasons Dan outlines, and (b) improvements<br><blockquote type="cite"></blockquote>that you make to boot times help everyone.<br><blockquote type="cite"></blockquote><font color="#007316"><br></font><blockquote type="cite"></blockquote>[...]<br><blockquote type="cite"><blockquote type="cite"><blockquote type="cite"></blockquote>2. Security Concerns<br><blockquote type="cite"></blockquote>====================<br><blockquote type="cite"></blockquote><br><blockquote type="cite"></blockquote>There are a number of security issues with loading state from another vm. Here is a<br><blockquote type="cite"></blockquote>short list of things that need to be done just to make a cloned vm usable:<br><blockquote type="cite"></blockquote><br><blockquote type="cite"></blockquote>a) mac address needs to be recreated<br><blockquote type="cite"></blockquote>b) entropy pool needs to be reset<br><blockquote type="cite"></blockquote>c) host name must be reset<br><blockquote type="cite"></blockquote>d) host keys bust be regenerated<br><blockquote type="cite"></blockquote><br><blockquote type="cite"></blockquote>There are others, and trying to clone a running application as well may expose other<br>sensitive data, especially if users are snaphsoting vms and making them public.<br></blockquote></blockquote><font color="#007316"><br></font><blockquote type="cite"></blockquote>Are we talking about cloning VMs that you already trust, or cloning<br><blockquote type="cite"></blockquote>random VMs and allowing random other users to use them? These would<br><blockquote type="cite"></blockquote>lead to very different solutions. In the first case, you only care<br><blockquote type="cite"></blockquote>about correctness, not security. In the second case, you care about<br>security as well as correctness.<br></blockquote><blockquote type="cite"></blockquote><font color="#0f61c8"><br></font><blockquote type="cite"></blockquote>Case number one.<br><blockquote type="cite"></blockquote><font color="#0f61c8"><br></font><blockquote type="cite"></blockquote>The correctness issues are a hard problem, and a particularly hard one in Windows, but it is pragmatically solvable.<br><blockquote type="cite"></blockquote><font color="#0f61c8"><br></font><blockquote type="cite"></blockquote>For a common scenario in Linux, renewing dhcp leases and leveling your entropy pool are what you need.<br><blockquote type="cite"></blockquote><font color="#0f61c8"><br></font><blockquote type="cite"><blockquote type="cite"></blockquote><font color="#007316"><br></font><blockquote type="cite"></blockquote>I highly doubt the second case is possible because scrubbing the disk<br>is going to take far too long for any supposed time-saving to matter.<br></blockquote><blockquote type="cite"></blockquote><font color="#0f61c8"><br></font><blockquote type="cite"></blockquote>That would be very counter-productive, so yes, focusing on the first case.<br><blockquote type="cite"><blockquote type="cite"></blockquote><font color="#007316"><br></font><blockquote type="cite"></blockquote>As Dan says, even the first case is dubious because it won't be correct.<br><blockquote type="cite"></blockquote><font color="#007316"><br></font><blockquote type="cite"><blockquote type="cite"></blockquote>The libguestfs project provide tools to perform offline cloning of<br><blockquote type="cite"></blockquote>VM disk images. Its virt-sysprep knows how to delete alot (but by<br><blockquote type="cite"></blockquote>no means all possible) sensitive file data for common Linux &<br><blockquote type="cite"></blockquote>Windows OS. It still has to be combined with use of the<br><blockquote type="cite"></blockquote>virt-sparsify tool though, to ensure the deleted data is actually<br><blockquote type="cite"></blockquote>purged from the VM disk image as well as the filesystem, by<br><blockquote type="cite"></blockquote>releasing all unused VM disk sectors back to the host storage (and<br>not all storage supports that).<br></blockquote><blockquote type="cite"></blockquote><font color="#007316"><br></font><blockquote type="cite"></blockquote>Links to the tools that Dan mentions:<br><blockquote type="cite"></blockquote><font color="#007316"><br></font><blockquote type="cite"></blockquote><a href="http://libguestfs.org/virt-sysprep.1.html">http://libguestfs.org/virt-sysprep.1.html</a><br>http://libguestfs.org/virt-sparsify.1.html<br></blockquote><blockquote type="cite"></blockquote><font color="#0f61c8"><br></font><blockquote type="cite"></blockquote>Virt-sparsify is not strictly relevant here. The disk side of live images is carried out with qcow2.<br><blockquote type="cite"></blockquote><font color="#0f61c8"><br></font><blockquote type="cite"></blockquote>Virt-sysprep is great work and highly relevant.<br><blockquote type="cite"></blockquote><font color="#0f61c8"><br></font><blockquote type="cite"></blockquote>But virt-sysprep allows us to see the argument in a different light. Have you noticed nova does not run virt-sysprep before booting an ephemeral instance from an image? (AFAIK, could be wrong, not even regenerating host ssh keys is part of the assured workflow). Furthermore, one can create arbitrary (cold, non-live) images at any time, from live instances<br><blockquote type="cite"></blockquote><font color="#0f61c8"><br></font><blockquote type="cite"></blockquote>This isn't necessarily wrong. It underpins massive deployments, it pragmatically adds value. The fundamental semantics at play with live-instances are the same: know what you are doing, ephemeral instances, bound to your tenant.<br><blockquote type="cite"></blockquote><font color="#0f61c8"><br></font><blockquote type="cite"></blockquote>So as long as we take care of not weakening cryptographic foundations and correctly reconfiguring the identity, all the principles above still apply: know what you are doing, ephemeral instances, bound to your tenant.<br><blockquote type="cite"></blockquote><font color="#0f61c8"><br></font><blockquote type="cite"></blockquote>We are working on proof-of-concept using the qemu guest agent to undertake all these tasks. It is PoC to show it's doable. It's neither clean nor asking for mainline merge (at the moment). It's not the only solution for the guest side problem.<br><blockquote type="cite"></blockquote><font color="#0f61c8"><br></font><blockquote type="cite"></blockquote>An one final very important note for the libvirt list and those who may be tuning in just now: no changes to libvirt are being asked for here. This is a nova-side tech preview feature.<br><blockquote type="cite"></blockquote><font color="#0f61c8"><br></font><blockquote type="cite"></blockquote>Best,<br><blockquote type="cite"></blockquote>Andres<br><blockquote type="cite"></blockquote><font color="#0f61c8"><br></font><blockquote type="cite"></blockquote><font color="#0f61c8"><br></font><blockquote type="cite"><blockquote type="cite"></blockquote><font color="#007316"><br></font><blockquote type="cite"></blockquote>Note these tools can only be used on offline machines.<br><blockquote type="cite"></blockquote><font color="#007316"><br></font><blockquote type="cite"></blockquote>Rich.<br><blockquote type="cite"></blockquote><font color="#007316"><br></font><blockquote type="cite"></blockquote>-- <br><blockquote type="cite"></blockquote>Richard Jones, Virtualization Group, Red Hat <a href="http://people.redhat.com/~rjones">http://people.redhat.com/~rjones</a><br><blockquote type="cite"></blockquote>virt-top is 'top' for virtual machines. Tiny program with many<br><blockquote type="cite"></blockquote>powerful monitoring features, net stats, disk stats, logging, etc.<br><a href="http://people.redhat.com/~rjones/virt-top">http://people.redhat.com/~rjones/virt-top</a><br></blockquote><br></div><br></body></html>