<div style="line-height:1.7;color:#000000;font-size:14px;font-family:arial"><pre>>IMO we'd better to use backend storage optimized approach to access
>remote image from compute node instead of using iSCSI only. And from
>my experience, I'm sure iSCSI is short of stability under heavy I/O
>workload in product environment, it could causes either VM filesystem
>to be marked as readonly or VM kernel panic.</pre><pre>Yes, in this situation, the problem lies in the backend storage, so no other</pre><pre>protocol will perform better. However, P2P transferring will greatly reduce</pre><pre>workload on the backend storage, so as to increase responsiveness.</pre><br><br><pre>>As I said currently Nova already has image caching mechanism, so in
>this case P2P is just an approach could be used for downloading or
>preheating for image caching.</pre><pre>Nova's image caching is file level, while VMThunder's is block-level. And</pre><pre>VMThunder is for working in conjunction with Cinder, not Glance. VMThunder</pre><pre>currently uses facebook's flashcache to realize caching, and dm-cache,</pre><pre>bcache are also options in the future.</pre><pre><br></pre><pre>
>I think P2P transferring/pre-caching sounds a good way to go, as I
>mentioned as well, but actually for the area I'd like to see something
>like zero-copy + CoR. On one hand we can leverage the capability of
>on-demand downloading image bits by zero-copy approach, on the other
>hand we can prevent to reading data from remote image every time by
>CoR.
</pre><div>Yes, on-demand transferring is what you mean by "zero-copy", and caching</div><div>is something close to CoR. In fact, we are working on a kernel module called</div><div>foolcache that realize a true CoR. See <a href="https://github.com/lihuiba/dm-foolcache">https://github.com/lihuiba/dm-foolcache</a>.</div><div><br></div><div><br></div><br class="Apple-interchange-newline"><br><div></div><div id="divNeteaseMailCard"></div><pre>National Key Laboratory for Parallel and Distributed
Processing, College of Computer Science, National University of Defense
Technology, Changsha, Hunan Province, P.R. China
410073</pre><pre><br>At 2014-04-17 17:11:48,"Zhi Yan Liu" <lzy.dev@gmail.com> wrote:
>On Thu, Apr 17, 2014 at 4:41 PM, lihuiba <magazine.lihuiba@163.com> wrote:
>>>IMHO, zero-copy approach is better
>> VMThunder's "on-demand transferring" is the same thing as your "zero-copy
>> approach".
>> VMThunder is uses iSCSI as the transferring protocol, which is option #b of
>> yours.
>>
>
>IMO we'd better to use backend storage optimized approach to access
>remote image from compute node instead of using iSCSI only. And from
>my experience, I'm sure iSCSI is short of stability under heavy I/O
>workload in product environment, it could causes either VM filesystem
>to be marked as readonly or VM kernel panic.
>
>>
>>>Under #b approach, my former experience from our previous similar
>>>Cloud deployment (not OpenStack) was that: under 2 PC server storage
>>>nodes (general *local SAS disk*, without any storage backend) +
>>>2-way/multi-path iSCSI + 1G network bandwidth, we can provisioning 500
>>>VMs in a minute.
>> suppose booting one instance requires reading 300MB of data, so 500 ones
>> require 150GB. Each of the storage server needs to send data at a rate of
>> 150GB/2/60 = 1.25GB/s on average. This is absolutely a heavy burden even
>> for high-end storage appliances. In production systems, this request
>> (booting
>> 500 VMs in one shot) will significantly disturb other running instances
>> accessing the same storage nodes.
>>
>> VMThunder eliminates this problem by P2P transferring and on-compute-node
>> caching. Even a pc server with one 1gb NIC (this is a true pc server!) can
>> boot
>> 500 VMs in a minute with ease. For the first time, VMThunder makes bulk
>> provisioning of VMs practical for production cloud systems. This is the
>> essential
>> value of VMThunder.
>>
>
>As I said currently Nova already has image caching mechanism, so in
>this case P2P is just an approach could be used for downloading or
>preheating for image caching.
>
>I think P2P transferring/pre-caching sounds a good way to go, as I
>mentioned as well, but actually for the area I'd like to see something
>like zero-copy + CoR. On one hand we can leverage the capability of
>on-demand downloading image bits by zero-copy approach, on the other
>hand we can prevent to reading data from remote image every time by
>CoR.
>
>zhiyan
>
>>
>>
>>
>> ===================================================
>> From: Zhi Yan Liu <lzy.dev@gmail.com>
>> Date: 2014-04-17 0:02 GMT+08:00
>> Subject: Re: [openstack-dev] [Nova][blueprint] Accelerate the booting
>> process of a number of vms via VMThunder
>> To: "OpenStack Development Mailing List (not for usage questions)"
>> <openstack-dev@lists.openstack.org>
>>
>>
>>
>> Hello Yongquan Fu,
>>
>> My thoughts:
>>
>> 1. Currently Nova has already supported image caching mechanism. It
>> could caches the image on compute host which VM had provisioning from
>> it before, and next provisioning (boot same image) doesn't need to
>> transfer it again only if cache-manger clear it up.
>> 2. P2P transferring and prefacing is something that still based on
>> copy mechanism, IMHO, zero-copy approach is better, even
>> transferring/prefacing could be optimized by such approach. (I have
>> not check "on-demand transferring" of VMThunder, but it is a kind of
>> transferring as well, at last from its literal meaning).
>> And btw, IMO, we have two ways can go follow zero-copy idea:
>> a. when Nova and Glance use same backend storage, we could use storage
>> special CoW/snapshot approach to prepare VM disk instead of
>> copy/transferring image bits (through HTTP/network or local copy).
>> b. without "unified" storage, we could attach volume/LUN to compute
>> node from backend storage as a base image, then do such CoW/snapshot
>> on it to prepare root/ephemeral disk of VM. This way just like
>> boot-from-volume but different is that we do CoW/snapshot on Nova side
>> instead of Cinder/storage side.
>>
>> For option #a, we have already got some progress:
>> https://blueprints.launchpad.net/nova/+spec/image-multiple-location
>> https://blueprints.launchpad.net/nova/+spec/rbd-clone-image-handler
>> https://blueprints.launchpad.net/nova/+spec/vmware-clone-image-handler
>>
>> Under #b approach, my former experience from our previous similar
>> Cloud deployment (not OpenStack) was that: under 2 PC server storage
>> nodes (general *local SAS disk*, without any storage backend) +
>> 2-way/multi-path iSCSI + 1G network bandwidth, we can provisioning 500
>> VMs in a minute.
>>
>> For vmThunder topic I think it sounds a good idea, IMO P2P, prefacing
>> is one of optimized approach for image transferring valuably.
>>
>> zhiyan
>>
>> On Wed, Apr 16, 2014 at 9:14 PM, yongquan Fu <quanyongf@gmail.com> wrote:
>>>
>>> Dear all,
>>>
>>>
>>>
>>> We would like to present an extension to the vm-booting functionality of
>>> Nova when a number of homogeneous vms need to be launched at the same
>>> time.
>>>
>>>
>>>
>>> The motivation for our work is to increase the speed of provisioning vms
>>> for
>>> large-scale scientific computing and big data processing. In that case, we
>>> often need to boot tens and hundreds virtual machine instances at the same
>>> time.
>>>
>>>
>>> Currently, under the Openstack, we found that creating a large number
>>> of
>>> virtual machine instances is very time-consuming. The reason is the
>>> booting
>>> procedure is a centralized operation that involve performance bottlenecks.
>>> Before a virtual machine can be actually started, OpenStack either copy
>>> the
>>> image file (swift) or attach the image volume (cinder) from storage server
>>> to compute node via network. Booting a single VM need to read a large
>>> amount
>>> of image data from the image storage server. So creating a large number of
>>> virtual machine instances would cause a significant workload on the
>>> servers.
>>> The servers become quite busy even unavailable during the deployment
>>> phase.
>>> It would consume a very long time before the whole virtual machine cluster
>>> useable.
>>>
>>>
>>>
>>> Our extension is based on our work on vmThunder, a novel mechanism
>>> accelerating the deployment of large number virtual machine instances. It
>>> is
>>> written in Python, can be integrated with OpenStack easily. VMThunder
>>> addresses the problem described above by following improvements: on-demand
>>> transferring (network attached storage), compute node caching, P2P
>>> transferring and prefetching. VMThunder is a scalable and cost-effective
>>> accelerator for bulk provisioning of virtual machines.
>>>
>>>
>>>
>>> We hope to receive your feedbacks. Any comments are extremely welcome.
>>> Thanks in advance.
>>>
>>>
>>>
>>> PS:
>>>
>>>
>>>
>>> VMThunder enhanced nova blueprint:
>>> https://blueprints.launchpad.net/nova/+spec/thunderboost
>>>
>>> VMThunder standalone project: https://launchpad.net/vmthunder;
>>>
>>> VMThunder prototype: https://github.com/lihuiba/VMThunder
>>>
>>> VMThunder etherpad: https://etherpad.openstack.org/p/vmThunder
>>>
>>> VMThunder portal: http://www.vmthunder.org/
>>>
>>> VMThunder paper:
>>> http://www.computer.org/csdl/trans/td/preprint/06719385.pdf
>>>
>>>
>>>
>>> Regards
>>>
>>>
>>>
>>> vmThunder development group
>>>
>>> PDL
>>>
>>> National University of Defense Technology
>>>
>>>
>>> _______________________________________________
>>> OpenStack-dev mailing list
>>> OpenStack-dev@lists.openstack.org
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>
>>
>> _______________________________________________
>> OpenStack-dev mailing list
>> OpenStack-dev@lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>>
>>
>> --
>> Yongquan Fu
>> PhD, Assistant Professor,
>> National Key Laboratory for Parallel and Distributed
>> Processing, College of Computer Science, National University of Defense
>> Technology, Changsha, Hunan Province, P.R. China
>> 410073
>>
>> _______________________________________________
>> OpenStack-dev mailing list
>> OpenStack-dev@lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>
>_______________________________________________
>OpenStack-dev mailing list
>OpenStack-dev@lists.openstack.org
>http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
</pre></div>