[openstack-dev] [nova] Unshelve Instance Performance Optimization Questions

Kekane, Abhishek Abhishek.Kekane at nttdata.com
Tue Mar 17 08:37:01 UTC 2015


Hi John,

Thanks for your opinion.

Fundamentally we cannot assume infinite storage space.

To enhance the shelve/unshelve performance, I have proposed nova-specs [1], in which there are two challenges.

A. This design is libvirt specific, currently I am using KVM hypervisor but I am open to make changes to other hypervisors.
      I don't have the know-how about other hypervisors (how to configuration etc.)  any help about same from community is appreciated.

B. HostAggregateGroupFilter [2] - Filter to schedule host on different node if shared storage is full or resources are not available.
     Please let me know your opinion about this HostAggregateGroupFilter.

I request community members to go through the nova-specs [1] and patches submitted [3] for the same and let us give your feedback on the same.

[1] https://review.openstack.org/135387
[2] https://review.openstack.org/150330
[3] https://review.openstack.org/150315, https://review.openstack.org/150337, https://review.openstack.org/150344

Thank You,

Abhishek Kekane

-----Original Message-----
From: John Garbutt [mailto:john at johngarbutt.com] 
Sent: 12 March 2015 17:41
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] [nova] Unshelve Instance Performance Optimization Questions

Hi,

On 11 March 2015 at 06:35, Kekane, Abhishek <Abhishek.Kekane at nttdata.com> wrote:
> In case of start/stop API’s cpu/memory are not released/reassigned. We 
> can modify these API’s to release the cpu and memory while stopping 
> the instance and reassign the same while starting the instance. In 
> this case also rescheduling logic need  to be modified to reschedule 
> the instance on different host, if required resources are not 
> available while starting the instance. This is similar to what I have 
> implemented in [2] Improving the performance of unshelve API.

I am against start releasing the resources, as you can't guarantee start will work quickly. Similar to suspend I suppose.

The idea of shelve/unshelve is to avoid that problem, by ensuring you can resume the VM anywhere, should someone else use the resources you have freed up. But the idea was to optimize for a quick unshelve, where possible. The feature is not really complete, we need a scheduling weighter to deal with avoiding that capacity till you need it, etc. When you have shared storage, it maybes sense to add the option of skipping the snapshot (boot from volume clearly doesn't need a snapshot), if you are happy to assume there will always be space on some host that can see that shared storage.

> Please let me know your opinion, whether we can modify start/stop 
> API’s as an alternative to shelve/unshelve API’s.

I would rather we enhance shelve/unshelve, rather than fundamentally change the semantics of start/stop.

Thanks,
John


> From: Kekane, Abhishek [mailto:Abhishek.Kekane at nttdata.com]
> Sent: 24 February 2015 12:47
>
> To: OpenStack Development Mailing List (not for usage questions)
> Subject: Re: [openstack-dev] [nova] Unshelve Instance Performance 
> Optimization Questions
>
>
>
> Hi Duncan,
>
>
>
> Thank you for the inputs.
>
>
>
> @Community-Members
>
> I want to know if there are any other alternatives to improve the 
> performance of unshelve api ((booted from image only).
>
> Please give me your opinion on the same.
>
>
>
> Thank You,
>
>
>
> Abhishek Kekane
>
>
>
>
>
>
>
> From: Duncan Thomas [mailto:duncan.thomas at gmail.com]
> Sent: 16 February 2015 16:46
> To: OpenStack Development Mailing List (not for usage questions)
> Subject: Re: [openstack-dev] [nova] Unshelve Instance Performance 
> Optimization Questions
>
>
>
> There has been some talk in cinder meetings about making 
> cinder<->glance interactions more efficient. They are already 
> optimised in some deployments, e.g. ceph glance and ceph cinder, and 
> some backends cache glance images so that many volumes created from the same image becomes very efficient.
> (search the meeting logs or channel logs for 'public snapshot' to get 
> some entry points into the discussions)
>
> I'd like to see more work done on this, and perhaps re-examine a 
> cinder backend to glance. This would give some of what you're 
> suggesting (particularly fast, low traffic un-shelve), and there is 
> more that can be done to improve that performance, particularly if we 
> can find a better performing generic CoW technology than QCOW2.
>
> As suggested in the review, in the short term you might be better 
> experimenting with moving to boot-from-volume instances if you have a 
> suitable cinder deployed, since that gives you some of the performance 
> improvements already.
>
>
>
> On 16 February 2015 at 12:10, Kekane, Abhishek 
> <Abhishek.Kekane at nttdata.com>
> wrote:
>
> Hi Devs,
>
>
>
> Problem Statement: Performance and storage efficiency of 
> shelving/unshelving instance booted from image is far worse than instance booted from volume.
>
>
>
> When you unshelve hundreds of instances at the same time, instance 
> spawning time varies and it mainly depends on the size of the instance 
> snapshot and
>
> the network speed between glance and nova servers.
>
>
>
> If you have configured file store (shared storage) as a backend in 
> Glance for storing images/snapshots, then it's possible to improve the 
> performance of
>
> unshelve instance dramatically by configuring 
> nova.image.download.FileTransfer in nova. In this case, it simply 
> copies the instance snapshot as if it is
>
> stored on the local filesystem of the compute node. But then again in 
> this case, it is observed the network traffic between shared storage 
> servers and
>
> nova increases enormously resulting in slow spawning of the instances.
>
>
>
> I would like to gather some thoughts about how we can improve the 
> performance of unshelve api (booted from image only) in terms of 
> downloading large
>
> size instance snapshots from glance.
>
>
>
> I have proposed a nova-specs [1] to address this performance issue. 
> Please take a look at it.
>
>
>
> During the last nova mid-cycle summit, Michael Still has suggested 
> alternative solutions to tackle this issue.
>
>
>
> Storage solutions like ceph (Software based) and NetApp (Hardare 
> based) support exposing images from glance to nova-compute and 
> cinder-volume with
>
> copy in write feature. This way there will be no need to download the 
> instance snapshot and unshelve api will be pretty faster than getting 
> it
>
> from glance.
>
>
>
> Do you think the above performance issue should be handled in the 
> OpenStack software as described in nova-specs [1] or storage solutions 
> like
>
> ceph/NetApp should be used in production environment? Apart from 
> ceph/NetApp solutions, are there any other options available in the market.
>
>
>
> [1] https://review.openstack.org/#/c/135387/
>
>
>
> Thank You,
>
>
>
> Abhishek Kekane
>
>
> ______________________________________________________________________
> Disclaimer: This email and any attachments are sent in strictest 
> confidence for the sole use of the addressee and may contain legally 
> privileged, confidential, and proprietary data. If you are not the 
> intended recipient, please advise the sender by replying promptly to 
> this email and then delete and destroy this email and any attachments 
> without any further use, copying or forwarding.
>
>
> ______________________________________________________________________
> ____ OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: 
> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
>
>
> --
>
> Duncan Thomas
>
>
> ______________________________________________________________________
> Disclaimer: This email and any attachments are sent in strictest 
> confidence for the sole use of the addressee and may contain legally 
> privileged, confidential, and proprietary data. If you are not the 
> intended recipient, please advise the sender by replying promptly to 
> this email and then delete and destroy this email and any attachments 
> without any further use, copying or forwarding.
>
>
> ______________________________________________________________________
> Disclaimer: This email and any attachments are sent in strictest 
> confidence for the sole use of the addressee and may contain legally 
> privileged, confidential, and proprietary data. If you are not the 
> intended recipient, please advise the sender by replying promptly to 
> this email and then delete and destroy this email and any attachments 
> without any further use, copying or forwarding.
>
> ______________________________________________________________________
> ____ OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: 
> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

______________________________________________________________________
Disclaimer: This email and any attachments are sent in strictest confidence
for the sole use of the addressee and may contain legally privileged,
confidential, and proprietary data. If you are not the intended recipient,
please advise the sender by replying promptly to this email and then delete
and destroy this email and any attachments without any further use, copying
or forwarding.


More information about the OpenStack-dev mailing list