[openstack-dev] [cinder][nova] Are disk-intensive operations managed ... or not?
Preston L. Bannister
preston at bannister.us
Sun Oct 19 17:45:50 UTC 2014
Jay,
Thanks very much for the insight and links. In fact, I have visited
*almost* all the places mentioned, prior. Added clarity is good. :)
Also, to your earlier comment (to an earlier thread) about backup not
really belonging in Nova - in main I agree. The "backup" API belongs in
Nova (as this maps cleanly to the equivalent in AWS), but the bulk of the
implementation can and should be distinct (in my opinion).
My current work is at:
https://github.com/dreadedhill-work/stack-backup
I also have matching changes to Nova and the Nova client under the same
Github account.
Please note this is very much a work in progress (as you might guess from
my prior comments). This needs a longer proper write up, and a cleaner Git
history. The code is a pretty fair ways along, but should be considered
more a rough draft, rather than a final version.
For the next few weeks, I am enormously crunched for time, as I have
promised a PoC at a site with a very large OpenStack deployment.
Noted your suggestion about the Rally team. Might be a bit before I can
pursue. :)
Again, Thanks.
On Sun, Oct 19, 2014 at 10:13 AM, Jay Pipes <jaypipes at gmail.com> wrote:
> Hi Preston, some great questions in here. Some comments inline, but tl;dr
> my answer is "yes, we need to be doing a much better job thinking about how
> I/O intensive operations affect other things running on providers of
> compute and block storage resources"
>
> On 10/19/2014 06:41 AM, Preston L. Bannister wrote:
>
>> OK, I am fairly new here (to OpenStack). Maybe I am missing something.
>> Or not.
>>
>> Have a DevStack, running in a VM (VirtualBox), backed by a single flash
>> drive (on my current generation MacBook). Could be I have something off
>> in my setup.
>>
>> Testing nova backup - first the existing implementation, then my (much
>> changed) replacement.
>>
>> Simple scripts for testing. Create images. Create instances (five). Run
>> backup on all instances.
>>
>> Currently found in:
>> https://github.com/dreadedhill-work/stack-backup/
>> tree/master/backup-scripts
>>
>> First time I started backups of all (five) instances, load on the
>> Devstack VM went insane, and all but one backup failed. Seems that all
>> of the backups were performed immediately (or attempted), without any
>> sort of queuing or load management. Huh. Well, maybe just the backup
>> implementation is naive...
>>
>
> Yes, you are exactly correct. There is no queuing behaviour for any of the
> "backup" operations (I put "backup" operations in quotes because IMO it is
> silly to refer to them as backup operations, since all they are doing
> really is a snapshot action against the instance/volume -- and then
> attempting to be a poor man's cloud cron).
>
> The backup is initiated from the admin_actions API extension here:
>
> https://github.com/openstack/nova/blob/master/nova/api/
> openstack/compute/contrib/admin_actions.py#L297
>
> which calls the nova.compute.api.API.backup() method here:
>
> https://github.com/openstack/nova/blob/master/nova/compute/api.py#L2031
>
> which, after creating some image metadata in Glance for the snapshot,
> calls the compute RPC API here:
>
> https://github.com/openstack/nova/blob/master/nova/compute/rpcapi.py#L759
>
> Which sends an RPC asynchronous message to the compute node to execute the
> instance snapshot and "rotate backups":
>
> https://github.com/openstack/nova/blob/master/nova/compute/
> manager.py#L2969
>
> That method eventually calls the blocking snapshot() operation on the virt
> driver:
>
> https://github.com/openstack/nova/blob/master/nova/compute/
> manager.py#L3041
>
> And it is the nova.virt.libvirt.Driver.snapshot() method that is quite
> "icky", with lots of logic to determine the type of snapshot to do and how
> to do it:
>
> https://github.com/openstack/nova/blob/master/nova/virt/
> libvirt/driver.py#L1607
>
> The gist of the driver's snapshot() method calls ImageBackend.snapshot(),
> which is responsible for doing the actual snapshot of the instance:
>
> https://github.com/openstack/nova/blob/master/nova/virt/
> libvirt/driver.py#L1685
>
> and then once the snapshot is done, the method calls to the Glance API to
> upload the snapshotted disk image to Glance:
>
> https://github.com/openstack/nova/blob/master/nova/virt/
> libvirt/driver.py#L1730-L1734
>
> All of which is I/O intensive and AFAICT, mostly done in a blocking
> manner, with no queuing or traffic control measures, so as you correctly
> point out, if the compute node daemon receives 5 backup requests, it will
> go ahead and do 5 snapshot operations and 5 uploads to Glance all as fast
> as it can. It will do it in 5 different eventlet greenthreads, but there
> are no designs in place to prioritize the snapshotting I/O lower than
> active VM I/O.
>
> I will write on this at greater length, but backup should interfere as
>> little as possible with foreground processing. Overloading a host is
>> entirely unacceptable.
>>
>
> Agree with you completely.
>
> Replaced the backup implementation so it does proper queuing (among
>> other things). Iterating forward - implementing and testing.
>>
>
> Is this code up somewhere we can take a look at?
>
> Fired off snapshots on five Cinder volumes (attached to five instances).
>> Again the load shot very high. Huh. Well, in a full-scale OpenStack
>> setup, maybe storage can handle that much I/O more gracefully ... or
>> not. Again, should taking snapshots interfere with foreground activity?
>> I would say, most often not. Queuing and serializing snapshots would
>> strictly limit the interference with foreground. Also, very high end
>> storage can perform snapshots *very* quickly, so serialized snapshots
>> will not be slow. My take is that the default behavior should be to
>> queue and serialize all heavy I/O operations, with non-default
>> allowances for limited concurrency.
>>
>> Cleaned up (which required reboot/unstack/stack and more). Tried again.
>>
>> Ran two test backups (which in the current iteration create Cinder
>> volume snapshots). Asked Cinder to delete the snapshots. Again, very
>> high load factors, and in "top" I can see two long-running "dd"
>> processes. (Given I have a single disk, more than one "dd" is not good.)
>>
>> Running too many heavyweight operations against storage can lead to
>> thrashing. Queuing can strictly limit that load, and insure better and
>> reliable performance. I am not seeing evidence of this thought in my
>> OpenStack testing.
>>
>> So far it looks like there is no thought to managing the impact of disk
>> intensive management operations. Am I missing something?
>>
>
> Nope, I think you pretty much hit the nail on the head. Very much
> appreciate your thoughts above on the matter and I look forward to working
> with you to address the problems. If you haven't already, get to know the
> Rally contributor team, who are working to write benchmarks and stress
> tests that show these kinds of issues.
>
> The Rally team hangs out on freenode IRC #openstack-rally channel.
>
> All the best,
> -jay
>
>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20141019/f4f8b643/attachment.html>
More information about the OpenStack-dev
mailing list