[openstack-dev] [nova] [glance] How to deal with aborted image read?
Ian Cordasco
ian.cordasco at RACKSPACE.COM
Sat Jun 6 01:08:21 UTC 2015
On 6/5/15, 02:55, "Flavio Percoco" <flavio at redhat.com> wrote:
>On 04/06/15 11:46 -0600, Chris Friesen wrote:
>>On 06/04/2015 03:01 AM, Flavio Percoco wrote:
>>>On 03/06/15 16:46 -0600, Chris Friesen wrote:
>>>>We recently ran into an issue where nova couldn't write an image file
>>>>due to
>>>>lack of space and so just quit reading from glance.
>>>>
>>>>This caused glance to be stuck with an open file descriptor, which
>>>>meant that
>>>>the image consumed space even after it was deleted.
>>>>
>>>>I have a crude fix for nova at
>>>>"https://review.openstack.org/#/c/188179/"
>>>>which basically continues to read the image even though it can't write
>>>>it.
>>>>That seems less than ideal for large images though.
>>>>
>>>>Is there a better way to do this? Is there a way for nova to indicate
>>>>to
>>>>glance that it's no longer interested in that image and glance can
>>>>close the
>>>>file?
>>>>
>>>>If I've followed this correctly, on the glance side I think the code in
>>>>question is ultimately
>>>>glance_store._drivers.filesystem.ChunkedFile.__iter__().
>>>
>>>Actually, to be honest, I was quite confused by the email :P
>>>
>>>Correct me if I still didn't understand what you're asking.
>>>
>>>You ran out of space on the Nova side while downloading the image and
>>>there's a file descriptor leak somewhere either in that lovely (sarcasm)
>>>glance wrapper or in glanceclient.
>>
>>The first part is correct, but the file descriptor is actually held by
>>glance-api.
>>
>>>Just by reading your email and glancing your patch, I believe the bug
>>>might be in glanceclient but I'd need to five into this. The piece of
>>>code you'll need to look into is[0].
>>>
>>>glance_store is just used server side. If that's what you meant -
>>>glance is keeping the request and the ChunkedFile around - then yes,
>>>glance_store is the place to look into.
>>>
>>>[0]
>>>https://github.com/openstack/python-glanceclient/blob/master/glanceclien
>>>t/v1/images.py#L152
>>
>>I believe what's happening is that the ChunkedFile code opens the file
>>and creates the iterator. Nova then starts iterating through the
>>file.
>>
>>If nova (or any other user of glance) iterates all the way through the
>>file then the ChunkedFile code will hit the "finally" clause in
>>__iter__() and close the file descriptor.
>>
>>If nova starts iterating through the file and then stops (due to
>>running out of room, for example), the ChunkedFile.__iter__() routine
>>is left with an open file descriptor. At this point deleting the
>>image will not actually free up any space.
>>
>>I'm not a glance guy so I could be wrong about the code. The
>>externally-visible data are:
>>1) glance-api is holding an open file descriptor to a deleted image file
>>2) If I kill glance-api the disk space is freed up.
>>3) If I modify nova to always finish iterating through the file the
>>problem doesn't occur in the first place.
>
>Gotcha, thanks for explaining. I think the problem is that there might
>be a reference leak and therefore the FD is kept opened. Probably the
>request interruption is not getting to the driver. I've filed this
>bug[0] so we can look into it.
>
>[0] https://bugs.launchpad.net/glance-store/+bug/1462235
>
>Flavio
>
>--
>@flaper87
>Flavio Percoco
So the problem is with how we use ResponseSerializer and the ChunkedFile
(https://git.openstack.org/cgit/openstack/glance/tree/glance/api/v2/image_d
ata.py#n222). I think the problem we'll have is that webob provides
nothing on a Response
(https://webob.readthedocs.org/en/latest/modules/webob.html#response) to
hook into so we can close the ChunkedFile.
I wonder if we used the body_file attribute if webob would close the file
when the response is closed (because I'm assuming that nova/glanceclient
are closing the response with which it's downloading the data).
More information about the OpenStack-dev
mailing list