Thanks folks for your responses.I have a couple of follow up questions/clarifications to quickly fix/hack this issue in my local environment.1. Can I manually copy any images larger than 5GB to s3 bucket before running image-import command?Would image-import see the file in the S3 and use it instead of trying to copy it all over again?
2. Following the discussion, I was getting a sense that the code changes may not be a lot, in that case, if it's possible to provide a patch,I can patch my openstack and see how it goes.
Thanks,ShrishailOn Wed, 31 Jan 2024 at 05:28, Abhishek Kekane <akekane@redhat.com> wrote:On Wed, 31 Jan 2024 at 3:49 PM, Christian Rohmann <christian.rohmann@inovex.de> wrote:Hey Abhishek!
On 31.01.24 10:35, Abhishek Kekane wrote:
On 31.01.24 09:13, Abhishek Kekane wrote:
Abishek, I suppose the copy-image is done via this helper here and which are were referring to?By design copy-image import workflow uses a common uploading mechanism for all stores, so yes it is a known limitation if it is not using multipart upload for s3 backend. Feel free to propose enhancement for the same or participate in the upcoming PTG 'April 8-12, 2024' to discuss the improvements for this behavior.
https://github.com/openstack/glance/blob/master/glance/async_/flows/_internal_plugins/copy_image.py
Hi Christian,
The helper you mention above is responsible to download the existing data at common storage known as staging area (configured using os_glance_staging_store in glance-api.conf) and from there it will be imported to the destination/target store. However debugging further I found that it internally calls store.add method, which means in fact it is using a particular driver call only.
I suspect [1] is where it is using single_part as an upload for s3 while copying the image, because we are not passing the size of an existing image to the import call.
I think this is driver specific improvement, and requires additional effort to make it work.
I cannot (quickly) follow your debugging / the calls you mentioned.
Could you please raise a bug with your findings to "fix" this? Seems like this is not intended behavior?
Here the image size is actually provided when the image is fetched to the staging store: https://github.com/openstack/glance/blob/b6b9f043ffe664c643456912148648ecc0d6c9b4/glance/async_/flows/_internal_plugins/copy_image.py#L122Hey Christian,The store you mentioned here is staging store which is a filesystem store and not intended (s3) store, from here after the image import flow will get called which will give call to upload the data from file (staging) store to actual store. You will find it in a method set_image_data from glance/async_/flows/api_image_import.py file.Abhishek
But what is the next step then to upload the "staged" image into the new target store?
In any case, I tend to also disagree that, if missing image_size is the issue, providing it to the add call is a S3 driver specific thing.
Other object storages (GCS, Azure Blob, ...) might "like" to know the size as well to adjust their upload strategy.
Regards
Christian