[glance] Slow image download when using glanceclient

Lucio Seki lucioseki at gmail.com
Thu Oct 20 14:21:25 UTC 2022


Thanks Artem,

Indeed, using the `output` parameter increased the download speed from
120KB/s to >120MB/s (the max network performance I have). That's great!

I'll look into the method definition and see what's the secret.

Regards,
Lucio

On Fri, Oct 14, 2022, 12:07 Artem Goncharov <artem.goncharov at gmail.com>
wrote:

> ```
> import openstack
>
> conn = openstack.connect()
>
> conn.image.download_image(image_name, stream=True, output="data.iso”)
> ```
>
> This gives me max performance of the network. Actually using stream=True
> may be slower (around 40%), but may be crucially necessary when dealing
> with huge images. Additionally you can specify chunk_size as param to
> download_image function, what aligns performance of stream vs non stream
> (for me stream=True and chunk_size=8192 resulted 2.3G image to be
> downloaded in 14 sec)
>
>
> On 13. Oct 2022, at 23:24, Lucio Seki <lucioseki at gmail.com> wrote:
>
> Yes, I'm using tqdm to monitor the progress and speed.
> I removed it, and it improved slightly (120kB/s -> 131kB/s) but not
> significantly :-/
>
> On Thu, Oct 13, 2022, 16:54 Sean Mooney <smooney at redhat.com> wrote:
>
>> On Thu, 2022-10-13 at 16:21 -0300, Lucio Seki wrote:
>> > Thanks Sean, that makes much easier to code!
>> >
>> > ```
>> > ...
>> > conn = openstack.connect(cloud_name)
>> >
>> > with open(path, 'wb') as image_file:
>> >     response = conn.image.download_image(image_name)
>> >     for chunk in tqdm(response.iter_content(), **tqdm_params):
>> >         image_file.write(chunk)
>> > ```
>> >
>> > And it gave me some performance improvement (3kB/s -> 120kB/s).
>> > ... though it would still take several days to download an image.
>> >
>> > Is there some tuning that I could apply?
>> this is what nova does
>> https://github.com/openstack/nova/blob/master/nova/image/glance.py#L344
>>
>> we get the image chunks by calling the data method on the glance client
>>
>> https://github.com/openstack/nova/blob/03d2715ed492350fa11908aea0fdd0265993e284/nova/image/glance.py#L373-L377
>> then bwe basiclly just loop over the chunks and write them to a file like
>> you are
>>
>> https://github.com/openstack/nova/blob/03d2715ed492350fa11908aea0fdd0265993e284/nova/image/glance.py#L413-L437
>> we have some extra code for doing image verification but its basically
>> the same as what you are doing
>> we use eventlets to monkeypatch python io which can imporve performce but
>> i woudl not expect it to be that dramatic
>> and i dont think the glance clinet or opesntack client use eventlet so
>> its sound liek something else is limiting the transfer speed.
>>
>> this is the glance client method we are invokeing
>>
>> https://github.com/openstack/python-glanceclient/blob/56186d6d5aa1a0c8fde99eeb535a650b0495925d/glanceclient/v2/images.py#L201-L271
>>
>>
>> im not sure what tqdm is by the way is it meusrign the transfer speed of
>> something linke that?
>> does the speed increase if you remvoe that?
>> i.ie can you test this via a simple time script and see how much
>> downloads say in up to 60 seconds by lookign at the file size?
>>
>> assuming its https://github.com/tqdm/tqdm perhaps the addtional io that
>> woudl be doing to standard out is slowign it down?
>>
>>
>>
>>
>> >
>> > On Thu, Oct 13, 2022, 14:18 Sean Mooney <smooney at redhat.com> wrote:
>> >
>> > > On Thu, 2022-10-13 at 13:30 -0300, Lucio Seki wrote:
>> > > > Hi glance experts,
>> > > >
>> > > > I'm using the following code to download a glance image:
>> > > >
>> > > > ```
>> > > > from glanceapi import client
>> > > > ...
>> > > > glance = client.Client(GLANCE_API_VERSION, session=sess)
>> > > > ...
>> > > > with open(path, 'wb') as image_file:
>> > > >     data = glance.images.data(image_id)
>> > > >     for chunk in tqdm(data, unit='B', unit_scale=True,
>> > > unit_divisor=1024):
>> > > >         image_file.write(chunk)
>> > > > ```
>> > > >
>> > > > And I get a speed around 3kB/s. It would take months to download an
>> > > image.
>> > > > I'm using python3-glanceclient==3.6.0.
>> > > > I even tried:
>> > > > ```
>> > > >     for chunk in tqdm(data, unit='B', unit_scale=True,
>> > > unit_divisor=1024):
>> > > >         pass
>> > > > ```
>> > > > to see if the bottleneck was the disk I/O, but didn't get any
>> faster.
>> > > >
>> > > > In the same environment, when I use the glance CLI instead:
>> > > >
>> > > > ```
>> > > > glance image-download --file $path $image_id
>> > > > ```
>> > > > I get hundreds of MB/s download speed, and it finishes in a few
>> minutes.
>> > > >
>> > > > Is there anything I can do to improve the glanceclient performance?
>> > > > I'm considering using subprocess.Popen(['glance', 'image-download',
>> ...])
>> > > > if nothing helps...
>> > > have you considered using the openstacksdk instead
>> > >
>> > > the glanceclint is really only intendeted for other openstack service
>> to
>> > > use like
>> > > nova or ironic.
>> > > its not really ment to be used to write your onw code anymore.
>> > > in the past it provided a programatic interface for interacting with
>> glance
>> > > but now you shoudl prefer the openstack sdk instead.
>> > > https://github.com/openstack/openstacksdk
>> > >
>> > > >
>> > > > Regards,
>> > > > Lucio
>> > >
>> > >
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.openstack.org/pipermail/openstack-discuss/attachments/20221020/80d90340/attachment-0001.htm>


More information about the openstack-discuss mailing list