[openstack-dev] [API] [Glance] New v2 Image Import Specification Discussion

Flavio Percoco flavio at redhat.com
Mon Jan 11 15:09:23 UTC 2016

On 08/01/16 17:14 +0000, Ian Cordasco wrote:
>Hey all,


Thanks for taking the time to read the spec and writing this email.

>I'm well aware that this isn't exactly timely because we've partially agreed on the import spec in large strokes. [1]_ I haven't had a great deal of time before now to really dig into the spec, though, so I'm bringing my feedback. I've spoken to some API-WG members and other Glance team members and I thought it would be best to discuss some of the concerns here.
>An outline (or tl;dr) is roughly
>- We're exposing way too much complexity to the user

I agree with this to some extent and I think we could work out a way
to reduce some of these calls or make them optional as in they should
always be exposed but the user doesn't need to call them to complete
an upload.

>- The user has to be very aware of discovering all of the data from the cloud
>  - The user has to store a lot of that data to make subsequent requests
>  - Exposing information to the user is not bad, making the user resubmit it to the cloud is bad

This is the part I'd like to simplify a bit in the current spec.

>- We're introducing two new image states. Do we need two?

In some of the patch sets we discused ways to avoid adding new status
but the agreement seemed to have been to add these two to communicate
things properly to users and be ease the way we reason about the
process within Glance. I'll quote Stuart "When you need a new status,
you need a new status"

>- Quotas - Are these even still a thing?

Yes they are and they were discussed in the spec as well.

>So the problem description explains why uploading an image can be very complex and why some clouds need to not allow uploads directly to Glance. I understand that very well. I also recognize that I am the person who advocated so strongly for having information endpoints that help the user discover information about the cloud. I think this allows for us to make smarter clients that can help the user do the right thing (DTRT) instead of shooting themselves in the foot accidentally.
>All of that said, I have some very severe reservations about the extra surface area we're adding to the images API and in particular how a user's view of the image upload flow would work. We've shipped v2 of Glance as the default version now in python-glanceclient (as of Liberty) so users are already going to be familiar with a two-step dance of creating an image:
>1. Create the image record (image-create)
>2. Upload the image data (image-upload)
>In this new workflow, the dance is extended
>1. Get information about how the cloud is configured and prefers images

Technically, this call is optional. It is a requirement for scripts
that want to work accross multiple clouds as you need this info for

>2. Create the image record
>3. Upload the image data somewhere (directly to Glance, or to some other service in that cloud perhaps [e.g., swift]) that is specified in the data retrieved from step 1
>4. Tell Glance to import/process the image

This one I think we could simplify a bit by not making all these
fields required and allow the cloud to have their own defaults.

>Here are my issues with this new upload workflow:
>1. The user must pre-fetch the information to have any chance of their upload being successful

Not really (read above).

>2. They have to use that information to find out where to upload the image bytes (data) to

Not really. The upload info is returned in the `image-create` call
which is a requirement in any of the evaluated workflow options. The
new header tells the user where the image data should go.

>3. In step 4, they have to tell Glance if they're using the direct/indirect (currently named glance-direct and swift-local respectively) and then give it information (retrieved from step 1) about what the target formats are for that cloud

Again, I believe we could make target_* fields optional in the
*import* call as I (the user) most of the time don't care about the
final format of my image as long as it boots. Perhaps, the most
critical information in call #1 is the one related to the srouce_*
fields as that's something the user must respect. Not passing the
target_* fields in the import call will let the cloud use whatever it
has as default target format, which can be communicated through call

[snip just want to focus on the proposal]

>So what if we modeled this a little
>1. The user creates an image record (image-create)
>2. The user uses the link in the response to upload the image data (we don't care whether it is direct or not because the user can follow it without thinking about it)
>3. The user sends a request to process the image without having to specify what to translate the image to because the operator has chosen a preferred image type (and maybe has other acceptable types that mean the image will not need to be converted if it's already of that type).

If you read my comments above, you'll notice that the current proposal
is really close to what you're proposing here. It needs to be tuned up
a bit but it does pretty much this.

The difficult part of "the user just uses the returned link" is that
in the swift case, for example, the user also needs to know what
container to upload the image to. Is it a service container? or is it
a *user's* container?

>This adds an extra step and it could remove the need for an "uploading" status.

The real reason for the uploading status is probably not the one
you're thinking of. There are a couple of reasons for this uploading

1) It tells us what path is being used (old one or new one)
2) It communicates what status the image is in (saving is not really
useful here)
3) It'll allow, in the future, for enhancements like multi-part
uploads (again, I'd rather add uploading than repurposing saving).

There are ways to do the above w/o adding this status, sure. These are
listed in one of the PSs. However, for the sake of simplicity and
explicitness, the status was chosen.

>This also means that the user does not need to be cognizant of the method that they're using to create an image (glance-direct, swift-local, foo-bar-bogus-whatever, etc.) and that they do not need to think too hard about the target format for that cloud or pre-process any extra information or worry whether that cloud needs some location URL for an indirect upload. Glance can manage that complexity for the user.
>Now let's talk about step 2 briefly. If the operator wants all image bytes to go to Glance directly, this can be the default behaviour. In that case, the URL would be "https://<glance>/v2/images/<image-id>/files" and the user would do it like the default PUT behaviour that we have now. If it's an indirect upload, then the user would make another PUT request with that URL and their authentication credentials. If the user is out of storage quota than that service will tell the user.

The problem with this is that the current behavior is that the image
will be active as soon as the upload ends, which is something we're
chaning in the new import workflow. This change would break the
current behavior and therefore it'd break backwards compatibility.

One of the goals behind the staging are is to help migrating from the
current synchronous upload process to something asynchronous that
would allow for the task engine to be executed as well.

Hope the above makes sense. Thanks for taking the time to send this
feedback. Btw, if you think there are things that could be changed
(like making `target_*` optional) feel free to send a new patch
amending the current spec.

As I mentioned in one of my latest comments on the spec, I believe the
major calls have been made but there's room for improvement and we
should accept amends to the spec. A good example is the `target_*`
fields in the import call point. I think we can improve that.

I'm all for reducing the complexity (and I agree this spec is adding
some) so, let's work on simplifying things. Above, I tried to break
your concerns down into "discussable items". Let's follow-up on that
and see how we can simplify the proposal. Also, feel free to do it in
the form of patches to the spec.

One final note. All these should not require pushing logic into client
libraries as that would break the discoverability we're trying to
achieve here.


Flavio Percoco
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20160111/8445f11a/attachment.pgp>

More information about the OpenStack-dev mailing list