[openstack-dev] [glance] proposed priorities for Mitaka

Monty Taylor mordred at inaugust.com
Tue Sep 15 00:46:38 UTC 2015


On 09/15/2015 02:06 AM, Clint Byrum wrote:
> Excerpts from Doug Hellmann's message of 2015-09-14 13:46:16 -0700:
>> Excerpts from Clint Byrum's message of 2015-09-14 13:25:43 -0700:
>>> Excerpts from Doug Hellmann's message of 2015-09-14 12:51:24 -0700:
>>>> Excerpts from Flavio Percoco's message of 2015-09-14 14:41:00 +0200:
>>>>> On 14/09/15 08:10 -0400, Doug Hellmann wrote:
>>>>>>
>>>>>> After having some conversations with folks at the Ops Midcycle a
>>>>>> few weeks ago, and observing some of the more recent email threads
>>>>>> related to glance, glance-store, the client, and the API, I spent
>>>>>> last week contacting a few of you individually to learn more about
>>>>>> some of the issues confronting the Glance team. I had some very
>>>>>> frank, but I think constructive, conversations with all of you about
>>>>>> the issues as you see them. As promised, this is the public email
>>>>>> thread to discuss what I found, and to see if we can agree on what
>>>>>> the Glance team should be focusing on going into the Mitaka summit
>>>>>> and development cycle and how the rest of the community can support
>>>>>> you in those efforts.
>>>>>>
>>>>>> I apologize for the length of this email, but there's a lot to go
>>>>>> over. I've identified 2 high priority items that I think are critical
>>>>>> for the team to be focusing on starting right away in order to use
>>>>>> the upcoming summit time effectively. I will also describe several
>>>>>> other issues that need to be addressed but that are less immediately
>>>>>> critical. First the high priority items:
>>>>>>
>>>>>> 1. Resolve the situation preventing the DefCore committee from
>>>>>>    including image upload capabilities in the tests used for trademark
>>>>>>    and interoperability validation.
>>>>>>
>>>>>> 2. Follow through on the original commitment of the project to
>>>>>>    provide an image API by completing the integration work with
>>>>>>    nova and cinder to ensure V2 API adoption.
>>>>>
>>>>> Hi Doug,
>>>>>
>>>>> First and foremost, I'd like to thank you for taking the time to dig
>>>>> into these issues, and for reaching out to the community seeking for
>>>>> information and a better understanding of what the real issues are. I
>>>>> can imagine how much time you had to dedicate on this and I'm glad you
>>>>> did.
>>>>>
>>>>> Now, to your email, I very much agree with the priorities you
>>>>> mentioned above and I'd like for, whomever will win Glance's PTL
>>>>> election, to bring focus back on that.
>>>>>
>>>>> Please, find some comments in-line for each point:
>>>>>
>>>>>>
>>>>>> I. DefCore
>>>>>>
>>>>>> The primary issue that attracted my attention was the fact that
>>>>>> DefCore cannot currently include an image upload API in its
>>>>>> interoperability test suite, and therefore we do not have a way to
>>>>>> ensure interoperability between clouds for users or for trademark
>>>>>> use. The DefCore process has been long, and at times confusing,
>>>>>> even to those of us following it sort of closely. It's not entirely
>>>>>> surprising that some projects haven't been following the whole time,
>>>>>> or aren't aware of exactly what the whole thing means. I have
>>>>>> proposed a cross-project summit session for the Mitaka summit to
>>>>>> address this need for communication more broadly, but I'll try to
>>>>>> summarize a bit here.
>>>>>
>>>>> +1
>>>>>
>>>>> I think it's quite sad that some projects, especially those considered
>>>>> to be part of the `starter-kit:compute`[0], don't follow closely
>>>>> what's going on in DefCore. I personally consider this a task PTLs
>>>>> should incorporate in their role duties. I'm glad you proposed such
>>>>> session, I hope it'll help raising awareness of this effort and it'll
>>>>> help moving things forward on that front.
>>>>
>>>> Until fairly recently a lot of the discussion was around process
>>>> and priorities for the DefCore committee. Now that those things are
>>>> settled, and we have some approved policies, it's time to engage
>>>> more fully.  I'll be working during Mitaka to improve the two-way
>>>> communication.
>>>>
>>>>>
>>>>>>
>>>>>> DefCore is using automated tests, combined with business policies,
>>>>>> to build a set of criteria for allowing trademark use. One of the
>>>>>> goals of that process is to ensure that all OpenStack deployments
>>>>>> are interoperable, so that users who write programs that talk to
>>>>>> one cloud can use the same program with another cloud easily. This
>>>>>> is a *REST API* level of compatibility. We cannot insert cloud-specific
>>>>>> behavior into our client libraries, because not all cloud consumers
>>>>>> will use those libraries to talk to the services. Similarly, we
>>>>>> can't put the logic in the test suite, because that defeats the
>>>>>> entire purpose of making the APIs interoperable. For this level of
>>>>>> compatibility to work, we need well-defined APIs, with a long support
>>>>>> period, that work the same no matter how the cloud is deployed. We
>>>>>> need the entire community to support this effort. From what I can
>>>>>> tell, that is going to require some changes to the current Glance
>>>>>> API to meet the requirements. I'll list those requirements, and I
>>>>>> hope we can discuss them to a degree that ensures everyone understands
>>>>>> them. I don't want this email thread to get bogged down in
>>>>>> implementation details or API designs, though, so let's try to keep
>>>>>> the discussion at a somewhat high level, and leave the details for
>>>>>> specs and summit discussions. I do hope you will correct any
>>>>>> misunderstandings or misconceptions, because unwinding this as an
>>>>>> outside observer has been quite a challenge and it's likely I have
>>>>>> some details wrong.
>>>>>>
>>>>>> As I understand it, there are basically two ways to upload an image
>>>>>> to glance using the V2 API today. The "POST" API pushes the image's
>>>>>> bits through the Glance API server, and the "task" API instructs
>>>>>> Glance to download the image separately in the background. At one
>>>>>> point apparently there was a bug that caused the results of the two
>>>>>> different paths to be incompatible, but I believe that is now fixed.
>>>>>> However, the two separate APIs each have different issues that make
>>>>>> them unsuitable for DefCore.
>>>>>>
>>>>>> The DefCore process relies on several factors when designating APIs
>>>>>> for compliance. One factor is the technical direction, as communicated
>>>>>> by the contributor community -- that's where we tell them things
>>>>>> like "we plan to deprecate the Glance V1 API". In addition to the
>>>>>> technical direction, DefCore looks at the deployment history of an
>>>>>> API. They do not want to require deploying an API if it is not seen
>>>>>> as widely usable, and they look for some level of existing adoption
>>>>>> by cloud providers and distributors as an indication of that the
>>>>>> API is desired and can be successfully used. Because we have multiple
>>>>>> upload APIs, the message we're sending on technical direction is
>>>>>> weak right now, and so they have focused on deployment considerations
>>>>>> to resolve the question.
>>>>>
>>>>> The task upload process you're referring to is the one that uses the
>>>>> `import` task, which allows you to download an image from an external
>>>>> source, asynchronously, and import it in Glance. This is the old
>>>>> `copy-from` behavior that was moved into a task.
>>>>>
>>>>> The "fun" thing about this - and I'm sure other folks in the Glance
>>>>> community will disagree - is that I don't consider tasks to be a
>>>>> public API. That is to say, I would expect tasks to be an internal API
>>>>> used by cloud admins to perform some actions (bsaed on its current
>>>>> implementation). Eventually, some of these tasks could be triggered
>>>>> from the external API but as background operations that are triggered
>>>>> by the well-known public ones and not through the task API.
>>>>
>>>> Does that mean it's more of an "admin" API?
>>>>
>>>
>>> I think it is basically just a half-way done implementation that is
>>> exposed directly to users of Rackspace Cloud and, AFAIK, nobody else.
>>> When last I tried to make integration tests in shade that exercised the
>>> upstream glance task import code, I was met with an implementation that
>>> simply did not work, because the pieces behind it had never been fully
>>> implemented upstream. That may have been resolved, but in the process
>>> of trying to write tests and make this work, I discovered a system that
>>> made very little sense from a user standpoint. I want to upload an
>>> image, why do I want a task?!
>>>
>>>>>
>>>>> Ultimately, I believe end-users of the cloud simply shouldn't care
>>>>> about what tasks are or aren't and more importantly, as you mentioned
>>>>> later in the email, tasks make clouds not interoperable. I'd be pissed
>>>>> if my public image service would ask me to learn about tasks to be
>>>>> able to use the service.
>>>>
>>>> It would be OK if a public API set up to do a specific task returned a
>>>> task ID that could be used with a generic task API to check status, etc.
>>>> So the idea of tasks isn't completely bad, it's just too vague as it's
>>>> exposed right now.
>>>>
>>>
>>> I think it is a concern, because it is assuming users will want to do
>>> generic things with a specific API. This turns into a black-box game where
>>> the user shoves a task in and then waits to see what comes out the other
>>> side. Not something I want to encourage users to do or burden them with.
>>>
>>> We have an API whose sole purpose is to accept image uploads. That
>>> Rackspace identified a scaling pain point there is _good_. But why not
>>> *solve* it for the user, instead of introduce more complexity?
>>
>> That's fair. I don't actually care which API we have, as long as it
>> meets the other requirements.
>>
>>>
>>> What I'd like to see is the upload image API given the ability to
>>> respond with a URL that can be uploaded to using the object storage API
>>> we already have in OpenStack. Exposing users to all of these operator
>>> choices is just wasting their time. Just simply say "Oh, you want to
>>> upload an image? Thats fine, please upload it as an object over there
>>> and POST here again when it is ready to be imported." This will make
>>> perfect sense to a user reading docs, and doesn't require them to grasp
>>> an abstract concept like "tasks" when all they want to do is upload
>>> their image.
>>>
>>
>> And what would it do if the backing store for the image service
>> isn't Swift or another object storage system that supports direct
>> uploads? Return a URL that pointed back to itself, maybe?
>
> For those operators who don't have concerns about scaling the glance
> API service to their users' demands, glance's image upload API works
> perfectly well today.  The indirect approach is only meant to dealt with
> the situation where the operator expects a lot of really large images to
> be uploaded simultaneously, and would like to take advantage of the Swift
> API's rather rich set of features for making that a positive experience.
> There is also a user benefit to using the Swift API, which is that a
> segmented upload can more easily be resumed.

Yes, BUT ...

If there are going to be two legitimate ways to upload an image, that 
needs to be discoverable so that scripts (or things like ansible or 
razor or juju or terraform or *insert system tool here*) can accomplish 
"please upload this here image file into this here cloud"

It's really not about the REST API itself. Literally zero percent of the 
people are doing that. People use tools. Tools write to APIs. And nobody 
who is running an OpenStack cloud should have to write their own branded 
tools - that's a cost that's completely silly to bear. An operator 
running an openstack cloud should be able to say to their users "go use 
the ansible openstack modules" or "go use the juju openstack provider"

Which brings us back to your excellnet point - both of these are totally 
legitimate ways to upload to the cloud, except small clouds often don't 
run swift, and large clouds may want to handle the situation you mention 
and leverage swift. So how about:

glance image-create my-great-image
returns: 200 OK {
   upload-url: 'https://example.com/some/url/location',
   is_swift: False
}

OR

glance image-create my-great-image
returns: 200 OK {
   upload-url: 'https://example.com/some/url/location',
   is_swift: False
}

and if is_swift is true, then the user (or script) knows it can used the 
threaded swiftuploader,  If it's false, the user (or script) just 
uploads content to the URL. The process is completely sane, is pretty 
much the same for both types of cloud, and has one known and 
understandable either-or deployer difference that each fork of is open 
source and each fork of has a defined semantic.

Details, of course - and I know there are at least 5 more to work out - 
but hopefully that makes sense and doesn't disenfrancize anyone?


> Now, IMO HTTP has facilities for that too, it's just that glanceclient
> (and lo, many HTTP clients) aren't well versed in those deeper, optional
> pieces of HTTP. That is why Swift works the way it does, and I like
> the idea of glance simply piggy backing on the experience of many years
> of production refinement that are available and codified in Swift and
> any other OpenStack Object Storage API implementations (like the CEPH
> RADOS gateway).




More information about the OpenStack-dev mailing list