[openstack-dev] [glance] proposed priorities for Mitaka
Doug Hellmann
doug at doughellmann.com
Tue Sep 15 12:34:17 UTC 2015
Excerpts from Clint Byrum's message of 2015-09-14 17:06:44 -0700:
> Excerpts from Doug Hellmann's message of 2015-09-14 13:46:16 -0700:
> > Excerpts from Clint Byrum's message of 2015-09-14 13:25:43 -0700:
> > > Excerpts from Doug Hellmann's message of 2015-09-14 12:51:24 -0700:
> > > > Excerpts from Flavio Percoco's message of 2015-09-14 14:41:00 +0200:
> > > > > On 14/09/15 08:10 -0400, Doug Hellmann wrote:
> > > > > >
> > > > > >After having some conversations with folks at the Ops Midcycle a
> > > > > >few weeks ago, and observing some of the more recent email threads
> > > > > >related to glance, glance-store, the client, and the API, I spent
> > > > > >last week contacting a few of you individually to learn more about
> > > > > >some of the issues confronting the Glance team. I had some very
> > > > > >frank, but I think constructive, conversations with all of you about
> > > > > >the issues as you see them. As promised, this is the public email
> > > > > >thread to discuss what I found, and to see if we can agree on what
> > > > > >the Glance team should be focusing on going into the Mitaka summit
> > > > > >and development cycle and how the rest of the community can support
> > > > > >you in those efforts.
> > > > > >
> > > > > >I apologize for the length of this email, but there's a lot to go
> > > > > >over. I've identified 2 high priority items that I think are critical
> > > > > >for the team to be focusing on starting right away in order to use
> > > > > >the upcoming summit time effectively. I will also describe several
> > > > > >other issues that need to be addressed but that are less immediately
> > > > > >critical. First the high priority items:
> > > > > >
> > > > > >1. Resolve the situation preventing the DefCore committee from
> > > > > > including image upload capabilities in the tests used for trademark
> > > > > > and interoperability validation.
> > > > > >
> > > > > >2. Follow through on the original commitment of the project to
> > > > > > provide an image API by completing the integration work with
> > > > > > nova and cinder to ensure V2 API adoption.
> > > > >
> > > > > Hi Doug,
> > > > >
> > > > > First and foremost, I'd like to thank you for taking the time to dig
> > > > > into these issues, and for reaching out to the community seeking for
> > > > > information and a better understanding of what the real issues are. I
> > > > > can imagine how much time you had to dedicate on this and I'm glad you
> > > > > did.
> > > > >
> > > > > Now, to your email, I very much agree with the priorities you
> > > > > mentioned above and I'd like for, whomever will win Glance's PTL
> > > > > election, to bring focus back on that.
> > > > >
> > > > > Please, find some comments in-line for each point:
> > > > >
> > > > > >
> > > > > >I. DefCore
> > > > > >
> > > > > >The primary issue that attracted my attention was the fact that
> > > > > >DefCore cannot currently include an image upload API in its
> > > > > >interoperability test suite, and therefore we do not have a way to
> > > > > >ensure interoperability between clouds for users or for trademark
> > > > > >use. The DefCore process has been long, and at times confusing,
> > > > > >even to those of us following it sort of closely. It's not entirely
> > > > > >surprising that some projects haven't been following the whole time,
> > > > > >or aren't aware of exactly what the whole thing means. I have
> > > > > >proposed a cross-project summit session for the Mitaka summit to
> > > > > >address this need for communication more broadly, but I'll try to
> > > > > >summarize a bit here.
> > > > >
> > > > > +1
> > > > >
> > > > > I think it's quite sad that some projects, especially those considered
> > > > > to be part of the `starter-kit:compute`[0], don't follow closely
> > > > > what's going on in DefCore. I personally consider this a task PTLs
> > > > > should incorporate in their role duties. I'm glad you proposed such
> > > > > session, I hope it'll help raising awareness of this effort and it'll
> > > > > help moving things forward on that front.
> > > >
> > > > Until fairly recently a lot of the discussion was around process
> > > > and priorities for the DefCore committee. Now that those things are
> > > > settled, and we have some approved policies, it's time to engage
> > > > more fully. I'll be working during Mitaka to improve the two-way
> > > > communication.
> > > >
> > > > >
> > > > > >
> > > > > >DefCore is using automated tests, combined with business policies,
> > > > > >to build a set of criteria for allowing trademark use. One of the
> > > > > >goals of that process is to ensure that all OpenStack deployments
> > > > > >are interoperable, so that users who write programs that talk to
> > > > > >one cloud can use the same program with another cloud easily. This
> > > > > >is a *REST API* level of compatibility. We cannot insert cloud-specific
> > > > > >behavior into our client libraries, because not all cloud consumers
> > > > > >will use those libraries to talk to the services. Similarly, we
> > > > > >can't put the logic in the test suite, because that defeats the
> > > > > >entire purpose of making the APIs interoperable. For this level of
> > > > > >compatibility to work, we need well-defined APIs, with a long support
> > > > > >period, that work the same no matter how the cloud is deployed. We
> > > > > >need the entire community to support this effort. From what I can
> > > > > >tell, that is going to require some changes to the current Glance
> > > > > >API to meet the requirements. I'll list those requirements, and I
> > > > > >hope we can discuss them to a degree that ensures everyone understands
> > > > > >them. I don't want this email thread to get bogged down in
> > > > > >implementation details or API designs, though, so let's try to keep
> > > > > >the discussion at a somewhat high level, and leave the details for
> > > > > >specs and summit discussions. I do hope you will correct any
> > > > > >misunderstandings or misconceptions, because unwinding this as an
> > > > > >outside observer has been quite a challenge and it's likely I have
> > > > > >some details wrong.
> > > > > >
> > > > > >As I understand it, there are basically two ways to upload an image
> > > > > >to glance using the V2 API today. The "POST" API pushes the image's
> > > > > >bits through the Glance API server, and the "task" API instructs
> > > > > >Glance to download the image separately in the background. At one
> > > > > >point apparently there was a bug that caused the results of the two
> > > > > >different paths to be incompatible, but I believe that is now fixed.
> > > > > >However, the two separate APIs each have different issues that make
> > > > > >them unsuitable for DefCore.
> > > > > >
> > > > > >The DefCore process relies on several factors when designating APIs
> > > > > >for compliance. One factor is the technical direction, as communicated
> > > > > >by the contributor community -- that's where we tell them things
> > > > > >like "we plan to deprecate the Glance V1 API". In addition to the
> > > > > >technical direction, DefCore looks at the deployment history of an
> > > > > >API. They do not want to require deploying an API if it is not seen
> > > > > >as widely usable, and they look for some level of existing adoption
> > > > > >by cloud providers and distributors as an indication of that the
> > > > > >API is desired and can be successfully used. Because we have multiple
> > > > > >upload APIs, the message we're sending on technical direction is
> > > > > >weak right now, and so they have focused on deployment considerations
> > > > > >to resolve the question.
> > > > >
> > > > > The task upload process you're referring to is the one that uses the
> > > > > `import` task, which allows you to download an image from an external
> > > > > source, asynchronously, and import it in Glance. This is the old
> > > > > `copy-from` behavior that was moved into a task.
> > > > >
> > > > > The "fun" thing about this - and I'm sure other folks in the Glance
> > > > > community will disagree - is that I don't consider tasks to be a
> > > > > public API. That is to say, I would expect tasks to be an internal API
> > > > > used by cloud admins to perform some actions (bsaed on its current
> > > > > implementation). Eventually, some of these tasks could be triggered
> > > > > from the external API but as background operations that are triggered
> > > > > by the well-known public ones and not through the task API.
> > > >
> > > > Does that mean it's more of an "admin" API?
> > > >
> > >
> > > I think it is basically just a half-way done implementation that is
> > > exposed directly to users of Rackspace Cloud and, AFAIK, nobody else.
> > > When last I tried to make integration tests in shade that exercised the
> > > upstream glance task import code, I was met with an implementation that
> > > simply did not work, because the pieces behind it had never been fully
> > > implemented upstream. That may have been resolved, but in the process
> > > of trying to write tests and make this work, I discovered a system that
> > > made very little sense from a user standpoint. I want to upload an
> > > image, why do I want a task?!
> > >
> > > > >
> > > > > Ultimately, I believe end-users of the cloud simply shouldn't care
> > > > > about what tasks are or aren't and more importantly, as you mentioned
> > > > > later in the email, tasks make clouds not interoperable. I'd be pissed
> > > > > if my public image service would ask me to learn about tasks to be
> > > > > able to use the service.
> > > >
> > > > It would be OK if a public API set up to do a specific task returned a
> > > > task ID that could be used with a generic task API to check status, etc.
> > > > So the idea of tasks isn't completely bad, it's just too vague as it's
> > > > exposed right now.
> > > >
> > >
> > > I think it is a concern, because it is assuming users will want to do
> > > generic things with a specific API. This turns into a black-box game where
> > > the user shoves a task in and then waits to see what comes out the other
> > > side. Not something I want to encourage users to do or burden them with.
> > >
> > > We have an API whose sole purpose is to accept image uploads. That
> > > Rackspace identified a scaling pain point there is _good_. But why not
> > > *solve* it for the user, instead of introduce more complexity?
> >
> > That's fair. I don't actually care which API we have, as long as it
> > meets the other requirements.
> >
> > >
> > > What I'd like to see is the upload image API given the ability to
> > > respond with a URL that can be uploaded to using the object storage API
> > > we already have in OpenStack. Exposing users to all of these operator
> > > choices is just wasting their time. Just simply say "Oh, you want to
> > > upload an image? Thats fine, please upload it as an object over there
> > > and POST here again when it is ready to be imported." This will make
> > > perfect sense to a user reading docs, and doesn't require them to grasp
> > > an abstract concept like "tasks" when all they want to do is upload
> > > their image.
> > >
> >
> > And what would it do if the backing store for the image service
> > isn't Swift or another object storage system that supports direct
> > uploads? Return a URL that pointed back to itself, maybe?
>
> For those operators who don't have concerns about scaling the glance
> API service to their users' demands, glance's image upload API works
> perfectly well today. The indirect approach is only meant to dealt with
> the situation where the operator expects a lot of really large images to
> be uploaded simultaneously, and would like to take advantage of the Swift
> API's rather rich set of features for making that a positive experience.
> There is also a user benefit to using the Swift API, which is that a
> segmented upload can more easily be resumed.
>
> Now, IMO HTTP has facilities for that too, it's just that glanceclient
> (and lo, many HTTP clients) aren't well versed in those deeper, optional
> pieces of HTTP. That is why Swift works the way it does, and I like
> the idea of glance simply piggy backing on the experience of many years
> of production refinement that are available and codified in Swift and
> any other OpenStack Object Storage API implementations (like the CEPH
> RADOS gateway).
>
That's fine, as an option. But we have existing business requirements
(as differentiated from technical requirements) that constrain us
and prevent us from inserting a hard dependency from glance to
swift. We could even make it a required option, so that using the
"platform" trademark includes that behavior. But we must support a
version of glance that does not depend on swift for the "compute"
trademark program as it is defined today.
Doug
More information about the OpenStack-dev
mailing list