[openstack-dev] [trove][all][tc] A proposal to rearchitect Trove
Hongbin Lu
hongbin.lu at huawei.com
Thu Jun 22 19:27:56 UTC 2017
> On 20/06/17 11:45, Jay Pipes wrote:
> > Good discussion, Zane. Comments inline.
> ++
> > On 06/20/2017 11:01 AM, Zane Bitter wrote:
> >> On 20/06/17 10:08, Jay Pipes wrote:
> >>> On 06/20/2017 09:42 AM, Doug Hellmann wrote:
> >>>> Does "service VM" need to be a first-class thing? Akanda creates
> >>>> them, using a service user. The VMs are tied to a "router" which
> is
> >>>> the billable resource that the user understands and interacts with
> >>>> through the API.
> >>>
> >>> Frankly, I believe all of these types of services should be built
> as
> >>> applications that run on OpenStack (or other) infrastructure. In
> >>> other words, they should not be part of the infrastructure itself.
> >>>
> >>> There's really no need for a user of a DBaaS to have access to the
> >>> host or hosts the DB is running on. If the user really wanted that,
> >>> they would just spin up a VM/baremetal server and install the thing
> >>> themselves.
> >>
> >> Hey Jay,
> >> I'd be interested in exploring this idea with you, because I think
> >> everyone agrees that this would be a good goal, but at least in my
> >> mind it's not obvious what the technical solution should be.
> >> (Actually, I've read your email a bunch of times now, and I go back
> >> and forth on which one you're actually advocating for.) The two
> >> options, as I see it, are as follows:
> >>
> >> 1) The database VMs are created in the user's tena^W project. They
> >> connect directly to the tenant's networks, are governed by the
> user's
> >> quota, and are billed to the project as Nova VMs (on top of whatever
> >> additional billing might come along with the management services). A
> >> [future] feature in Nova (https://review.openstack.org/#/c/438134/)
> >> allows the Trove service to lock down access so that the user cannot
> >> actually interact with the server using Nova, but must go through
> the
> >> Trove API. On a cloud that doesn't include Trove, a user could run
> >> Trove as an application themselves and all it would have to do
> >> differently is not pass the service token to lock down the VM.
> >>
> >> alternatively:
> >>
> >> 2) The database VMs are created in a project belonging to the
> >> operator of the service. They're connected to the user's network
> >> through <magic>, and isolated from other users' databases running in
> >> the same project through <security groups? hierarchical projects?
> magic?>.
> >> Trove has its own quota management and billing. The user cannot
> >> interact with the server using Nova since it is owned by a different
> >> project. On a cloud that doesn't include Trove, a user could run
> >> Trove as an application themselves, by giving it credentials for
> >> their own project and disabling all of the cross-tenant networking
> stuff.
> >
> > None of the above :)
> >
> > Don't think about VMs at all. Or networking plumbing. Or volume
> > storage or any of that.
> OK, but somebody has to ;)
> > Think only in terms of what a user of a DBaaS really wants. At the
> end
> > of the day, all they want is an address in the cloud where they can
> > point their application to write and read data from.
> >
> > Do they want that data connection to be fast and reliable? Of course,
> > but how that happens is irrelevant to them
> >
> > Do they want that data to be safe and backed up? Of course, but how
> > that happens is irrelevant to them.
> Fair enough. The world has changed a lot since RDS (which was the model
> for Trove) was designed, it's certainly worth reviewing the base
> assumptions before embarking on a new design.
> > The problem with many of these high-level *aaS projects is that they
> > consider their user to be a typical tenant of general cloud
> > infrastructure -- focused on launching VMs and creating volumes and
> > networks etc. And the discussions around the implementation of these
> > projects always comes back to minutia about how to set up secure
> > communication channels between a control plane message bus and the
> > service VMs.
> Incidentally, the reason that discussions always come back to that is
> because OpenStack isn't very good at it, which is a huge problem not
> only for the *aaS projects but for user applications in general running
> on OpenStack.
> If we had fine-grained authorisation and ubiquitous multi-tenant
> asynchronous messaging in OpenStack then I firmly believe that we, and
> application developers, would be in much better shape.
> > If you create these projects as applications that run on cloud
> > infrastructure (OpenStack, k8s or otherwise),
> I'm convinced there's an interesting idea here, but the terminology
> you're using doesn't really capture it. When you say 'as applications
> that run on cloud infrastructure', it sounds like you mean they should
> run in a Nova VM, or in a Kubernetes cluster somewhere, rather than on
> the OpenStack control plane. I don't think that's what you mean though,
> because you can (and IIUC Rackspace does) deploy OpenStack services
> that way already, and it has no real effect on the architecture of
> those services.
> > then the discussions focus
> > instead on how the real end-users -- the ones that actually call the
> > APIs and utilize the service -- would interact with the APIs and not
> > the underlying infrastructure itself.
> >
> > Here's an example to think about...
> >
> > What if a provider of this DBaaS service wanted to jam 100 database
> > instances on a single VM and provide connectivity to those database
> > instances to 100 different tenants?
> >
> > Would those tenants know if those databases were all serviced from a
> > single database server process running on the VM?
> You bet they would when one (or all) of the other 99 decided to run a
> really expensive query at an inopportune moment :)
> > Or 100 contains each
> > running a separate database server process? Or 10 containers running
> > 10 database server processes each?
> >
> > No, of course not. And the tenant wouldn't care at all, because the
> Well, if they had any kind of regulatory (or even performance)
> requirements then the tenant might care really quite a lot. But I take
> your point that many might not and it would be good to be able to offer
> them lower cost options.
> > point of the DBaaS service is to get a database. It isn't to get one
> > or more VMs/containers/baremetal servers.
> I'm not sure I entirely agree here. There are two kinds of DBaaS. One
> is a data API: a multitenant database a la DynamoDB. Those are very
> cool, and I'm excited about the potential to reduce the granularity of
> billing to a minimum, in much the same way Swift does for storage, and
> I'm sad that OpenStack's attempt in this space (MagnetoDB) didn't work
> out. But Trove is not that.
> People use Trove because they want to use a *particular* database, but
> still have all the upgrades, backups, &c. handled for them. Given that
> the choice of database is explicitly *not* abstracted away from them,
> things like how many different VMs/containers/baremetal servers the
> database is running on are very much relevant IMHO, because what you
> want depends on both the database and how you're trying to use it. And
> because (afaik) none of them have native multitenancy, it's necessary
> that no tenant should have to share with any other.
> Essentially Trove operates at a moderate level of abstraction -
> somewhere between managing the database + the infrastructure it runs on
> yourself and just an API endpoint you poke data into. It also operates
> at the coarse end of a granularity spectrum running from
> VMs->Containers->pay as you go.
> It's reasonable to want to move closer to the middle of the granularity
> spectrum. But you can't go all the way to the high abstraction/fine
> grained ends of the spectra (which turn out to be equivalent) without
> becoming something qualitatively different.
> > At the end of the day, I think Trove is best implemented as a hosted
> > application that exposes an API to its users that is entirely
> separate
> > from the underlying infrastructure APIs like Cinder/Nova/Neutron.
> >
> > This is similar to Kevin's k8s Operator idea, which I support but in
> a
> > generic fashion that isn't specific to k8s.
> >
> > In the same way that k8s abstracts the underlying infrastructure (via
> > its "cloud provider" concept), I think that Trove and similar
> projects
> > need to use a similar abstraction and focus on providing a different
> > API to their users that doesn't leak the underlying infrastructure
> > concepts out.
> OK, so trying to summarise (stop me if I'm getting it wrong):
> essentially you support option (2) because it is a closed abstraction.
> Trove has its own quota management, billing, &c. and the user can't see
> the VM, so the operator is free to substitute a different backend that
> allocates compute capacity in finer-grained increments than Nova does.
> Interestingly, that's only an issue because there is no finer-grained
> compute resource than a VM available through the OpenStack API. If
> there were an OpenStack API (or even just a Keystone-authenticated API)
> to a shared, multitenant container orchestration cluster, this wouldn't
> be an issue. But apart from OpenShift, I can't think of any cloud
[Hongbin Lu] I just wanted to clarify that there is such OpenStack API, which is Zun. Zun's API is container-centric that would give you a finer-grained compute resource than a VM, which is a container. Zun is Keystone-authenticated and multitenant, and it can bundle with Heat [1] (or Senlin in the future) to provide container orchestration equivalent functionalities.
[1] https://review.openstack.org/#/c/437810/
> service that's doing that - AWS, Google, OpenStack are all using the
> model where the COE cluster is deployed on VMs that are owned by a
> particular tenant. Of all the things you could run in containers on
> shared servers, databases have arguably the most to lose (performance,
> security) and the least to gain (since they're by definition stateful).
> So my question is:
> if this is such a good idea for databases, why isn't anybody doing it
> for everything container-based? i.e. instead of Magnum/Zun should we
> just be working on a Keystone auth gateway for OpenShift (a.k.a. the
> _one_ thing that _everyone_ had hitherto agreed was definitely out of
> scope :D )?
> Until then it seems to me that the tradeoff is between decoupling it
> from the particular cloud it's running on so that users can optionally
> deploy it standalone (essentially Vish's proposed solution for the *aaS
> services from many moons ago) vs. decoupling it from OpenStack in
> general so that the operator has more flexibility in how to deploy.
> I'd love to be able to cover both - from a user using it standalone to
> spin up and manage a DB in containers on a shared PaaS, through to a
> user accessing it as a service to provide a DB running on a dedicated
> VM or bare metal server, and everything in between. I don't know is
> such a thing is feasible. I suspect we're going to have to talk a lot
> about VMs and network plumbing and volume storage :)
> cheers,
> Zane.
> > Best,
> > -jay
> >
> >> Of course the current situation, as Amrith alluded to, where the
> >> default is option (1) except without the lock-down feature in Nova,
> >> though some operators are deploying option (2) but it's not tested
> >> upstream... clearly that's the worst of all possible worlds, and
> >> nobody disagrees with that.
> >>
> >> To my mind, (1) sounds more like "applications that run on OpenStack
> >> (or other) infrastructure", since it doesn't require stuff like the
> >> admin-only cross-project networking that makes it effectively "part
> >> of the infrastructure itself" - as evidenced by the fact that
> >> unprivileged users can run it standalone with little more than a
> >> simple auth middleware change. But I suspect you are going to use
> >> similar logic to argue for (2)? I'd be interested to hear your
> thoughts.
> >>
> >> cheers,
> >> Zane.
> >>
> >>
