[openstack-dev] More on the topic of DELIMITER, the Quota Management Library proposal
Jay Pipes
jaypipes at gmail.com
Sat Apr 23 20:10:25 UTC 2016
Looking forward to arriving in Austin so that I can buy you a beer,
Amrith, and have a high-bandwidth conversation about how you're wrong. :P
Comments inline.
On 04/23/2016 11:25 AM, Amrith Kumar wrote:
> On Sat, 2016-04-23 at 10:26 -0400, Andrew Laski wrote:
>> On Fri, Apr 22, 2016, at 09:57 PM, Tim Bell wrote:
>>> I have reservations on f and g.
>>> On f., We have had a number of discussions in the past about
>>> centralising quota (e.g. Boson) and the project teams of the other
>>> components wanted to keep the quota contents ‘close’. This can
>>> always be reviewed further with them but I would hope for at least a
>>> standard schema structure of tables in each project for the handling
>>> of quota.
>>> On g., aren’t all projects now nested projects ? If we have the
>>> complexity of handling nested projects sorted out in the common
>>> library, is there a reason why a project would not want to support
>>> nested projects ?
>>> One other issue is how to do reconcilliation, each project needs to
>>> have a mechanism to re-calculate the current allocations and
>>> reconcile that with the quota usage. While in an ideal world, this
>>> should not be necessary, it would be for the foreseeable future,
>>> especially with a new implementation.
>> One of the big reasons that Jay and I have been pushing to remove
>> reservations and tracking of quota in a separate place than the
>> resources are actually used, e.g., an instance record in the Nova db,
>> is so that reconciliation is not necessary. For example, if RAM quota
>> usage is simply tracked as sum(instances.memory_mb) then you can be
>> sure that usage is always up to date.
> Uh oh, there be gremlins here ...
> I am positive that this will NOT work, see earlier conversations about
> isolation levels, and Jay's alternate solution.
> The way (I understand the issue, and Jay's solution) you get around the
> isolation levels trap is to NOT do your quota determinations based on a
> SUM(column) but rather based on the rowcount on a well crafted UPDATE of
> a single table that stored total quota.
No, we would do our quota calculations by doing a SUM(used) against the
allocations table. There is no separate table that stored the total
quota (or quota usage records). That's the source of the problem with
the existing quota handling code in Nova. The generation field value is
used to provide the consistent view of the actual resource usage records
so that the INSERT operations for all claimed resources can be done in a
transactional manner and will be rolled back if any other writer changes
the amount of consumed resources on a provider (which of course would
affect the quota check calculations).
> You could also store a detail
> claim record for each claim in an independent table that is maintained
> in the same database transaction if you so desire, that is optional.
The allocations table is the "detail claim record" table that you refer
to above.
> My view of how this would work (which I described earlier as building on
> Jay's solution) is that the claim flow would look like this:
> select total_used, generation
> from quota_claimed
> where tenant = <tenant> and resource = 'memory'
There is no need to keep a total_used value for anything. That is
denormalized calculated data that merely adds a point of race
contention. The quota check is against the *detail* table (allocations),
which stores the *actual resource usage records*.
> begin transaction
> update quota_claimed
> set total_used = total_used + claim, generation =
> generation + 1
> where tenant = <tenant> and resource = 'memory'
> and generation = generation
> and total_used + claim < limit
This part of the transaction must always occur **after** the insertion
of the actual resource records, not before.
> if @@rowcount = 1
> -- optional claim_detail table
> insert into claim_detail values ( <tenant>, 'memory',
> claim, ...)
> commit
> else
> rollback
So, in pseudo-Python-SQLish code, my solution works like this:
limits = get_limits_from_delimiter()
requested = get_requested_from_request_spec()
while True:
used := SELECT
SUM(used) as total_used
FROM allocations
JOIN resource_providers ON (...)
WHERE consumer_uuid = $USER_UUID
# Check that our requested resource amounts don't exceed quotas
if not check_requested_within_limits(requested, used, limits):
raise QuotaExceeded
# Claim all requested resources. Note that the generation retrieved
# from the above query is our consistent view marker. If the UPDATE
# below succeeds and returns != 0 rows affected, that means there
# was no other writer that changed our resource usage in between
# this thread's claiming of resources, and therefore we prevent
# any oversubscription of resources.
provider := SELECT id, generation, ... FROM resource_providers
JOIN (...)
WHERE (<resource_usage_filters>)
for resource in requested:
INSERT INTO allocations (
rows_affected := UPDATE resource_providers
SET generation = generation + 1
WHERE id = $provider.id
AND generation = $used[$provider.id].generation;
if $rows_affected == 0:
The only reason we would need a post-claim quota check is if some of the
requested resources are owned and tracked by an external-to-Nova system.
BTW, note to Ed Leafe... unless your distributed data store supports
transactional semantics, you can't use a distributed data store for
these types of solutions. Instead, you will need to write a whole bunch
of code that does post-auditing of claims and quotas and a system that
accepts that oversubscription and out-of-sync quota limits and usages is
a fact of life. Not to mention needing to implement JOINs in Python.
> But, it is my understanding that
> (a) if you wish to do the SUM(column) approach that you propose,
> you must have a reservation that is committed and then you must
> re-read the SUM(column) to make sure you did not over-subscribe;
> and
Erm, kind of? Oversubscription is not possible in the solution I
describe because the compare-and-update on the
resource_providers.generation field allows for a consistent view of the
resources used -- and if that view changes during the insertion of
resource usage records -- the transaction containing those insertions is
rolled back.
> (b) to get away from reservations you must stop using the
> SUM(column) approach and instead use a single quota_claimed
> table to determine the current quota claimed.
No. This has nothing to do with reservations.
> At least that's what I understand of Jay's example from earlier in this
> thread.
> Let's definitely discuss this in Austin. While I don't love Jay's
> solution for other reasons to do with making the quota table a hotspot
> and things like that, it is a perfectly workable solution, I think.
There is no quota table in my solution.
If you refer to the resource_providers table (the table that has the
generation field), then yes, it's a hot spot. But hot spots in the DB
aren't necessarily a bad thing if you design the underlying schema properly.
More in Austin.
>>> Tim
>>> From: Amrith Kumar <amrith at tesora.com>
>>> Reply-To: "OpenStack Development Mailing List (not for usage
>>> questions)" <openstack-dev at lists.openstack.org>
>>> Date: Friday 22 April 2016 at 06:51
>>> To: "OpenStack Development Mailing List (not for usage questions)"
>>> <openstack-dev at lists.openstack.org>
>>> Subject: Re: [openstack-dev] More on the topic of DELIMITER, the
>>> Quota Management Library proposal
>>> I’ve thought more about Jay’s approach to enforcing quotas
>>> and I think we can build on and around it. With that
>>> implementation as the basic quota primitive, I think we can
>>> build a quota management API that isn’t dependent on
>>> reservations. It does place some burdens on the consuming
>>> projects that I had hoped to avoid and these will cause
>>> heartburn for some (make sure that you always request
>>> resources in a consistent order and free them in a
>>> consistent order being the most obvious).
>>> If it doesn’t make it harder, I would like to see if we can
>>> make the quota API take care of the ordering of requests.
>>> i.e. if the quota API is an extension of Jay’s example and
>>> accepts some data structure (dict?) with all the claims that
>>> a project wants to make for some operation, and then
>>> proceeds to make those claims for the project in the
>>> consistent order, I think it would be of some value.
>>> Beyond that, I’m on board with a-g below,
>>> -amrith
>>> From: Vilobh Meshram
>>> [mailto:vilobhmeshram.openstack at gmail.com]
>>> Sent: Friday, April 22, 2016 4:08 AM
>>> To: OpenStack Development Mailing List (not for usage
>>> questions) <openstack-dev at lists.openstack.org>
>>> Subject: Re: [openstack-dev] More on the topic of DELIMITER,
>>> the Quota Management Library proposal
>>> I strongly agree with Jay on the points related to "no
>>> reservation" , keeping the interface simple and the role for
>>> Delimiter (impose limits on resource consumption and enforce
>>> quotas).
>>> The point to keep user quota, tenant quotas in Keystone
>>> sounds interestring and would need support from Keystone
>>> team. We have a Cross project session planned [1] and will
>>> definitely bring that up in that session.
>>> The main thought with which Delimiter was formed was to
>>> enforce resource quota in transaction safe manner and do it
>>> in a cross-project conducive manner and it still holds
>>> true. Delimiters mission is to impose limits on
>>> resource consumption and enforce quotas in transaction safe
>>> manner. Few key aspects of Delimiter are :-
>>> a. Delimiter will be a new Library and not a Service.
>>> Details covered in spec.
>>> b. Delimiter's role will be to impose limits on resource
>>> consumption.
>>> c. Delimiter will not be responsible for rate limiting.
>>> d. Delimiter will not maintain data for the resources.
>>> Respective projects will take care of keeping, maintaining
>>> data for the resources and resource consumption.
>>> e. Delimiter will not have the concept of "reservations".
>>> Delimiter will read or update the "actual" resource tables
>>> and will not rely on the "cached" tables. At present, the
>>> quota infrastructure in Nova, Cinder and other projects have
>>> tables such as reservations, quota_usage, etc which are used
>>> as "cached tables" to track re
>>> f. Delimiter will fetch the information for project quota,
>>> user quota from a centralized place, say Keystone, or if
>>> that doesn't materialize will fetch default quota values
>>> from respective service. This information will be cached
>>> since it gets updated rarely but read many times.
>>> g. Delimiter will take into consideration whether the
>>> project is a Flat or Nested and will make the calculations
>>> of allocated, available resources. Nested means project
>>> namespace is hierarchical and Flat means project namespace
>>> is not hierarchical.
>>> -Vilobh
>>> [1] https://www.openstack.org/summit/austin-2016/summit-schedule/events/9492
>>> On Thu, Apr 21, 2016 at 11:08 PM, Joshua Harlow
>>> <harlowja at fastmail.com> wrote:
>>> Since people will be on a plane soon,
>>> I threw this together as a example of a quota engine
>>> (the zookeeper code does even work, and yes it
>>> provides transactional semantics due to the nice
>>> abilities of zookeeper znode versions[1] and its
>>> inherent consistency model, yippe).
>>> https://gist.github.com/harlowja/e7175c2d76e020a82ae94467a1441d85
>>> Someone else can fill in the db quota engine with a
>>> similar/equivalent api if they so dare, ha. Or even
>>> feel to say the gist/api above is crap, cause that's
>>> ok to, lol.
>>> [1]
>>> https://zookeeper.apache.org/doc/r3.1.2/zookeeperProgrammers.html#Data+Access
>>> Amrith Kumar wrote:
>>> Inline below ... thread is too long, will
>>> catch you in Austin.
>>> -----Original Message-----
>>> From: Jay Pipes
>>> [mailto:jaypipes at gmail.com]
>>> Sent: Thursday, April 21, 2016 8:08
>>> PM
>>> To:
>>> openstack-dev at lists.openstack.org
>>> Subject: Re: [openstack-dev] More on
>>> the topic of DELIMITER, the Quota
>>> Management Library proposal
>>> Hmm, where do I start... I think I
>>> will just cut to the two primary
>>> disagreements I have. And I will
>>> top-post because this email is way
>>> too
>>> big.
>>> 1) On serializable isolation level.
>>> No, you don't need it at all to
>>> prevent races in claiming. Just use
>>> a
>>> compare-and-update with retries
>>> strategy. Proof is here:
>>> https://github.com/jaypipes/placement-bench/blob/master/placement.py#L97-
>>> L142
>>> Works great and prevents multiple
>>> writers from oversubscribing any
>>> resource without relying on any
>>> particular isolation level at all.
>>> The `generation` field in the
>>> inventories table is what allows
>>> multiple
>>> writers to ensure a consistent view
>>> of the data without needing to rely
>>> on
>>> heavy lock-based semantics and/or
>>> RDBMS-specific isolation levels.
>>> [amrith] this works for what it is doing, we
>>> can definitely do this. This will work at
>>> any isolation level, yes. I didn't want to
>>> go this route because it is going to still
>>> require an insert into another table
>>> recording what the actual 'thing' is that is
>>> claiming the resource and that insert is
>>> going to be in a different transaction and
>>> managing those two transactions was what I
>>> wanted to avoid. I was hoping to avoid
>>> having two tables tracking claims, one
>>> showing the currently claimed quota and
>>> another holding the things that claimed that
>>> quota. Have to think again whether that is
>>> possible.
>>> 2) On reservations.
>>> The reason I don't believe
>>> reservations are necessary to be in
>>> a quota
>>> library is because reservations add
>>> a concept of a time to a claim of
>>> some
>>> resource. You reserve some resource
>>> to be claimed at some point in the
>>> future and release those resources
>>> at a point further in time.
>>> Quota checking doesn't look at what
>>> the state of some system will be at
>>> some point in the future. It simply
>>> returns whether the system *right
>>> now* can handle a request *right
>>> now* to claim a set of resources.
>>> If you want reservation semantics
>>> for some resource, that's totally
>>> cool,
>>> but IMHO, a reservation service
>>> should live outside of the service
>>> that is
>>> actually responsible for providing
>>> resources to a consumer.
>>> Merging right-now quota checks and
>>> future-based reservations into the
>>> same
>>> library just complicates things
>>> unnecessarily IMHO.
>>> [amrith] extension of the above ...
>>> 3) On resizes.
>>> Look, I recognize some users see
>>> some value in resizing their
>>> resources.
>>> That's fine. I personally think
>>> expand operations are fine, and that
>>> shrink operations are really the
>>> operations that should be prohibited
>>> in
>>> the API. But, whatever, I'm fine
>>> with resizing of requested resource
>>> amounts. My big point is if you
>>> don't have a separate table that
>>> stores
>>> quota_usages and instead only have a
>>> single table that stores the actual
>>> resource usage records, you don't
>>> have to do *any* quota check
>>> operations
>>> at all upon deletion of a resource.
>>> For modifying resource amounts (i.e.
>>> a
>>> resize) you merely need to change
>>> the calculation of requested
>>> resource
>>> amounts to account for the
>>> already-consumed usage amount.
>>> Bottom line for me: I really won't
>>> support any proposal for a complex
>>> library that takes the resource
>>> claim process out of the hands of
>>> the
>>> services that own those resources.
>>> The simpler the interface of this
>>> library, the better.
>>> [amrith] my proposal would not but this
>>> email thread has got too long. Yes, simpler
>>> interface, will catch you in Austin.
>>> Best,
>>> -jay
>>> On 04/19/2016 09:59 PM, Amrith Kumar
>>> wrote:
>>> -----Original
>>> Message-----
>>> From: Jay Pipes
>>> [mailto:jaypipes at gmail.com]
>>> Sent: Monday, April
>>> 18, 2016 2:54 PM
>>> To:
>>> openstack-dev at lists.openstack.org
>>> Subject: Re:
>>> [openstack-dev] More
>>> on the topic of
>>> DELIMITER, the
>>> Quota Management
>>> Library proposal
>>> On 04/16/2016 05:51
>>> PM, Amrith Kumar
>>> wrote:
>>> If we
>>> therefore
>>> assume that
>>> this will be
>>> a Quota
>>> Management
>>> Library,
>>> it is safe
>>> to assume
>>> that quotas
>>> are going to
>>> be managed
>>> on a
>>> per-project
>>> basis, where
>>> participating projects will use this library.
>>> I believe
>>> that it
>>> stands to
>>> reason that
>>> any data
>>> persistence
>>> will
>>> have to be
>>> in a
>>> location
>>> decided by
>>> the
>>> individual
>>> project.
>>> Depends on what you
>>> mean by "any data
>>> persistence". If you
>>> are
>>> referring to the
>>> storage of quota
>>> values (per user,
>>> per tenant,
>>> global, etc) I think
>>> that should be done
>>> by the Keystone
>>> service.
>>> This data is
>>> essentially an
>>> attribute of the
>>> user or the tenant
>>> or the
>>> service endpoint itself (i.e.
>>> global defaults).
>>> This data also
>>> rarely changes and
>>> logically belongs
>>> to the service that
>>> manages users,
>>> tenants, and service
>>> endpoints:
>>> Keystone.
>>> If you are referring
>>> to the storage of
>>> resource usage
>>> records, yes,
>>> each service project
>>> should own that data
>>> (and frankly, I
>>> don't see a
>>> need to persist any
>>> quota usage data at
>>> all, as I mentioned
>>> in a
>>> previous reply to
>>> Attila).
>>> [amrith] You make a
>>> distinction that I had made
>>> implicitly, and it is
>>> important to highlight it.
>>> Thanks for pointing it out.
>>> Yes, I meant
>>> both of the above, and as
>>> stipulated. Global defaults
>>> in keystone
>>> (somehow, TBD) and usage
>>> records, on a per-service
>>> basis.
>>> That may not
>>> be a very
>>> interesting
>>> statement
>>> but the
>>> corollary
>>> is, I
>>> think, a
>>> very
>>> significant
>>> statement;
>>> it cannot be
>>> assumed that
>>> the
>>> quota
>>> management
>>> information
>>> for all
>>> participating projects is in
>>> the same
>>> database.
>>> It cannot be assumed
>>> that this
>>> information is even
>>> in a database at
>>> all...
>>> [amrith] I don't follow. If
>>> the service in question is
>>> to be scalable,
>>> I think it stands to reason
>>> that there must be some
>>> mechanism by which
>>> instances of the service can
>>> share usage records (as you
>>> refer to
>>> them, and I like that term).
>>> I think it stands to reason
>>> that there
>>> must be some database, no?
>>> A
>>> hypothetical
>>> service
>>> consuming
>>> the
>>> Delimiter
>>> library
>>> provides
>>> requesters
>>> with some
>>> widgets, and
>>> wishes to
>>> track the
>>> widgets that
>>> it has
>>> provisioned
>>> both on a
>>> per-user
>>> basis, and
>>> on the
>>> whole. It
>>> should
>>> therefore
>>> multi-tenant
>>> and able to
>>> track the
>>> widgets on a
>>> per
>>> tenant basis
>>> and if
>>> required
>>> impose
>>> limits on
>>> the number
>>> of widgets
>>> that a
>>> tenant may
>>> consume at a
>>> time, during
>>> a course of
>>> a period of
>>> time, and so
>>> on.
>>> No, this last part
>>> is absolutely not
>>> what I think quota
>>> management
>>> should be about.
>>> Rate limiting --
>>> i.e. how many
>>> requests a
>>> particular user can
>>> make of
>>> an API in a given
>>> period of time --
>>> should *not* be
>>> handled by
>>> OpenStack API
>>> services, IMHO. It
>>> is the
>>> responsibility of
>>> the
>>> deployer to handle
>>> this using
>>> off-the-shelf
>>> rate-limiting
>>> solutions
>>> (open source or proprietary).
>>> Quotas should only
>>> be about the hard
>>> limit of different
>>> types of
>>> resources that a
>>> user or group of
>>> users can consume at
>>> a given time.
>>> [amrith] OK, good point.
>>> Agreed as stipulated.
>>> Such a
>>> hypothetical
>>> service may
>>> also consume
>>> resources
>>> from other
>>> services
>>> that it
>>> wishes to
>>> track, and
>>> impose
>>> limits on.
>>> Yes, absolutely
>>> agreed.
>>> It is also
>>> understood
>>> as Jay Pipes
>>> points out
>>> in [4] that
>>> the actual
>>> process of
>>> provisioning
>>> widgets
>>> could be
>>> time
>>> consuming
>>> and it is
>>> ill-advised
>>> to hold a
>>> database
>>> transaction
>>> of any kind
>>> open for
>>> that
>>> duration of
>>> time.
>>> Ensuring
>>> that a user
>>> does not
>>> exceed some
>>> limit on
>>> the number
>>> of
>>> concurrent
>>> widgets that
>>> he or she
>>> may create
>>> therefore
>>> requires
>>> some
>>> mechanism to
>>> track
>>> in-flight
>>> requests for
>>> widgets. I
>>> view these
>>> as "intent"
>>> but not yet
>>> materialized.
>>> It has nothing to do
>>> with the amount of
>>> concurrent widgets
>>> that a
>>> user can create.
>>> It's just about the
>>> total number of some
>>> resource
>>> that may be consumed
>>> by that user.
>>> As for an "intent",
>>> I don't believe
>>> tracking intent is
>>> the right way
>>> to go at all. As
>>> I've mentioned
>>> before, the major
>>> problem in Nova's
>>> quota system is that
>>> there are two tables
>>> storing resource
>>> usage
>>> records: the
>>> *actual* resource
>>> usage tables (the
>>> allocations table in
>>> the new
>>> resource- providers
>>> modeling and the
>>> instance_extra,
>>> pci_devices and
>>> instances table in
>>> the legacy modeling)
>>> and the *quota
>>> usage* tables
>>> (quota_usages and
>>> reservations
>>> tables). The
>>> quota_usages table
>>> does
>>> not need to exist at
>>> all, and neither
>>> does the
>>> reservations table.
>>> Don't do
>>> intent-based
>>> consumption.
>>> Instead, just
>>> consume (claim) by
>>> writing a record for
>>> the resource class
>>> consumed on a
>>> provider into
>>> the actual resource
>>> usages table and
>>> then "check quotas"
>>> by querying
>>> the *actual*
>>> resource usages and
>>> comparing the
>>> SUM(used) values,
>>> grouped by resource
>>> class, against the
>>> appropriate quota
>>> limits for
>>> the user. The
>>> introduction of the
>>> quota_usages and
>>> reservations
>>> tables to cache
>>> usage records is the
>>> primary reason for
>>> the race
>>> problems in the Nova
>>> (and
>>> other) quota system
>>> because every time
>>> you introduce a
>>> caching system
>>> for highly-volatile
>>> data (like usage
>>> records) you
>>> introduce
>>> complexity into the
>>> write path and the
>>> need to track the
>>> same thing
>>> across multiple
>>> writes to different
>>> tables needlessly.
>>> [amrith] I don't agree, I'll
>>> respond to this and the next
>>> comment group
>>> together. See below.
>>> Looking up
>>> at this
>>> whole
>>> infrastructure from the perspective of the
>>> database, I
>>> think we
>>> should
>>> require that
>>> the database
>>> must not be
>>> required to
>>> operate in
>>> any
>>> isolation
>>> mode higher
>>> than
>>> READ-COMMITTED; more about that later (i.e. requiring a database run
>>> either
>>> serializable
>>> or
>>> repeatable
>>> read is a
>>> show
>>> stopper).
>>> This is an
>>> implementation
>>> detail is not
>>> relevant to the
>>> discussion
>>> about what the
>>> interface of a quota
>>> library would look
>>> like.
>>> [amrith] I disagree, let me
>>> give you an example of why.
>>> Earlier, I wrote:
>>> Such a
>>> hypothetical
>>> service may
>>> also consume
>>> resources
>>> from other
>>> services
>>> that it
>>> wishes to
>>> track, and
>>> impose
>>> limits on.
>>> And you responded:
>>> Yes, absolutely
>>> agreed.
>>> So let's take this
>>> hypothetical service that in
>>> response to a user
>>> request, will provision a Cinder
>>> volume and a Nova instance. Let's
>>> assume
>>> that the service also imposes limits
>>> on the number of cinder volumes and
>>> nova instances the user may
>>> provision; independent of limits
>>> that Nova and
>>> Cinder may themselves maintain.
>>> One way that the
>>> hypothetical service can
>>> function is this:
>>> (a) check Cinder quota, if
>>> successful, create cinder
>>> volume
>>> (b) check Nova quota, if
>>> successful, create nova
>>> instance with cinder
>>> volume attachment
>>> Now, this is sub-optimal as
>>> there are going to be some
>>> number of cases
>>> where the nova quota check fails.
>>> Now you have needlessly created and
>>> will
>>> have to release a cinder volume. It
>>> also takes longer to fail.
>>> Another way to do this is
>>> this:
>>> (1) check Cinder quota, if
>>> successful, check Nova
>>> quota, if successful
>>> proceed to (2) else error
>>> out
>>> (2) create cinder volume
>>> (3) create nova instance
>>> with cinder attachment.
>>> I'm trying to get to this
>>> latter form of doing things.
>>> Easy, you might say ...
>>> theoretically this should
>>> simply be:
>>> BEGIN;
>>> -- Get data to do the Cinder
>>> check
>>> SELECT ......
>>> -- Do the cinder check
>>> INSERT INTO ....
>>> -- Get data to do the Nova
>>> check
>>> SELECT ....
>>> -- Do the Nova check
>>> You can only make this work
>>> if you ran at isolation
>>> level serializable.
>>> Why?
>>> To make this run at
>>> isolation level
>>> REPEATABLE-READ, you must
>>> enforce
>>> constraints at the database level
>>> that will fail the commit. But wait,
>>> you
>>> can't do that because the data about
>>> the global limits may not be in the
>>> same database as the usage records.
>>> Later you talk about caching and
>>> stuff; all that doesn't help a
>>> database constraint.
>>> For this reason, I think
>>> there is going to have to be
>>> some cognizance to
>>> the database isolation level in the
>>> design of the library, and I think
>>> it
>>> will also impact the API that can be
>>> constructed.
>>> In general
>>> therefore, I
>>> believe that
>>> the
>>> hypothetical
>>> service
>>> processing
>>> requests for
>>> widgets
>>> would have
>>> to handle
>>> three kinds
>>> of
>>> operations,
>>> provision,
>>> modify, and
>>> destroy. The
>>> names are, I
>>> believe,
>>> self-explanatory.
>>> Generally,
>>> modification of a
>>> resource doesn't
>>> come into play. The
>>> primary exception to
>>> this is for
>>> transferring of
>>> ownership of some
>>> resource.
>>> [amrith] Trove RESIZE is a
>>> huge benefit for users and
>>> while it may be a
>>> pain as you say, this is still a
>>> very real benefit. Trove allows you
>>> to
>>> resize both your storage (resize the
>>> cinder volume) and resize your
>>> instance (change the flavor).
>>> Without loss
>>> of
>>> generality,
>>> one can say
>>> that all
>>> three of
>>> them must
>>> validate
>>> that the
>>> operation
>>> does not
>>> violate some
>>> limit (no
>>> more
>>> than X
>>> widgets, no
>>> fewer than X
>>> widgets,
>>> rates, and
>>> so on).
>>> No, only the
>>> creation (and very
>>> rarely the
>>> modification) needs
>>> any
>>> validation that a
>>> limit could been
>>> violated. Destroying
>>> a resource
>>> never needs to be
>>> checked for limit
>>> violations.
>>> [amrith] Well, if you are
>>> going to create a volume of
>>> 10GB and your
>>> limit is 100GB, resizing it to 200GB
>>> should fail, I think.
>>> Assuming
>>> that the
>>> service
>>> provisions
>>> resources
>>> from other
>>> services,
>>> it is also
>>> conceivable
>>> that limits
>>> be imposed
>>> on the
>>> quantum of
>>> those
>>> services
>>> consumed. In
>>> practice, I
>>> can imagine
>>> a service
>>> like
>>> Trove using
>>> the
>>> Delimiter
>>> project to
>>> perform all
>>> of these
>>> kinds of
>>> limit
>>> checks; I'm
>>> not
>>> suggesting
>>> that it does
>>> this today,
>>> nor that
>>> there is an
>>> immediate
>>> plan to
>>> implement
>>> all of them,
>>> just that
>>> these
>>> all seem
>>> like good
>>> uses a Quota
>>> Management
>>> capability.
>>> - User may
>>> not have
>>> more than 25
>>> database
>>> instances at
>>> a
>>> time
>>> -
>>> User may not
>>> have more
>>> than 4
>>> clusters at
>>> a time
>>> - User may
>>> not consume
>>> more than
>>> 3TB of SSD
>>> storage at a
>>> time
>>> Only if SSD storage
>>> is a distinct
>>> resource class from
>>> DISK_GB. Right
>>> now, Nova makes no
>>> differentiation
>>> w.r.t. SSD or HDD or
>>> shared vs.
>>> local block storage.
>>> [amrith] It matters not to
>>> Trove whether Nova does nor
>>> not. Cinder
>>> supports volume-types and users DO
>>> want to limit based on volume-type
>>> (for
>>> example).
>>> -
>>> User may not
>>> launch more
>>> than 10 huge
>>> instances at
>>> a
>>> time
>>> What is the point of
>>> such a limit?
>>> [amrith] Metering usage,
>>> placing limitations on the
>>> quantum of resources
>>> that a user may provision. Same as
>>> with Nova. A flavor is merely a
>>> simple
>>> way to tie together a bag of
>>> resources. It is a way to restrict
>>> access,
>>> for example, to specific resources
>>> that are available in the cloud.
>>> HUGE
>>> is just an example I gave, pick any
>>> flavor you want, and here's how a
>>> service like Trove uses it.
>>> Users can ask to launch an
>>> instance of a specific
>>> database+version;
>>> MySQL 5.6-48 for example. Now, an
>>> operator can restrict the instance
>>> flavors, or volume types that can be
>>> associated with the specific
>>> datastore. And the flavor could be
>>> used to map to, for example whether
>>> the
>>> instance is running on bare metal or
>>> in a VM and if so with what kind of
>>> hardware. That's a useful construct
>>> for a service like Trove.
>>> -
>>> User may not
>>> launch more
>>> than 3
>>> clusters an
>>> hour
>>> -1. This is rate
>>> limiting and should
>>> be handled by
>>> rate-limiting
>>> services.
>>> -
>>> No more than
>>> 500 copies
>>> of Oracle
>>> may be run
>>> at a time
>>> Is "Oracle" a
>>> resource class?
>>> [amrith] As I view it, every
>>> project should be free to
>>> define its own
>>> set of resource classes and meter
>>> them as it feels fit. So, while
>>> Oracle
>>> licenses may not, conceivably a lot
>>> of things that Nova, Cinder, and the
>>> other core projects don't care
>>> about, are in fact relevant for a
>>> consumer
>>> of this library.
>>> While Nova
>>> would be the
>>> service that
>>> limits the
>>> number of
>>> instances
>>> a user can
>>> have at a
>>> time, the
>>> ability for
>>> a service to
>>> limit this
>>> further
>>> should not
>>> be
>>> underestimated.
>>> In turn,
>>> should Nova
>>> and Cinder
>>> also use the
>>> same Quota
>>> Management
>>> Library,
>>> they may
>>> each impose
>>> limitations
>>> like:
>>> - User may
>>> not launch
>>> more than 20
>>> huge
>>> instances at
>>> a
>>> time
>>> Not a useful
>>> limitation IMHO.
>>> [amrith] I beg to differ.
>>> Again a huge instance is
>>> just an example of
>>> some flavor; and the idea is to
>>> allow a project to place its own
>>> metrics
>>> and meter based on those.
>>> -
>>> User may not
>>> launch more
>>> than 3
>>> instances in
>>> a minute
>>> -1. This is rate
>>> limiting.
>>> -
>>> User may not
>>> consume more
>>> than 15TB of
>>> SSD at a
>>> time
>>> - User may
>>> not have
>>> more than 30
>>> volumes at a
>>> time
>>> Again, I'm
>>> not implying
>>> that either
>>> Nova or
>>> Cinder
>>> should
>>> provide
>>> these
>>> capabilities.
>>> With this in
>>> mind, I
>>> believe that
>>> the minimal
>>> set of
>>> operations
>>> that
>>> Delimiter
>>> should
>>> provide are:
>>> -
>>> define_resource(name, max, min, user_max, user_min, ...)
>>> What would the above
>>> do? What service
>>> would it be speaking
>>> to?
>>> [amrith] I assume that this
>>> would speak with some
>>> backend (either
>>> keystone or the project itself) and
>>> record these designated limits. This
>>> is the way to register a project
>>> specific metric like "Oracle
>>> licenses".
>>> -
>>> update_resource_limits(name, user, user_max, user_min,
>>> ...)
>>> This doesn't belong
>>> in a quota library.
>>> It belongs as a REST
>>> API in
>>> Keystone.
>>> [amrith] Fine, same place
>>> where the previous thing
>>> stores the global
>>> defaults is the target of this call.
>>> -
>>> reserve_resource(name, user, size, parent_resource, ...)
>>> This doesn't belong
>>> in a quota library
>>> at all. I think
>>> reservations
>>> are not germane to
>>> resource consumption
>>> and should be
>>> handled by an
>>> external service at
>>> the orchestration
>>> layer.
>>> [amrith] Again not true, as
>>> illustrated above this
>>> library is the thing
>>> that projects could use to determine
>>> whether or not to honor a request.
>>> This reserve/provision process is, I
>>> believe required because of the
>>> vagaries of how we want to implement
>>> this in the database.
>>> -
>>> provision_resource(resource, id)
>>> A quota library
>>> should not be
>>> provisioning
>>> anything. A quota
>>> library
>>> should simply
>>> provide a consistent
>>> interface for
>>> *checking* that a
>>> structured request
>>> for some set of
>>> resources *can* be
>>> provided by the
>>> service.
>>> [amrith] This does not
>>> actually call Nova or
>>> anything; merely that AFTER
>>> the hypothetical service has called
>>> NOVA, this converts the reservation
>>> (which can expire) into an actual
>>> allocation.
>>> -
>>> update_resource(id or resource, newsize)
>>> Resizing resources
>>> is a bad idea, IMHO.
>>> Resources are easier
>>> to deal
>>> with when they are
>>> considered of
>>> immutable size and
>>> simple (i.e. not
>>> complex or nested).
>>> I think the problem
>>> here is in the
>>> definition of
>>> resource classes
>>> improperly.
>>> [amrith] Let's leave the
>>> quota library aside. This
>>> assertion strikes at
>>> the very heart of things like Nova
>>> resize, or for that matter Cinder
>>> volume resize. Are those all bad
>>> ideas? I made a 500GB Cinder volume
>>> and
>>> it is getting close to full. I'd
>>> like to resize it to 2TB. Are you
>>> saying
>>> that's not a valid use case?
>>> For example, a
>>> "cluster" is not a
>>> resource. It is a
>>> collection of
>>> resources of type
>>> node. "Resizing" a
>>> cluster is a
>>> misnomer, because
>>> you aren't resizing
>>> a resource at all.
>>> Instead, you are
>>> creating or
>>> destroying resources
>>> inside the cluster
>>> (i.e. joining or
>>> leaving
>>> cluster nodes).
>>> BTW, this is also
>>> why the "resize
>>> instance" API in
>>> Nova is such a
>>> giant pain in the
>>> ass. It's attempting
>>> to "modify" the
>>> instance
>>> "resource"
>>> when the instance
>>> isn't really the
>>> resource at all. The
>>> DISK_GB, and PCI
>>> devices are the
>>> actual resources.
>>> The instance is a
>>> convenient way to
>>> tie those resources
>>> together, and doing
>>> a "resize"
>>> of the instance
>>> behind the scenes
>>> actually performs a
>>> *move*
>>> operation, which
>>> isn't a *change* of
>>> the original
>>> resources. Rather,
>>> it is a creation of
>>> a new set of
>>> resources (of the
>>> new amounts) and a
>>> deletion of the old
>>> set of resources.
>>> [amrith] that's fine, if all
>>> we want is to handle the
>>> resize operation
>>> as a new instance followed by a
>>> deletion, that's great. But that
>>> semantic
>>> isn't necessarily the case for
>>> something like (say) cinder.
>>> The "resize" API
>>> call adds some nasty
>>> confirmation and
>>> cancel
>>> semantics to the
>>> calling interface
>>> that hint that the
>>> underlying
>>> implementation of
>>> the "resize"
>>> operation is in
>>> actuality not a
>>> resize
>>> at all, but rather a
>>> create-new-and-delete-old-resources operation.
>>> [amrith] And that isn't
>>> germane to a quota library,
>>> I don't think. What
>>> is, is this. Do we want to treat the
>>> transient state when there are (for
>>> example of Nova) two instances, one
>>> of the new flavor and one of the old
>>> flavor, or not. But, from the
>>> perspective of a quota library, a
>>> resize
>>> operation is merely a reset of the
>>> quota by the delta in the resource
>>> consumed.
>>> -
>>> release_resource(id or resource)
>>> -
>>> expire_reservations()
>>> I see no need to
>>> have reservations in
>>> the quota library at
>>> all, as
>>> mentioned above.
>>> [amrith] Then I think the
>>> quota library must require
>>> that either (a) the
>>> underlying database runs
>>> serializable or (b) database
>>> constraints can be
>>> used to enforce that at commit the
>>> global limits are adhered to.
>>> As for your proposed
>>> interface and
>>> calling structure
>>> below, I think a
>>> much simpler
>>> proposal would work
>>> better. I'll work on
>>> a cross-project
>>> spec that describes
>>> this simpler
>>> proposal, but the
>>> basics would be:
>>> 1) Have Keystone
>>> store quota
>>> information for
>>> defaults (per
>>> service
>>> endpoint), for
>>> tenants and for
>>> users.
>>> Keystone would have
>>> the set of canonical
>>> resource class
>>> names, and
>>> each project, upon
>>> handling a new
>>> resource class,
>>> would be
>>> responsible for a
>>> change submitted to
>>> Keystone to add the
>>> new resource
>>> class code.
>>> Straw man REST API:
>>> GET /quotas/resource-classes
>>> 200 OK
>>> {
>>> "resource_classes":
>>> {
>>> "compute.vcpu": {
>>> "service":
>>> "compute",
>>> "code":
>>> "compute.vcpu",
>>> "description": "A
>>> virtual CPU unit"
>>> },
>>> "compute.ram_mb": {
>>> "service":
>>> "compute",
>>> "code":
>>> "compute.ram_mb",
>>> "description":
>>> "Memory in
>>> megabytes"
>>> },
>>> ...
>>> "volume.disk_gb": {
>>> "service": "volume",
>>> "code":
>>> "volume.disk_gb",
>>> "description":
>>> "Amount of disk
>>> space in gigabytes"
>>> },
>>> ...
>>> "database.count": {
>>> "service":
>>> "database",
>>> "code":
>>> "database.count",
>>> "description":
>>> "Number of database
>>> instances"
>>> }
>>> }
>>> }
>>> [amrith] Well, a user is
>>> allowed to have a certain
>>> compute quota (which
>>> is shared by Nova and Trove) but
>>> also a Trove quota. How would your
>>> representation represent that?
>>> # Get the default
>>> limits for new
>>> users...
>>> GET /quotas/defaults
>>> 200 OK
>>> {
>>> "quotas": {
>>> "compute.vcpu": 100,
>>> "compute.ram_mb":
>>> 32768,
>>> "volume.disk_gb":
>>> 1000,
>>> "database.count": 25
>>> }
>>> }
>>> # Get a specific
>>> user's limits...
>>> GET /quotas/users/{UUID}
>>> 200 OK
>>> {
>>> "quotas": {
>>> "compute.vcpu": 100,
>>> "compute.ram_mb":
>>> 32768,
>>> "volume.disk_gb":
>>> 1000,
>>> "database.count": 25
>>> }
>>> }
>>> # Get a tenant's
>>> limits...
>>> GET /quotas/tenants/{UUID}
>>> 200 OK
>>> {
>>> "quotas": {
>>> "compute.vcpu":
>>> 1000,
>>> "compute.ram_mb":
>>> 327680,
>>> "volume.disk_gb":
>>> 10000,
>>> "database.count":
>>> 250
>>> }
>>> }
>>> 2) Have Delimiter
>>> communicate with the
>>> above proposed new
>>> Keystone
>>> REST API and package
>>> up data into an
>>> oslo.versioned_objects interface.
>>> Clearly all of the
>>> above can be heavily
>>> cached both on the
>>> server and
>>> client side since
>>> they rarely change
>>> but are read often.
>>> [amrith] Caching on the
>>> client won't save you from
>>> oversubscription if
>>> you don't run serializable.
>>> The Delimiter
>>> library could be
>>> used to provide a
>>> calling interface
>>> for service projects
>>> to get a user's
>>> limits for a set of
>>> resource
>>> classes:
>>> (please excuse
>>> wrongness, typos,
>>> and other stuff
>>> below, it's just a
>>> straw- man not
>>> production working
>>> code...)
>>> # file:
>>> delimiter/objects/limits.py
>>> import
>>> oslo.versioned_objects.base as ovo import
>>> oslo.versioned_objects.fields as ovo_fields
>>> class
>>> ResourceLimit(ovo.VersionedObjectBase):
>>> # 1.0: Initial
>>> version
>>> VERSION = '1.0'
>>> fields = {
>>> 'resource_class':
>>> ovo_fields.StringField(),
>>> 'amount':
>>> ovo_fields.IntegerField(),
>>> }
>>> class
>>> ResourceLimitList(ovo.VersionedObjectBase):
>>> # 1.0: Initial
>>> version
>>> VERSION = '1.0'
>>> fields = {
>>> 'resources':
>>> ListOfObjectsField(ResourceLimit),
>>> }
>>> @cache_this_heavily
>>> @remotable_classmethod
>>> def
>>> get_all_by_user(cls,
>>> user_uuid):
>>> """Returns a Limits
>>> object that tells
>>> the caller what a
>>> user's
>>> absolute limits for
>>> the set of resource
>>> classes in the
>>> system.
>>> """
>>> # Grab a keystone
>>> client session
>>> object and connect
>>> to Keystone
>>> ks =
>>> ksclient.Session(...)
>>> raw_limits =
>>> ksclient.get_limits_by_user()
>>> return
>>> cls(resources=[ResourceLimit(**d) for d in raw_limits])
>>> 3) Each service
>>> project would be
>>> responsible for
>>> handling the
>>> consumption of a set
>>> of requested
>>> resource amounts in
>>> an atomic and
>>> consistent way.
>>> [amrith] This is where the
>>> rubber meets the road. What
>>> is that atomic
>>> and consistent way? And what
>>> computing infrastructure do you need
>>> to
>>> deliver this?
>>> The Delimiter
>>> library would return
>>> the limits that the
>>> service would
>>> pre- check before
>>> claiming the
>>> resources and either
>>> post-check after
>>> claim or utilize a
>>> compare-and-update
>>> technique with a
>>> generation/timestamp
>>> during claiming to
>>> prevent race
>>> conditions.
>>> For instance, in
>>> Nova with the new
>>> resource providers
>>> database schema
>>> and doing claims in
>>> the scheduler (a
>>> proposed change), we
>>> might do
>>> something to the
>>> effect of:
>>> from delimiter
>>> import objects as
>>> delim_obj from
>>> delimier import
>>> exceptions as
>>> delim_exc from nova
>>> import objects as
>>> nova_obj
>>> request =
>>> nova_obj.RequestSpec.get_by_uuid(request_uuid)
>>> requested =
>>> request.resources
>>> limits =
>>> delim_obj.ResourceLimitList.get_all_by_user(user_uuid)
>>> allocations =
>>> nova_obj.AllocationList.get_all_by_user(user_uuid)
>>> # Pre-check for
>>> violations
>>> for resource_class,
>>> requested_amount in
>>> requested.items():
>>> limit_idx =
>>> limits.resources.index(resource_class)
>>> resource_limit =
>>> limits.resources[limit_idx].amount
>>> alloc_idx =
>>> allocations.resources.index(resource_class)
>>> resource_used =
>>> allocations.resources[alloc_idx]
>>> if (resource_used +
>>> requested_amount)>
>>> resource_limit:
>>> raise
>>> delim_exc.QuotaExceeded
>>> [amrith] Is the above code
>>> run with some global mutex
>>> to prevent that
>>> two people don't believe that they
>>> are good on quota at the same time?
>>> # Do claims in
>>> scheduler in an
>>> atomic, consistent
>>> fashion...
>>> claims =
>>> scheduler_client.claim_resources(request)
>>> [amrith] Yes, each 'atomic'
>>> claim on a repeatable-read
>>> database could
>>> result in oversubscription.
>>> # Post-check for
>>> violations
>>> allocations =
>>> nova_obj.AllocationList.get_all_by_user(user_uuid)
>>> # allocations now
>>> include the claimed
>>> resources from the
>>> scheduler
>>> for resource_class,
>>> requested_amount in
>>> requested.items():
>>> limit_idx =
>>> limits.resources.index(resource_class)
>>> resource_limit =
>>> limits.resources[limit_idx].amount
>>> alloc_idx =
>>> allocations.resources.index(resource_class)
>>> resource_used =
>>> allocations.resources[alloc_idx]
>>> if resource_used>
>>> resource_limit:
>>> # Delete the
>>> allocation records
>>> for the resources
>>> just claimed
>>> delete_resources(claims)
>>> raise
>>> delim_exc.QuotaExceeded
>>> [amrith] Again, two people
>>> could drive through this
>>> code and both of
>>> them could fail :(
>>> 4) The only other
>>> thing that would
>>> need to be done for
>>> a first go of
>>> the Delimiter
>>> library is some
>>> event listener that
>>> can listen for
>>> changes to the quota
>>> limits for a
>>> user/tenant/default
>>> in Keystone.
>>> We'd want the
>>> services to be able
>>> notify someone if a
>>> reduction in
>>> quota results in an
>>> overquota situation.
>>> Anyway, that's my
>>> idea. Keep the
>>> Delimiter library
>>> small and focused
>>> on describing the
>>> limits only, not on
>>> the resource
>>> allocations. Have
>>> the Delimiter
>>> library present a
>>> versioned object
>>> interface so the
>>> interaction between
>>> the data exposed by
>>> the Keystone REST
>>> API for
>>> quotas can evolve
>>> naturally and
>>> smoothly over time.
>>> Best,
>>> -jay
>>> Let me
>>> illustrate
>>> the way I
>>> see these
>>> things
>>> fitting
>>> together. A
>>> hypothetical
>>> Trove system
>>> may be setup
>>> as follows:
>>> - No more
>>> than 2000
>>> database
>>> instances in
>>> total, 300
>>> clusters
>>> in
>>> total
>>> - Users may
>>> not launch
>>> more than 25
>>> database
>>> instances,
>>> or 4
>>> clusters
>>> - The
>>> particular
>>> user
>>> 'amrith' is
>>> limited to 2
>>> databases
>>> and
>>> 1
>>> cluster
>>> - No user
>>> may consume
>>> more than
>>> 20TB of
>>> storage at a
>>> time
>>> - No user
>>> may consume
>>> more than
>>> 10GB of
>>> memory at a
>>> time
>>> At startup,
>>> I believe
>>> that the
>>> system would
>>> make the
>>> following
>>> sequence of
>>> calls:
>>> -
>>> define_resource(databaseInstance, 2000, 0, 25, 0, ...)
>>> -
>>> update_resource_limits(databaseInstance, amrith, 2, 0,
>>> ...)
>>> -
>>> define_resource(databaseCluster, 300, 0, 4, 0, ...)
>>> -
>>> update_resource_limits(databaseCluster, amrith, 1, 0, ...)
>>> -
>>> define_resource(storage, -1, 0, 20TB, 0, ...)
>>> -
>>> define_resource(memory, -1, 0, 10GB, 0, ...)
>>> Assume that
>>> the user
>>> john comes
>>> along and
>>> asks for a
>>> cluster with
>>> 4
>>> nodes, 1TB
>>> storage per
>>> node and
>>> each node
>>> having 1GB
>>> of memory,
>>> the
>>> system would
>>> go through
>>> the
>>> following
>>> sequence:
>>> -
>>> reserve_resource(databaseCluster, john, 1, None)
>>> o this
>>> returns a
>>> resourceID
>>> (say
>>> cluster-resource-
>>> ID)
>>> o the
>>> cluster
>>> instance
>>> that it
>>> reserves
>>> counts
>>> against
>>> the
>>> limit of 300
>>> cluster
>>> instances in
>>> total, as
>>> well
>>> as
>>> the 4
>>> clusters
>>> that john
>>> can
>>> provision.
>>> If 'amrith'
>>> had
>>> requested
>>> it, that
>>> would have
>>> been counted
>>> against
>>> the
>>> limit
>>> of 2
>>> clusters for
>>> the user.
>>> -
>>> reserve_resource(databaseInstance, john, 1,
>>> cluster-resource-id)
>>> -
>>> reserve_resource(databaseInstance, john, 1,
>>> cluster-resource-id)
>>> -
>>> reserve_resource(databaseInstance, john, 1,
>>> cluster-resource-id)
>>> -
>>> reserve_resource(databaseInstance, john, 1,
>>> cluster-resource-id)
>>> o this
>>> returns four
>>> resource
>>> id's, let's
>>> say
>>> instance-1-id, instance-2-id, instance-3-id,
>>> instance-4-id
>>> o note that
>>> each
>>> instance is
>>> that, an
>>> instance by
>>> itself. it
>>> is therefore
>>> not right to
>>> consider
>>> this
>>> as
>>> equivalent
>>> to a call to
>>> reserve_resource() with a
>>> size
>>> of 4,
>>> especially
>>> because each
>>> instance
>>> could later
>>> be
>>> tracked as
>>> an
>>> individual
>>> Nova
>>> instance.
>>> -
>>> reserve_resource(storage, john, 1TB, instance-1-id)
>>> -
>>> reserve_resource(storage, john, 1TB, instance-2-id)
>>> -
>>> reserve_resource(storage, john, 1TB, instance-3-id)
>>> -
>>> reserve_resource(storage, john, 1TB, instance-4-id)
>>> o each of
>>> them returns
>>> some
>>> resourceID,
>>> let's say
>>> they
>>> returned
>>> cinder-1-id,
>>> cinder-2-id,
>>> cinder-3-id,
>>> cinder-4-id
>>> o since the
>>> storage of
>>> 1TB is a
>>> unit, it is
>>> treated
>>> as
>>> such.
>>> In other
>>> words, you
>>> don't need
>>> to invoke
>>> reserve_resource 10^12 times, once per byte
>>> allocated
>>> :)
>>> -
>>> reserve_resource(memory, john, 1GB, instance-1-id)
>>> -
>>> reserve_resource(memory, john, 1GB, instance-2-id)
>>> -
>>> reserve_resource(memory, john, 1GB, instance-3-id)
>>> -
>>> reserve_resource(memory, john, 1GB, instance-4-id)
>>> o each of
>>> these return
>>> something,
>>> say
>>> have
>>> made
>>> up arbitrary
>>> strings just
>>> to highlight
>>> that we
>>> really don't
>>> track these
>>> anywhere so
>>> we don't
>>> care
>>> about
>>> them.
>>> If all this
>>> works, then
>>> the system
>>> knows that
>>> John's
>>> request does
>>> not violate
>>> any quotas
>>> that it can
>>> enforce, it
>>> can then go
>>> ahead and
>>> launch the
>>> instances
>>> (calling
>>> Nova),
>>> provision
>>> storage, and
>>> so on.
>>> The system
>>> then goes
>>> and creates
>>> four Cinder
>>> volumes,
>>> these are
>>> cinder-1-uuid, cinder-2-uuid, cinder-3-uuid, cinder-4-uuid.
>>> It can then
>>> go and
>>> confirm
>>> those
>>> reservations.
>>> -
>>> provision_resource(cinder-1-id, cinder-1-uuid)
>>> -
>>> provision_resource(cinder-2-id, cinder-2-uuid)
>>> -
>>> provision_resource(cinder-3-id, cinder-3-uuid)
>>> -
>>> provision_resource(cinder-4-id, cinder-4-uuid)
>>> It could
>>> then go and
>>> launch 4
>>> nova
>>> instances
>>> and
>>> similarly
>>> provision
>>> those
>>> resources,
>>> and so on.
>>> This process
>>> could take
>>> some minutes
>>> and
>>> holding a
>>> database
>>> transaction
>>> open for
>>> this is the
>>> issue that
>>> Jay
>>> brings up in
>>> [4]. We
>>> don't have
>>> to in this
>>> proposed
>>> scheme.
>>> Since the
>>> resources
>>> are all
>>> hierarchically linked through the
>>> overall
>>> cluster id,
>>> when the
>>> cluster is
>>> setup, it
>>> can finally
>>> go and
>>> provision
>>> that:
>>> -
>>> provision_resource(cluster-resource-id, cluster-uuid)
>>> When Trove
>>> is done with
>>> some
>>> individual
>>> resource, it
>>> can go and
>>> release it.
>>> Note that
>>> I'm thinking
>>> this will
>>> invoke
>>> release_resource
>>> with the ID
>>> of the
>>> underlying
>>> object OR
>>> the
>>> resource.
>>> -
>>> release_resource(cinder-4-id), and
>>> -
>>> release_resource(cinder-4-uuid)
>>> are
>>> therefore
>>> identical
>>> and indicate
>>> that the 4th
>>> 1TB volume
>>> is now
>>> released.
>>> How this
>>> will be
>>> implemented
>>> in Python,
>>> kwargs or
>>> some
>>> other
>>> mechanism
>>> is, I
>>> believe, an
>>> implementation detail.
>>> Finally, it
>>> releases the
>>> cluster
>>> resource by
>>> doing this:
>>> -
>>> release_resource(cluster-resource-id)
>>> This would
>>> release the
>>> cluster and
>>> all
>>> dependent
>>> resources in
>>> a
>>> single
>>> operation.
>>> A user may
>>> wish to
>>> manage a
>>> resource
>>> that was
>>> provisioned
>>> from the
>>> service.
>>> Assume that
>>> this results
>>> in a
>>> resizing of
>>> the
>>> instances,
>>> then it is a
>>> matter of
>>> updating
>>> that
>>> resource.
>>> Assume that
>>> the third
>>> 1TB volume
>>> is being
>>> resized to
>>> 2TB, then it
>>> is
>>> merely a
>>> matter of
>>> invoking:
>>> -
>>> update_resource(cinder-3-uuid, 2TB)
>>> Delimiter
>>> can go
>>> figure out
>>> that
>>> cinder-3-uuid is a 1TB device and
>>> therefore
>>> this is an
>>> increase of
>>> 1TB and
>>> verify that
>>> this is
>>> within
>>> the quotas
>>> allowed for
>>> the user.
>>> The thing
>>> that I find
>>> attractive
>>> about this
>>> model of
>>> maintaining
>>> a
>>> hierarchy of
>>> reservations
>>> is that in
>>> the event of
>>> an error,
>>> the
>>> service need
>>> merely call
>>> release_resource() on the highest level
>>> reservation
>>> and the
>>> Delimiter
>>> project can
>>> walk down
>>> the chain
>>> and
>>> release all
>>> the
>>> resources or
>>> reservations
>>> as
>>> appropriate.
>>> Under the
>>> covers I
>>> believe that
>>> each of
>>> these
>>> operations
>>> should be
>>> atomic and
>>> may update
>>> multiple
>>> database
>>> tables but
>>> these will
>>> all be
>>> short lived
>>> operations.
>>> For example,
>>> reserving an
>>> instance
>>> resource
>>> would
>>> increment
>>> the
>>> number of
>>> instances
>>> for the user
>>> as well as
>>> the number
>>> of instances
>>> on the
>>> whole, and
>>> this would
>>> be an atomic
>>> operation.
>>> I have two
>>> primary
>>> areas of
>>> concern
>>> about the
>>> proposal
>>> [3].
>>> The first is
>>> that it
>>> makes the
>>> implicit
>>> assumption
>>> that the
>>> "flat mode"
>>> is
>>> implemented.
>>> That
>>> provides
>>> value to a
>>> consumer
>>> but I think
>>> it leaves a
>>> lot for the
>>> consumer to
>>> do. For
>>> example,
>>> I
>>> find it hard
>>> to see how
>>> the model
>>> proposed
>>> would handle
>>> the
>>> release of
>>> quotas,
>>> leave alone
>>> the case of
>>> a nested
>>> release of
>>> a
>>> hierarchy
>>> of
>>> resources.
>>> The other is
>>> the notion
>>> that the
>>> implementation will begin a
>>> transaction,
>>> perform a
>>> query(),
>>> make some
>>> manipulations, and
>>> then do a
>>> save(). This
>>> makes for an
>>> interesting
>>> transaction
>>> management
>>> challenge as
>>> it would
>>> require the
>>> underlying
>>> database
>>> to run in
>>> an isolation
>>> mode of at
>>> least
>>> repeatable
>>> reads and
>>> maybe even
>>> serializable
>>> which would
>>> be a
>>> performance
>>> bear on
>>> a
>>> heavily
>>> loaded
>>> system. If
>>> run in the
>>> traditional
>>> read-
>>> committed
>>> mode, this
>>> would
>>> silently
>>> lead to over
>>> subscriptions, and
>>> the
>>> violation
>>> of quota
>>> limits.
>>> I believe
>>> that it
>>> should be a
>>> requirement
>>> that the
>>> Delimiter
>>> library
>>> should be
>>> able to run
>>> against a
>>> database
>>> that
>>> supports,
>>> and is
>>> configured
>>> for
>>> READ-COMMITTED, and should not require anything higher.
>>> The model
>>> proposed
>>> above can
>>> certainly be
>>> implemented
>>> with a
>>> database
>>> running
>>> READ-COMMITTED, and I believe that this is also
>>> true with
>>> the caveat
>>> that the
>>> operations
>>> will be
>>> performed
>>> through
>>> SQLAlchemy.
>>> Thanks,
>>> -amrith
>>> [1]
>>> http://openstack.markmail.org/thread/tkl2jcyvzgifniux
>>> [2]
>>> http://openstack.markmail.org/thread/3cr7hoeqjmgyle2j
>>> [3]
>>> https://review.openstack.org/#/c/284454/
>>> [4]
>>> http://markmail.org/message/7ixvezcsj3uyiro6
>>> ____________________________________________________________________
>>> __ ____
>>> OpenStack
>>> Development
>>> Mailing List
>>> (not for
>>> usage
>>> questions)
>>> Unsubscribe:
>>> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>> _____________________________________________________________________
>>> _____ OpenStack
>>> Development Mailing
>>> List (not for usage
>>> questions)
>>> Unsubscribe:
>>> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>> ______________________________________________________________________
>>> ____ OpenStack Development
>>> Mailing List (not for usage
>>> questions)
>>> Unsubscribe:
>>> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>> __________________________________________________________________________
>>> OpenStack Development Mailing List
>>> (not for usage questions)
>>> Unsubscribe:
>>> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>> __________________________________________________________________________
>>> OpenStack Development Mailing List (not for
>>> usage questions)
>>> Unsubscribe:
>>> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>> __________________________________________________________________________
>>> OpenStack Development Mailing List (not for usage
>>> questions)
>>> Unsubscribe:
>>> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>> __________________________________________________________________________
>>> OpenStack Development Mailing List (not for usage questions)
>>> Unsubscribe:
>>> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>> __________________________________________________________________________
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
More information about the OpenStack-dev
mailing list