[openstack-dev] More on the topic of DELIMITER, the Quota Management Library proposal
Amrith Kumar
amrith at tesora.com
Mon Apr 25 16:49:55 UTC 2016
On Sat, 2016-04-23 at 21:41 +0000, Amrith Kumar wrote:
> Ok to beer and high bandwidth. FYI Jay the distributed high perf db we
> did a couple of years ago is now open source. Just saying. Mysql plug
> compatible ....
> -amrith
>
>
> --
> Amrith Kumar
> amrith at tesora.com
>
>
> -------- Original message --------
> From: Jay Pipes <jaypipes at gmail.com>
> Date: 04/23/2016 4:10 PM (GMT-05:00)
> To: Amrith Kumar <amrith at tesora.com>,
> openstack-dev at lists.openstack.org
> Cc: vilobhmm at yahoo-inc.com, nik.komawar at gmail.com, Ed Leafe
> <ed at leafe.com>
> Subject: Re: [openstack-dev] More on the topic of DELIMITER, the Quota
> Management Library proposal
>
>
> Looking forward to arriving in Austin so that I can buy you a beer,
> Amrith, and have a high-bandwidth conversation about how you're
> wrong. :P
Jay and I chatted and it took a long time to come to an agreement
because we weren't able to find any beer.
Here's what I think we've agreed about. The library will store data in
two tables;
1. the detail table which stores the individual claims and the resource
class.
2. a generation table which stores the resource class and a generation.
When a claim is received, the requestor performs the following
operations.
begin
select sum(detail.claims) as total_claims,
generation.resource as resource,
generation.generation last_generation
from detail, generation
where detail.resource = generation.resource
and generation.resource = <chosen resource; memory, cpu, ...>
group by generation.generation, generation.resource
if total_claims + this_claim < limit
insert into detail values (this_claim, resource)
update generation
set generation = generation + 1
where generation = last_generation
if @@rowcount = 1
-- all good
commit
else
rollback
-- try again
There will be some bootstrapping that will be required for the situation
where there are no detail records for a given resource and so on but I
think we can figure that out easily. The easiest way I can think of
doing that is to lose the join and do both the queries (one against the
detail and one against the generation table within the same
transaction).
Using the generation table update as the locking mechanism that prevents
multiple requestors from making concurrent claims.
So long as people don't go and try and read these tables and change
tables outside of the methods that the library provides, we can
guarantee that this is al safe and will not oversubscribe.
-amrith
> Comments inline.
>
> On 04/23/2016 11:25 AM, Amrith Kumar wrote:
> > On Sat, 2016-04-23 at 10:26 -0400, Andrew Laski wrote:
> >> On Fri, Apr 22, 2016, at 09:57 PM, Tim Bell wrote:
> >>
> >>> I have reservations on f and g.
> >>>
> >>>
> >>> On f., We have had a number of discussions in the past about
> >>> centralising quota (e.g. Boson) and the project teams of the other
> >>> components wanted to keep the quota contents ‘close’. This can
> >>> always be reviewed further with them but I would hope for at least
> a
> >>> standard schema structure of tables in each project for the
> handling
> >>> of quota.
> >>>
> >>>
> >>> On g., aren’t all projects now nested projects ? If we have the
> >>> complexity of handling nested projects sorted out in the common
> >>> library, is there a reason why a project would not want to support
> >>> nested projects ?
> >>>
> >>>
> >>> One other issue is how to do reconcilliation, each project needs
> to
> >>> have a mechanism to re-calculate the current allocations and
> >>> reconcile that with the quota usage. While in an ideal world, this
> >>> should not be necessary, it would be for the foreseeable future,
> >>> especially with a new implementation.
> >>>
> >>
> >> One of the big reasons that Jay and I have been pushing to remove
> >> reservations and tracking of quota in a separate place than the
> >> resources are actually used, e.g., an instance record in the Nova
> db,
> >> is so that reconciliation is not necessary. For example, if RAM
> quota
> >> usage is simply tracked as sum(instances.memory_mb) then you can be
> >> sure that usage is always up to date.
> >
> > Uh oh, there be gremlins here ...
> >
> > I am positive that this will NOT work, see earlier conversations
> about
> > isolation levels, and Jay's alternate solution.
> >
> > The way (I understand the issue, and Jay's solution) you get around
> the
> > isolation levels trap is to NOT do your quota determinations based
> on a
> > SUM(column) but rather based on the rowcount on a well crafted
> UPDATE of
> > a single table that stored total quota.
>
> No, we would do our quota calculations by doing a SUM(used) against
> the
> allocations table. There is no separate table that stored the total
> quota (or quota usage records). That's the source of the problem with
> the existing quota handling code in Nova. The generation field value
> is
> used to provide the consistent view of the actual resource usage
> records
> so that the INSERT operations for all claimed resources can be done in
> a
> transactional manner and will be rolled back if any other writer
> changes
> the amount of consumed resources on a provider (which of course would
> affect the quota check calculations).
>
> > You could also store a detail
> > claim record for each claim in an independent table that is
> maintained
> > in the same database transaction if you so desire, that is optional.
>
> The allocations table is the "detail claim record" table that you
> refer
> to above.
>
> > My view of how this would work (which I described earlier as
> building on
> > Jay's solution) is that the claim flow would look like this:
> >
> > select total_used, generation
> > from quota_claimed
> > where tenant = <tenant> and resource = 'memory'
>
> There is no need to keep a total_used value for anything. That is
> denormalized calculated data that merely adds a point of race
> contention. The quota check is against the *detail* table
> (allocations),
> which stores the *actual resource usage records*.
>
> > begin transaction
> >
> > update quota_claimed
> > set total_used = total_used + claim, generation =
> > generation + 1
> > where tenant = <tenant> and resource = 'memory'
> > and generation = generation
> > and total_used + claim < limit
>
> This part of the transaction must always occur **after** the
> insertion
> of the actual resource records, not before.
>
> > if @@rowcount = 1
> > -- optional claim_detail table
> > insert into claim_detail values ( <tenant>,
> 'memory',
> > claim, ...)
> > commit
> > else
> > rollback
>
> So, in pseudo-Python-SQLish code, my solution works like this:
>
> limits = get_limits_from_delimiter()
> requested = get_requested_from_request_spec()
>
> while True:
>
> used := SELECT
> resource_class,
> resource_provider,
> generation,
> SUM(used) as total_used
> FROM allocations
> JOIN resource_providers ON (...)
> WHERE consumer_uuid = $USER_UUID
> GROUP BY
> resource_class,
> resource_provider,
> generation;
>
> # Check that our requested resource amounts don't exceed quotas
> if not check_requested_within_limits(requested, used, limits):
> raise QuotaExceeded
>
> # Claim all requested resources. Note that the generation
> retrieved
> # from the above query is our consistent view marker. If the
> UPDATE
> # below succeeds and returns != 0 rows affected, that means there
> # was no other writer that changed our resource usage in between
> # this thread's claiming of resources, and therefore we prevent
> # any oversubscription of resources.
> begin_transaction:
>
> provider := SELECT id, generation, ... FROM
> resource_providers
> JOIN (...)
> WHERE (<resource_usage_filters>)
>
> for resource in requested:
> INSERT INTO allocations (
> resource_provider_id,
> resource_class_id,
> consumer_uuid,
> used
> ) VALUES (
> $provider.id,
> $resource.id,
> $USER_UUID,
> $resource.amount
> );
>
> rows_affected := UPDATE resource_providers
> SET generation = generation + 1
> WHERE id = $provider.id
> AND generation =
> $used[$provider.id].generation;
>
> if $rows_affected == 0:
> ROLLBACK;
>
> The only reason we would need a post-claim quota check is if some of
> the
> requested resources are owned and tracked by an external-to-Nova
> system.
>
> BTW, note to Ed Leafe... unless your distributed data store supports
> transactional semantics, you can't use a distributed data store for
> these types of solutions. Instead, you will need to write a whole
> bunch
> of code that does post-auditing of claims and quotas and a system
> that
> accepts that oversubscription and out-of-sync quota limits and usages
> is
> a fact of life. Not to mention needing to implement JOINs in Python.
>
> > But, it is my understanding that
> >
> > (a) if you wish to do the SUM(column) approach that you
> propose,
> > you must have a reservation that is committed and then you
> must
> > re-read the SUM(column) to make sure you did not
> over-subscribe;
> > and
>
> Erm, kind of? Oversubscription is not possible in the solution I
> describe because the compare-and-update on the
> resource_providers.generation field allows for a consistent view of
> the
> resources used -- and if that view changes during the insertion of
> resource usage records -- the transaction containing those insertions
> is
> rolled back.
>
> > (b) to get away from reservations you must stop using the
> > SUM(column) approach and instead use a single quota_claimed
> > table to determine the current quota claimed.
>
> No. This has nothing to do with reservations.
>
> > At least that's what I understand of Jay's example from earlier in
> this
> > thread.
> >
> > Let's definitely discuss this in Austin. While I don't love Jay's
> > solution for other reasons to do with making the quota table a
> hotspot
> > and things like that, it is a perfectly workable solution, I think.
>
> There is no quota table in my solution.
>
> If you refer to the resource_providers table (the table that has the
> generation field), then yes, it's a hot spot. But hot spots in the DB
> aren't necessarily a bad thing if you design the underlying schema
> properly.
>
> More in Austin.
>
> Best,
> -jay
>
> >>
> >>
> >>>
> >>> Tim
> >>>
> >>>
> >>>
> >>>
> >>> From: Amrith Kumar <amrith at tesora.com>
> >>> Reply-To: "OpenStack Development Mailing List (not for usage
> >>> questions)" <openstack-dev at lists.openstack.org>
> >>> Date: Friday 22 April 2016 at 06:51
> >>> To: "OpenStack Development Mailing List (not for usage questions)"
> >>> <openstack-dev at lists.openstack.org>
> >>> Subject: Re: [openstack-dev] More on the topic of DELIMITER, the
> >>> Quota Management Library proposal
> >>>
> >>>
> >>>
> >>> I’ve thought more about Jay’s approach to enforcing
> quotas
> >>> and I think we can build on and around it. With that
> >>> implementation as the basic quota primitive, I think we
> can
> >>> build a quota management API that isn’t dependent on
> >>> reservations. It does place some burdens on the consuming
> >>> projects that I had hoped to avoid and these will cause
> >>> heartburn for some (make sure that you always request
> >>> resources in a consistent order and free them in a
> >>> consistent order being the most obvious).
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> If it doesn’t make it harder, I would like to see if we
> can
> >>> make the quota API take care of the ordering of requests.
> >>> i.e. if the quota API is an extension of Jay’s example
> and
> >>> accepts some data structure (dict?) with all the claims
> that
> >>> a project wants to make for some operation, and then
> >>> proceeds to make those claims for the project in the
> >>> consistent order, I think it would be of some value.
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> Beyond that, I’m on board with a-g below,
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> -amrith
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> From: Vilobh Meshram
> >>> [mailto:vilobhmeshram.openstack at gmail.com]
> >>> Sent: Friday, April 22, 2016 4:08 AM
> >>> To: OpenStack Development Mailing List (not for usage
> >>> questions) <openstack-dev at lists.openstack.org>
> >>> Subject: Re: [openstack-dev] More on the topic of
> DELIMITER,
> >>> the Quota Management Library proposal
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> I strongly agree with Jay on the points related to "no
> >>> reservation" , keeping the interface simple and the role
> for
> >>> Delimiter (impose limits on resource consumption and
> enforce
> >>> quotas).
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> The point to keep user quota, tenant quotas in Keystone
> >>> sounds interestring and would need support from Keystone
> >>> team. We have a Cross project session planned [1] and
> will
> >>> definitely bring that up in that session.
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> The main thought with which Delimiter was formed was to
> >>> enforce resource quota in transaction safe manner and do
> it
> >>> in a cross-project conducive manner and it still holds
> >>> true. Delimiters mission is to impose limits on
> >>> resource consumption and enforce quotas in transaction
> safe
> >>> manner. Few key aspects of Delimiter are :-
> >>>
> >>>
> >>>
> >>> a. Delimiter will be a new Library and not a Service.
> >>> Details covered in spec.
> >>>
> >>>
> >>> b. Delimiter's role will be to impose limits on resource
> >>> consumption.
> >>>
> >>>
> >>> c. Delimiter will not be responsible for rate limiting.
> >>>
> >>>
> >>> d. Delimiter will not maintain data for the resources.
> >>> Respective projects will take care of keeping,
> maintaining
> >>> data for the resources and resource consumption.
> >>>
> >>>
> >>> e. Delimiter will not have the concept of "reservations".
> >>> Delimiter will read or update the "actual" resource
> tables
> >>> and will not rely on the "cached" tables. At present, the
> >>> quota infrastructure in Nova, Cinder and other projects
> have
> >>> tables such as reservations, quota_usage, etc which are
> used
> >>> as "cached tables" to track re
> >>>
> >>>
> >>> f. Delimiter will fetch the information for project
> quota,
> >>> user quota from a centralized place, say Keystone, or if
> >>> that doesn't materialize will fetch default quota values
> >>> from respective service. This information will be cached
> >>> since it gets updated rarely but read many times.
> >>>
> >>>
> >>> g. Delimiter will take into consideration whether the
> >>> project is a Flat or Nested and will make the
> calculations
> >>> of allocated, available resources. Nested means project
> >>> namespace is hierarchical and Flat means project
> namespace
> >>> is not hierarchical.
> >>>
> >>>
> >>> -Vilobh
> >>>
> >>>
> >>> [1]
> https://www.openstack.org/summit/austin-2016/summit-schedule/events/9492
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>> On Thu, Apr 21, 2016 at 11:08 PM, Joshua Harlow
> >>> <harlowja at fastmail.com> wrote:
> >>>
> >>>
> >>> Since people will be on a plane soon,
> >>>
> >>> I threw this together as a example of a quota
> engine
> >>> (the zookeeper code does even work, and yes it
> >>> provides transactional semantics due to the nice
> >>> abilities of zookeeper znode versions[1] and its
> >>> inherent consistency model, yippe).
> >>>
> >>>
> https://gist.github.com/harlowja/e7175c2d76e020a82ae94467a1441d85
> >>>
> >>> Someone else can fill in the db quota engine with
> a
> >>> similar/equivalent api if they so dare, ha. Or
> even
> >>> feel to say the gist/api above is crap, cause
> that's
> >>> ok to, lol.
> >>>
> >>> [1]
> >>>
> https://zookeeper.apache.org/doc/r3.1.2/zookeeperProgrammers.html#Data
> +Access
> >>>
> >>>
> >>>
> >>> Amrith Kumar wrote:
> >>>
> >>> Inline below ... thread is too long, will
> >>> catch you in Austin.
> >>>
> >>>
> >>> -----Original Message-----
> >>> From: Jay Pipes
> >>> [mailto:jaypipes at gmail.com]
> >>> Sent: Thursday, April 21, 2016
> 8:08
> >>> PM
> >>> To:
> >>> openstack-dev at lists.openstack.org
> >>> Subject: Re: [openstack-dev] More
> on
> >>> the topic of DELIMITER, the Quota
> >>> Management Library proposal
> >>>
> >>> Hmm, where do I start... I think
> I
> >>> will just cut to the two primary
> >>> disagreements I have. And I will
> >>> top-post because this email is
> way
> >>> too
> >>> big.
> >>>
> >>> 1) On serializable isolation
> level.
> >>>
> >>> No, you don't need it at all to
> >>> prevent races in claiming. Just
> use
> >>> a
> >>> compare-and-update with retries
> >>> strategy. Proof is here:
> >>>
> >>>
> https://github.com/jaypipes/placement-bench/blob/master/placement.py#L97-
> >>> L142
> >>>
> >>> Works great and prevents multiple
> >>> writers from oversubscribing any
> >>> resource without relying on any
> >>> particular isolation level at
> all.
> >>>
> >>> The `generation` field in the
> >>> inventories table is what allows
> >>> multiple
> >>> writers to ensure a consistent
> view
> >>> of the data without needing to
> rely
> >>> on
> >>> heavy lock-based semantics and/or
> >>> RDBMS-specific isolation levels.
> >>>
> >>>
> >>>
> >>> [amrith] this works for what it is doing,
> we
> >>> can definitely do this. This will work at
> >>> any isolation level, yes. I didn't want
> to
> >>> go this route because it is going to
> still
> >>> require an insert into another table
> >>> recording what the actual 'thing' is that
> is
> >>> claiming the resource and that insert is
> >>> going to be in a different transaction
> and
> >>> managing those two transactions was what
> I
> >>> wanted to avoid. I was hoping to avoid
> >>> having two tables tracking claims, one
> >>> showing the currently claimed quota and
> >>> another holding the things that claimed
> that
> >>> quota. Have to think again whether that
> is
> >>> possible.
> >>>
> >>> 2) On reservations.
> >>>
> >>> The reason I don't believe
> >>> reservations are necessary to be
> in
> >>> a quota
> >>> library is because reservations
> add
> >>> a concept of a time to a claim of
> >>> some
> >>> resource. You reserve some
> resource
> >>> to be claimed at some point in
> the
> >>> future and release those
> resources
> >>> at a point further in time.
> >>>
> >>> Quota checking doesn't look at
> what
> >>> the state of some system will be
> at
> >>> some point in the future. It
> simply
> >>> returns whether the system *right
> >>> now* can handle a request *right
> >>> now* to claim a set of resources.
> >>>
> >>> If you want reservation semantics
> >>> for some resource, that's totally
> >>> cool,
> >>> but IMHO, a reservation service
> >>> should live outside of the
> service
> >>> that is
> >>> actually responsible for
> providing
> >>> resources to a consumer.
> >>> Merging right-now quota checks
> and
> >>> future-based reservations into
> the
> >>> same
> >>> library just complicates things
> >>> unnecessarily IMHO.
> >>>
> >>>
> >>>
> >>> [amrith] extension of the above ...
> >>>
> >>> 3) On resizes.
> >>>
> >>> Look, I recognize some users see
> >>> some value in resizing their
> >>> resources.
> >>> That's fine. I personally think
> >>> expand operations are fine, and
> that
> >>> shrink operations are really the
> >>> operations that should be
> prohibited
> >>> in
> >>> the API. But, whatever, I'm fine
> >>> with resizing of requested
> resource
> >>> amounts. My big point is if you
> >>> don't have a separate table that
> >>> stores
> >>> quota_usages and instead only
> have a
> >>> single table that stores the
> actual
> >>> resource usage records, you don't
> >>> have to do *any* quota check
> >>> operations
> >>> at all upon deletion of a
> resource.
> >>> For modifying resource amounts
> (i.e.
> >>> a
> >>> resize) you merely need to change
> >>> the calculation of requested
> >>> resource
> >>> amounts to account for the
> >>> already-consumed usage amount.
> >>>
> >>> Bottom line for me: I really
> won't
> >>> support any proposal for a
> complex
> >>> library that takes the resource
> >>> claim process out of the hands of
> >>> the
> >>> services that own those
> resources.
> >>> The simpler the interface of this
> >>> library, the better.
> >>>
> >>>
> >>>
> >>> [amrith] my proposal would not but this
> >>> email thread has got too long. Yes,
> simpler
> >>> interface, will catch you in Austin.
> >>>
> >>> Best,
> >>> -jay
> >>>
> >>> On 04/19/2016 09:59 PM, Amrith
> Kumar
> >>> wrote:
> >>>
> >>> -----Original
> >>> Message-----
> >>> From: Jay Pipes
> >>>
> [mailto:jaypipes at gmail.com]
> >>> Sent: Monday,
> April
> >>> 18, 2016 2:54 PM
> >>> To:
> >>>
> openstack-dev at lists.openstack.org
> >>> Subject: Re:
> >>> [openstack-dev]
> More
> >>> on the topic of
> >>> DELIMITER, the
> >>> Quota Management
> >>> Library proposal
> >>>
> >>> On 04/16/2016
> 05:51
> >>> PM, Amrith Kumar
> >>> wrote:
> >>>
> >>> If we
> >>> therefore
> >>> assume
> that
> >>> this will
> be
> >>> a Quota
> >>>
> Management
> >>> Library,
> >>> it is
> safe
> >>> to assume
> >>> that
> quotas
> >>> are going
> to
> >>> be
> managed
> >>> on a
> >>>
> per-project
> >>> basis,
> where
> >>>
> participating projects will use this library.
> >>> I believe
> >>> that it
> >>> stands to
> >>> reason
> that
> >>> any data
> >>>
> persistence
> >>> will
> >>> have to
> be
> >>> in a
> >>> location
> >>> decided
> by
> >>> the
> >>>
> individual
> >>> project.
> >>>
> >>>
> >>> Depends on what
> you
> >>> mean by "any data
> >>> persistence". If
> you
> >>> are
> >>> referring to the
> >>> storage of quota
> >>> values (per user,
> >>> per tenant,
> >>> global, etc) I
> think
> >>> that should be
> done
> >>> by the Keystone
> >>> service.
> >>> This data is
> >>> essentially an
> >>> attribute of the
> >>> user or the
> tenant
> >>> or the
> >>>
> >>>
> >>> service endpoint itself (i.e.
> >>>
> >>>
> >>> global defaults).
> >>> This data also
> >>> rarely changes
> and
> >>> logically belongs
> >>> to the service
> that
> >>> manages users,
> >>> tenants, and
> service
> >>> endpoints:
> >>>
> >>>
> >>> Keystone.
> >>>
> >>>
> >>> If you are
> referring
> >>> to the storage of
> >>> resource usage
> >>> records, yes,
> >>> each service
> project
> >>> should own that
> data
> >>> (and frankly, I
> >>> don't see a
> >>> need to persist
> any
> >>> quota usage data
> at
> >>> all, as I
> mentioned
> >>> in a
> >>> previous reply to
> >>> Attila).
> >>>
> >>>
> >>> [amrith] You make a
> >>> distinction that I had
> made
> >>> implicitly, and it is
> >>> important to highlight
> it.
> >>> Thanks for pointing it
> out.
> >>> Yes, I meant
> >>> both of the above, and as
> >>> stipulated. Global
> defaults
> >>> in keystone
> >>> (somehow, TBD) and usage
> >>> records, on a per-service
> >>> basis.
> >>>
> >>> That may
> not
> >>> be a very
> >>>
> interesting
> >>> statement
> >>> but the
> >>> corollary
> >>> is, I
> >>> think, a
> >>> very
> >>>
> significant
> >>>
> statement;
> >>> it cannot
> be
> >>> assumed
> that
> >>> the
> >>> quota
> >>>
> management
> >>>
> information
> >>> for all
> >>>
> participating projects is in
> >>> the same
> >>> database.
> >>>
> >>>
> >>> It cannot be
> assumed
> >>> that this
> >>> information is
> even
> >>> in a database at
> >>>
> >>>
> >>>
> >>> all...
> >>>
> >>>
> >>> [amrith] I don't follow.
> If
> >>> the service in question
> is
> >>> to be scalable,
> >>> I think it stands to
> reason
> >>> that there must be some
> >>> mechanism by which
> >>> instances of the service
> can
> >>> share usage records (as
> you
> >>> refer to
> >>> them, and I like that
> term).
> >>> I think it stands to
> reason
> >>> that there
> >>> must be some database,
> no?
> >>>
> >>> A
> >>>
> hypothetical
> >>> service
> >>> consuming
> >>> the
> >>> Delimiter
> >>> library
> >>> provides
> >>>
> requesters
> >>> with some
> >>> widgets,
> and
> >>> wishes to
> >>> track the
> >>> widgets
> that
> >>> it has
> >>>
> provisioned
> >>> both on a
> >>> per-user
> >>> basis,
> and
> >>> on the
> >>> whole. It
> >>> should
> >>> therefore
> >>>
> multi-tenant
> >>> and able
> to
> >>> track the
> >>> widgets
> on a
> >>> per
> >>> tenant
> basis
> >>> and if
> >>> required
> >>> impose
> >>> limits on
> >>> the
> number
> >>> of
> widgets
> >>> that a
> >>> tenant
> may
> >>> consume
> at a
> >>> time,
> during
> >>> a course
> of
> >>> a period
> of
> >>> time, and
> so
> >>> on.
> >>>
> >>>
> >>> No, this last
> part
> >>> is absolutely not
> >>> what I think
> quota
> >>> management
> >>> should be about.
> >>>
> >>> Rate limiting --
> >>> i.e. how many
> >>> requests a
> >>> particular user
> can
> >>> make of
> >>> an API in a given
> >>> period of time --
> >>> should *not* be
> >>> handled by
> >>> OpenStack API
> >>> services, IMHO.
> It
> >>> is the
> >>> responsibility of
> >>> the
> >>> deployer to
> handle
> >>> this using
> >>> off-the-shelf
> >>> rate-limiting
> >>> solutions
> >>>
> >>>
> >>> (open source or proprietary).
> >>>
> >>>
> >>> Quotas should
> only
> >>> be about the hard
> >>> limit of
> different
> >>> types of
> >>> resources that a
> >>> user or group of
> >>> users can consume
> at
> >>> a given time.
> >>>
> >>>
> >>> [amrith] OK, good point.
> >>> Agreed as stipulated.
> >>>
> >>>
> >>> Such a
> >>>
> hypothetical
> >>> service
> may
> >>> also
> consume
> >>> resources
> >>> from
> other
> >>> services
> >>> that it
> >>> wishes to
> >>> track,
> and
> >>> impose
> >>> limits
> on.
> >>>
> >>>
> >>> Yes, absolutely
> >>> agreed.
> >>>
> >>>
> >>> It is
> also
> >>>
> understood
> >>> as Jay
> Pipes
> >>> points
> out
> >>> in [4]
> that
> >>> the
> actual
> >>> process
> of
> >>>
> provisioning
> >>> widgets
> >>> could be
> >>> time
> >>> consuming
> >>> and it is
> >>>
> ill-advised
> >>> to hold a
> >>> database
> >>>
> transaction
> >>> of any
> kind
> >>> open for
> >>> that
> >>> duration
> of
> >>> time.
> >>> Ensuring
> >>> that a
> user
> >>> does not
> >>> exceed
> some
> >>> limit on
> >>> the
> number
> >>> of
> >>>
> concurrent
> >>> widgets
> that
> >>> he or she
> >>> may
> create
> >>> therefore
> >>> requires
> >>> some
> >>> mechanism
> to
> >>> track
> >>> in-flight
> >>> requests
> for
> >>> widgets.
> I
> >>> view
> these
> >>> as
> "intent"
> >>> but not
> yet
> >>>
> materialized.
> >>>
> >>>
> >>> It has nothing to
> do
> >>> with the amount
> of
> >>> concurrent
> widgets
> >>> that a
> >>> user can create.
> >>> It's just about
> the
> >>> total number of
> some
> >>> resource
> >>> that may be
> consumed
> >>> by that user.
> >>>
> >>> As for an
> "intent",
> >>> I don't believe
> >>> tracking intent
> is
> >>> the right way
> >>> to go at all. As
> >>> I've mentioned
> >>> before, the major
> >>> problem in Nova's
> >>> quota system is
> that
> >>> there are two
> tables
> >>> storing resource
> >>> usage
> >>> records: the
> >>> *actual* resource
> >>> usage tables (the
> >>> allocations table
> in
> >>> the new
> >>> resource-
> providers
> >>> modeling and the
> >>> instance_extra,
> >>> pci_devices and
> >>> instances table
> in
> >>> the legacy
> modeling)
> >>> and the *quota
> >>> usage* tables
> >>> (quota_usages and
> >>> reservations
> >>> tables). The
> >>> quota_usages
> table
> >>> does
> >>> not need to exist
> at
> >>> all, and neither
> >>> does the
> >>> reservations
> table.
> >>> Don't do
> >>> intent-based
> >>> consumption.
> >>> Instead, just
> >>> consume (claim)
> by
> >>> writing a record
> for
> >>> the resource
> class
> >>> consumed on a
> >>> provider into
> >>> the actual
> resource
> >>> usages table and
> >>> then "check
> quotas"
> >>> by querying
> >>> the *actual*
> >>> resource usages
> and
> >>> comparing the
> >>> SUM(used) values,
> >>> grouped by
> resource
> >>> class, against
> the
> >>> appropriate quota
> >>> limits for
> >>> the user. The
> >>> introduction of
> the
> >>> quota_usages and
> >>> reservations
> >>> tables to cache
> >>> usage records is
> the
> >>> primary reason
> for
> >>> the race
> >>> problems in the
> Nova
> >>> (and
> >>> other) quota
> system
> >>> because every
> time
> >>> you introduce a
> >>> caching system
> >>> for
> highly-volatile
> >>> data (like usage
> >>> records) you
> >>> introduce
> >>> complexity into
> the
> >>> write path and
> the
> >>> need to track the
> >>> same thing
> >>> across multiple
> >>> writes to
> different
> >>> tables
> needlessly.
> >>>
> >>>
> >>> [amrith] I don't agree,
> I'll
> >>> respond to this and the
> next
> >>> comment group
> >>>
> >>>
> >>>
> >>> together. See below.
> >>>
> >>>
> >>> Looking
> up
> >>> at this
> >>> whole
> >>>
> infrastructure from the perspective of the
> >>> database,
> I
> >>> think we
> >>> should
> >>> require
> that
> >>> the
> database
> >>> must not
> be
> >>> required
> to
> >>> operate
> in
> >>> any
> >>> isolation
> >>> mode
> higher
> >>> than
> >>>
> READ-COMMITTED; more about that later (i.e. requiring a database run
> >>> either
> >>>
> serializable
> >>> or
> >>>
> repeatable
> >>> read is a
> >>> show
> >>> stopper).
> >>>
> >>>
> >>> This is an
> >>> implementation
> >>> detail is not
> >>> relevant to the
> >>> discussion
> >>> about what the
> >>> interface of a
> quota
> >>> library would
> look
> >>> like.
> >>>
> >>>
> >>> [amrith] I disagree, let
> me
> >>> give you an example of
> why.
> >>>
> >>> Earlier, I wrote:
> >>>
> >>> Such a
> >>>
> hypothetical
> >>> service
> may
> >>> also
> consume
> >>> resources
> >>> from
> other
> >>> services
> >>> that it
> >>> wishes to
> >>> track,
> and
> >>> impose
> >>> limits
> on.
> >>>
> >>>
> >>> And you responded:
> >>>
> >>>
> >>> Yes, absolutely
> >>> agreed.
> >>>
> >>>
> >>>
> >>> So let's take this
> >>> hypothetical service that
> in
> >>> response to a user
> >>>
> >>>
> >>>
> >>> request, will provision a Cinder
> >>> volume and a Nova instance. Let's
> >>> assume
> >>> that the service also imposes
> limits
> >>> on the number of cinder volumes
> and
> >>> nova instances the user may
> >>> provision; independent of limits
> >>> that Nova and
> >>> Cinder may themselves maintain.
> >>>
> >>> One way that the
> >>> hypothetical service can
> >>> function is this:
> >>>
> >>> (a) check Cinder quota,
> if
> >>> successful, create cinder
> >>> volume
> >>> (b) check Nova quota, if
> >>> successful, create nova
> >>> instance with cinder
> >>> volume attachment
> >>>
> >>> Now, this is sub-optimal
> as
> >>> there are going to be
> some
> >>> number of cases
> >>>
> >>>
> >>> where the nova quota check fails.
> >>> Now you have needlessly created
> and
> >>> will
> >>> have to release a cinder volume.
> It
> >>> also takes longer to fail.
> >>>
> >>> Another way to do this is
> >>> this:
> >>>
> >>> (1) check Cinder quota,
> if
> >>> successful, check Nova
> >>> quota, if successful
> >>> proceed to (2) else error
> >>> out
> >>> (2) create cinder volume
> >>> (3) create nova instance
> >>> with cinder attachment.
> >>>
> >>> I'm trying to get to this
> >>> latter form of doing
> things.
> >>>
> >>> Easy, you might say ...
> >>> theoretically this should
> >>> simply be:
> >>>
> >>> BEGIN;
> >>> -- Get data to do the
> Cinder
> >>> check
> >>>
> >>> SELECT ......
> >>>
> >>> -- Do the cinder check
> >>>
> >>> INSERT INTO ....
> >>>
> >>> -- Get data to do the
> Nova
> >>> check
> >>>
> >>> SELECT ....
> >>>
> >>> -- Do the Nova check
> >>>
> >>> INSERT INTO ...
> >>>
> >>> COMMIT
> >>>
> >>> You can only make this
> work
> >>> if you ran at isolation
> >>> level serializable.
> >>>
> >>>
> >>> Why?
> >>>
> >>>
> >>> To make this run at
> >>> isolation level
> >>> REPEATABLE-READ, you must
> >>> enforce
> >>>
> >>>
> >>>
> >>> constraints at the database level
> >>> that will fail the commit. But
> wait,
> >>> you
> >>> can't do that because the data
> about
> >>> the global limits may not be in
> the
> >>> same database as the usage
> records.
> >>> Later you talk about caching and
> >>> stuff; all that doesn't help a
> >>> database constraint.
> >>>
> >>> For this reason, I think
> >>> there is going to have to
> be
> >>> some cognizance to
> >>>
> >>>
> >>>
> >>> the database isolation level in
> the
> >>> design of the library, and I
> think
> >>> it
> >>> will also impact the API that can
> be
> >>> constructed.
> >>>
> >>> In
> general
> >>>
> therefore, I
> >>> believe
> that
> >>> the
> >>>
> hypothetical
> >>> service
> >>>
> processing
> >>> requests
> for
> >>> widgets
> >>> would
> have
> >>> to handle
> >>> three
> kinds
> >>> of
> >>>
> operations,
> >>>
> provision,
> >>> modify,
> and
> >>> destroy.
> The
> >>> names
> are, I
> >>> believe,
> >>>
> self-explanatory.
> >>>
> >>>
> >>> Generally,
> >>> modification of a
> >>> resource doesn't
> >>> come into play.
> The
> >>> primary exception
> to
> >>> this is for
> >>> transferring of
> >>> ownership of some
> >>>
> >>>
> >>> resource.
> >>>
> >>>
> >>> [amrith] Trove RESIZE is
> a
> >>> huge benefit for users
> and
> >>> while it may be a
> >>>
> >>>
> >>>
> >>> pain as you say, this is still a
> >>> very real benefit. Trove allows
> you
> >>> to
> >>> resize both your storage (resize
> the
> >>> cinder volume) and resize your
> >>> instance (change the flavor).
> >>>
> >>>
> >>>
> >>>
> >>> Without
> loss
> >>> of
> >>>
> generality,
> >>> one can
> say
> >>> that all
> >>> three of
> >>> them must
> >>> validate
> >>> that the
> >>> operation
> >>> does not
> >>> violate
> some
> >>> limit (no
> >>> more
> >>> than X
> >>> widgets,
> no
> >>> fewer
> than X
> >>> widgets,
> >>> rates,
> and
> >>> so on).
> >>>
> >>>
> >>> No, only the
> >>> creation (and
> very
> >>> rarely the
> >>> modification)
> needs
> >>> any
> >>> validation that a
> >>> limit could been
> >>> violated.
> Destroying
> >>> a resource
> >>> never needs to be
> >>> checked for limit
> >>> violations.
> >>>
> >>>
> >>> [amrith] Well, if you are
> >>> going to create a volume
> of
> >>> 10GB and your
> >>>
> >>>
> >>>
> >>> limit is 100GB, resizing it to
> 200GB
> >>> should fail, I think.
> >>>
> >>>
> >>> Assuming
> >>> that the
> >>> service
> >>>
> provisions
> >>> resources
> >>> from
> other
> >>> services,
> >>> it is
> also
> >>>
> conceivable
> >>> that
> limits
> >>> be
> imposed
> >>> on the
> >>> quantum
> of
> >>> those
> >>> services
> >>> consumed.
> In
> >>> practice,
> I
> >>> can
> imagine
> >>> a service
> >>> like
> >>> Trove
> using
> >>> the
> >>> Delimiter
> >>> project
> to
> >>> perform
> all
> >>> of these
> >>> kinds of
> >>> limit
> >>> checks;
> I'm
> >>> not
> >>>
> suggesting
> >>> that it
> does
> >>> this
> today,
> >>> nor that
> >>> there is
> an
> >>> immediate
> >>> plan to
> >>> implement
> >>> all of
> them,
> >>> just that
> >>> these
> >>> all seem
> >>> like good
> >>> uses a
> Quota
> >>>
> Management
> >>>
> capability.
> >>>
> >>> - User
> may
> >>> not have
> >>> more than
> 25
> >>> database
> >>> instances
> at
> >>> a
> >>>
> >>>
> >>> time
> >>>
> >>>
> >>>
> -
> >>> User may
> not
> >>> have more
> >>> than 4
> >>> clusters
> at
> >>> a time
> >>> - User
> may
> >>> not
> consume
> >>> more than
> >>> 3TB of
> SSD
> >>> storage
> at a
> >>> time
> >>>
> >>>
> >>> Only if SSD
> storage
> >>> is a distinct
> >>> resource class
> from
> >>> DISK_GB. Right
> >>> now, Nova makes
> no
> >>> differentiation
> >>> w.r.t. SSD or HDD
> or
> >>> shared vs.
> >>> local block
> storage.
> >>>
> >>>
> >>> [amrith] It matters not
> to
> >>> Trove whether Nova does
> nor
> >>> not. Cinder
> >>>
> >>>
> >>>
> >>> supports volume-types and users
> DO
> >>> want to limit based on
> volume-type
> >>> (for
> >>> example).
> >>>
> >>>
> -
> >>> User may
> not
> >>> launch
> more
> >>> than 10
> huge
> >>> instances
> at
> >>> a
> >>> time
> >>>
> >>>
> >>> What is the point
> of
> >>> such a limit?
> >>>
> >>>
> >>>
> >>> [amrith] Metering usage,
> >>> placing limitations on
> the
> >>> quantum of resources
> >>>
> >>>
> >>>
> >>> that a user may provision. Same
> as
> >>> with Nova. A flavor is merely a
> >>> simple
> >>> way to tie together a bag of
> >>> resources. It is a way to
> restrict
> >>> access,
> >>> for example, to specific
> resources
> >>> that are available in the cloud.
> >>> HUGE
> >>> is just an example I gave, pick
> any
> >>> flavor you want, and here's how a
> >>> service like Trove uses it.
> >>>
> >>> Users can ask to launch
> an
> >>> instance of a specific
> >>> database+version;
> >>>
> >>>
> >>>
> >>> MySQL 5.6-48 for example. Now, an
> >>> operator can restrict the
> instance
> >>> flavors, or volume types that can
> be
> >>> associated with the specific
> >>> datastore. And the flavor could
> be
> >>> used to map to, for example
> whether
> >>> the
> >>> instance is running on bare metal
> or
> >>> in a VM and if so with what kind
> of
> >>> hardware. That's a useful
> construct
> >>> for a service like Trove.
> >>>
> >>>
> -
> >>> User may
> not
> >>> launch
> more
> >>> than 3
> >>> clusters
> an
> >>> hour
> >>>
> >>>
> >>>
> >>> -1. This is rate
> >>> limiting and
> should
> >>> be handled by
> >>> rate-limiting
> >>>
> >>>
> >>>
> >>> services.
> >>>
> >>>
> >>>
> -
> >>> No more
> than
> >>> 500
> copies
> >>> of Oracle
> >>> may be
> run
> >>> at a time
> >>>
> >>>
> >>>
> >>> Is "Oracle" a
> >>> resource class?
> >>>
> >>>
> >>>
> >>> [amrith] As I view it,
> every
> >>> project should be free to
> >>> define its own
> >>>
> >>>
> >>>
> >>> set of resource classes and meter
> >>> them as it feels fit. So, while
> >>> Oracle
> >>> licenses may not, conceivably a
> lot
> >>> of things that Nova, Cinder, and
> the
> >>> other core projects don't care
> >>> about, are in fact relevant for a
> >>> consumer
> >>> of this library.
> >>>
> >>> While
> Nova
> >>> would be
> the
> >>> service
> that
> >>> limits
> the
> >>> number of
> >>> instances
> >>> a user
> can
> >>> have at a
> >>> time, the
> >>> ability
> for
> >>> a service
> to
> >>> limit
> this
> >>> further
> >>> should
> not
> >>> be
> >>>
> underestimated.
> >>>
> >>> In turn,
> >>> should
> Nova
> >>> and
> Cinder
> >>> also use
> the
> >>> same
> Quota
> >>>
> Management
> >>> Library,
> >>> they may
> >>> each
> impose
> >>>
> limitations
> >>> like:
> >>>
> >>> - User
> may
> >>> not
> launch
> >>> more than
> 20
> >>> huge
> >>> instances
> at
> >>> a
> >>> time
> >>>
> >>>
> >>> Not a useful
> >>> limitation IMHO.
> >>>
> >>>
> >>>
> >>> [amrith] I beg to differ.
> >>> Again a huge instance is
> >>> just an example of
> >>>
> >>>
> >>>
> >>> some flavor; and the idea is to
> >>> allow a project to place its own
> >>> metrics
> >>> and meter based on those.
> >>>
> >>>
> -
> >>> User may
> not
> >>> launch
> more
> >>> than 3
> >>> instances
> in
> >>> a minute
> >>>
> >>>
> >>>
> >>> -1. This is rate
> >>> limiting.
> >>>
> >>>
> >>>
> -
> >>> User may
> not
> >>> consume
> more
> >>> than 15TB
> of
> >>> SSD at a
> >>> time
> >>> - User
> may
> >>> not have
> >>> more than
> 30
> >>> volumes
> at a
> >>> time
> >>>
> >>> Again,
> I'm
> >>> not
> implying
> >>> that
> either
> >>> Nova or
> >>> Cinder
> >>> should
> >>> provide
> >>> these
> >>>
> capabilities.
> >>>
> >>> With this
> in
> >>> mind, I
> >>> believe
> that
> >>> the
> minimal
> >>> set of
> >>>
> operations
> >>> that
> >>> Delimiter
> >>> should
> >>> provide
> are:
> >>>
> >>> -
> >>>
> define_resource(name, max, min, user_max, user_min, ...)
> >>>
> >>>
> >>> What would the
> above
> >>> do? What service
> >>> would it be
> speaking
> >>> to?
> >>>
> >>>
> >>>
> >>> [amrith] I assume that
> this
> >>> would speak with some
> >>> backend (either
> >>>
> >>>
> >>>
> >>> keystone or the project itself)
> and
> >>> record these designated limits.
> This
> >>> is the way to register a project
> >>> specific metric like "Oracle
> >>> licenses".
> >>>
> >>>
> -
> >>>
> update_resource_limits(name, user, user_max, user_min,
> >>> ...)
> >>>
> >>>
> >>> This doesn't
> belong
> >>> in a quota
> library.
> >>> It belongs as a
> REST
> >>> API in
> >>> Keystone.
> >>>
> >>>
> >>> [amrith] Fine, same place
> >>> where the previous thing
> >>> stores the global
> >>>
> >>>
> >>>
> >>> defaults is the target of this
> call.
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> -
> >>>
> reserve_resource(name, user, size, parent_resource, ...)
> >>>
> >>>
> >>>
> >>> This doesn't
> belong
> >>> in a quota
> library
> >>> at all. I think
> >>> reservations
> >>> are not germane
> to
> >>> resource
> consumption
> >>> and should be
> >>> handled by an
> >>> external service
> at
> >>> the orchestration
> >>> layer.
> >>>
> >>>
> >>> [amrith] Again not true,
> as
> >>> illustrated above this
> >>> library is the thing
> >>>
> >>>
> >>>
> >>> that projects could use to
> determine
> >>> whether or not to honor a
> request.
> >>> This reserve/provision process
> is, I
> >>> believe required because of the
> >>> vagaries of how we want to
> implement
> >>> this in the database.
> >>>
> >>>
> -
> >>>
> provision_resource(resource, id)
> >>>
> >>>
> >>>
> >>> A quota library
> >>> should not be
> >>> provisioning
> >>> anything. A quota
> >>> library
> >>> should simply
> >>> provide a
> consistent
> >>> interface for
> >>> *checking* that a
> >>> structured
> request
> >>> for some set of
> >>> resources *can*
> be
> >>> provided by the
> >>> service.
> >>>
> >>>
> >>> [amrith] This does not
> >>> actually call Nova or
> >>> anything; merely that
> AFTER
> >>>
> >>>
> >>>
> >>> the hypothetical service has
> called
> >>> NOVA, this converts the
> reservation
> >>> (which can expire) into an actual
> >>> allocation.
> >>>
> >>>
> -
> >>>
> update_resource(id or resource, newsize)
> >>>
> >>>
> >>>
> >>> Resizing
> resources
> >>> is a bad idea,
> IMHO.
> >>> Resources are
> easier
> >>> to deal
> >>> with when they
> are
> >>> considered of
> >>> immutable size
> and
> >>> simple (i.e. not
> >>> complex or
> nested).
> >>> I think the
> problem
> >>> here is in the
> >>> definition of
> >>> resource classes
> >>> improperly.
> >>>
> >>>
> >>> [amrith] Let's leave the
> >>> quota library aside. This
> >>> assertion strikes at
> >>>
> >>>
> >>>
> >>> the very heart of things like
> Nova
> >>> resize, or for that matter Cinder
> >>> volume resize. Are those all bad
> >>> ideas? I made a 500GB Cinder
> volume
> >>> and
> >>> it is getting close to full. I'd
> >>> like to resize it to 2TB. Are you
> >>> saying
> >>> that's not a valid use case?
> >>>
> >>> For example, a
> >>> "cluster" is not
> a
> >>> resource. It is a
> >>> collection of
> >>> resources of type
> >>> node. "Resizing"
> a
> >>> cluster is a
> >>> misnomer, because
> >>> you aren't
> resizing
> >>> a resource at
> all.
> >>> Instead, you are
> >>> creating or
> >>> destroying
> resources
> >>> inside the
> cluster
> >>> (i.e. joining or
> >>> leaving
> >>>
> >>>
> >>> cluster nodes).
> >>>
> >>>
> >>> BTW, this is also
> >>> why the "resize
> >>> instance" API in
> >>> Nova is such a
> >>> giant pain in the
> >>> ass. It's
> attempting
> >>> to "modify" the
> >>> instance
> >>>
> >>>
> >>> "resource"
> >>>
> >>>
> >>> when the instance
> >>> isn't really the
> >>> resource at all.
> The
> >>> VCPU, RAM_MB,
> >>> DISK_GB, and PCI
> >>> devices are the
> >>> actual resources.
> >>> The instance is a
> >>> convenient way to
> >>> tie those
> resources
> >>> together, and
> doing
> >>> a "resize"
> >>> of the instance
> >>> behind the scenes
> >>> actually performs
> a
> >>> *move*
> >>> operation, which
> >>> isn't a *change*
> of
> >>> the original
> >>> resources.
> Rather,
> >>> it is a creation
> of
> >>> a new set of
> >>> resources (of the
> >>> new amounts) and
> a
> >>> deletion of the
> old
> >>> set of resources.
> >>>
> >>>
> >>> [amrith] that's fine, if
> all
> >>> we want is to handle the
> >>> resize operation
> >>>
> >>>
> >>>
> >>> as a new instance followed by a
> >>> deletion, that's great. But that
> >>> semantic
> >>> isn't necessarily the case for
> >>> something like (say) cinder.
> >>>
> >>> The "resize" API
> >>> call adds some
> nasty
> >>> confirmation and
> >>> cancel
> >>> semantics to the
> >>> calling interface
> >>> that hint that
> the
> >>> underlying
> >>> implementation of
> >>> the "resize"
> >>> operation is in
> >>> actuality not a
> >>> resize
> >>> at all, but
> rather a
> >>>
> create-new-and-delete-old-resources operation.
> >>>
> >>>
> >>> [amrith] And that isn't
> >>> germane to a quota
> library,
> >>> I don't think. What
> >>>
> >>>
> >>>
> >>> is, is this. Do we want to treat
> the
> >>> transient state when there are
> (for
> >>> example of Nova) two instances,
> one
> >>> of the new flavor and one of the
> old
> >>> flavor, or not. But, from the
> >>> perspective of a quota library, a
> >>> resize
> >>> operation is merely a reset of
> the
> >>> quota by the delta in the
> resource
> >>> consumed.
> >>>
> >>>
> >>>
> >>>
> >>>
> -
> >>>
> release_resource(id or resource)
> >>> -
> >>>
> expire_reservations()
> >>>
> >>>
> >>> I see no need to
> >>> have reservations
> in
> >>> the quota library
> at
> >>> all, as
> >>> mentioned above.
> >>>
> >>>
> >>> [amrith] Then I think the
> >>> quota library must
> require
> >>> that either (a) the
> >>>
> >>>
> >>>
> >>> underlying database runs
> >>> serializable or (b) database
> >>> constraints can be
> >>> used to enforce that at commit
> the
> >>> global limits are adhered to.
> >>>
> >>> As for your
> proposed
> >>> interface and
> >>> calling structure
> >>> below, I think a
> >>> much simpler
> >>> proposal would
> work
> >>> better. I'll work
> on
> >>> a cross-project
> >>> spec that
> describes
> >>> this simpler
> >>> proposal, but the
> >>> basics would be:
> >>>
> >>> 1) Have Keystone
> >>> store quota
> >>> information for
> >>> defaults (per
> >>> service
> >>> endpoint), for
> >>> tenants and for
> >>> users.
> >>>
> >>> Keystone would
> have
> >>> the set of
> canonical
> >>> resource class
> >>> names, and
> >>> each project,
> upon
> >>> handling a new
> >>> resource class,
> >>> would be
> >>> responsible for a
> >>> change submitted
> to
> >>> Keystone to add
> the
> >>> new resource
> >>>
> >>>
> >>> class code.
> >>>
> >>>
> >>> Straw man REST
> API:
> >>>
> >>>
> GET /quotas/resource-classes
> >>> 200 OK
> >>> {
> >>>
> "resource_classes":
> >>> {
> >>> "compute.vcpu": {
> >>> "service":
> >>> "compute",
> >>> "code":
> >>> "compute.vcpu",
> >>> "description": "A
> >>> virtual CPU unit"
> >>> },
> >>> "compute.ram_mb":
> {
> >>> "service":
> >>> "compute",
> >>> "code":
> >>> "compute.ram_mb",
> >>> "description":
> >>> "Memory in
> >>> megabytes"
> >>> },
> >>> ...
> >>> "volume.disk_gb":
> {
> >>> "service":
> "volume",
> >>> "code":
> >>> "volume.disk_gb",
> >>> "description":
> >>> "Amount of disk
> >>> space in
> gigabytes"
> >>> },
> >>> ...
> >>> "database.count":
> {
> >>> "service":
> >>> "database",
> >>> "code":
> >>> "database.count",
> >>> "description":
> >>> "Number of
> database
> >>> instances"
> >>> }
> >>> }
> >>> }
> >>>
> >>>
> >>> [amrith] Well, a user is
> >>> allowed to have a certain
> >>> compute quota (which
> >>>
> >>>
> >>>
> >>> is shared by Nova and Trove) but
> >>> also a Trove quota. How would
> your
> >>> representation represent that?
> >>>
> >>> # Get the default
> >>> limits for new
> >>> users...
> >>>
> GET /quotas/defaults
> >>> 200 OK
> >>> {
> >>> "quotas": {
> >>> "compute.vcpu":
> 100,
> >>> "compute.ram_mb":
> >>> 32768,
> >>> "volume.disk_gb":
> >>> 1000,
> >>> "database.count":
> 25
> >>> }
> >>> }
> >>>
> >>> # Get a specific
> >>> user's limits...
> >>>
> GET /quotas/users/{UUID}
> >>> 200 OK
> >>> {
> >>> "quotas": {
> >>> "compute.vcpu":
> 100,
> >>> "compute.ram_mb":
> >>> 32768,
> >>> "volume.disk_gb":
> >>> 1000,
> >>> "database.count":
> 25
> >>> }
> >>> }
> >>>
> >>> # Get a tenant's
> >>> limits...
> >>>
> GET /quotas/tenants/{UUID}
> >>> 200 OK
> >>> {
> >>> "quotas": {
> >>> "compute.vcpu":
> >>> 1000,
> >>> "compute.ram_mb":
> >>> 327680,
> >>> "volume.disk_gb":
> >>> 10000,
> >>> "database.count":
> >>> 250
> >>> }
> >>> }
> >>>
> >>> 2) Have Delimiter
> >>> communicate with
> the
> >>> above proposed
> new
> >>> Keystone
> >>> REST API and
> package
> >>> up data into an
> >>>
> oslo.versioned_objects interface.
> >>>
> >>> Clearly all of
> the
> >>> above can be
> heavily
> >>> cached both on
> the
> >>> server and
> >>> client side since
> >>> they rarely
> change
> >>> but are read
> often.
> >>>
> >>>
> >>> [amrith] Caching on the
> >>> client won't save you
> from
> >>> oversubscription if
> >>>
> >>>
> >>>
> >>> you don't run serializable.
> >>>
> >>>
> >>> The Delimiter
> >>> library could be
> >>> used to provide a
> >>> calling interface
> >>> for service
> projects
> >>> to get a user's
> >>> limits for a set
> of
> >>> resource
> >>>
> >>>
> >>> classes:
> >>>
> >>>
> >>> (please excuse
> >>> wrongness, typos,
> >>> and other stuff
> >>> below, it's just
> a
> >>> straw- man not
> >>> production
> working
> >>> code...)
> >>>
> >>> # file:
> >>>
> delimiter/objects/limits.py
> >>> import
> >>>
> oslo.versioned_objects.base as ovo import
> >>>
> oslo.versioned_objects.fields as ovo_fields
> >>>
> >>>
> >>> class
> >>>
> ResourceLimit(ovo.VersionedObjectBase):
> >>> # 1.0: Initial
> >>> version
> >>> VERSION = '1.0'
> >>>
> >>> fields = {
> >>> 'resource_class':
> >>>
> ovo_fields.StringField(),
> >>> 'amount':
> >>>
> ovo_fields.IntegerField(),
> >>> }
> >>>
> >>>
> >>> class
> >>>
> ResourceLimitList(ovo.VersionedObjectBase):
> >>> # 1.0: Initial
> >>> version
> >>> VERSION = '1.0'
> >>>
> >>> fields = {
> >>> 'resources':
> >>>
> ListOfObjectsField(ResourceLimit),
> >>> }
> >>>
> >>>
> @cache_this_heavily
> >>>
> @remotable_classmethod
> >>> def
> >>>
> get_all_by_user(cls,
> >>> user_uuid):
> >>> """Returns a
> Limits
> >>> object that tells
> >>> the caller what a
> >>> user's
> >>> absolute limits
> for
> >>> the set of
> resource
> >>> classes in the
> >>> system.
> >>> """
> >>> # Grab a keystone
> >>> client session
> >>> object and
> connect
> >>> to Keystone
> >>> ks =
> >>>
> ksclient.Session(...)
> >>> raw_limits =
> >>>
> ksclient.get_limits_by_user()
> >>> return
> >>>
> cls(resources=[ResourceLimit(**d) for d in raw_limits])
> >>>
> >>> 3) Each service
> >>> project would be
> >>> responsible for
> >>> handling the
> >>> consumption of a
> set
> >>> of requested
> >>> resource amounts
> in
> >>> an atomic and
> >>>
> >>>
> >>> consistent way.
> >>>
> >>>
> >>> [amrith] This is where
> the
> >>> rubber meets the road.
> What
> >>> is that atomic
> >>>
> >>>
> >>>
> >>> and consistent way? And what
> >>> computing infrastructure do you
> need
> >>> to
> >>> deliver this?
> >>>
> >>> The Delimiter
> >>> library would
> return
> >>> the limits that
> the
> >>> service would
> >>> pre- check before
> >>> claiming the
> >>> resources and
> either
> >>> post-check after
> >>> claim or utilize
> a
> >>>
> compare-and-update
> >>> technique with a
> >>>
> generation/timestamp
> >>> during claiming
> to
> >>> prevent race
> >>> conditions.
> >>>
> >>> For instance, in
> >>> Nova with the new
> >>> resource
> providers
> >>> database schema
> >>> and doing claims
> in
> >>> the scheduler (a
> >>> proposed change),
> we
> >>> might do
> >>> something to the
> >>> effect of:
> >>>
> >>> from delimiter
> >>> import objects as
> >>> delim_obj from
> >>> delimier import
> >>> exceptions as
> >>> delim_exc from
> nova
> >>> import objects as
> >>> nova_obj
> >>>
> >>> request =
> >>>
> nova_obj.RequestSpec.get_by_uuid(request_uuid)
> >>> requested =
> >>> request.resources
> >>> limits =
> >>>
> delim_obj.ResourceLimitList.get_all_by_user(user_uuid)
> >>> allocations =
> >>>
> nova_obj.AllocationList.get_all_by_user(user_uuid)
> >>>
> >>> # Pre-check for
> >>> violations
> >>> for
> resource_class,
> >>> requested_amount
> in
> >>>
> requested.items():
> >>> limit_idx =
> >>>
> limits.resources.index(resource_class)
> >>> resource_limit =
> >>>
> limits.resources[limit_idx].amount
> >>> alloc_idx =
> >>>
> allocations.resources.index(resource_class)
> >>> resource_used =
> >>>
> allocations.resources[alloc_idx]
> >>> if (resource_used
> +
> >>>
> requested_amount)>
> >>> resource_limit:
> >>> raise
> >>>
> delim_exc.QuotaExceeded
> >>>
> >>>
> >>> [amrith] Is the above
> code
> >>> run with some global
> mutex
> >>> to prevent that
> >>>
> >>>
> >>>
> >>> two people don't believe that
> they
> >>> are good on quota at the same
> time?
> >>>
> >>>
> >>> # Do claims in
> >>> scheduler in an
> >>> atomic,
> consistent
> >>> fashion...
> >>> claims =
> >>>
> scheduler_client.claim_resources(request)
> >>>
> >>>
> >>> [amrith] Yes, each
> 'atomic'
> >>> claim on a
> repeatable-read
> >>> database could
> >>>
> >>>
> >>>
> >>> result in oversubscription.
> >>>
> >>>
> >>> # Post-check for
> >>> violations
> >>> allocations =
> >>>
> nova_obj.AllocationList.get_all_by_user(user_uuid)
> >>> # allocations now
> >>> include the
> claimed
> >>> resources from
> the
> >>> scheduler
> >>>
> >>> for
> resource_class,
> >>> requested_amount
> in
> >>>
> requested.items():
> >>> limit_idx =
> >>>
> limits.resources.index(resource_class)
> >>> resource_limit =
> >>>
> limits.resources[limit_idx].amount
> >>> alloc_idx =
> >>>
> allocations.resources.index(resource_class)
> >>> resource_used =
> >>>
> allocations.resources[alloc_idx]
> >>> if resource_used>
> >>> resource_limit:
> >>> # Delete the
> >>> allocation
> records
> >>> for the resources
> >>> just claimed
> >>>
> delete_resources(claims)
> >>> raise
> >>>
> delim_exc.QuotaExceeded
> >>>
> >>>
> >>> [amrith] Again, two
> people
> >>> could drive through this
> >>> code and both of
> >>> them could fail :(
> >>>
> >>> 4) The only other
> >>> thing that would
> >>> need to be done
> for
> >>> a first go of
> >>> the Delimiter
> >>> library is some
> >>> event listener
> that
> >>> can listen for
> >>> changes to the
> quota
> >>> limits for a
> >>>
> user/tenant/default
> >>> in Keystone.
> >>> We'd want the
> >>> services to be
> able
> >>> notify someone if
> a
> >>> reduction in
> >>> quota results in
> an
> >>> overquota
> situation.
> >>>
> >>> Anyway, that's my
> >>> idea. Keep the
> >>> Delimiter library
> >>> small and focused
> >>> on describing the
> >>> limits only, not
> on
> >>> the resource
> >>> allocations. Have
> >>> the Delimiter
> >>> library present a
> >>> versioned object
> >>> interface so the
> >>> interaction
> between
> >>> the data exposed
> by
> >>> the Keystone REST
> >>> API for
> >>> quotas can evolve
> >>> naturally and
> >>> smoothly over
> time.
> >>>
> >>> Best,
> >>> -jay
> >>>
> >>> Let me
> >>>
> illustrate
> >>> the way I
> >>> see these
> >>> things
> >>> fitting
> >>> together.
> A
> >>>
> hypothetical
> >>> Trove
> system
> >>> may be
> setup
> >>> as
> follows:
> >>>
> >>> - No more
> >>> than 2000
> >>> database
> >>> instances
> in
> >>> total,
> 300
> >>> clusters
> >>>
> >>>
> >>> in
> >>>
> >>>
> >>>
> >>> total
> >>> - Users
> may
> >>> not
> launch
> >>> more than
> 25
> >>> database
> >>>
> instances,
> >>> or 4
> >>> clusters
> >>> - The
> >>>
> particular
> >>> user
> >>> 'amrith'
> is
> >>> limited
> to 2
> >>> databases
> >>> and
> >>>
> >>>
> >>> 1
> >>>
> >>>
> >>>
> >>> cluster
> >>> - No user
> >>> may
> consume
> >>> more than
> >>> 20TB of
> >>> storage
> at a
> >>> time
> >>> - No user
> >>> may
> consume
> >>> more than
> >>> 10GB of
> >>> memory at
> a
> >>> time
> >>>
> >>> At
> startup,
> >>> I believe
> >>> that the
> >>> system
> would
> >>> make the
> >>> following
> >>> sequence
> of
> >>> calls:
> >>>
> >>> -
> >>>
> define_resource(databaseInstance, 2000, 0, 25, 0, ...)
> >>> -
> >>>
> update_resource_limits(databaseInstance, amrith, 2, 0,
> >>>
> >>>
> >>> ...)
> >>>
> >>>
> >>>
> -
> >>>
> define_resource(databaseCluster, 300, 0, 4, 0, ...)
> >>> -
> >>>
> update_resource_limits(databaseCluster, amrith, 1, 0, ...)
> >>> -
> >>>
> define_resource(storage, -1, 0, 20TB, 0, ...)
> >>> -
> >>>
> define_resource(memory, -1, 0, 10GB, 0, ...)
> >>>
> >>> Assume
> that
> >>> the user
> >>> john
> comes
> >>> along and
> >>> asks for
> a
> >>> cluster
> with
> >>> 4
> >>> nodes,
> 1TB
> >>> storage
> per
> >>> node and
> >>> each node
> >>> having
> 1GB
> >>> of
> memory,
> >>> the
> >>> system
> would
> >>> go
> through
> >>> the
> >>> following
> >>> sequence:
> >>>
> >>> -
> >>>
> reserve_resource(databaseCluster, john, 1, None)
> >>> o this
> >>> returns a
> >>>
> resourceID
> >>> (say
> >>>
> cluster-resource-
> >>>
> >>>
> >>> ID)
> >>>
> >>>
> >>>
> >>> o
> the
> >>> cluster
> >>> instance
> >>> that it
> >>> reserves
> >>> counts
> >>>
> >>>
> >>>
> >>> against
> >>>
> >>>
> >>>
> >>>
> the
> >>> limit of
> 300
> >>> cluster
> >>> instances
> in
> >>> total, as
> >>> well
> >>>
> >>>
> >>>
> >>> as
> >>>
> >>>
> >>>
> >>>
> the 4
> >>> clusters
> >>> that john
> >>> can
> >>>
> provision.
> >>> If
> 'amrith'
> >>>
> >>>
> >>>
> >>> had
> >>>
> >>>
> >>>
> >>>
> >>>
> requested
> >>> it, that
> >>> would
> have
> >>> been
> counted
> >>> against
> >>>
> >>>
> >>>
> >>> the
> >>>
> >>>
> >>>
> >>>
> limit
> >>> of 2
> >>> clusters
> for
> >>> the user.
> >>>
> >>> -
> >>>
> reserve_resource(databaseInstance, john, 1,
> >>>
> cluster-resource-id)
> >>> -
> >>>
> reserve_resource(databaseInstance, john, 1,
> >>>
> cluster-resource-id)
> >>> -
> >>>
> reserve_resource(databaseInstance, john, 1,
> >>>
> cluster-resource-id)
> >>> -
> >>>
> reserve_resource(databaseInstance, john, 1,
> >>>
> cluster-resource-id)
> >>> o this
> >>> returns
> four
> >>> resource
> >>> id's,
> let's
> >>> say
> >>>
> instance-1-id, instance-2-id, instance-3-id,
> >>>
> instance-4-id
> >>> o note
> that
> >>> each
> >>> instance
> is
> >>> that, an
> >>> instance
> by
> >>> itself.
> it
> >>> is
> therefore
> >>> not right
> to
> >>> consider
> >>> this
> >>>
> >>>
> >>> as
> >>>
> >>>
> >>>
> >>>
> >>>
> equivalent
> >>> to a call
> to
> >>>
> reserve_resource() with a
> >>>
> >>>
> >>>
> >>> size
> >>>
> >>>
> >>>
> >>> of
> 4,
> >>>
> especially
> >>> because
> each
> >>> instance
> >>> could
> later
> >>>
> >>>
> >>>
> >>> be
> >>>
> >>>
> >>>
> >>>
> >>> tracked
> as
> >>> an
> >>>
> individual
> >>> Nova
> >>> instance.
> >>>
> >>> -
> >>>
> reserve_resource(storage, john, 1TB, instance-1-id)
> >>> -
> >>>
> reserve_resource(storage, john, 1TB, instance-2-id)
> >>> -
> >>>
> reserve_resource(storage, john, 1TB, instance-3-id)
> >>> -
> >>>
> reserve_resource(storage, john, 1TB, instance-4-id)
> >>>
> >>> o each of
> >>> them
> returns
> >>> some
> >>>
> resourceID,
> >>> let's say
> >>>
> >>>
> >>> they
> >>>
> >>>
> >>>
> >>>
> >>> returned
> >>>
> cinder-1-id,
> >>>
> cinder-2-id,
> >>>
> cinder-3-id,
> >>>
> cinder-4-id
> >>> o since
> the
> >>> storage
> of
> >>> 1TB is a
> >>> unit, it
> is
> >>> treated
> >>>
> >>>
> >>> as
> >>>
> >>>
> >>>
> >>>
> such.
> >>> In other
> >>> words,
> you
> >>> don't
> need
> >>> to invoke
> >>>
> reserve_resource 10^12 times, once per byte
> >>> allocated
> >>> :)
> >>>
> >>> -
> >>>
> reserve_resource(memory, john, 1GB, instance-1-id)
> >>> -
> >>>
> reserve_resource(memory, john, 1GB, instance-2-id)
> >>> -
> >>>
> reserve_resource(memory, john, 1GB, instance-3-id)
> >>> -
> >>>
> reserve_resource(memory, john, 1GB, instance-4-id)
> >>> o each of
> >>> these
> return
> >>>
> something,
> >>> say
> >>>
> Dg4KBQcODAENBQEGBAcEDA, CgMJAg8FBQ8GDwgLBA8FAg,
> >>>
> BAQJBwYMDwAIAA0DBAkNAg, AQMLDA4OAgEBCQ0MBAMGCA. I
> >>>
> >>>
> >>> have
> >>>
> >>>
> >>>
> >>>
> made
> >>> up
> arbitrary
> >>> strings
> just
> >>> to
> highlight
> >>> that we
> >>> really
> don't
> >>> track
> these
> >>> anywhere
> so
> >>> we don't
> >>> care
> >>>
> >>>
> >>> about
> >>>
> >>>
> >>>
> >>>
> them.
> >>>
> >>> If all
> this
> >>> works,
> then
> >>> the
> system
> >>> knows
> that
> >>> John's
> >>> request
> does
> >>> not
> violate
> >>> any
> quotas
> >>> that it
> can
> >>> enforce,
> it
> >>> can then
> go
> >>> ahead and
> >>> launch
> the
> >>> instances
> >>> (calling
> >>> Nova),
> >>> provision
> >>> storage,
> and
> >>> so on.
> >>>
> >>> The
> system
> >>> then goes
> >>> and
> creates
> >>> four
> Cinder
> >>> volumes,
> >>> these are
> >>>
> cinder-1-uuid, cinder-2-uuid, cinder-3-uuid, cinder-4-uuid.
> >>>
> >>> It can
> then
> >>> go and
> >>> confirm
> >>> those
> >>>
> reservations.
> >>>
> >>> -
> >>>
> provision_resource(cinder-1-id, cinder-1-uuid)
> >>> -
> >>>
> provision_resource(cinder-2-id, cinder-2-uuid)
> >>> -
> >>>
> provision_resource(cinder-3-id, cinder-3-uuid)
> >>> -
> >>>
> provision_resource(cinder-4-id, cinder-4-uuid)
> >>>
> >>> It could
> >>> then go
> and
> >>> launch 4
> >>> nova
> >>> instances
> >>> and
> >>> similarly
> >>> provision
> >>> those
> >>>
> resources,
> >>> and so
> on.
> >>> This
> process
> >>> could
> take
> >>> some
> minutes
> >>> and
> >>> holding a
> >>> database
> >>>
> transaction
> >>> open for
> >>> this is
> the
> >>> issue
> that
> >>> Jay
> >>> brings up
> in
> >>> [4]. We
> >>> don't
> have
> >>> to in
> this
> >>> proposed
> >>> scheme.
> >>>
> >>> Since the
> >>> resources
> >>> are all
> >>>
> hierarchically linked through the
> >>> overall
> >>> cluster
> id,
> >>> when the
> >>> cluster
> is
> >>> setup, it
> >>> can
> finally
> >>> go and
> >>> provision
> >>> that:
> >>>
> >>> -
> >>>
> provision_resource(cluster-resource-id, cluster-uuid)
> >>>
> >>> When
> Trove
> >>> is done
> with
> >>> some
> >>>
> individual
> >>> resource,
> it
> >>> can go
> and
> >>> release
> it.
> >>> Note that
> >>> I'm
> thinking
> >>> this will
> >>> invoke
> >>>
> release_resource
> >>> with the
> ID
> >>> of the
> >>>
> underlying
> >>> object OR
> >>> the
> >>> resource.
> >>>
> >>> -
> >>>
> release_resource(cinder-4-id), and
> >>> -
> >>>
> release_resource(cinder-4-uuid)
> >>>
> >>> are
> >>> therefore
> >>> identical
> >>> and
> indicate
> >>> that the
> 4th
> >>> 1TB
> volume
> >>> is now
> >>> released.
> >>> How this
> >>> will be
> >>>
> implemented
> >>> in
> Python,
> >>> kwargs or
> >>> some
> >>> other
> >>> mechanism
> >>> is, I
> >>> believe,
> an
> >>>
> implementation detail.
> >>>
> >>> Finally,
> it
> >>> releases
> the
> >>> cluster
> >>> resource
> by
> >>> doing
> this:
> >>>
> >>> -
> >>>
> release_resource(cluster-resource-id)
> >>>
> >>> This
> would
> >>> release
> the
> >>> cluster
> and
> >>> all
> >>> dependent
> >>> resources
> in
> >>> a
> >>> single
> >>>
> operation.
> >>>
> >>> A user
> may
> >>> wish to
> >>> manage a
> >>> resource
> >>> that was
> >>>
> provisioned
> >>> from the
> >>> service.
> >>> Assume
> that
> >>> this
> results
> >>> in a
> >>> resizing
> of
> >>> the
> >>>
> instances,
> >>> then it
> is a
> >>> matter of
> >>> updating
> >>> that
> >>> resource.
> >>>
> >>> Assume
> that
> >>> the third
> >>> 1TB
> volume
> >>> is being
> >>> resized
> to
> >>> 2TB, then
> it
> >>> is
> >>> merely a
> >>> matter of
> >>> invoking:
> >>>
> >>> -
> >>>
> update_resource(cinder-3-uuid, 2TB)
> >>>
> >>> Delimiter
> >>> can go
> >>> figure
> out
> >>> that
> >>>
> cinder-3-uuid is a 1TB device and
> >>> therefore
> >>> this is
> an
> >>> increase
> of
> >>> 1TB and
> >>> verify
> that
> >>> this is
> >>> within
> >>> the
> quotas
> >>> allowed
> for
> >>> the user.
> >>>
> >>> The thing
> >>> that I
> find
> >>>
> attractive
> >>> about
> this
> >>> model of
> >>>
> maintaining
> >>> a
> >>> hierarchy
> of
> >>>
> reservations
> >>> is that
> in
> >>> the event
> of
> >>> an error,
> >>> the
> >>> service
> need
> >>> merely
> call
> >>>
> release_resource() on the highest level
> >>>
> reservation
> >>> and the
> >>> Delimiter
> >>> project
> can
> >>> walk down
> >>> the chain
> >>> and
> >>> release
> all
> >>> the
> >>> resources
> or
> >>>
> reservations
> >>> as
> >>>
> appropriate.
> >>>
> >>> Under the
> >>> covers I
> >>> believe
> that
> >>> each of
> >>> these
> >>>
> operations
> >>> should be
> >>> atomic
> and
> >>> may
> update
> >>> multiple
> >>> database
> >>> tables
> but
> >>> these
> will
> >>> all be
> >>> short
> lived
> >>>
> operations.
> >>>
> >>> For
> example,
> >>> reserving
> an
> >>> instance
> >>> resource
> >>> would
> >>> increment
> >>> the
> >>> number of
> >>> instances
> >>> for the
> user
> >>> as well
> as
> >>> the
> number
> >>> of
> instances
> >>> on the
> >>> whole,
> and
> >>> this
> would
> >>> be an
> atomic
> >>>
> operation.
> >>>
> >>> I have
> two
> >>> primary
> >>> areas of
> >>> concern
> >>> about the
> >>> proposal
> >>> [3].
> >>>
> >>> The first
> is
> >>> that it
> >>> makes the
> >>> implicit
> >>>
> assumption
> >>> that the
> >>> "flat
> mode"
> >>> is
> >>>
> implemented.
> >>> That
> >>> provides
> >>> value to
> a
> >>>
> >>>
> >>> consumer
> >>>
> >>>
> >>>
> >>> but I
> think
> >>> it leaves
> a
> >>> lot for
> the
> >>> consumer
> to
> >>> do. For
> >>>
> >>>
> >>>
> >>> example,
> >>>
> >>>
> >>>
> I
> >>> find it
> hard
> >>> to see
> how
> >>> the model
> >>> proposed
> >>> would
> handle
> >>>
> >>>
> >>>
> >>> the
> >>>
> >>>
> >>>
> >>> release
> of
> >>> quotas,
> >>> leave
> alone
> >>> the case
> of
> >>> a nested
> >>> release
> of
> >>>
> >>>
> >>> a
> >>>
> >>>
> >>>
> >>>
> hierarchy
> >>> of
> >>>
> resources.
> >>>
> >>> The other
> is
> >>> the
> notion
> >>> that the
> >>>
> implementation will begin a
> >>>
> transaction,
> >>> perform a
> >>> query(),
> >>> make some
> >>>
> manipulations, and
> >>> then do a
> >>> save().
> This
> >>> makes for
> an
> >>>
> interesting
> >>>
> transaction
> >>>
> management
> >>> challenge
> as
> >>> it would
> >>> require
> the
> >>>
> underlying
> >>>
> >>>
> >>> database
> >>>
> >>>
> >>>
> >>> to run
> in
> >>> an
> isolation
> >>> mode of
> at
> >>> least
> >>>
> repeatable
> >>> reads and
> >>> maybe
> even
> >>>
> serializable
> >>> which
> would
> >>> be a
> >>>
> performance
> >>> bear on
> >>>
> >>>
> >>> a
> >>>
> >>>
> >>>
> >>> heavily
> >>> loaded
> >>> system.
> If
> >>> run in
> the
> >>>
> traditional
> >>> read-
> >>>
> >>>
> >>>
> >>> committed
> >>>
> >>>
> >>>
> >>> mode,
> this
> >>> would
> >>> silently
> >>> lead to
> over
> >>>
> subscriptions, and
> >>>
> >>>
> >>>
> >>> the
> >>>
> >>>
> >>>
> >>>
> violation
> >>> of quota
> >>> limits.
> >>>
> >>> I believe
> >>> that it
> >>> should be
> a
> >>>
> requirement
> >>> that the
> >>> Delimiter
> >>> library
> >>> should be
> >>> able to
> run
> >>> against a
> >>> database
> >>> that
> >>> supports,
> >>> and is
> >>>
> configured
> >>> for
> >>>
> READ-COMMITTED, and should not require anything higher.
> >>> The model
> >>> proposed
> >>> above can
> >>> certainly
> be
> >>>
> implemented
> >>> with a
> >>> database
> >>> running
> >>>
> READ-COMMITTED, and I believe that this is also
> >>> true with
> >>> the
> caveat
> >>> that the
> >>>
> operations
> >>> will be
> >>> performed
> >>> through
> >>>
> >>>
> >>> SQLAlchemy.
> >>>
> >>>
> >>> Thanks,
> >>>
> >>> -amrith
> >>>
> >>> [1]
> >>>
> http://openstack.markmail.org/thread/tkl2jcyvzgifniux
> >>> [2]
> >>>
> http://openstack.markmail.org/thread/3cr7hoeqjmgyle2j
> >>> [3]
> >>>
> https://review.openstack.org/#/c/284454/
> >>> [4]
> >>>
> http://markmail.org/message/7ixvezcsj3uyiro6
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> ____________________________________________________________________
> >>> __ ____
> >>> OpenStack
> >>>
> Development
> >>> Mailing
> List
> >>> (not for
> >>> usage
> >>>
> questions)
> >>>
> Unsubscribe:
> >>>
> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> >>>
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> >>>
> >>>
> >>>
> _____________________________________________________________________
> >>> _____ OpenStack
> >>> Development
> Mailing
> >>> List (not for
> usage
> >>> questions)
> >>> Unsubscribe:
> >>>
> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> >>>
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> >>>
> >>>
> >>>
> ______________________________________________________________________
> >>> ____ OpenStack
> Development
> >>> Mailing List (not for
> usage
> >>> questions)
> >>> Unsubscribe:
> >>>
> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> >>>
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> >>>
> >>>
> >>>
> __________________________________________________________________________
> >>> OpenStack Development Mailing
> List
> >>> (not for usage questions)
> >>> Unsubscribe:
> >>>
> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> >>>
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> >>>
> >>>
> >>>
> >>>
> __________________________________________________________________________
> >>> OpenStack Development Mailing List (not
> for
> >>> usage questions)
> >>> Unsubscribe:
> >>>
> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> >>>
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> >>>
> >>>
> >>>
> >>>
> __________________________________________________________________________
> >>> OpenStack Development Mailing List (not for usage
> >>> questions)
> >>> Unsubscribe:
> >>>
> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> >>>
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> __________________________________________________________________________
> >>>
> >>> OpenStack Development Mailing List (not for usage questions)
> >>>
> >>> Unsubscribe:
> >>> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> >>>
> >>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> >>>
> >>
> >>
> __________________________________________________________________________
> >> OpenStack Development Mailing List (not for usage questions)
> >> Unsubscribe:
> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> >
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 966 bytes
Desc: This is a digitally signed message part
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20160425/16c064a6/attachment.pgp>
More information about the OpenStack-dev
mailing list