[openstack-dev] [nova] [placement] [api] cache headers in placement service
Jay Pipes
jaypipes at gmail.com
Mon Aug 21 09:35:13 UTC 2017
On 08/21/2017 04:59 AM, Chris Dent wrote:
> On Sun, 20 Aug 2017, Jay Pipes wrote:
>> On 08/18/2017 01:23 PM, Chris Dent wrote:
>>> So my change above adds 'last-modified' and 'cache-control:
>>> no-cache' to GET of /resource_providers and
>>> /resource_providers/{uuid} and proposes we do it for everything
>>> else.
>>>
>>> Should we?
>>
>> No. :) Not everything. In particular, I think both the GET
>> /resource_classes and GET /traits URIs are very cacheable and we
>> shouldn't disallow proxies from caching that content if they want to.
>
> Except that unless we have cache validation handling on the server
> side, which we don't, then the "very cacheable" dependent on use
> setting a max-age and coming to agreement over what the right
> max-age seems unlikely. The simpler solution is to not cache.
We do have cache validation on the server side for resource classes. Any
time a resource class is added or deleted, we call _RC_CACHE.clear().
Couldn't we add a single attribute to the ResourceClassCache that
returns the last time the cache was reset?
But meh, you're right that the simpler solution is just to not do HTTP
caching.
>>> If we do, some things to think about:
>>>
>>> * The related OVO will need the updated_at and created_at
>>> fields exposed. This is pretty easy to do with the
>>> NovaTimestampObject mixin. This doesn't need to require a object
>>> version bump because we don't do RPC with them.
>>
>> Technically, only the updated_at field would need to be exposed via
>> the OVO objects. But OK, sure. I'd even advocate a patch to OVO that
>> would bring in the NovaTimestampObject mixin. Just call it Timestamped
>> or something...
>
> The way the database tables are currently set up, when a entity is
> first created, created_at is set, and updated_at is null. Therefore,
> on new entities, failing over to created_at when updated_at is null
> is necessary.
>
> The work I've done thus far has tried to have the smallest impact on
> the database tables and the queries used to get at them. They're
> already complex enough.
>
> The entity tables already have created_at and updated_at columns.
> Exposing those columns on the objects is a matter of adding the
> mixin.
Right.
> I agree that making a change on OVO to have a Timestamped would be
> useful.
>
>>> * The current implementation of getting the last modified time for a
>>> collection of resources is intentionally naive and decoupled from
>>> other stuff. For very large result sets[3] this might be annoying,
>>> but since we are already doing plenty of traversing of long lists,
>>> it may not be a big deal. If it is we can incorporate getting the
>>> last modified time in the loop that serializes objects to JSON
>>> output.
>>
>> I'm not sure what you're referring to above as "intentionally naive
>> and decoupled from other stuff"? Adding the updated_at field of the
>> underlying DB tables would be trivial -- maybe 10-15 lines total for
>> DB/object layer and REST API as well. Am I missing something?
>
> By "other stuff" I mean two things:
>
> * the code is nova/objects/resource_provider.py
> * the serialization (to JSON) code in placement/handlers/*.py
>
> For those requests that return collections, we _could_ adapt the
> queries used to retrieve those resources to find us the max
> updated_at time during the query.
No, I don't recommend that... just return the updated_at and created_at
fields.
> Or we could also do the same while traversing the list of objects to
> create the JSON output.
Yeah, that's fine.
> I've chosen not to do the DB/object side changes because that is a
> maze of many twisting passages, composed in fun ways. For those
> situations where a list of native (e.g. /resource_providers) objects
> it return it is simply easier to extract the info later in the
> process. For those situations there the returned data is composed on
> the fly (e.g. /allocation_candidates, /usages) we want the
> last-modified to be now() anyway, so it doesn't matter.
>
> So the concern/question is around whether people deem it a problem
> to traverse the list of objects a second time after already
> traversing them a first time to create the JSON output. If so, we
> can make the serialization loop have two purposes.
I have no problem calculating the last modified time in the
serialization loop.
But then again, if the default behaviour of HTTP is to never cache
anything unless some cache-related headers are present [1] and you
*don't* want proxies to cache any placement API information, why are we
changing anything at all anyway? If we left it alone (and continue not
sending Cache-Control headers for anything), the same exact result would
be achieved, no?
Best,
-jay
[1] https://tools.ietf.org/html/rfc7234#page-5
More information about the OpenStack-dev
mailing list