[openstack-dev] [nova] [placement] [api] cache headers in placement service

Jay Pipes jaypipes at gmail.com
Mon Aug 21 09:35:13 UTC 2017


On 08/21/2017 04:59 AM, Chris Dent wrote:
> On Sun, 20 Aug 2017, Jay Pipes wrote:
>> On 08/18/2017 01:23 PM, Chris Dent wrote:
>>> So my change above adds 'last-modified' and 'cache-control:
>>> no-cache' to GET of /resource_providers and
>>> /resource_providers/{uuid} and proposes we do it for everything
>>> else.
>>>
>>> Should we?
>>
>> No. :) Not everything. In particular, I think both the GET 
>> /resource_classes and GET /traits URIs are very cacheable and we 
>> shouldn't disallow proxies from caching that content if they want to.
> 
> Except that unless we have cache validation handling on the server
> side, which we don't, then the "very cacheable" dependent on use
> setting a max-age and coming to agreement over what the right
> max-age seems unlikely. The simpler solution is to not cache.

We do have cache validation on the server side for resource classes. Any 
time a resource class is added or deleted, we call _RC_CACHE.clear(). 
Couldn't we add a single attribute to the ResourceClassCache that 
returns the last time the cache was reset?

But meh, you're right that the simpler solution is just to not do HTTP 
caching.

>>> If we do, some things to think about:
>>>
>>> * The related OVO will need the updated_at and created_at
>>>    fields exposed. This is pretty easy to do with the
>>>    NovaTimestampObject mixin. This doesn't need to require a object
>>>    version bump because we don't do RPC with them.
>>
>> Technically, only the updated_at field would need to be exposed via 
>> the OVO objects. But OK, sure. I'd even advocate a patch to OVO that 
>> would bring in the NovaTimestampObject mixin. Just call it Timestamped 
>> or something...
> 
> The way the database tables are currently set up, when a entity is
> first created, created_at is set, and updated_at is null. Therefore,
> on new entities, failing over to created_at when updated_at is null
> is necessary.
> 
> The work I've done thus far has tried to have the smallest impact on
> the database tables and the queries used to get at them. They're
> already complex enough.
> 
> The entity tables already have created_at and updated_at columns.
> Exposing those columns on the objects is a matter of adding the
> mixin.

Right.

> I agree that making a change on OVO to have a Timestamped would be
> useful.
> 
>>> * The current implementation of getting the last modified time for a
>>>    collection of resources is intentionally naive and decoupled from
>>>    other stuff. For very large result sets[3] this might be annoying,
>>>    but since we are already doing plenty of traversing of long lists,
>>>    it may not be a big deal. If it is we can incorporate getting the
>>>    last modified time in the loop that serializes objects to JSON
>>>    output.
>>
>> I'm not sure what you're referring to above as "intentionally naive 
>> and decoupled from other stuff"? Adding the updated_at field of the 
>> underlying DB tables would be trivial -- maybe 10-15 lines total for 
>> DB/object layer and REST API as well. Am I missing something?
> 
> By "other stuff" I mean two things:
> 
> * the code is nova/objects/resource_provider.py
> * the serialization (to JSON) code in placement/handlers/*.py
> 
> For those requests that return collections, we _could_ adapt the
> queries used to retrieve those resources to find us the max
> updated_at time during the query.

No, I don't recommend that... just return the updated_at and created_at 
fields.

> Or we could also do the same while traversing the list of objects to
> create the JSON output.

Yeah, that's fine.

> I've chosen not to do the DB/object side changes because that is a
> maze of many twisting passages, composed in fun ways. For those
> situations where a list of native (e.g. /resource_providers) objects
> it return it is simply easier to extract the info later in the
> process. For those situations there the returned data is composed on
> the fly (e.g. /allocation_candidates, /usages) we want the
> last-modified to be now() anyway, so it doesn't matter.
> 
> So the concern/question is around whether people deem it a problem
> to traverse the list of objects a second time after already
> traversing them a first time to create the JSON output. If so, we
> can make the serialization loop have two purposes.

I have no problem calculating the last modified time in the 
serialization loop.

But then again, if the default behaviour of HTTP is to never cache 
anything unless some cache-related headers are present [1] and you 
*don't* want proxies to cache any placement API information, why are we 
changing anything at all anyway? If we left it alone (and continue not 
sending Cache-Control headers for anything), the same exact result would 
be achieved, no?

Best,
-jay

[1] https://tools.ietf.org/html/rfc7234#page-5



More information about the OpenStack-dev mailing list