[openstack-dev] [oslo] memoizer aka cache

Henry Gessau gessau at cisco.com
Thu Jan 23 16:42:23 UTC 2014


Top posting to point out that:

In Python3 there is a generic memoizer in functools called lru_cache.
And here is a backport to Python 2.7:
  https://pypi.python.org/pypi/functools32

That leaves Python 2.6. Maybe some clever wrapping in Oslo can make it
available to all versions?

On Thu, Jan 23, at 11:07 am, Shawn Hartsock <hartsock at acm.org> wrote:

> I would like to have us adopt a memoizing caching library of some kind
> for use with OpenStack projects. I have no strong preference at this
> time and I would like suggestions on what to use.
> 
> I have seen a number of patches where people have begun to implement
> their own caches in dictionaries. This typically confuses the code and
> mixes issues of correctness and performance in code.
> 
> Here's an example:
> 
> We start with:
> 
> def my_thing_method(some_args):
>     # do expensive work
>     return value
> 
> ... but a performance problem is detected... maybe the method is
> called 15 times in 10 seconds but then not again for 5 minutes and the
> return value can only logically change every minute or two... so we
> end up with ...
> 
> _GLOBAL_THING_CACHE = {}
> 
> def my_thing_method(some_args):
>     key = key_from(some_args)
>      if key in _GLOBAL_THING_CACHE:
>          return _GLOBAL_THING_CACHE[key]
>      else:
>           # do expensive work
>           _GLOBAL_THING_CACHE[key] = value
>           return value
> 
> ... which is all well and good... but now as a maintenance programmer
> I need to comprehend the cache mechanism, when cached values are
> invalidated, and if I need to debug the "do expensive work" part I
> need to tease out some test that prevents the cache from being hit.
> Plus I've introduced a new global variable. We love globals right?
> 
> I would like us to be able to say:
> 
> @memoize(seconds=10)
> def my_thing_method(some_args):
>     # do expensive work
>     return value
> 
> ... where we're clearly addressing the performance issue by
> introducing a cache and limiting it's possible impact to 10 seconds
> which allows for the idea that "do expensive work" has network calls
> to systems that may change state outside of this Python process.
> 
> I'd like to see this done because I would like to have a place to
> point developers to during reviews... to say: use "common/memoizer" or
> use "Bob's awesome memoizer" because Bob has worked out all the cache
> problems already and you can just use it instead of worrying about
> introducing new bugs by building your own cache.
> 
> Does this make sense? I'd love to contribute something... but I wanted
> to understand why this state of affairs has persisted for a number of
> years... is there something I'm missing?
> 



More information about the OpenStack-dev mailing list