[openstack-dev] [oslo] memoizer aka cache

Renat Akhmerov rakhmerov at mirantis.com
Fri Jan 24 18:14:14 UTC 2014


Joining to providing our backgrounds.. I’d be happy to help here too since I have pretty solid background in using and developing caching solutions, however mostly in Java world (expertise in GemFire and Coherence, developing GridGain distributed cache). 

Renat Akhmerov
@ Mirantis Inc.



On 23 Jan 2014, at 18:38, Joshua Harlow <harlowja at yahoo-inc.com> wrote:

> Same here; I've done pretty big memcache (and similar technologies) scale caching & invalidations at Y! before so here to help…
> 
> From: Morgan Fainberg <m at metacloud.com>
> Reply-To: "OpenStack Development Mailing List (not for usage questions)" <openstack-dev at lists.openstack.org>
> Date: Thursday, January 23, 2014 at 4:17 PM
> To: "OpenStack Development Mailing List (not for usage questions)" <openstack-dev at lists.openstack.org>
> Subject: Re: [openstack-dev] [oslo] memoizer aka cache
> 
> Yes! There is a reason Keystone has a very small footprint of caching/invalidation done so far.  It really needs to be correct when it comes to proper invalidation logic.  I am happy to offer some help in determining logic for caching/invalidation with Dogpile.cache in mind as we get it into oslo and available for all to use.
> 
> --Morgan
> 
> 
> 
> On Thu, Jan 23, 2014 at 2:54 PM, Joshua Harlow <harlowja at yahoo-inc.com> wrote:
>> Sure, no cancelling cases of conscious usage, but we need to be careful
>> here and make sure its really appropriate. Caching and invalidation
>> techniques are right up there in terms of problems that appear easy and
>> simple to initially do/use, but doing it correctly is really really hard
>> (especially at any type of scale).
>> 
>> -Josh
>> 
>> On 1/23/14, 1:35 PM, "Renat Akhmerov" <rakhmerov at mirantis.com> wrote:
>> 
>> >
>> >On 23 Jan 2014, at 08:41, Joshua Harlow <harlowja at yahoo-inc.com> wrote:
>> >
>> >> So to me memoizing is typically a premature optimization in a lot of
>> >>cases. And doing it incorrectly leads to overfilling the python
>> >>processes memory (your global dict will have objects in it that can't be
>> >>garbage collected, and with enough keys+values being stored will act
>> >>just like a memory leak; basically it acts as a new GC root object in a
>> >>way) or more cache invalidation races/inconsistencies than just
>> >>recomputing the initial valueŠ
>> >
>> >I agree with your concerns here. At the same time, I think this thinking
>> >shouldn¹t cancel cases of conscious usage of caching technics. A decent
>> >cache implementation would help to solve lots of performance problems
>> >(which eventually becomes a concern for any project).
>> >
>> >> Overall though there are a few caching libraries I've seen being used,
>> >>any of which could be used for memoization.
>> >>
>> >> -
>> >>https://github.com/openstack/oslo-incubator/tree/master/openstack/common/
>> >>cache
>> >> -
>> >>https://github.com/openstack/oslo-incubator/blob/master/openstack/common/
>> >>memorycache.py
>> >
>> >I looked at the code. I have lots of question to the implementation (like
>> >cache eviction policies, whether or not it works well with green threads,
>> >but I think it¹s a subject for a separate discussion though). Could you
>> >please share your experience of using it? Were there specific problems
>> >that you could point to? May be they are already described somewhere?
>> >
>> >> - dogpile cache @ https://pypi.python.org/pypi/dogpile.cache
>> >
>> >This one looks really interesting in terms of claimed feature set. It
>> >seems to be compatible with Python 2.7, not sure about 2.6. As above, it
>> >would be cool you told about your experience with it.
>> >
>> >
>> >> I am personally weary of using them for memoization, what expensive
>> >>method calls do u see the complexity of this being useful? I didn't
>> >>think that many method calls being done in openstack warranted the
>> >>complexity added by doing this (premature optimization is the root of
>> >>all evil...). Do u have data showing where it would be
>> >>applicable/beneficial?
>> >
>> >I believe there¹s a great deal of use cases like caching db objects or
>> >more generally caching any heavy objects involving interprocess
>> >communication. For instance, API clients may be caching objects that are
>> >known to be immutable on the server side.
>> >
>> >
>> >>
>> >> Sent from my really tiny device...
>> >>
>> >>> On Jan 23, 2014, at 8:19 AM, "Shawn Hartsock" <hartsock at acm.org> wrote:
>> >>>
>> >>> I would like to have us adopt a memoizing caching library of some kind
>> >>> for use with OpenStack projects. I have no strong preference at this
>> >>> time and I would like suggestions on what to use.
>> >>>
>> >>> I have seen a number of patches where people have begun to implement
>> >>> their own caches in dictionaries. This typically confuses the code and
>> >>> mixes issues of correctness and performance in code.
>> >>>
>> >>> Here's an example:
>> >>>
>> >>> We start with:
>> >>>
>> >>> def my_thing_method(some_args):
>> >>>   # do expensive work
>> >>>   return value
>> >>>
>> >>> ... but a performance problem is detected... maybe the method is
>> >>> called 15 times in 10 seconds but then not again for 5 minutes and the
>> >>> return value can only logically change every minute or two... so we
>> >>> end up with ...
>> >>>
>> >>> _GLOBAL_THING_CACHE = {}
>> >>>
>> >>> def my_thing_method(some_args):
>> >>>   key = key_from(some_args)
>> >>>    if key in _GLOBAL_THING_CACHE:
>> >>>        return _GLOBAL_THING_CACHE[key]
>> >>>    else:
>> >>>         # do expensive work
>> >>>         _GLOBAL_THING_CACHE[key] = value
>> >>>         return value
>> >>>
>> >>> ... which is all well and good... but now as a maintenance programmer
>> >>> I need to comprehend the cache mechanism, when cached values are
>> >>> invalidated, and if I need to debug the "do expensive work" part I
>> >>> need to tease out some test that prevents the cache from being hit.
>> >>> Plus I've introduced a new global variable. We love globals right?
>> >>>
>> >>> I would like us to be able to say:
>> >>>
>> >>> @memoize(seconds=10)
>> >>> def my_thing_method(some_args):
>> >>>   # do expensive work
>> >>>   return value
>> >>>
>> >>> ... where we're clearly addressing the performance issue by
>> >>> introducing a cache and limiting it's possible impact to 10 seconds
>> >>> which allows for the idea that "do expensive work" has network calls
>> >>> to systems that may change state outside of this Python process.
>> >>>
>> >>> I'd like to see this done because I would like to have a place to
>> >>> point developers to during reviews... to say: use "common/memoizer" or
>> >>> use "Bob's awesome memoizer" because Bob has worked out all the cache
>> >>> problems already and you can just use it instead of worrying about
>> >>> introducing new bugs by building your own cache.
>> >>>
>> >>> Does this make sense? I'd love to contribute something... but I wanted
>> >>> to understand why this state of affairs has persisted for a number of
>> >>> years... is there something I'm missing?
>> >>>
>> >>> --
>> >>> # Shawn.Hartsock - twitter: @hartsock - plus.google.com/+ShawnHartsock
>> >>>
>> >>> _______________________________________________
>> >>> OpenStack-dev mailing list
>> >>> OpenStack-dev at lists.openstack.org
>> >>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>> >>
>> >> _______________________________________________
>> >> OpenStack-dev mailing list
>> >> OpenStack-dev at lists.openstack.org
>> >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>> >
>> >
>> >_______________________________________________
>> >OpenStack-dev mailing list
>> >OpenStack-dev at lists.openstack.org
>> >http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>> 
>> 
>> _______________________________________________
>> OpenStack-dev mailing list
>> OpenStack-dev at lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20140124/d87872d8/attachment.html>


More information about the OpenStack-dev mailing list