[OpenStack-Infra] Zuul memory leak

James E. Blair corvus at inaugust.com
Wed Feb 10 16:57:19 UTC 2016


Michael Still <mikal at stillhq.com> writes:

> On Tue, Feb 9, 2016 at 4:59 AM, Joshua Hesketh <joshua.hesketh at gmail.com>
> wrote:
>
>> On Thu, Feb 4, 2016 at 2:44 AM, James E. Blair <corvus at inaugust.com>
>> wrote:
>>>
>>> On the subject of clearing the cache more often, I think we may not want
>>> to wipe out the cache more often than we do now -- in fact, I think we
>>> may want to look into ways to keep from doing even that, because
>>> whenever we reload now, Zuul slows down considerably as it has to query
>>> Gerrit again for all of the data previously in its cache.
>>>
>>
>> I can see a lot of 3rd parties or simpler CI's not needing to reload zuul
>> very often so this cache would never get cleared. Perhaps cached objects
>> should have an expiry time (of a day or so) and can be cleaned up
>> periodically? Additionally if clearing the cache on a reload is causing
>> pain maybe we should move the cache into the scheduler and keep it between
>> reloads?
>>
>
> Do you guys use oslo at all? I ask because the olso memcache stuff does
> exactly this, so it should be trivial to implement if you don't mind
> depending on oslo.

One of the main things we use the cache for is to ensure that every
change is represented by a single Change object in Zuul's memory.  The
graph of enqueued Items link to their respective Changes which may link
to each other due to dependencies.  When something changes in Gerrit, we
want that reflected immediately and consistently in all of the objects
in that graph.  Using the cache means that every time we add a new
Change object to that graph, we use the same object for a given change.

This is why we can't use time-based expiry -- we must not drop objects
from the cache if they are still in the graph.  Otherwise we will create
new duplicative objects and the ones still in the graph will not be
updated.

Perhaps we should change these objects to something more ephemeral that
can proxy for some other mechanism that can operate more like a
traditional cache (with time-based expiry).  But I think changes to this
system should happen in Zuulv3 -- it works well enough for Zuulv2 for
now.

-Jim



More information about the OpenStack-Infra mailing list