[cinder][tooz]Lock-files are remained

Ben Nemec openstack at nemebean.com
Tue Oct 15 17:00:38 UTC 2019

On 10/15/19 10:24 AM, Cédric Jeanneret wrote:
> On 10/15/19 4:47 PM, Ben Nemec wrote:
>> On 10/15/19 7:48 AM, Herve Beraud wrote:
>>> I proposed some patches through heat templates and puppet-cinder to 
>>> remove lock files older than 1 week and avoid file system growing.
>>> This is a solution based on a cron job, to fix that on stable 
>>> branches, in a second time I'll help the fasteners project to fix the 
>>> root cause by reviewing and testing the proposed patch (lock based on 
>>> file offset). In next versions I hope we will use a patched fasteners 
>>> and so we could drop the cron based solution.
>>> Please can you give /reviews/feedbacks:
>>> - https://review.opendev.org/688413
>>> - https://review.opendev.org/688414
>>> - https://review.opendev.org/688415
>> I'm rather hesitant to recommend this. It looks like the change is 
>> only removing the -delete lock files, which are a fraction of the 
>> total lock files created by Cinder, and I don't particularly want to 
>> encourage people to start monkeying with the lock files while a 
>> service is running. Even with this limited set of deletions, I think 
>> we need a Cinder person to look and verify that we aren't making bad 
>> assumptions about how the locks are used.
> I'm also not that happy with the cron way - but apparently it might 
> create some issues in some setup with the current way things are done:
> - inodes aren't infinit on ext* FS (ext3, ext4, blah) - see bellow for 
> context
> - perfs might be bad (see bellow for context)
> So one way or another, cleanup is needed.
>> In essence, I don't think this is going to meaningfully reduce the 
>> amount of leftover lock files and it sets a bad precedent for how to 
>> handle them.
> The filter is strict - on purpose, to address this specific issue. Of 
> course, we might want to loosen it, but... do we really want that? IF 
> we're to go with the cron thingy of course. Some more thinking is needed 
> I guess.
>> Personally, I'd rather see a boot-time service added for each 
>> OpenStack service that goes out and wipes the lock file directory 
>> before starting the service.
> Well.... former sysops here: don't count on regular reboot - once a year 
> is a fair average - and it's usually due to some power cut... Sad world, 
> I know ;).
> So a "boot-time cleanup" will help. A little. And wouldn't hurt anyway. 
> So +1 for that idea, but I wouldn't rely only on it. And there might be 
> some issues (see bellow)

I understand that it doesn't happen regularly, but it's the easiest to 
automate safe time to clean locks. It can also be done when the service 
is down for maintenance, but even then you need to be careful because if 
you wipe the cinder locks when cinder-volume gets restarted, but 
cinder-scheduler is still running you might wipe an in-use lock. I don't 
know if that specific scenario is possible, but the point is that any 
process that could hold a lock needs to be down before you clear the 
lock directory. Since most OpenStack services have multiple separate OS 
services running that complicates the process.

>> On a more general note, I'm going to challenge the assertion that 
>> "Customer file system growing slowly and so customer risk to facing some
>> issues to file system usage after a long period." I have yet to hear 
>> an actual bug report from the leftover lock files. Every time this 
>> comes up it's because someone noticed a lot of lock files and thought 
>> we were leaking them. I've never heard anyone report an actual 
>> functional or performance problem as a result of the lock files. I 
>> don't think we should "fix" this until someone reports that it's 
>> actually broken. Especially because previous attempts have all 
>> resulted in very real bugs that did break people.
> I'm not on your side here. Waiting to get a fire before thinking about 
> correction and counter-measures isn't good. Since we know there's an 
> issue, and that, eventually, it might be a really big one, it would be 
> good to address it before it explodes in our face.

If you have a solution that fixes the problem without introducing 
concurrency problems then I'm all ears. :-)

Until then, I'm not comfortable fixing a hypothetical problem by 
creating very real new problems.

> The disk *space* is probably not the issue. File with 1k, on a couple of 
> gigas, it's good.
> But there are other concerns:
> - inodes. Yes, we're supposed to get things on XFS, and that dude 
> doesn't have inodes. But some ops might want to rely on the good(?) old 
> ext3, or ext4. There, we might get some issues, and pretty quickly 
> depending on the speed of lock creation (so, linked to cinder actions in 
> this case). Or it might be some NFS with an ext4 FS.
> - rm limits: you probably never ever hit "rm argument list limit". But 
> it does exist, and I did hit it (like 15 years ago - maybe it's sorted 
> out, but I have some doubts). This means that rm will NOT be able to 
> cope with the directory content after a certain amount (which is huge, 
> right, but still... we're talking about some long-lasting process 
> filling a directory).
> Of course, "find" might be the right tool in such case, but it will take 
> a long, long time to delete (thinking about the "boot-time cleanup 
> proposal" for instance). >
> - performances: it might impact the system, for instance if one has some 
> backup process running and wanting to eat the /var/lib/cinder directory: 
> if the op doesn't know about this issue, they might get some surprises 
> with long, long, loooong running backups.
> With a target on some ext4 - hello Inodes!

Sure, but these are all still theoretical problems. I've _never_ heard 
of anyone running into them, and we have some pretty big OpenStack 
deployments in the wild.

I feel like at one point I did the math to figure out what it would take 
to hit an inode limit because of lock files, and it was fairly absurd. 
Like you would have to leave a deployment running for a decade under 
heavy use to actually get there. I don't still have those numbers handy, 
but it might be a useful exercise to look at that again.

And this reminds me of another thing that has been suggested in the past 
to address the lock file cleanup issue (we should really write all of 
this down if we haven't already...), which is to put them on tmpfs. That 
way they get cleared automatically on reboot and you don't have to 
manage anything. Locks don't persist over reboots anyway so it doesn't 
matter that it's on volatile storage. The whole file thing is actually a 
consequence of Linux IPC being complete garbage, not because we want 
persistent storage of locks.

> Sooo... yeah. I think this issue should be addressed. Really. But I +1 
> the fact that it should be done "the right way", whatever it is. The 
> "cron" might be the wrong one. Or not. We need some more feedbacks on 
> that :).

Patches welcome. But fair warning: This problem is a lot harder than it 
looks on the surface. Many solutions have been proposed over the years, 
all of them were worse than what we have now.

>> Maybe we should have oslo.concurrency drop a file named _README (or 
>> something else likely to sort first in the file listing) into the 
>> configured lock_path that explains why the files are there and the 
>> proper way to deal with them.
> Hmmm... who reads README anyway? Like, really. Better getting some 
> cleanup un place to avoid questions ;).

If it heads off even one of these threads that happen every few months 
then it will have been worth it. :-D

> Cheers,
> C.
>>> Thanks
>>> Le lun. 30 sept. 2019 à 03:35, Rikimaru Honjo 
>>> <honjo.rikimaru at ntt-tx.co.jp <mailto:honjo.rikimaru at ntt-tx.co.jp>> a 
>>> écrit :
>>>     On 2019/09/28 1:44, Ben Nemec wrote:
>>>      >
>>>      >
>>>      > On 9/23/19 11:42 PM, Rikimaru Honjo wrote:
>>>      >> Hi Eric,
>>>      >>
>>>      >> On 2019/09/20 23:10, Eric Harney wrote:
>>>      >>> On 9/20/19 1:52 AM, Rikimaru Honjo wrote:
>>>      >>>> Hi,
>>>      >>>>
>>>      >>>> I'm using Queens cinder with the following setting.
>>>      >>>>
>>>      >>>> ---------------------------------
>>>      >>>> [coordination]
>>>      >>>> backend_url = file://$state_path
>>>      >>>> ---------------------------------
>>>      >>>>
>>>      >>>> As a result, the files like the following were remained under
>>>     the state path after some operations.[1]
>>>      >>>>
>>>      >>>> cinder-63dacb3d-bd4d-42bb-88fe-6e4180164765-delete_volume
>>>      >>>> cinder-32c426af-82b4-41de-b637-7d76fed69e83-delete_snapshot
>>>      >>>>
>>>      >>>> In my understanding, these are lock-files created for
>>>     synchronization by tooz.
>>>      >>>> But, these lock-files were not deleted after finishing 
>>> operations.
>>>      >>>> Is this behaviour correct?
>>>      >>>>
>>>      >>>> [1]
>>>      >>>> e.g. Delete volume, Delete snapshot
>>>      >>>
>>>      >>> This is a known bug that's described here:
>>>      >>>
>>>      >>> https://github.com/harlowja/fasteners/issues/26
>>>      >>>
>>>      >>> (The fasteners library is used by tooz, which is used by Cinder
>>>     for managing these lock files.)
>>>      >>>
>>>      >>> There's an old Cinder bug for it here:
>>>      >>> https://bugs.launchpad.net/cinder/+bug/1432387
>>>      >>>
>>>      >>> but that's marked as "Won't Fix" because Cinder needs it to be
>>>     fixed in the underlying libraries.
>>>      >> Thank you for your explanation.
>>>      >> I understood the state.
>>>      >>
>>>      >> But, I have one more question.
>>>      >> Can I think this bug doesn't affect synchronization?
>>>      >
>>>      > It does not. In fact, it's important to not remove lock files
>>>     while a service is running or you can end up with synchronization
>>>     issues.
>>>      >
>>>      > To clean up the leftover lock files, we generally recommend
>>>     clearing the lock_path for each service on reboot before the
>>>     services have started.
>>>     Thank you for your information.
>>>     I think that I understood this issue completely.
>>>     Best Regards,
>>>      >>
>>>      >> Best regards,
>>>      >>
>>>      >>> Thanks,
>>>      >>> Eric
>>>      >>>
>>>      >>
>>>      >
>>>     --     _/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/
>>>     Rikimaru Honjo
>>>     E-mail:honjo.rikimaru at ntt-tx.co.jp
>>>     <mailto:E-mail%3Ahonjo.rikimaru at ntt-tx.co.jp>
>>> -- 
>>> Hervé Beraud
>>> Senior Software Engineer
>>> Red Hat - Openstack Oslo
>>> irc: hberaud
>>> Kf31yCutl5bAlS7tOKpPQ9XN4oC0ZSThyNNFVrg8ail0SczHXsC4rOrsPblgGRN+
>>> RQLoCm2eO1AkB0ubCYLaq0XqSaO+Uk81QxAPkyPCEGT6SRxXr2lhADK0T86kBnMP
>>> F8RvGolu3EFjlqCVgeOZaR51PqwUlEhZXZuuNKrWZXg/oRiY4811GmnvzmUhgK5G
>>> 5+f8mUg74hfjDbR2VhjTeaLKp0PhskjOIKY3vqHXofLuaqFDD+WrAy/NgDGvN22g
>>> glGfj472T3xyHnUzM8ILgAGSghfzZF5Skj2qEeci9cB6K3Hm3osj+PbvfsXE/7Kw
>>> m/xtm+FjnaywZEv54uCmVIzQsRIm1qJscu20Qw6Q0UiPpDFqD7O6tWSRKdX11UTZ
>>> hwVQTMh9AKQDBEh2W9nnFi9kzSSNu4OQ1dRMcYHWfd9BEkccezxHwUM4Xyov5Fe0
>>> qnbfzTB1tYkjU78loMWFaLa00ftSxP/DtQ//iYVyfVNfcCwfDszXLOqlkvGmY1/Y
>>> F1ON0ONekDZkGJsDoS6QdiUSn8RZ2mHArGEWMV00EV5DCIbCXRvywXV43ckx8Z+3
>>> B8qUJhBqJ8RS2F+vTs3DTaXqcktgJ4UkhYC2c1gImcPRyGrK9VY0sCT+1iA+wp/O
>>> v6rDpkeNksZ9fFSyoY2o
>>> =ECSj
>>> -----END PGP SIGNATURE-----

More information about the openstack-discuss mailing list