Le mar. 15 oct. 2019 à 16:48, Ben Nemec <openstack@nemebean.com> a écrit :
On 10/15/19 7:48 AM, Herve Beraud wrote:
I proposed some patches through heat templates and puppet-cinder to remove lock files older than 1 week and avoid file system growing.
This is a solution based on a cron job, to fix that on stable branches, in a second time I'll help the fasteners project to fix the root cause by reviewing and testing the proposed patch (lock based on file offset). In next versions I hope we will use a patched fasteners and so we could drop the cron based solution.
Please can you give /reviews/feedbacks: - https://review.opendev.org/688413 - https://review.opendev.org/688414 - https://review.opendev.org/688415
I'm rather hesitant to recommend this. It looks like the change is only removing the -delete lock files, which are a fraction of the total lock files created by Cinder, and I don't particularly want to encourage people to start monkeying with the lock files while a service is running. Even with this limited set of deletions, I think we need a Cinder person to look and verify that we aren't making bad assumptions about how the locks are used.
Yes these changes should be validated by the cinder team. I chosen this approach to allow use to fix that on stable branches too, and to avoid to introduce a non backportable new feature.
In essence, I don't think this is going to meaningfully reduce the amount of leftover lock files and it sets a bad precedent for how to handle them.
Personally, I'd rather see a boot-time service added for each OpenStack service that goes out and wipes the lock file directory before starting the service.
I agree it can be an alternative to the proposed changes. I guess it's related to some sort of puppet code too, I'm right? (the boot-time service)
On a more general note, I'm going to challenge the assertion that "Customer file system growing slowly and so customer risk to facing some issues to file system usage after a long period." I have yet to hear an actual bug report from the leftover lock files. Every time this comes up it's because someone noticed a lot of lock files and thought we were leaking them. I've never heard anyone report an actual functional or performance problem as a result of the lock files. I don't think we should "fix" this until someone reports that it's actually broken. Especially because previous attempts have all resulted in very real bugs that did break people.
Yes I agreee it's more an assumption than a reality, I never seen anybody report a disk usage issue or things like this due to leftover lock files.
Maybe we should have oslo.concurrency drop a file named _README (or something else likely to sort first in the file listing) into the configured lock_path that explains why the files are there and the proper way to deal with them.
Good idea. Anyway, even if nobody facing a file system issue related to files leftover, I think it's not a good thing to lets grow a FS, and we need to try to address it to prevent potential file system issues related to disk usage and lock files, but in a secure way to avoid to introduce race conditions with cinder. Cinder people need to confirm that my proposed changes can fit well with cinder's mechanismes or choose a better approach.
Thanks
Le lun. 30 sept. 2019 à 03:35, Rikimaru Honjo <honjo.rikimaru@ntt-tx.co.jp <mailto:honjo.rikimaru@ntt-tx.co.jp>> a
écrit :
On 2019/09/28 1:44, Ben Nemec wrote: > > > On 9/23/19 11:42 PM, Rikimaru Honjo wrote: >> Hi Eric, >> >> On 2019/09/20 23:10, Eric Harney wrote: >>> On 9/20/19 1:52 AM, Rikimaru Honjo wrote: >>>> Hi, >>>> >>>> I'm using Queens cinder with the following setting. >>>> >>>> --------------------------------- >>>> [coordination] >>>> backend_url = file://$state_path >>>> --------------------------------- >>>> >>>> As a result, the files like the following were remained under the state path after some operations.[1] >>>> >>>> cinder-63dacb3d-bd4d-42bb-88fe-6e4180164765-delete_volume >>>> cinder-32c426af-82b4-41de-b637-7d76fed69e83-delete_snapshot >>>> >>>> In my understanding, these are lock-files created for synchronization by tooz. >>>> But, these lock-files were not deleted after finishing
operations.
>>>> Is this behaviour correct? >>>> >>>> [1] >>>> e.g. Delete volume, Delete snapshot >>> >>> This is a known bug that's described here: >>> >>> https://github.com/harlowja/fasteners/issues/26 >>> >>> (The fasteners library is used by tooz, which is used by Cinder for managing these lock files.) >>> >>> There's an old Cinder bug for it here: >>> https://bugs.launchpad.net/cinder/+bug/1432387 >>> >>> but that's marked as "Won't Fix" because Cinder needs it to be fixed in the underlying libraries. >> Thank you for your explanation. >> I understood the state. >> >> But, I have one more question. >> Can I think this bug doesn't affect synchronization? > > It does not. In fact, it's important to not remove lock files while a service is running or you can end up with synchronization issues. > > To clean up the leftover lock files, we generally recommend clearing the lock_path for each service on reboot before the services have started.
Thank you for your information. I think that I understood this issue completely.
Best Regards,
>> >> Best regards, >> >>> Thanks, >>> Eric >>> >> >
-- _/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/ Rikimaru Honjo E-mail:honjo.rikimaru@ntt-tx.co.jp <mailto:E-mail%3Ahonjo.rikimaru@ntt-tx.co.jp>
-- Hervé Beraud Senior Software Engineer Red Hat - Openstack Oslo irc: hberaud -----BEGIN PGP SIGNATURE-----
wsFcBAABCAAQBQJb4AwCCRAHwXRBNkGNegAALSkQAHrotwCiL3VMwDR0vcja10Q+ Kf31yCutl5bAlS7tOKpPQ9XN4oC0ZSThyNNFVrg8ail0SczHXsC4rOrsPblgGRN+ RQLoCm2eO1AkB0ubCYLaq0XqSaO+Uk81QxAPkyPCEGT6SRxXr2lhADK0T86kBnMP F8RvGolu3EFjlqCVgeOZaR51PqwUlEhZXZuuNKrWZXg/oRiY4811GmnvzmUhgK5G 5+f8mUg74hfjDbR2VhjTeaLKp0PhskjOIKY3vqHXofLuaqFDD+WrAy/NgDGvN22g glGfj472T3xyHnUzM8ILgAGSghfzZF5Skj2qEeci9cB6K3Hm3osj+PbvfsXE/7Kw m/xtm+FjnaywZEv54uCmVIzQsRIm1qJscu20Qw6Q0UiPpDFqD7O6tWSRKdX11UTZ hwVQTMh9AKQDBEh2W9nnFi9kzSSNu4OQ1dRMcYHWfd9BEkccezxHwUM4Xyov5Fe0 qnbfzTB1tYkjU78loMWFaLa00ftSxP/DtQ//iYVyfVNfcCwfDszXLOqlkvGmY1/Y F1ON0ONekDZkGJsDoS6QdiUSn8RZ2mHArGEWMV00EV5DCIbCXRvywXV43ckx8Z+3 B8qUJhBqJ8RS2F+vTs3DTaXqcktgJ4UkhYC2c1gImcPRyGrK9VY0sCT+1iA+wp/O v6rDpkeNksZ9fFSyoY2o =ECSj -----END PGP SIGNATURE-----
-- Hervé Beraud Senior Software Engineer Red Hat - Openstack Oslo irc: hberaud -----BEGIN PGP SIGNATURE----- wsFcBAABCAAQBQJb4AwCCRAHwXRBNkGNegAALSkQAHrotwCiL3VMwDR0vcja10Q+ Kf31yCutl5bAlS7tOKpPQ9XN4oC0ZSThyNNFVrg8ail0SczHXsC4rOrsPblgGRN+ RQLoCm2eO1AkB0ubCYLaq0XqSaO+Uk81QxAPkyPCEGT6SRxXr2lhADK0T86kBnMP F8RvGolu3EFjlqCVgeOZaR51PqwUlEhZXZuuNKrWZXg/oRiY4811GmnvzmUhgK5G 5+f8mUg74hfjDbR2VhjTeaLKp0PhskjOIKY3vqHXofLuaqFDD+WrAy/NgDGvN22g glGfj472T3xyHnUzM8ILgAGSghfzZF5Skj2qEeci9cB6K3Hm3osj+PbvfsXE/7Kw m/xtm+FjnaywZEv54uCmVIzQsRIm1qJscu20Qw6Q0UiPpDFqD7O6tWSRKdX11UTZ hwVQTMh9AKQDBEh2W9nnFi9kzSSNu4OQ1dRMcYHWfd9BEkccezxHwUM4Xyov5Fe0 qnbfzTB1tYkjU78loMWFaLa00ftSxP/DtQ//iYVyfVNfcCwfDszXLOqlkvGmY1/Y F1ON0ONekDZkGJsDoS6QdiUSn8RZ2mHArGEWMV00EV5DCIbCXRvywXV43ckx8Z+3 B8qUJhBqJ8RS2F+vTs3DTaXqcktgJ4UkhYC2c1gImcPRyGrK9VY0sCT+1iA+wp/O v6rDpkeNksZ9fFSyoY2o =ECSj -----END PGP SIGNATURE-----