[cinder][tooz]Lock-files are remained

Herve Beraud hberaud at redhat.com
Tue Oct 15 15:41:19 UTC 2019


Le mar. 15 oct. 2019 à 16:48, Ben Nemec <openstack at nemebean.com> a écrit :

>
>
> On 10/15/19 7:48 AM, Herve Beraud wrote:
> > I proposed some patches through heat templates and puppet-cinder to
> > remove lock files older than 1 week and avoid file system growing.
> >
> > This is a solution based on a cron job, to fix that on stable branches,
> > in a second time I'll help the fasteners project to fix the root cause
> > by reviewing and testing the proposed patch (lock based on file offset).
> > In next versions I hope we will use a patched fasteners and so we could
> > drop the cron based solution.
> >
> > Please can you give /reviews/feedbacks:
> > - https://review.opendev.org/688413
> > - https://review.opendev.org/688414
> > - https://review.opendev.org/688415
>
> I'm rather hesitant to recommend this. It looks like the change is only
> removing the -delete lock files, which are a fraction of the total lock
> files created by Cinder, and I don't particularly want to encourage
> people to start monkeying with the lock files while a service is
> running. Even with this limited set of deletions, I think we need a
> Cinder person to look and verify that we aren't making bad assumptions
> about how the locks are used.
>

Yes these changes should be validated by the cinder team.
I chosen this approach to allow use to fix that on stable branches too, and
to avoid to introduce a non backportable new feature.


>
> In essence, I don't think this is going to meaningfully reduce the
> amount of leftover lock files and it sets a bad precedent for how to
> handle them.
>
> Personally, I'd rather see a boot-time service added for each OpenStack
> service that goes out and wipes the lock file directory before starting
> the service.
>

I agree it can be an alternative to the proposed changes.
I guess it's related to some sort of puppet code too, I'm right? (the
boot-time service)

>
> On a more general note, I'm going to challenge the assertion that
> "Customer file system growing slowly and so customer risk to facing some
> issues to file system usage after a long period." I have yet to hear an
> actual bug report from the leftover lock files. Every time this comes up
> it's because someone noticed a lot of lock files and thought we were
> leaking them. I've never heard anyone report an actual functional or
> performance problem as a result of the lock files. I don't think we
> should "fix" this until someone reports that it's actually broken.
> Especially because previous attempts have all resulted in very real bugs
> that did break people.
>

Yes I agreee it's more an assumption than a reality, I never seen anybody
report a disk usage issue or things like this due to leftover lock files.


> Maybe we should have oslo.concurrency drop a file named _README (or
> something else likely to sort first in the file listing) into the
> configured lock_path that explains why the files are there and the
> proper way to deal with them.
>

Good idea.

Anyway, even if nobody facing a file system issue related to files
leftover, I think it's not a good thing to lets grow a FS, and we need to
try to address it to prevent potential file system issues related to disk
usage and lock files, but in a secure way to avoid to introduce race
conditions with cinder.

Cinder people need to confirm that my proposed changes can fit well with
cinder's mechanismes or choose a better approach.


>
> >
> > Thanks
> >
> >
> > Le lun. 30 sept. 2019 à 03:35, Rikimaru Honjo
> > <honjo.rikimaru at ntt-tx.co.jp <mailto:honjo.rikimaru at ntt-tx.co.jp>> a
> écrit :
> >
> >     On 2019/09/28 1:44, Ben Nemec wrote:
> >      >
> >      >
> >      > On 9/23/19 11:42 PM, Rikimaru Honjo wrote:
> >      >> Hi Eric,
> >      >>
> >      >> On 2019/09/20 23:10, Eric Harney wrote:
> >      >>> On 9/20/19 1:52 AM, Rikimaru Honjo wrote:
> >      >>>> Hi,
> >      >>>>
> >      >>>> I'm using Queens cinder with the following setting.
> >      >>>>
> >      >>>> ---------------------------------
> >      >>>> [coordination]
> >      >>>> backend_url = file://$state_path
> >      >>>> ---------------------------------
> >      >>>>
> >      >>>> As a result, the files like the following were remained under
> >     the state path after some operations.[1]
> >      >>>>
> >      >>>> cinder-63dacb3d-bd4d-42bb-88fe-6e4180164765-delete_volume
> >      >>>> cinder-32c426af-82b4-41de-b637-7d76fed69e83-delete_snapshot
> >      >>>>
> >      >>>> In my understanding, these are lock-files created for
> >     synchronization by tooz.
> >      >>>> But, these lock-files were not deleted after finishing
> operations.
> >      >>>> Is this behaviour correct?
> >      >>>>
> >      >>>> [1]
> >      >>>> e.g. Delete volume, Delete snapshot
> >      >>>
> >      >>> This is a known bug that's described here:
> >      >>>
> >      >>> https://github.com/harlowja/fasteners/issues/26
> >      >>>
> >      >>> (The fasteners library is used by tooz, which is used by Cinder
> >     for managing these lock files.)
> >      >>>
> >      >>> There's an old Cinder bug for it here:
> >      >>> https://bugs.launchpad.net/cinder/+bug/1432387
> >      >>>
> >      >>> but that's marked as "Won't Fix" because Cinder needs it to be
> >     fixed in the underlying libraries.
> >      >> Thank you for your explanation.
> >      >> I understood the state.
> >      >>
> >      >> But, I have one more question.
> >      >> Can I think this bug doesn't affect synchronization?
> >      >
> >      > It does not. In fact, it's important to not remove lock files
> >     while a service is running or you can end up with synchronization
> >     issues.
> >      >
> >      > To clean up the leftover lock files, we generally recommend
> >     clearing the lock_path for each service on reboot before the
> >     services have started.
> >
> >     Thank you for your information.
> >     I think that I understood this issue completely.
> >
> >     Best Regards,
> >
> >
> >      >>
> >      >> Best regards,
> >      >>
> >      >>> Thanks,
> >      >>> Eric
> >      >>>
> >      >>
> >      >
> >
> >     --
> >     _/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/
> >     Rikimaru Honjo
> >     E-mail:honjo.rikimaru at ntt-tx.co.jp
> >     <mailto:E-mail%3Ahonjo.rikimaru at ntt-tx.co.jp>
> >
> >
> >
> >
> > --
> > Hervé Beraud
> > Senior Software Engineer
> > Red Hat - Openstack Oslo
> > irc: hberaud
> > -----BEGIN PGP SIGNATURE-----
> >
> > wsFcBAABCAAQBQJb4AwCCRAHwXRBNkGNegAALSkQAHrotwCiL3VMwDR0vcja10Q+
> > Kf31yCutl5bAlS7tOKpPQ9XN4oC0ZSThyNNFVrg8ail0SczHXsC4rOrsPblgGRN+
> > RQLoCm2eO1AkB0ubCYLaq0XqSaO+Uk81QxAPkyPCEGT6SRxXr2lhADK0T86kBnMP
> > F8RvGolu3EFjlqCVgeOZaR51PqwUlEhZXZuuNKrWZXg/oRiY4811GmnvzmUhgK5G
> > 5+f8mUg74hfjDbR2VhjTeaLKp0PhskjOIKY3vqHXofLuaqFDD+WrAy/NgDGvN22g
> > glGfj472T3xyHnUzM8ILgAGSghfzZF5Skj2qEeci9cB6K3Hm3osj+PbvfsXE/7Kw
> > m/xtm+FjnaywZEv54uCmVIzQsRIm1qJscu20Qw6Q0UiPpDFqD7O6tWSRKdX11UTZ
> > hwVQTMh9AKQDBEh2W9nnFi9kzSSNu4OQ1dRMcYHWfd9BEkccezxHwUM4Xyov5Fe0
> > qnbfzTB1tYkjU78loMWFaLa00ftSxP/DtQ//iYVyfVNfcCwfDszXLOqlkvGmY1/Y
> > F1ON0ONekDZkGJsDoS6QdiUSn8RZ2mHArGEWMV00EV5DCIbCXRvywXV43ckx8Z+3
> > B8qUJhBqJ8RS2F+vTs3DTaXqcktgJ4UkhYC2c1gImcPRyGrK9VY0sCT+1iA+wp/O
> > v6rDpkeNksZ9fFSyoY2o
> > =ECSj
> > -----END PGP SIGNATURE-----
> >
>


-- 
Hervé Beraud
Senior Software Engineer
Red Hat - Openstack Oslo
irc: hberaud
-----BEGIN PGP SIGNATURE-----

wsFcBAABCAAQBQJb4AwCCRAHwXRBNkGNegAALSkQAHrotwCiL3VMwDR0vcja10Q+
Kf31yCutl5bAlS7tOKpPQ9XN4oC0ZSThyNNFVrg8ail0SczHXsC4rOrsPblgGRN+
RQLoCm2eO1AkB0ubCYLaq0XqSaO+Uk81QxAPkyPCEGT6SRxXr2lhADK0T86kBnMP
F8RvGolu3EFjlqCVgeOZaR51PqwUlEhZXZuuNKrWZXg/oRiY4811GmnvzmUhgK5G
5+f8mUg74hfjDbR2VhjTeaLKp0PhskjOIKY3vqHXofLuaqFDD+WrAy/NgDGvN22g
glGfj472T3xyHnUzM8ILgAGSghfzZF5Skj2qEeci9cB6K3Hm3osj+PbvfsXE/7Kw
m/xtm+FjnaywZEv54uCmVIzQsRIm1qJscu20Qw6Q0UiPpDFqD7O6tWSRKdX11UTZ
hwVQTMh9AKQDBEh2W9nnFi9kzSSNu4OQ1dRMcYHWfd9BEkccezxHwUM4Xyov5Fe0
qnbfzTB1tYkjU78loMWFaLa00ftSxP/DtQ//iYVyfVNfcCwfDszXLOqlkvGmY1/Y
F1ON0ONekDZkGJsDoS6QdiUSn8RZ2mHArGEWMV00EV5DCIbCXRvywXV43ckx8Z+3
B8qUJhBqJ8RS2F+vTs3DTaXqcktgJ4UkhYC2c1gImcPRyGrK9VY0sCT+1iA+wp/O
v6rDpkeNksZ9fFSyoY2o
=ECSj
-----END PGP SIGNATURE-----
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20191015/91264483/attachment-0001.html>


More information about the openstack-discuss mailing list