[openstack-dev] creating a default for oslo config variables within a project?

Sean Dague sean at dague.net
Wed Dec 4 18:51:16 UTC 2013

On 12/04/2013 11:56 AM, Ben Nemec wrote:
> On 2013-12-04 06:07, Sean Dague wrote:
>> On 12/03/2013 11:21 PM, Clint Byrum wrote:
>>> Excerpts from Sean Dague's message of 2013-12-03 16:05:47 -0800:
>>>> On 12/03/2013 06:13 PM, Ben Nemec wrote:
>>>>> On 2013-12-03 17:09, Sean Dague wrote:
>>>>>> On 12/03/2013 05:50 PM, Mark McLoughlin wrote:
>>>>>>> On Tue, 2013-12-03 at 16:23 -0600, Ben Nemec wrote:
>>>>>>>> On 2013-12-03 15:56, Sean Dague wrote:
>>>>>>>>> This cinder patch - https://review.openstack.org/#/c/48935/
>>>>>>>>> Is blocked on failing upgrade because the updated oslo
>>>>>>>>> lockutils won't
>>>>>>>>> function until there is a specific configuration variable added
>>>>>>>>> to the
>>>>>>>>> cinder.conf.
>>>>>>>>> That work around is proposed here -
>>>>>>>>> https://review.openstack.org/#/c/52070/3
>>>>>>>>> However I think this is exactly the kind of forward breaks that we
>>>>>>>>> want
>>>>>>>>> to prevent with grenade, as cinder failing to function after a
>>>>>>>>> rolling
>>>>>>>>> upgrade because a config item wasn't added is exactly the kind
>>>>>>>>> of pain
>>>>>>>>> we are trying to prevent happening to ops.
>>>>>>>>> So the question is, how is this done correctly so that a default
>>>>>>>>> can be
>>>>>>>>> set in the cinder code for this value, and it not require a config
>>>>>>>>> change to work?
>>>>>>> You're absolutely correct, in principle - if the default value for
>>>>>>> lock_path worked for users before, we should absolutely continue to
>>>>>>> support it.
>>>>>>>> I don't know that I have a good answer on how to handle this,
>>>>>>>> but for
>>>>>>>> context this change is the result of a nasty bug in lockutils that
>>>>>>>> meant
>>>>>>>> external locks were doing nothing if lock_path wasn't set. 
>>>>>>>> Basically
>>>>>>>> it's something we should never have allowed in the first place.
>>>>>>>> As far as setting this in code, it's important that all of the
>>>>>>>> processes
>>>>>>>> for a service are using the same value to avoid the same bad
>>>>>>>> situation
>>>>>>>> we were in before.  For tests, we have a lockutils wrapper
>>>>>>>> (https://github.com/openstack/oslo-incubator/blob/master/openstack/common/lockutils.py#L282)
>>>>>>>> that sets an environment variable to address this, but that only
>>>>>>>> works if all of the processes are going to be spawned from within
>>>>>>>> the same wrapper, and I'm not sure how secure that is for
>>>>>>>> production
>>>>>>>> deployments since it puts all of the lock files in a temporary
>>>>>>>> directory.
>>>>>>> Right, I don't think the previous default really "worked" - if
>>>>>>> you used
>>>>>>> the default, then external locking was broken.
>>>>>>> I suspect most distros do set a default - I see RDO has this in its
>>>>>>> default nova.conf:
>>>>>>>   lock_path = /var/lib/nova/tmp
>>>>>>> So, yes - this is all terrible.
>>>>>>> IMHO, rather than raise an exception we should log a big fat warning
>>>>>>> about relying on the default and perhaps just treat the lock as an
>>>>>>> in-process lock in that case ... since that's essentially what it
>>>>>>> was
>>>>>>> before, right?
>>>>>> So a default of lock_path = /tmp will work (FHS says that path has
>>>>>> to be
>>>>>> there), even if not optimal. Could we make it a default value like
>>>>>> that
>>>>>> instead of the current default which is null (and hence the problem).
>>>>> IIRC, my initial fix was something similar to that, but it got shot
>>>>> down
>>>>> because putting the lock files in a known world writeable location
>>>>> was a
>>>>> security issue.
>>>>> Although maybe if we put them in a subdirectory of /tmp and ensured
>>>>> that
>>>>> the permissions were such that only the user running the service could
>>>>> use that directory, it might be acceptable?  We could still log a
>>>>> warning if we wanted.
>>>>> This seems like it would have implications for people running services
>>>>> on Windows too, but we can probably find a way to make that work if we
>>>>> decide on a solution.
>>>> How is that a security issue? Are the lock files being written with
>>>> some
>>>> sensitive data in them and have g or o permissions on? The sticky bit
>>>> (+t) on /tmp will prevent other users from deleting the file.
>>> Right, but it won't prevent users from creating a symlink with the same
>>> name.
>>> ln -s /var/lib/nova/instances/xxxxx/image.raw /tmp/well.known.location
>>> Now when you do
>>> with open('/tmp/well.known.location', 'w') as lockfile:
>>>   lockfile.write('Stuff')
>>> Nova has just truncated the image file and written 'Stuff' to it.
>> So that's the generic case (and the way people often write this). But
>> the oslo lockutils code doesn't work that way. While it does open the
>> file for write, it does not actually write, it's using fcntl to hold
>> locks. That's taking a data structure on the fd in kernel memory (IIRC),
>> so it correctly gives it up if the process crashes.
>> I'm not saying there isn't some other possible security vulnerability
>> here as well, but it's not jumping out at me. So I'd love to understand
>> that, because if we can close that exposure we can provide a working
>> default, plus a strong recommendation for how to do that *right*. I'd be
>> totally happy with printing WARNING level at startup if lock_path = /tmp
>> that this should be adjusted.
> Full disclosure: I don't claim to be a security expert, so take my
> thoughts on this with a grain of salt.
> tldr: I still don't see a way to do this without breaking _something_.
> Unfortunately, while we don't actually write to the file, just opening
> it for write access truncates it.  So there remains the issue Clint
> raised if someone can make that file a link.  I suppose we could check
> for the bad things people might do, but I'm pretty sure that way lies
> madness.

Fair point, more coffee before I make statements like that I guess :).

Follow up question: could we change it to open in "a" mode instead.
Still requires write access, but won't truncate.

> As a followup to my suggestion to have lockutils create something like
> /tmp/oslo-lock-path and make it accessible only to the owner, I think
> there's still a problem because a malicious user could create that
> directory ahead of time and when OpenStack is run for the first time it
> would happily use the existing directory and potentially overwrite any
> linked files in it.  There might be ways to avoid that, but I suspect
> they would be racy (which I believe is why you're not supposed to use
> tmp paths that are known ahead of time).  Granted, I'm not sure you
> should be installing to a system that already has malicious users on it,
> but I suspect security people would be less than thrilled with that answer.
> So the workable options that I see right now are to have it break on
> upgrade if lock_path is not set, or continue to break external locking
> by having it fall back to process locks when lock_path is not set, as
> Mark suggested.  I'm not wild about either so I'd love to hear if anyone
> has a third option.

Honestly, I'd love us to be clever and figure out a not dangerous way
through this, even if unwise (where we can yell at the user in the LOGs
loudly, and fail them in J if lock_dir=/tmp) that lets us progress
through this while gracefully bringing configs into line.


Sean Dague

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 482 bytes
Desc: OpenPGP digital signature
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20131204/ee5effdb/attachment.pgp>

More information about the OpenStack-dev mailing list