[openstack-dev] creating a default for oslo config variables within a project?
Ben Nemec
openstack at nemebean.com
Wed Dec 4 19:00:12 UTC 2013
On 2013-12-04 12:51, Sean Dague wrote:
> On 12/04/2013 11:56 AM, Ben Nemec wrote:
>> On 2013-12-04 06:07, Sean Dague wrote:
>>> On 12/03/2013 11:21 PM, Clint Byrum wrote:
>>>> Excerpts from Sean Dague's message of 2013-12-03 16:05:47 -0800:
>>>>> On 12/03/2013 06:13 PM, Ben Nemec wrote:
>>>>>> On 2013-12-03 17:09, Sean Dague wrote:
>>>>>>> On 12/03/2013 05:50 PM, Mark McLoughlin wrote:
>>>>>>>> On Tue, 2013-12-03 at 16:23 -0600, Ben Nemec wrote:
>>>>>>>>> On 2013-12-03 15:56, Sean Dague wrote:
>>>>>>>>>> This cinder patch - https://review.openstack.org/#/c/48935/
>>>>>>>>>>
>>>>>>>>>> Is blocked on failing upgrade because the updated oslo
>>>>>>>>>> lockutils won't
>>>>>>>>>> function until there is a specific configuration variable
>>>>>>>>>> added
>>>>>>>>>> to the
>>>>>>>>>> cinder.conf.
>>>>>>>>>>
>>>>>>>>>> That work around is proposed here -
>>>>>>>>>> https://review.openstack.org/#/c/52070/3
>>>>>>>>>>
>>>>>>>>>> However I think this is exactly the kind of forward breaks
>>>>>>>>>> that we
>>>>>>>>>> want
>>>>>>>>>> to prevent with grenade, as cinder failing to function after a
>>>>>>>>>> rolling
>>>>>>>>>> upgrade because a config item wasn't added is exactly the kind
>>>>>>>>>> of pain
>>>>>>>>>> we are trying to prevent happening to ops.
>>>>>>>>>>
>>>>>>>>>> So the question is, how is this done correctly so that a
>>>>>>>>>> default
>>>>>>>>>> can be
>>>>>>>>>> set in the cinder code for this value, and it not require a
>>>>>>>>>> config
>>>>>>>>>> change to work?
>>>>>>>>
>>>>>>>> You're absolutely correct, in principle - if the default value
>>>>>>>> for
>>>>>>>> lock_path worked for users before, we should absolutely continue
>>>>>>>> to
>>>>>>>> support it.
>>>>>>>>
>>>>>>>>> I don't know that I have a good answer on how to handle this,
>>>>>>>>> but for
>>>>>>>>> context this change is the result of a nasty bug in lockutils
>>>>>>>>> that
>>>>>>>>> meant
>>>>>>>>> external locks were doing nothing if lock_path wasn't set.
>>>>>>>>> Basically
>>>>>>>>> it's something we should never have allowed in the first place.
>>>>>>>>>
>>>>>>>>> As far as setting this in code, it's important that all of the
>>>>>>>>> processes
>>>>>>>>> for a service are using the same value to avoid the same bad
>>>>>>>>> situation
>>>>>>>>> we were in before. For tests, we have a lockutils wrapper
>>>>>>>>> (https://github.com/openstack/oslo-incubator/blob/master/openstack/common/lockutils.py#L282)
>>>>>>>>>
>>>>>>>>> that sets an environment variable to address this, but that
>>>>>>>>> only
>>>>>>>>> works if all of the processes are going to be spawned from
>>>>>>>>> within
>>>>>>>>> the same wrapper, and I'm not sure how secure that is for
>>>>>>>>> production
>>>>>>>>> deployments since it puts all of the lock files in a temporary
>>>>>>>>> directory.
>>>>>>>>
>>>>>>>> Right, I don't think the previous default really "worked" - if
>>>>>>>> you used
>>>>>>>> the default, then external locking was broken.
>>>>>>>>
>>>>>>>> I suspect most distros do set a default - I see RDO has this in
>>>>>>>> its
>>>>>>>> default nova.conf:
>>>>>>>>
>>>>>>>> lock_path = /var/lib/nova/tmp
>>>>>>>>
>>>>>>>> So, yes - this is all terrible.
>>>>>>>>
>>>>>>>> IMHO, rather than raise an exception we should log a big fat
>>>>>>>> warning
>>>>>>>> about relying on the default and perhaps just treat the lock as
>>>>>>>> an
>>>>>>>> in-process lock in that case ... since that's essentially what
>>>>>>>> it
>>>>>>>> was
>>>>>>>> before, right?
>>>>>>>
>>>>>>> So a default of lock_path = /tmp will work (FHS says that path
>>>>>>> has
>>>>>>> to be
>>>>>>> there), even if not optimal. Could we make it a default value
>>>>>>> like
>>>>>>> that
>>>>>>> instead of the current default which is null (and hence the
>>>>>>> problem).
>>>>>>
>>>>>> IIRC, my initial fix was something similar to that, but it got
>>>>>> shot
>>>>>> down
>>>>>> because putting the lock files in a known world writeable location
>>>>>> was a
>>>>>> security issue.
>>>>>>
>>>>>> Although maybe if we put them in a subdirectory of /tmp and
>>>>>> ensured
>>>>>> that
>>>>>> the permissions were such that only the user running the service
>>>>>> could
>>>>>> use that directory, it might be acceptable? We could still log a
>>>>>> warning if we wanted.
>>>>>>
>>>>>> This seems like it would have implications for people running
>>>>>> services
>>>>>> on Windows too, but we can probably find a way to make that work
>>>>>> if we
>>>>>> decide on a solution.
>>>>>
>>>>> How is that a security issue? Are the lock files being written with
>>>>> some
>>>>> sensitive data in them and have g or o permissions on? The sticky
>>>>> bit
>>>>> (+t) on /tmp will prevent other users from deleting the file.
>>>>>
>>>>
>>>> Right, but it won't prevent users from creating a symlink with the
>>>> same
>>>> name.
>>>>
>>>> ln -s /var/lib/nova/instances/xxxxx/image.raw
>>>> /tmp/well.known.location
>>>>
>>>> Now when you do
>>>>
>>>> with open('/tmp/well.known.location', 'w') as lockfile:
>>>> lockfile.write('Stuff')
>>>>
>>>> Nova has just truncated the image file and written 'Stuff' to it.
>>>
>>> So that's the generic case (and the way people often write this). But
>>> the oslo lockutils code doesn't work that way. While it does open the
>>> file for write, it does not actually write, it's using fcntl to hold
>>> locks. That's taking a data structure on the fd in kernel memory
>>> (IIRC),
>>> so it correctly gives it up if the process crashes.
>>>
>>> I'm not saying there isn't some other possible security vulnerability
>>> here as well, but it's not jumping out at me. So I'd love to
>>> understand
>>> that, because if we can close that exposure we can provide a working
>>> default, plus a strong recommendation for how to do that *right*. I'd
>>> be
>>> totally happy with printing WARNING level at startup if lock_path =
>>> /tmp
>>> that this should be adjusted.
>>
>> Full disclosure: I don't claim to be a security expert, so take my
>> thoughts on this with a grain of salt.
>>
>> tldr: I still don't see a way to do this without breaking _something_.
>>
>> Unfortunately, while we don't actually write to the file, just opening
>> it for write access truncates it. So there remains the issue Clint
>> raised if someone can make that file a link. I suppose we could check
>> for the bad things people might do, but I'm pretty sure that way lies
>> madness.
>
> Fair point, more coffee before I make statements like that I guess :).
>
> Follow up question: could we change it to open in "a" mode instead.
> Still requires write access, but won't truncate.
>
>> As a followup to my suggestion to have lockutils create something like
>> /tmp/oslo-lock-path and make it accessible only to the owner, I think
>> there's still a problem because a malicious user could create that
>> directory ahead of time and when OpenStack is run for the first time
>> it
>> would happily use the existing directory and potentially overwrite any
>> linked files in it. There might be ways to avoid that, but I suspect
>> they would be racy (which I believe is why you're not supposed to use
>> tmp paths that are known ahead of time). Granted, I'm not sure you
>> should be installing to a system that already has malicious users on
>> it,
>> but I suspect security people would be less than thrilled with that
>> answer.
>>
>> So the workable options that I see right now are to have it break on
>> upgrade if lock_path is not set, or continue to break external locking
>> by having it fall back to process locks when lock_path is not set, as
>> Mark suggested. I'm not wild about either so I'd love to hear if
>> anyone
>> has a third option.
>
> Honestly, I'd love us to be clever and figure out a not dangerous way
> through this, even if unwise (where we can yell at the user in the LOGs
> loudly, and fail them in J if lock_dir=/tmp) that lets us progress
> through this while gracefully bringing configs into line.
Copying openstack-security since they should probably sign off on this.
-Ben
More information about the OpenStack-dev
mailing list