[Openstack-security] [openstack-dev] creating a default for oslo config variables within a project?

Ben Nemec openstack at nemebean.com
Wed Dec 4 19:00:12 UTC 2013


On 2013-12-04 12:51, Sean Dague wrote:
> On 12/04/2013 11:56 AM, Ben Nemec wrote:
>> On 2013-12-04 06:07, Sean Dague wrote:
>>> On 12/03/2013 11:21 PM, Clint Byrum wrote:
>>>> Excerpts from Sean Dague's message of 2013-12-03 16:05:47 -0800:
>>>>> On 12/03/2013 06:13 PM, Ben Nemec wrote:
>>>>>> On 2013-12-03 17:09, Sean Dague wrote:
>>>>>>> On 12/03/2013 05:50 PM, Mark McLoughlin wrote:
>>>>>>>> On Tue, 2013-12-03 at 16:23 -0600, Ben Nemec wrote:
>>>>>>>>> On 2013-12-03 15:56, Sean Dague wrote:
>>>>>>>>>> This cinder patch - https://review.openstack.org/#/c/48935/
>>>>>>>>>> 
>>>>>>>>>> Is blocked on failing upgrade because the updated oslo
>>>>>>>>>> lockutils won't
>>>>>>>>>> function until there is a specific configuration variable 
>>>>>>>>>> added
>>>>>>>>>> to the
>>>>>>>>>> cinder.conf.
>>>>>>>>>> 
>>>>>>>>>> That work around is proposed here -
>>>>>>>>>> https://review.openstack.org/#/c/52070/3
>>>>>>>>>> 
>>>>>>>>>> However I think this is exactly the kind of forward breaks 
>>>>>>>>>> that we
>>>>>>>>>> want
>>>>>>>>>> to prevent with grenade, as cinder failing to function after a
>>>>>>>>>> rolling
>>>>>>>>>> upgrade because a config item wasn't added is exactly the kind
>>>>>>>>>> of pain
>>>>>>>>>> we are trying to prevent happening to ops.
>>>>>>>>>> 
>>>>>>>>>> So the question is, how is this done correctly so that a 
>>>>>>>>>> default
>>>>>>>>>> can be
>>>>>>>>>> set in the cinder code for this value, and it not require a 
>>>>>>>>>> config
>>>>>>>>>> change to work?
>>>>>>>> 
>>>>>>>> You're absolutely correct, in principle - if the default value 
>>>>>>>> for
>>>>>>>> lock_path worked for users before, we should absolutely continue 
>>>>>>>> to
>>>>>>>> support it.
>>>>>>>> 
>>>>>>>>> I don't know that I have a good answer on how to handle this,
>>>>>>>>> but for
>>>>>>>>> context this change is the result of a nasty bug in lockutils 
>>>>>>>>> that
>>>>>>>>> meant
>>>>>>>>> external locks were doing nothing if lock_path wasn't set.
>>>>>>>>> Basically
>>>>>>>>> it's something we should never have allowed in the first place.
>>>>>>>>> 
>>>>>>>>> As far as setting this in code, it's important that all of the
>>>>>>>>> processes
>>>>>>>>> for a service are using the same value to avoid the same bad
>>>>>>>>> situation
>>>>>>>>> we were in before.  For tests, we have a lockutils wrapper
>>>>>>>>> (https://github.com/openstack/oslo-incubator/blob/master/openstack/common/lockutils.py#L282)
>>>>>>>>> 
>>>>>>>>> that sets an environment variable to address this, but that 
>>>>>>>>> only
>>>>>>>>> works if all of the processes are going to be spawned from 
>>>>>>>>> within
>>>>>>>>> the same wrapper, and I'm not sure how secure that is for
>>>>>>>>> production
>>>>>>>>> deployments since it puts all of the lock files in a temporary
>>>>>>>>> directory.
>>>>>>>> 
>>>>>>>> Right, I don't think the previous default really "worked" - if
>>>>>>>> you used
>>>>>>>> the default, then external locking was broken.
>>>>>>>> 
>>>>>>>> I suspect most distros do set a default - I see RDO has this in 
>>>>>>>> its
>>>>>>>> default nova.conf:
>>>>>>>> 
>>>>>>>>   lock_path = /var/lib/nova/tmp
>>>>>>>> 
>>>>>>>> So, yes - this is all terrible.
>>>>>>>> 
>>>>>>>> IMHO, rather than raise an exception we should log a big fat 
>>>>>>>> warning
>>>>>>>> about relying on the default and perhaps just treat the lock as 
>>>>>>>> an
>>>>>>>> in-process lock in that case ... since that's essentially what 
>>>>>>>> it
>>>>>>>> was
>>>>>>>> before, right?
>>>>>>> 
>>>>>>> So a default of lock_path = /tmp will work (FHS says that path 
>>>>>>> has
>>>>>>> to be
>>>>>>> there), even if not optimal. Could we make it a default value 
>>>>>>> like
>>>>>>> that
>>>>>>> instead of the current default which is null (and hence the 
>>>>>>> problem).
>>>>>> 
>>>>>> IIRC, my initial fix was something similar to that, but it got 
>>>>>> shot
>>>>>> down
>>>>>> because putting the lock files in a known world writeable location
>>>>>> was a
>>>>>> security issue.
>>>>>> 
>>>>>> Although maybe if we put them in a subdirectory of /tmp and 
>>>>>> ensured
>>>>>> that
>>>>>> the permissions were such that only the user running the service 
>>>>>> could
>>>>>> use that directory, it might be acceptable?  We could still log a
>>>>>> warning if we wanted.
>>>>>> 
>>>>>> This seems like it would have implications for people running 
>>>>>> services
>>>>>> on Windows too, but we can probably find a way to make that work 
>>>>>> if we
>>>>>> decide on a solution.
>>>>> 
>>>>> How is that a security issue? Are the lock files being written with
>>>>> some
>>>>> sensitive data in them and have g or o permissions on? The sticky 
>>>>> bit
>>>>> (+t) on /tmp will prevent other users from deleting the file.
>>>>> 
>>>> 
>>>> Right, but it won't prevent users from creating a symlink with the 
>>>> same
>>>> name.
>>>> 
>>>> ln -s /var/lib/nova/instances/xxxxx/image.raw 
>>>> /tmp/well.known.location
>>>> 
>>>> Now when you do
>>>> 
>>>> with open('/tmp/well.known.location', 'w') as lockfile:
>>>>   lockfile.write('Stuff')
>>>> 
>>>> Nova has just truncated the image file and written 'Stuff' to it.
>>> 
>>> So that's the generic case (and the way people often write this). But
>>> the oslo lockutils code doesn't work that way. While it does open the
>>> file for write, it does not actually write, it's using fcntl to hold
>>> locks. That's taking a data structure on the fd in kernel memory 
>>> (IIRC),
>>> so it correctly gives it up if the process crashes.
>>> 
>>> I'm not saying there isn't some other possible security vulnerability
>>> here as well, but it's not jumping out at me. So I'd love to 
>>> understand
>>> that, because if we can close that exposure we can provide a working
>>> default, plus a strong recommendation for how to do that *right*. I'd 
>>> be
>>> totally happy with printing WARNING level at startup if lock_path = 
>>> /tmp
>>> that this should be adjusted.
>> 
>> Full disclosure: I don't claim to be a security expert, so take my
>> thoughts on this with a grain of salt.
>> 
>> tldr: I still don't see a way to do this without breaking _something_.
>> 
>> Unfortunately, while we don't actually write to the file, just opening
>> it for write access truncates it.  So there remains the issue Clint
>> raised if someone can make that file a link.  I suppose we could check
>> for the bad things people might do, but I'm pretty sure that way lies
>> madness.
> 
> Fair point, more coffee before I make statements like that I guess :).
> 
> Follow up question: could we change it to open in "a" mode instead.
> Still requires write access, but won't truncate.
> 
>> As a followup to my suggestion to have lockutils create something like
>> /tmp/oslo-lock-path and make it accessible only to the owner, I think
>> there's still a problem because a malicious user could create that
>> directory ahead of time and when OpenStack is run for the first time 
>> it
>> would happily use the existing directory and potentially overwrite any
>> linked files in it.  There might be ways to avoid that, but I suspect
>> they would be racy (which I believe is why you're not supposed to use
>> tmp paths that are known ahead of time).  Granted, I'm not sure you
>> should be installing to a system that already has malicious users on 
>> it,
>> but I suspect security people would be less than thrilled with that 
>> answer.
>> 
>> So the workable options that I see right now are to have it break on
>> upgrade if lock_path is not set, or continue to break external locking
>> by having it fall back to process locks when lock_path is not set, as
>> Mark suggested.  I'm not wild about either so I'd love to hear if 
>> anyone
>> has a third option.
> 
> Honestly, I'd love us to be clever and figure out a not dangerous way
> through this, even if unwise (where we can yell at the user in the LOGs
> loudly, and fail them in J if lock_dir=/tmp) that lets us progress
> through this while gracefully bringing configs into line.

Copying openstack-security since they should probably sign off on this.

-Ben




More information about the Openstack-security mailing list