[openstack-dev] [olso] [cinder] upgrade issues in lock_path in cinder after oslo utils sync

Sean Dague sean at dague.net
Tue Dec 10 13:35:17 UTC 2013


On 12/09/2013 05:18 PM, Mark McLoughlin wrote:
> On Mon, 2013-12-09 at 11:11 -0600, Ben Nemec wrote:
>> On 2013-12-09 10:55, Sean Dague wrote:
>>> On 12/09/2013 11:38 AM, Clint Byrum wrote:
>>>> Excerpts from Sean Dague's message of 2013-12-09 08:17:45 -0800:
>>>>> On 12/06/2013 05:40 PM, Ben Nemec wrote:
>>>>>> On 2013-12-06 16:30, Clint Byrum wrote:
>>>>>>> Excerpts from Ben Nemec's message of 2013-12-06 13:38:16 -0800:
>>>>>>>>
>>>>>>>>
>>>>>>>> On 2013-12-06 15:14, Yuriy Taraday wrote:
>>>>>>>>
>>>>>>>>> Hello, Sean.
>>>>>>>>>
>>>>>>>>> I get the issue with upgrade path. User doesn't want to update
>>>>>>>> config unless one is forced to do so.
>>>>>>>>> But introducing code that weakens security and let it stay is an
>>>>>>>> unconditionally bad idea.
>>>>>>>>> It looks like we have to weigh two evils: having troubles 
>>>>>>>>> upgrading
>>>>>>>> and lessening security. That's obvious.
>>>>>>>>>
>>>>>>>>> Here are my thoughts on what we can do with it:
>>>>>>>>> 1. I think we should definitely force user to do appropriate
>>>>>>>> configuration to let us use secure ways to do locking.
>>>>>>>>> 2. We can wait one release to do so, e.g. issue a deprecation
>>>>>>>> warning now and force user to do it the right way later.
>>>>>>>>> 3. If we are going to do 2. we should do it in the service that 
>>>>>>>>> is
>>>>>>>> affected not in the library because library shouldn't track 
>>>>>>>> releases
>>>>>>>> of an application that uses it. It should do its thing and do it
>>>>>>>> right (secure).
>>>>>>>>>
>>>>>>>>> So I would suggest to deal with it in Cinder by importing
>>>>>>>> 'lock_path' option after parsing configs and issuing a deprecation
>>>>>>>> warning and setting it to tempfile.gettempdir() if it is still 
>>>>>>>> None.
>>>>>>>>
>>>>>>>> This is what Sean's change is doing, but setting lock_path to
>>>>>>>> tempfile.gettempdir() is the security concern.
>>>>>>>
>>>>>>> Yuriy's suggestion is that we should let Cinder override the config
>>>>>>> variable's default with something insecure. Basically only 
>>>>>>> deprecate
>>>>>>> it in Cinder's world, not oslo's. That makes more sense from a 
>>>>>>> library
>>>>>>> standpoint as it keeps the library's expected interface stable.
>>>>>>
>>>>>> Ah, I see the distinction now.  If we get this split off into
>>>>>> oslo.lockutils (which I believe is the plan), that's probably what 
>>>>>> we'd
>>>>>> have to do.
>>>>>>
>>>>>>>>
>>>>>>>> Since there seems to be plenty of resistance to using /tmp by 
>>>>>>>> default,
>>>>>>>> here is my proposal:
>>>>>>>>
>>>>>>>> 1) We make Sean's change to open files in append mode. I think we 
>>>>>>>> can
>>>>>>>> all agree this is a good thing regardless of any config changes.
>>>>>>>>
>>>>>>>> 2) Leave lockutils broken in Icehouse if lock_path is not set, as 
>>>>>>>> I
>>>>>>>> believe Mark suggested earlier. Log an error if we find that
>>>>>>>> configuration. Users will be no worse off than they are today, and 
>>>>>>>> if
>>>>>>>> they're paying attention they can get the fixed lockutils behavior
>>>>>>>> immediately.
>>>>>>>
>>>>>>> Broken how? Broken in that it raises an exception, or broken in 
>>>>>>> that it
>>>>>>> carries a security risk?
>>>>>>
>>>>>> Broken as in external locks don't actually lock.  If we fall back to
>>>>>> using a local semaphore it might actually be a little better because
>>>>>> then at least the locks work within a single process, whereas before
>>>>>> there was no locking whatsoever.
>>>>>
>>>>> Right, so broken as in "doesn't actually locks, potentially 
>>>>> completely
>>>>> scrambles the user's data, breaking them forever."
>>>>>
>>>>
>>>> Things I'd like to see OpenStack do in the short term, ranked in 
>>>> ascending
>>>> order of importance:
>>>>
>>>> 4) Upgrade smoothly.
>>>> 3) Scale.
>>>> 2) Help users manage external risks.
>>>> 1) Not do what Sean described above.
>>>>
>>>> I mean, how can we even suggest silently destroying integrity?
>>>>
>>>> I suggest merging Sean's patch and putting a warning in the release
>>>> notes that running without setting this will be deprecated in the next
>>>> release. If that is what this is preventing this debate should not 
>>>> have
>>>> happened, and I sincerely apologize for having delayed it. I believe 
>>>> my
>>>> mistake was assuming this was something far more trivial than "without
>>>> this patch we destroy users' data".
>>>>
>>>> I thought we were just talking about making upgrades work. :-P
>>>
>>> Honestly, I haven't looked exactly how bad the corruption would be. But
>>> we are locking to handle something around simultaneous db access in
>>> cinder, so I'm going to assume the worst here.
>>
>> Yeah, my understanding is that this doesn't come up much in actual use 
>> because lock_path is set in most production environments.  Still, 
>> obviously not cool when your locks don't lock, which is why we made the 
>> unpleasant change to require lock_path.  It wasn't something we did 
>> lightly (I even sent something to the list before it merged, although I 
>> got no responses at the time).
> 
> What would happen if we required each service to set a sane default
> here?
> 
> e.g. for Nova, would a dir under $state_path work? It just needs to be a
> directory that isn't world-writeable but is writeable by whatever user
> Nova is running as.
> 
> Practically speaking, this just means that Cinder needs to do:
> 
>   lockutils.set_defaults(lock_path=os.path.join(CONF.state_path, 'tmp'))
> 
> and the current behaviour of lockutils.py is fine.
> 
> Hmm, that feels like I'm missing something?

So the first thread actually started with asking how to do that. But I
guess it went off the rails and never actually popped up with an answer.
So that path was never persued. I'd be totally happy coming back to
that. I was actually hoping we'd get a cinder person in on this thread
to understand the implications of the patch on cinder deeper.

I think a bit part of the issue is understanding how locks are used, in
cinder, or elsewhere. Because as Clint said we've got the issue that
external locks from oslo:

 * support multiple users working with the same lock
 * require a "well known location" to work, as they are file based
 * uses write calls on the lock files, so thus are vulnerable to a
symlink attack vector if a malicious user has write access to the lock_path

If we don't care about the multiple user case lots of things could be
further tightened up. For instance, on not-Windows we could actually use
semaphores. Or if we wanted to keep the current implementation we could
check directory permissions and fail if they were wrong.

So I think there are basically 3 things here:

1) sorting what really needs to happen in that cinder patch

2) what sort of additional changes should we make to oslo locking to
make it safer in it's normal operation.

3) is anyone expecting oslo locking external=True to work between 2
processes running under different user ids? If not, we can actually make
this a ton safer.

	-Sean

-- 
Sean Dague
http://dague.net

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 482 bytes
Desc: OpenPGP digital signature
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20131210/ded19c21/attachment.pgp>


More information about the OpenStack-dev mailing list