[oslo][neutron] Neutron Functional Test Failures with oslo.privsep 1.31.0

Ben Nemec openstack at nemebean.com
Tue Jan 15 22:56:20 UTC 2019


TLDR: We now need to look at the thread namespace instead of the process 
namespace. Many, many details below.

On 1/15/19 11:51 AM, Ben Nemec wrote:
> 
> 
> On 1/15/19 11:16 AM, Ben Nemec wrote:
>>
>>
>> On 1/15/19 6:49 AM, Doug Hellmann wrote:
>>> Ben Nemec <openstack at nemebean.com> writes:
>>>
>>>> I tried to set up a test environment for this, but I'm having some
>>>> issues. My local environment is defaulting to python 3, while the gate
>>>> job appears to have been running under python 2. I'm not sure why it's
>>>> doing that since the tox env definition doesn't specify python 3 (maybe
>>>> something to do with https://review.openstack.org/#/c/622415/ ?), but
>>>> either way I keep running into import issues.
>>>>
>>>> I'll take another look tomorrow, but in the meantime I'm afraid I
>>>> haven't made any meaningful progress. :-(
>>>
>>> If no version is specified in the tox.ini then tox defaults to the
>>> version of python used to install it.
>>>
>>
>> Ah, good to know. I think I installed tox as just "tox" instead of 
>> "python-tox", which means I got the py3 version.
>>
>> Unfortunately I'm still having trouble running the failing test (and 
>> not for the expected reason ;-). The daemon is failing to start with:
>>
>> ImportError: No module named tests.functional.utils

No idea why, but updating the fwaas capabilities to match core neutron 
by adding c.CAP_DAC_OVERRIDE and c.CAP_DAC_READ_SEARCH made this go 
away. Those are related to file permission checks, but the permissions 
on my source tree are, well, permissive, so I'm not sure why that would 
be a problem.

>>
>> I'm not seeing any log output from the daemon either for some reason 
>> so it's hard to debug. There must be some difference between this and 
>> the neutron test environment because in neutron I was getting daemon 
>> log output in /opt/stack/logs.
> 
> Figured this part out. tox.ini wasn't inheriting some values in the same 
> way as neutron. Fix proposed in https://review.openstack.org/#/c/631035/

Actually, I discovered that these logs were happening, they were just in 
/tmp. So that change is probably not necessary, especially since it's 
breaking ci.

> 
> Now hopefully I can make progress on the rest of it.

And sure enough, I did. :-)

In short, we need to look at the thread-specific network namespace in 
this test instead of the process-specific one. When we change the 
namespace it only affects the thread, unless the call is made from the 
process's main thread. Here's a simple(?) example:

#!/usr/bin/env python

import ctypes
import os
import threading

from pyroute2 import netns

# The python threading identifier is useless here,
# we need to make a syscall
libc = ctypes.CDLL('libc.so.6')

def do_the_thing(ns):
     tid = libc.syscall(186) # This id varies by platform :-/
     # Check the starting netns
     print('process %s' % os.readlink('/proc/self/ns/net'))
     print('thread %s' % os.readlink('/proc/self/task/%s/ns/net' % tid))
     # Change the netns
     print('changing to %s' % ns)
     netns.setns(ns)
     # Check again. It should be different
     print('process %s' % os.readlink('/proc/self/ns/net'))
     print('thread %s\n' % os.readlink('/proc/self/task/%s/ns/net' % tid))

# Run in main thread
do_the_thing('foo')
# Run in new thread
t = threading.Thread(target=do_the_thing, args=('bar',))
t.start()
t.join()
# Run in main thread again to show difference
do_the_thing('bar')

# Clean up after ourselves
netns.remove('foo')
netns.remove('bar')

And here's the output:

process net:[4026531992]
thread net:[4026531992]
changing to foo
process net:[4026532196] <- Running in the main thread changes both
thread net:[4026532196]

process net:[4026532196]
thread net:[4026532196]
changing to bar
process net:[4026532196] <- Child thread only changes the thread
thread net:[4026532254]

process net:[4026532196]
thread net:[4026532196]
changing to bar
process net:[4026532254] <- Main thread gets them back in sync
thread net:[4026532254]

So, to get this test passing I think we need to change [1] so it looks 
for the thread id and uses a replacement for [2] that allows the thread 
id to be injected as above.

And it's the end of my day so I'm going to leave it there. :-)

1: 
https://github.com/openstack/neutron-fwaas/blob/master/neutron_fwaas/privileged/tests/functional/utils.py#L23
2: 
https://github.com/openstack/neutron-fwaas/blob/master/neutron_fwaas/privileged/utils.py#L25

-Ben



More information about the openstack-discuss mailing list