[openstack-dev] Bug #1194026

Nachi Ueno nachi at ntti3.com
Fri Jul 12 19:55:08 UTC 2013


HI Folks

I pushed  https://review.openstack.org/#/c/36890/ .
Since this is kind of critical problem, I'm very appreciate if this is
reviewed quickly.

Thanks
Nachi


2013/7/11 Nachi Ueno <nachi at ntti3.com>:
> Hi folks
>
> I think I found possible cause of this problems.
>
> so we expected all RPC call is executed serialized way on l3-agent
> However it is executed in random order.
>
> http://paste.openstack.org/show/40156/
> line starts from **** Get is RPC message log.
> line starts from [[[[[ Process is when l3-agent start processing rpc messages.
> (I added rpc message number for debugging)
>
> https://bugs.launchpad.net/neutron/+bug/1194026
>
> Here is my proposal for fixing code.
>
> - Server side simply notifies when something updated.
> - Client will update updated flag in the client when it get updated
> - some looping call will check the flag,
>   if the flag is true, it will full sync with servers
>
> If this is OK, I'll start write it.
>
> Thanks
> Nachi
>
>
>
>
>
>
>
>
>
>
>
>
> 2013/7/11 Salvatore Orlando <sorlando at nicira.com>:
>> Adding openstack-dev to this discussion thread.
>> Looks like the test is going to be skipped at the moment, but we probably
>> need to consider raising the priority of this issue and assign our cores
>> with more experience with tempest/gating on this.
>>
>> salvatore
>>
>>
>> On 9 July 2013 22:48, Maru Newby <marun at redhat.com> wrote:
>>>
>>> My suggestion is that the quantum exercise script be disabled for now if
>>> that will allow the tempest test to run, since the tempest test is more
>>> useful (it does an ssh check to ensure that the metadata service has
>>> configured the VM).  Doing so would allow useful gating while we look at
>>> resolving the timing bug.
>>>
>>>
>>> m.
>>>
>>> On Jul 9, 2013, at 5:42 PM, Nachi Ueno <nachi at ntti3.com> wrote:
>>>
>>> > Hi Maru
>>> >
>>> > The gating test will not fail everytime. Sometimes, both of tests
>>> > works, sometimes not.
>>> > In this test, execise.sh works and tempest don't works.
>>> > I'm still not sure is there any dependencies in this failure or not.
>>> >
>>> > So I'm assuming this is kind of timing issue..
>>> >
>>> > hmm this kind of bug is hard to fix..
>>> >
>>> >
>>> > 2013/7/9 Maru Newby <marun at redhat.com>:
>>> >> If there is a conflict between the exercise test and the tempest test,
>>> >> does the tempest test pass if the exercise script isn't run beforehand?
>>> >>
>>> >>
>>> >> m.
>>> >>
>>> >> On Jul 9, 2013, at 5:20 PM, Nachi Ueno <nachi at ntti3.com> wrote:
>>> >>
>>> >>> Hi
>>> >>>
>>> >>> I checked briefly, and it looks some timing bug of l3-agent.
>>> >>> I added note on the bug report.
>>> >>> https://bugs.launchpad.net/neutron/+bug/1194026
>>> >>>
>>> >>> 2013/7/9 Salvatore Orlando <sorlando at nicira.com>:
>>> >>>> Sean Dague singled it out as the biggest cause for gate failures, and
>>> >>>> invited us to have a look at it.
>>> >>>> I've raised its importance to high, but I don't have the cycles to
>>> >>>> look at
>>> >>>> it in the short term.
>>> >>>> It would be really if somebody from the core team finds some time to
>>> >>>> triage
>>> >>>> it.
>>> >>>>
>>> >>>> Salvatore
>>> >>
>>>
>>



More information about the OpenStack-dev mailing list