[openstack-dev] Bug #1194026

Nachi Ueno nachi at ntti3.com
Thu Jul 11 23:19:15 UTC 2013


Hi folks

I think I found possible cause of this problems.

so we expected all RPC call is executed serialized way on l3-agent
However it is executed in random order.

http://paste.openstack.org/show/40156/
line starts from **** Get is RPC message log.
line starts from [[[[[ Process is when l3-agent start processing rpc messages.
(I added rpc message number for debugging)

https://bugs.launchpad.net/neutron/+bug/1194026

Here is my proposal for fixing code.

- Server side simply notifies when something updated.
- Client will update updated flag in the client when it get updated
- some looping call will check the flag,
  if the flag is true, it will full sync with servers

If this is OK, I'll start write it.

Thanks
Nachi












2013/7/11 Salvatore Orlando <sorlando at nicira.com>:
> Adding openstack-dev to this discussion thread.
> Looks like the test is going to be skipped at the moment, but we probably
> need to consider raising the priority of this issue and assign our cores
> with more experience with tempest/gating on this.
>
> salvatore
>
>
> On 9 July 2013 22:48, Maru Newby <marun at redhat.com> wrote:
>>
>> My suggestion is that the quantum exercise script be disabled for now if
>> that will allow the tempest test to run, since the tempest test is more
>> useful (it does an ssh check to ensure that the metadata service has
>> configured the VM).  Doing so would allow useful gating while we look at
>> resolving the timing bug.
>>
>>
>> m.
>>
>> On Jul 9, 2013, at 5:42 PM, Nachi Ueno <nachi at ntti3.com> wrote:
>>
>> > Hi Maru
>> >
>> > The gating test will not fail everytime. Sometimes, both of tests
>> > works, sometimes not.
>> > In this test, execise.sh works and tempest don't works.
>> > I'm still not sure is there any dependencies in this failure or not.
>> >
>> > So I'm assuming this is kind of timing issue..
>> >
>> > hmm this kind of bug is hard to fix..
>> >
>> >
>> > 2013/7/9 Maru Newby <marun at redhat.com>:
>> >> If there is a conflict between the exercise test and the tempest test,
>> >> does the tempest test pass if the exercise script isn't run beforehand?
>> >>
>> >>
>> >> m.
>> >>
>> >> On Jul 9, 2013, at 5:20 PM, Nachi Ueno <nachi at ntti3.com> wrote:
>> >>
>> >>> Hi
>> >>>
>> >>> I checked briefly, and it looks some timing bug of l3-agent.
>> >>> I added note on the bug report.
>> >>> https://bugs.launchpad.net/neutron/+bug/1194026
>> >>>
>> >>> 2013/7/9 Salvatore Orlando <sorlando at nicira.com>:
>> >>>> Sean Dague singled it out as the biggest cause for gate failures, and
>> >>>> invited us to have a look at it.
>> >>>> I've raised its importance to high, but I don't have the cycles to
>> >>>> look at
>> >>>> it in the short term.
>> >>>> It would be really if somebody from the core team finds some time to
>> >>>> triage
>> >>>> it.
>> >>>>
>> >>>> Salvatore
>> >>
>>
>



More information about the OpenStack-dev mailing list