[openstack-dev] [neutron][L3][QA] DVR job failure rate and maintainability

shihanzhang ayshihanzhang at 126.com
Wed Sep 16 02:40:37 UTC 2015


Sean, 
Thank you very much for writing this, DVR indeed need to get more attention, it's a very cool and usefull feature, especially in large-scale. In Juno, it firstly lands to Neutron, through the development of Kilo and Liberty, it's getting better and better, we have used it in our production,
in the process of use, we found the following bugs have not been fixed, we have filed bug on launchpad:
1. every time we create a VM, it will trigger router scheduling, in large-scale, if there are lage l3 agents bind to a DVR router, scheduling router consume much time, but scheduling action is not necessary.[1]
2. every time we bind a VM with floatingIP, it also trigger router scheduling, and send this floatingIP to all bound
l3 agents.[2]
3. Bulk delete VMs from a compute node which has no VM on this router, for most part, the router namespace will remain.[3]
4. Updating router_gateway trigger reschedule_router, during reschedule_router, the communication is broken related to this router, for DVR router, why router need to reschedule_router? it reschedule which l3 agents? [4]
5. Stale fip namespaces are not cleaned up on compute nodes. [5]


I very agree with that we need a group of contributors that
can help with the DVR feature in the immediate term to fix the current bugs.
I am very glad to join this group.


Neutroner, let's start to do the great things!


Thanks,
Hanzhang,Shi


[1] https://bugs.launchpad.net/neutron/+bug/1486795
[2]https://bugs.launchpad.net/neutron/+bug/1486828
[3] https://bugs.launchpad.net/neutron/+bug/1496201
[4] https://bugs.launchpad.net/neutron/+bug/1496204
[5] https://bugs.launchpad.net/neutron/+bug/1470909







At 2015-09-15 06:01:03, "Sean M. Collins" <sean at coreitpro.com> wrote:
>[adding neutron tag to subject and resending]
>
>Hi,
>
>Carl Baldwin, Doug Wiegley, Matt Kassawara, Ryan Moats, and myself are
>at the QA sprint in Fort Collins. Earlier today there was a discussion
>about the failure rate about the DVR job, and the possible impact that
>it is having on the gate.
>
>Ryan has a good patch up that shows the failure rates over time:
>
>https://review.openstack.org/223201
>
>To view the graphs, you go over into your neutron git repo, and open the
>.html files that are present in doc/dashboards - which should open up
>your browser and display the Graphite query.
>
>Doug put up a patch to change the DVR job to be non-voting while we
>determine the cause of the recent spikes:
>
>https://review.openstack.org/223173
>
>There was a good discussion after pushing the patch, revolving around
>the need for Neutron to have DVR, to fit operational and reliability
>requirements, and help transition away from Nova-Network by providing
>one of many solutions similar to Nova's multihost feature.  I'm skipping
>over a huge amount of context about the Nova-Network and Neutron work,
>since that is a big and ongoing effort. 
>
>DVR is an important feature to have, and we need to ensure that the job
>that tests DVR has a high pass rate.
>
>One thing that I think we need, is to form a group of contributors that
>can help with the DVR feature in the immediate term to fix the current
>bugs, and longer term maintain the feature. It's a big task and I don't
>believe that a single person or company can or should do it by themselves.
>
>The L3 group is a good place to start, but I think that even within the
>L3 team we need dedicated and diverse group of people who are interested
>in maintaining the DVR feature. 
>
>Without this, I think the DVR feature will start to bit-rot and that
>will have a significant impact on our ability to recommend Neutron as a
>replacement for Nova-Network in the future.
>
>-- 
>Sean M. Collins
>
>__________________________________________________________________________
>OpenStack Development Mailing List (not for usage questions)
>Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>__________________________________________________________________________
>OpenStack Development Mailing List (not for usage questions)
>Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20150916/cb348209/attachment.html>


More information about the OpenStack-dev mailing list