[openstack-dev] [Openstack-dev][Neutron] Handling of ovs command errors

Irena Berezovsky irenab at mellanox.com
Tue Nov 26 06:22:53 UTC 2013


Salvatore, 
Very good questions.
You raised your concerns for OVS agent, but I think it will be applicable for any other neutron agent that requires additional service to perform actions . At least, I was dealing with similar issues for Mellanox L2 agent. It makes sense for me if you fail to bind the port, it should be indicated  by neutron port status.
Another issue I had and try to solve  by the following patch: https://review.openstack.org/#/c/48842/ is the situation when agent fails to communicate with external daemon that responsible for actual programming. After number of retries with increasing back-off interval between retries, the agent will be terminated if fails to communicate. Does it make sense?

Regards,
Irena 

-----Original Message-----
From: Kyle Mestery (kmestery) [mailto:kmestery at cisco.com] 
Sent: Monday, November 25, 2013 11:16 PM
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] [Openstack-dev][Neutron] Handling of ovs command errors

On Nov 25, 2013, at 12:36 PM, Salvatore Orlando <sorlando at nicira.com> wrote:
> 
> Thanks Kyle,
> 
> More comments inline.
> 
> Salvatore
> 
> 
> On 25 November 2013 16:03, Kyle Mestery (kmestery) <kmestery at cisco.com> wrote:
> On Nov 25, 2013, at 8:28 AM, Salvatore Orlando <sorlando at nicira.com> wrote:
> >
> > Hi,
> >
> > I've been recently debugging some issues I've had with the OVS agent, and I found out that in many  cases (possibly every case) the code just logs errors from ovs-vsctl and ovs-ofctl without taking any action in the control flow.
> >
> > For instance, the routine which should do the wiring for a port, port_bound [1], does not react in any way if it fails to configure the local vlan, which I guess means the port would not be able to send/receive any data.
> >
> > I'm pretty sure there's a good reason for this which I'm missing at the moment. I am asking because I see a pretty large number of ALARM_CLOCK errors returned by OVS commands in gate logs (see bug [2]), and I'm not sure whether it's ok to handle them as the OVS agent is doing nowadays.
> >
> Thanks for bringing this up Salvatore. It looks like the underlying run_vstcl [1] provides an ability to raise exceptions on errors, but this is not used by most of the callers of run_vsctl. Do you think we should be returning the exceptions back up the stack to callers to handle? I think that may be a good first step.
> 
> I think it makes sense to start to handle errors; as they often happen in the agent's rpc loop simply raising will probably just cause the agent to crash.
> I looked again at the code and it really seems it's silently ignoring errors from ovs command.
> This actually makes sense in some cases. For instance the l3 agent might remove a qr-xxx or qg-xxx port while the l2 agent is in the middle of its iteration.
> 
> There are however cases in which the exception must be handled.
> In cases like the ALARM_CLOCK error, either a retry mechanism or marking the port for re-syncing at the next iteration might make sense.
> Other error cases might be unrecoverable; for instance when a port disappears. In that case it seems reasonable to put the relevant neutron port in ERROR state, so that the user is aware that the port anymore.
> 
I think it makes sense to address these things. Want me to file a bug?

> Thanks,
> Kyle
> 
> [1] https://github.com/openstack/neutron/blob/master/neutron/agent/linux/ovs_lib.py#L52
> 
> > Regards,
> > Salvatore
> >
> > [1] https://github.com/openstack/neutron/blob/master/neutron/plugins/openvswitch/agent/ovs_neutron_agent.py#L599
> > [2] https://bugs.launchpad.net/neutron/+bug/1254520
> > _______________________________________________
> > OpenStack-dev mailing list
> > OpenStack-dev at lists.openstack.org
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 
> 
> 
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



_______________________________________________
OpenStack-dev mailing list
OpenStack-dev at lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



More information about the OpenStack-dev mailing list