Re: Floating IP's for routed networks
Hi Ryan, If you don't mind, I'm adding the openstack-discuss list in the loop, as this topic may be of interest to others. For mailing list readers, I'm trying to implement this: https://review.opendev.org/#/c/669395/ but I'm having some difficulties. I did a bit of investigation with some added LOG.info() in the code. When doing:
openstack subnet create vm-fip \ --subnet-range 10.66.20.0/24 \ --service-type 'network:routed' \ --service-type 'network:floatingip' \ --network multisegment1
Here's where neutron-api crashes. in db/ipam_backend_mixin.py: def _validate_segment(self, context, network_id, segment_id, action=None, old_segment_id=None): # TODO(tidwellr) Create and use a constant for the service type segments = subnet_obj.Subnet.get_subnet_segment_ids( context, network_id, filtered_service_type='network:routed') associated_segments = set(segments) if None in associated_segments and len(associated_segments) > 1: raise segment_exc.SubnetsNotAllAssociatedWithSegments( network_id=network_id) SubnetsNotAllAssociatedWithSegments() is raised, as you must already guessed. Here's the values... associated_segments is an array containing 3 values: 2 being the IDs of the segments I added previously, the 3rd one being None. This test is then matched. Where is that None value coming from? Is this the new subnet I'm trying to add? Maybe the filtered_service_type='network:routed' in the call: subnet_obj.Subnet.get_subnet_segment_ids() isn't working as expected? Printing the SQL query that is checked shows: SELECT subnets.segment_id AS subnets_segment_id FROM subnets WHERE subnets.network_id = %(network_id_1)s AND subnets.id NOT IN (SELECT subnet_service_types.subnet_id AS subnet_service_types_subnet_id FROM subnet_service_types WHERE subnets.network_id = %(network_id_2)s AND subnet_service_types.subnet_id = subnets.id AND subnet_service_types.service_type = %(service_type_1)s) though when doing by hand: SELECT subnets.segment_id AS subnets_segment_id FROM subnets the db has only 2 subnets, so it looks like the floating-ip subnet got added before the check, and is then removed when the above test fails. So I just removed the raise, and could add the subnet I wanted, but that's obviously not a long term solution. Your thoughts? Another problem that I'm having, is that neutron-bgp-dragent is not receiving (or processing) the messages from neutron-rpc-server. I've enabled DEBUG mode for oslo_messaging, and found out that when dr-agent starts and prints "Agent has just been revived. Scheduling full sync", it does send a message to neutron-rpc-server, which is replied, but it doesn't look like dr-agent processes the return message in its reply queue, and then prints in the logs: "imeout in RPC method get_bgp_speakers. Waiting for 17 seconds before next attempt. If the server is not down, consider increasing the rpc_response_timeout option as Neutron server(s) may be overloaded and unable to respond quickly enough.: oslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply to message ID c1b401c9e10d481bb5e071f2c048e480". What is weird is that a few times (rarely), it worked, and the agent gets the reply. What should I do to investigate further? Cheers, Thomas Goirand (zigo)
Sending the message again with the correct From, as I'm not subscribed to the list with the other mailbox. On 7/15/20 2:13 PM, Thomas Goirand wrote:
Hi Ryan,
If you don't mind, I'm adding the openstack-discuss list in the loop, as this topic may be of interest to others.
For mailing list readers, I'm trying to implement this: https://review.opendev.org/#/c/669395/ but I'm having some difficulties.
I did a bit of investigation with some added LOG.info() in the code.
When doing:
openstack subnet create vm-fip \ --subnet-range 10.66.20.0/24 \ --service-type 'network:routed' \ --service-type 'network:floatingip' \ --network multisegment1
Here's where neutron-api crashes. in db/ipam_backend_mixin.py:
def _validate_segment(self, context, network_id, segment_id, action=None, old_segment_id=None): # TODO(tidwellr) Create and use a constant for the service type segments = subnet_obj.Subnet.get_subnet_segment_ids( context, network_id, filtered_service_type='network:routed')
associated_segments = set(segments) if None in associated_segments and len(associated_segments) > 1: raise segment_exc.SubnetsNotAllAssociatedWithSegments( network_id=network_id)
SubnetsNotAllAssociatedWithSegments() is raised, as you must already guessed. Here's the values...
associated_segments is an array containing 3 values: 2 being the IDs of the segments I added previously, the 3rd one being None. This test is then matched. Where is that None value coming from? Is this the new subnet I'm trying to add? Maybe the filtered_service_type='network:routed' in the call: subnet_obj.Subnet.get_subnet_segment_ids() isn't working as expected?
Printing the SQL query that is checked shows:
SELECT subnets.segment_id AS subnets_segment_id FROM subnets WHERE subnets.network_id = %(network_id_1)s AND subnets.id NOT IN (SELECT subnet_service_types.subnet_id AS subnet_service_types_subnet_id FROM subnet_service_types WHERE subnets.network_id = %(network_id_2)s AND subnet_service_types.subnet_id = subnets.id AND subnet_service_types.service_type = %(service_type_1)s)
though when doing by hand:
SELECT subnets.segment_id AS subnets_segment_id FROM subnets
the db has only 2 subnets, so it looks like the floating-ip subnet got added before the check, and is then removed when the above test fails.
So I just removed the raise, and could add the subnet I wanted, but that's obviously not a long term solution.
Your thoughts?
Another problem that I'm having, is that neutron-bgp-dragent is not receiving (or processing) the messages from neutron-rpc-server. I've enabled DEBUG mode for oslo_messaging, and found out that when dr-agent starts and prints "Agent has just been revived. Scheduling full sync", it does send a message to neutron-rpc-server, which is replied, but it doesn't look like dr-agent processes the return message in its reply queue, and then prints in the logs: "imeout in RPC method get_bgp_speakers. Waiting for 17 seconds before next attempt. If the server is not down, consider increasing the rpc_response_timeout option as Neutron server(s) may be overloaded and unable to respond quickly enough.: oslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply to message ID c1b401c9e10d481bb5e071f2c048e480". What is weird is that a few times (rarely), it worked, and the agent gets the reply.
What should I do to investigate further?
Cheers,
Thomas Goirand (zigo)
Hi Thomas: If I'm not wrong, the goal of this filtering is to remove all those subnets with service_type='network:routed'. Maybe you can check implementing an easier query: SELECT subnets.segment_id AS subnets_segment_id FROM subnets WHERE subnets.network_id = %(network_id_1)s AND NOT (EXISTS (SELECT * FROM subnet_service_types WHERE subnets.id = subnet_service_types.subnet_id AND subnet_service_types.service_type = %(service_type_1)s)) That will be translated to python as: query = test_db.context.session.query(subnet_obj.Subnet.db_model.segment_id) query = query.filter(subnet_obj.Subnet.db_model.network_id == network_id) if filtered_service_type: query = query.filter(~exists().where(and_( subnet_obj.Subnet.db_model.id == service_type_model.subnet_id, service_type_model.service_type == filtered_service_type))) Can you provide a UTs or a way to check the problem you are experiencing? Regards. On Wed, Jul 15, 2020 at 1:27 PM Thomas Goirand <zigo@debian.org> wrote:
Sending the message again with the correct From, as I'm not subscribed to the list with the other mailbox.
On 7/15/20 2:13 PM, Thomas Goirand wrote:
Hi Ryan,
If you don't mind, I'm adding the openstack-discuss list in the loop, as this topic may be of interest to others.
For mailing list readers, I'm trying to implement this: https://review.opendev.org/#/c/669395/ but I'm having some difficulties.
I did a bit of investigation with some added LOG.info() in the code.
When doing:
openstack subnet create vm-fip \ --subnet-range 10.66.20.0/24 \ --service-type 'network:routed' \ --service-type 'network:floatingip' \ --network multisegment1
Here's where neutron-api crashes. in db/ipam_backend_mixin.py:
def _validate_segment(self, context, network_id, segment_id, action=None, old_segment_id=None): # TODO(tidwellr) Create and use a constant for the service type segments = subnet_obj.Subnet.get_subnet_segment_ids( context, network_id, filtered_service_type='network:routed')
associated_segments = set(segments) if None in associated_segments and len(associated_segments) > 1: raise segment_exc.SubnetsNotAllAssociatedWithSegments( network_id=network_id)
SubnetsNotAllAssociatedWithSegments() is raised, as you must already guessed. Here's the values...
associated_segments is an array containing 3 values: 2 being the IDs of the segments I added previously, the 3rd one being None. This test is then matched. Where is that None value coming from? Is this the new subnet I'm trying to add? Maybe the filtered_service_type='network:routed' in the call: subnet_obj.Subnet.get_subnet_segment_ids() isn't working as expected?
Printing the SQL query that is checked shows:
SELECT subnets.segment_id AS subnets_segment_id FROM subnets WHERE subnets.network_id = %(network_id_1)s AND subnets.id NOT IN (SELECT subnet_service_types.subnet_id AS subnet_service_types_subnet_id FROM subnet_service_types WHERE subnets.network_id = %(network_id_2)s AND subnet_service_types.subnet_id = subnets.id AND subnet_service_types.service_type = %(service_type_1)s)
though when doing by hand:
SELECT subnets.segment_id AS subnets_segment_id FROM subnets
the db has only 2 subnets, so it looks like the floating-ip subnet got added before the check, and is then removed when the above test fails.
So I just removed the raise, and could add the subnet I wanted, but that's obviously not a long term solution.
Your thoughts?
Another problem that I'm having, is that neutron-bgp-dragent is not receiving (or processing) the messages from neutron-rpc-server. I've enabled DEBUG mode for oslo_messaging, and found out that when dr-agent starts and prints "Agent has just been revived. Scheduling full sync", it does send a message to neutron-rpc-server, which is replied, but it doesn't look like dr-agent processes the return message in its reply queue, and then prints in the logs: "imeout in RPC method get_bgp_speakers. Waiting for 17 seconds before next attempt. If the server is not down, consider increasing the rpc_response_timeout option as Neutron server(s) may be overloaded and unable to respond quickly enough.: oslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply to message ID c1b401c9e10d481bb5e071f2c048e480". What is weird is that a few times (rarely), it worked, and the agent gets the reply.
What should I do to investigate further?
Cheers,
Thomas Goirand (zigo)
On 7/15/20 4:09 PM, Rodolfo Alonso Hernandez wrote:
Hi Thomas:
If I'm not wrong, the goal of this filtering is to remove all those subnets with service_type='network:routed'. Maybe you can check implementing an easier query: SELECT subnets.segment_id AS subnets_segment_id FROM subnets WHERE subnets.network_id = %(network_id_1)s AND NOT (EXISTS (SELECT * FROM subnet_service_types WHERE subnets.id <http://subnets.id> = subnet_service_types.subnet_id AND subnet_service_types.service_type = %(service_type_1)s))
That will be translated to python as:
query = test_db.context.session.query(subnet_obj.Subnet.db_model.segment_id) query = query.filter(subnet_obj.Subnet.db_model.network_id == network_id) if filtered_service_type: query = query.filter(~exists().where(and_( subnet_obj.Subnet.db_model.id <http://subnet_obj.Subnet.db_model.id> == service_type_model.subnet_id, service_type_model.service_type == filtered_service_type)))
Can you provide a UTs or a way to check the problem you are experiencing?
Regards.
Hi Rodolfo, Thanks for your help. I tried translating what you wrote above into a working code (ie: fixing a few variables here and there), which I sent as a new PR here: https://review.opendev.org/#/c/741429/ However, printing the result from SQLAlchemy shows that get_subnet_segment_ids() still returns None together with my other 2 subnets, so something must still be wrong. I'm not yet to the point I can write unit tests, just trying the code locally for the moment. Cheers, Thomas Goirand (zigo)
On 7/16/20 2:56 PM, Thomas Goirand wrote:
On 7/15/20 4:09 PM, Rodolfo Alonso Hernandez wrote:
Hi Thomas:
If I'm not wrong, the goal of this filtering is to remove all those subnets with service_type='network:routed'. Maybe you can check implementing an easier query: SELECT subnets.segment_id AS subnets_segment_id FROM subnets WHERE subnets.network_id = %(network_id_1)s AND NOT (EXISTS (SELECT * FROM subnet_service_types WHERE subnets.id <http://subnets.id> = subnet_service_types.subnet_id AND subnet_service_types.service_type = %(service_type_1)s))
That will be translated to python as:
query = test_db.context.session.query(subnet_obj.Subnet.db_model.segment_id) query = query.filter(subnet_obj.Subnet.db_model.network_id == network_id) if filtered_service_type: query = query.filter(~exists().where(and_( subnet_obj.Subnet.db_model.id <http://subnet_obj.Subnet.db_model.id> == service_type_model.subnet_id, service_type_model.service_type == filtered_service_type)))
Can you provide a UTs or a way to check the problem you are experiencing?
Regards.
Hi Rodolfo,
Thanks for your help.
I tried translating what you wrote above into a working code (ie: fixing a few variables here and there), which I sent as a new PR here: https://review.opendev.org/#/c/741429/
However, printing the result from SQLAlchemy shows that get_subnet_segment_ids() still returns None together with my other 2 subnets, so something must still be wrong.
I'm not yet to the point I can write unit tests, just trying the code locally for the moment.
Cheers,
Thomas Goirand (zigo)
Rodolfo, You are right that the purpose is to filter subnets with service_type='network:routed' However, if I add: if segment_id in the: return [segment_id for (segment_id,) in query.all() if segment_id] then this doesn't work, because _validate_segment will never return 400 whenever there is a non-valid request, which defeats the purpose of this function. I removed the "if segment_id" and now the patch passes unit tests. See: https://review.opendev.org/669395 However, it's still not possible to provision a subnet with --service-type='network:routed', and at this point, I don't understand what's going wrong, and why get_subnet_segment_ids is returning None for the new subnet I'm trying to create, when this is supposed to be filtered. Is it possible that the service_type table isn't written yet at the time of the call of get_subnet_segment_ids()? I'd like to add a test, to me it looks like I should do it here: neutron/tests/unit/extensions/test_segment.py using as model: test_only_some_subnets_associated_not_allowed() by just adding service_type='network:routed', and expecting it to succeed. However, how do I add a service-type when creating the subnet? It doesn't look like this exists in this test framework. Any suggestion? Cheers, Thomas Goirand (zigo)
participants (3)
-
Rodolfo Alonso Hernandez
-
Thomas Goirand
-
Thomas Goirand