On Mon, Apr 15, 2019 at 7:48 PM Bence Romsics <bence.romsics@gmail.com> wrote:
Hi Heat and Neutron folks,
I'm wondering how could I improve support for extra routes for one of our users. In this email I'd like to get your input (from both Heat and Neutron folks) in which direction should I start before I sit down to write the relevant blueprints and RFEs. This will be a bit long, please bear with me.
As you know a neutron router can have extra static routing table entries as the 'routes' attribute like this (please see the api-ref in [1]):
$ openstack router show router1 -f value -c routes destination='10.1.0.0/24', gateway='10.0.0.11' destination='10.2.0.0/24', gateway='10.0.0.12' ...
To my best knowledge the whole set of extra routes must be updated at once which easily leads to lost updates if multiple clients modify this attribute concurrently (imagine client A and B working concurrently, for example both reads routes [r1], client A writes [r1, r2], client B writes [r1, r3]; depending on timing either r2 or r3 will be lost).
Heat has an unsupported resource OS::Neutron::ExtraRoute [2] to front Neutron extra routes. For example:
heat_template_version: '2015-04-30' resources: Extra_route_1: type: OS::Neutron::ExtraRoute properties: destination: '10.1.0.0/24' nexthop: '10.0.0.11' router_id: 'd544b5c5-d4d2-4b52-aaa8-ef4b9fa3e154'
The Heat resource represents a single extra route out of the many in the router's routes attribute. If you have multiple OS::Neutron::ExtraRoute resources Heat by default handles them concurrently leading to the lost update problem. You can easily reproduce the problem just by having 10 extra routes in a template. The stack creation will succeed, but the router in fact will have only 6-7 (the exact number changes based on timing) extra routes. AFAIU this is exactly why this was marked unsupported. (Please let me know if you know other reasons.)
I think this had been discussed on several occasions[1] before and one of the suggestions from the neutron team was to use compare-and-swap update api [2] to avoid race conditions. [1] http://lists.openstack.org/pipermail/openstack-dev/2018-June/131020.html [2] https://bugs.launchpad.net/neutron/+bug/1703234 As a workaround I suggested our user a hack to serialize the
OS::Neutron::ExtraRoute resources by 'depends_on'.
But which way should I go if I want to fix this properly?
way #1 - changes in Heat only: Introduce a new Heat resource OS::Neutron::ExtraRoutes (please notice the plural) which can represent the whole routes attribute of a router (i.e. multiple extra routes in a single resource). This will not solve the design shortcomings of the neutron extraroute API, but it can at least express what can be done in the Neutron API as is. Lost updates of multiple stacks editing the same router's routes are still possible.
way #2 - changes in Heat and Neutron: Introduce a new Neutron
extension with a better API (either extraroutes as a top level resource, each object with a router_id or an action map a'la add/remove_router_interface to edit extra routes on the server side). This new API would work on the same DB tables and backends as the old. The old API would be kept for backwards compatibility. But if a new user exclusively uses the new API it can work concurrently. Then introduce a new Heat resource (or change the existing unsupported one) to use this new Neutron API. This is obviously more work than the first option.
Which one do you prefer and why? Or let me know please if I missed a better alternative.
In this thread I hope to get a cross-project agreement of this between Heat and Neutron so I can go ahead and open the relevant Heat and Neutron blueprints where we can focus only on each projects' internal questions.
Thanks in advance, Bence Romsics irc: rubasov
[1] https://developer.openstack.org/api-ref/network/v2/?expanded=update-router-d... [2] https://docs.openstack.org/heat/latest/template_guide/unsupported.html#OS::N...
-- Regards, Rabi Mishra