[openstack-dev] [neutron][tricircle]DVR issue in cross Neutron networking
Brian Haley
brian.haley at hpe.com
Tue Dec 6 20:58:39 UTC 2016
On 12/05/2016 10:38 PM, joehuang wrote:
> Hello, Brian,
>
> Thank you for your comment, see inline comments marked with [joehuang].
>
> The ASCII figure is not good in the plain text mail, you can check it at the browser: http://lists.openstack.org/pipermail/openstack-dev/2016-December/108447.html
>
> Best Regards
> Chaoyi Huang (joehuang)
>
> ________________________________________
> From: Brian Haley [brian.haley at hpe.com]
> Sent: 06 December 2016 10:29
> To: OpenStack Development Mailing List (not for usage questions)
> Subject: Re: [openstack-dev] [neutron][tricircle]DVR issue in cross Neutron networking
>
> On 12/5/16 3:03 AM, joehuang wrote:
>> Hello,
>
> Hi Chaoyi,
>
> Comments inline below.
>
>> Tricircle plans to provide L2 network across Neutron to ease supporting
>> high
>> availability of application:
>>
>> For example, in the following figure, the application is consisted of
>> instance1 and instance2, these two instances will be deployed into two
>> OpenStack. Intance1 will provide service through "ext net1"(i.e, external
>> network in OpenStack1), and Instance2 will provide service through
>> "ext net2". Instance1 and Instance2 will be plugged into same L2 network
>> net3 for data replication( for example database replication ).
>>
>> +-----------------+ +-----------------+
>> |OpenStack1 | |OpenStack2 |
>> | | | |
>> | ext net1 | | ext net2 |
>> | +-----+-----+ | | +-----+-----+ |
>> | | | | | |
>> | | | | | |
>> | +--+--+ | | +--+--+ |
>> | | | | | | | |
>> | | R1 | | | | R2 | |
>> | | | | | | | |
>> | +--+--+ | | +--+--+ |
>> | | | | | |
>> | | | | | |
>> | +---+-+-+ | | +---+-+-+ |
>> | net1 | | | net2 | |
>> | | | | | |
>> | +--------+--+ | | +--------+--+ |
>> | | Instance1 | | | | Instance2 | |
>> | +-----------+ | | +-----------+ |
>> | | | | | |
>> | | | net3 | | |
>> | +------+-------------------------+----+ |
>> | | | |
>> +-----------------+ +-----------------+
>
> Are Openstack1 and 2 simply different compute nodes?
>
> [joehuang] No, OpenStack 1 and 2 are two OpenStack clouds, each OpenStack cloud
> includes its own services like Nova, Cinder, Neutron. That means
> R1, net1 are network objects in OpenStack cloud1 ( Neutron1)
> R2, net2 are network objects in OpenStack cloud2 ( Neutron2),
> but net3 will be a shared L2 network existing in both OpenStack cloud1 and
> OpenStack cloud2, i.e in Neutron 1 and Neutron 2.
So net3 is a shared private network, perhaps just a shared VLAN. And something
makes sure IP address allocations don't overlap. Ok.
>> When we deploy the application in such a way, no matter which part of the
>> application stops providing service, the other part can still provide
>> service, and take the workload from the failure one. It'll bring the
>> failure
>> tolerance no matter the failure is due to OpenStack crush or upgrade, or
>> part of the application crush or upgrade.
>>
>> This mode can work very well and helpful, and router R1 R2 can run in DVR
>> or legacy mode.
>>
>> While during the discussion and review of the spec:
>> https://review.openstack.org/#/c/396564/, in this deployment, the end user
>> has to add two NICs for each instance, one for the net3(a L2 network across
>> OpenStack). And the net3 (a L2 network across OpenStack) can not be allowed
>> to add_router_interface to router R1 R2, this is not good in networking.
>>
>> If the end user wants to do so, there is DVR MAC issues if more than one L2
>> network across OpenStack are performed add_router_interface to router
>> R1 R2.
>>
>> Let's look at the following deployment scenario:
>> +-----------------+ +-------------------+
>> |OpenStack1 | |OpenStack2 |
>> | | | |
>> | ext net1 | | ext net2 |
>> | +-----+-----+ | | +-----+-----+ |
>> | | | | | |
>> | | | | | |
>> | +-------+--+ | | +--+-------+ |
>> | | | | | | | |
>> | | R1 | | | | R2 | |
>> | | | | | | | |
>> | ++------+--+ | | +--+-----+-+ |
>> | | | | | | | |
>> | | | | net3 | | | |
>> | | -+-+-------------------+-----+--+ | |
>> | | | | | | | |
>> | | +--+-------+ | | +-+---------+ | |
>> | | | Instance1| | | | Instance2 | | |
>> | | +----------+ | | +-----------+ | |
>> | | | net4 | | |
>> | ++-------+--------------------------+---+-+ |
>> | | | | | |
>> | +-------+---+ | | +--------+---+ |
>> | | Instance3 | | | | Instance4 | |
>> | +-----------+ | | +------------+ |
>> | | | |
>> +-----------------+ +-------------------+
>>
>> net3 and net4 are two L2 network across OpenStacks. These two networks will
>> be added router interface to R1 R2. Tricircle can help this, and addressed
>> the DHCP and gateway challenges: different gateway port for the same
>> network
>> in different OpenStack, so there is no problem for north-south traffic, the
>> north-south traffic will goes to local external network directly, for
>> example,
>> Instance1->R1->ext net1, instance2->R2->ext net2.
>
> Can you describe the subnet configuration here? Is there just one per
> network and was is the IP range?
>
> [joehuang] Subnet in net 3 could be 192.168.1.0, in OpenStack 1, net 3 gateway IP
> 192.168.1.254; in OpenStack 2, net 3 gateway IP 192.168.1.253.
> Instance 1's IP 192.168.1.3, Instance 2's IP 192.168.1.7,
> (Tricircle will help and coordinate the instance IP and gateway IP allocation)
>
> Subnet in net 4 could be 192.168.2.0, in OpenStack 1, net2 gateway IP
> 192.168.2.254; in OpenStack 2, net 3 gateway IP 192.168.2.253.
> Instance 3's IP 192.168.1.3, Instance 4's IP 192.168.1.7,
> (Tricircle will help and coordinate the instance IP and gateway IP allocation)
>
>> The issue is in east-west traffic if R1 R2 are running in DVR mode:
>> when instance1 tries to ping instance4, DVR MAC replacement will happen
>> before
>> the packet leaves the host where the instance1 is running, when the packet
>> arrives at the host where the instance4 is running, because DVR MAC
>> replacement,
>> the source mac(DVR MAC from OpenStack1) of the packet could not be
>> recognized
>> in OpenStack2, thus the packet will be dropped, and the ping fails.
>
> So in this case the packet must be routed (L3) since these are different
> L2 networks. Typically this would go from Instance1 -> R1 -> Instance4.
> But then the return path would be different? Instance4 -> R2 ->
> Instance1? It's hard to tell from the diagram. I'm curious if having
> both R1 and R2 plugged-into both net3 and net4 is causing problems?
>
> I'll try and configure something similar here, but it might help if you
> give a list of the commands you ran to configure neutron here.
>
> [joehuang] The data path is correct: Instance1 -> R1 -> Instance4, return
> path: Instance4 -> R2 ->Instance1.
> But R1 and R2 are located in two Neutrons. There will be some issue if you
> use VxLAN experience this, for VTEP needs to be populated into remote neutron
> (This is another topic). But you can try it using DVR+VLAN, and VLAN range should
> be able to accessible in both Neutron, i.e, in Neutron 1 and Neutron 2, enable
> DVR+VLAN, and manually create net3 and net4 in both neutron with same
> VLAN segment.
So is the asymmetric setup causing this? For example, if Instance4 used R1 as
the gateway IP to net3 do things work? I guess I would have to recommend doing
manual changes to the running config to prove it can work, then maybe we can
figure out how to tweak the code.
>> The latter one deployment bring more flexibility in networking capability,
>> and don't have to prevent the L2 network across OpenStack from
>> add_router_interface to DVR mode routers, otherwise, only legacy router
>> can be
>> supported for L2 network across OpenStack.
>>
>> Any thought on how to address this issue to make DVR and L2 network across
>> OpenStack be able to co-work together?
>
> DVR is really an L3 paradigm although there is some L2 programming going
> on so the same MAC can be used across compute nodes. So I'm not sure if
> this is a bug you've found or an unsupported configuration.
>
> [joehuang] To be honest, this is not a bug inside one Neutron, only happened
> in networking across Neutrons. The scenario is to support application geo-redundancy:
> deploy the application into different OpenStack cloud.
It's interesting to say the least.
-Brian
More information about the OpenStack-dev
mailing list