[openstack-dev] [Quantum] continuing todays discussion about the l3 agents

Soheil Eizadi seizadi at infoblox.com
Mon Dec 3 18:12:02 UTC 2012


I am commenting on this thread since it is an area of interest, I am new
to this list and only get the Digest version right now. If there is a link
to the Wiki that documents the use cases for this feature that would be
great, I am new to OpenStack development and might not have the full
picture of what you are trying to implement.

There was discussion of FO, but I did not see the draft IETF draft from
2003 referenced:
http://tools.ietf.org/html/draft-ietf-dhc-failover-12
I don't think there is any ongoing effort to standardize a real RFC. We
don't see a market demand for interoperable solutions, probably why there
is not an RFC, but this RFC draft is a good guideline if you are going to
design FO. Vendors that have commercial FO, have implemented extension to
the draft RFC. Microsoft documented their extensions here:
http://msdn.microsoft.com/en-us/library/hh880579(v=prot.20).aspx


>From this thread I see a requirement for HA and in the thread there was a
discussion of VRRP and FO, I read the writeup as mutually exclusive
options. Depending on the requirements for HA you might need both. The use
case for VRRP type HA active/passive provides local survivability while
the active/active FO solution could be used by customers that desire DR.
In some use cases we have customers use both of them.


I think the proposal on this thread to have a flexible agent registration
approach that can accommodate how dhcp agents are mapped to networks, e.g.
One to one, many to one, and one to many is great. It provides the
flexibility depending on the Service Provider use cases.

I did not see a discussion of following topics for dhcp agents, not sure
if it is in the use case, but of interest to the work I am researching:
- Ability for DHCP as a Plugin, rather than built-in as core service.
- Ability to support Hybrid Cloud, in this environment the DHCP Service
would need to coordinate IP Address between private and public cloud, with
a stretch network. (DHCP Plugin is a component of this solution.)

-Soheil


On 12/3/12 4:00 AM, "openstack-dev-request at lists.openstack.org"
<openstack-dev-request at lists.openstack.org> wrote:

>Message: 9
>Date: Mon, 03 Dec 2012 19:42:54 +0800
>From: gong yong sheng <gongysh at linux.vnet.ibm.com>
>To: gkotton at redhat.com
>Cc: OpenStack Development Mailing List
>	<openstack-dev at lists.openstack.org>
>Subject: Re: [openstack-dev] [Quantum] continuing todays discussion
>	about the l3 agents
>Message-ID: <50BC903E.5080604 at linux.vnet.ibm.com>
>Content-Type: text/plain; charset="iso-8859-1"; Format="flowed"
>
>On 12/03/2012 06:19 PM, Gary Kotton wrote:
>>On 12/03/2012 11:53 AM, gong yong sheng wrote:
>>>On 12/03/2012 05:32 PM, Gary Kotton wrote:
>>>>On 12/03/2012 04:16 AM, gong yong sheng wrote:
>>>>>On 12/03/2012 02:29 AM, Vinay Bannai wrote:
>>>>>>My understanding of the "scheduler" approach based on what I read
>>>>>>on the ML's is to have a mechanism where the DHCP agents can land
>>>>>>on different nodes. For example, just like we have compute hosts
>>>>>>in nova, we have a bunch of DHCP capable hosts (and L3 capable
>>>>>>hosts etc) that can be selected to host the network service for a
>>>>>>tenant when the network/subnet is created. The process of
>>>>>>selecting the host to run the service is based on a "scheduler".
>>>>>>This allows a graceful horizontal scaling. This approach is
>>>>>>similar to what nova does. You have a bunch of hosts capable of
>>>>>>providing a network service and the "scheduler" picks them based
>>>>>>on filters and other tunable knobs. I think you already know
>>>>>>this:-). I  was spelling it out so that you can see where I am
>>>>>>coming from.
>>>>>If we don't want all dhcp agents to host the data of all the networks,
>>>>>My Idea is:
>>>>>1. let quantum server have the ability to know all about the dhcp
>>>>>agents. for example we can have quantum agents-list to show all the
>>>>>agents running in the quantum deloyment,
>>>>>and the network they are hosting.
>>>>
>>>>ok, this may be useful for debugging. will this display and have the
>>>>status of the dhcp agent, say for example i deploy and agent and it
>>>>has an exception do to a bug? nn
>>>I think it is good for management, not just good for debugging. If
>>>the agent cannot start, the quantum agents-list will not have it in a
>>>:) status. If it has exception and is still running, we can have it
>>>report its status with exception. But regarding the log,  I think we
>>>should leave it to upper management tools. For example, admin user
>>>can start the agent with integrated log facility.
>>
>>Visibility is always good.
>>>>>2. let admin user have the ability to config the dhcp agents what
>>>>>networks they should host.   For example, quantum dhcpagent-update
>>>>>dhcpagent1 --networks network1 network2 network3. or quantum
>>>>>net-create network1 --dhcpagents agent1 agent2. And if admin user
>>>>>does not specify which agent to host which network, we can let
>>>>>scheduler to decide automatically
>>>>
>>>>this is exactly what i am suggesting, except we do not need to
>>>>change the quantum api to provide this. the agent can receive this
>>>>as an input parameter. in principle we agree on what needs to be
>>>>done, but the question is how.
>>>I am suggesting to control agents from quantum server. we can use
>>>quantum cli or API to command which agents host which networks. Of
>>>course admin user also can configure the dhcp agents to accept only
>>>some networks data. We can support both.
>>>We can add quantum api ( maybe by extension) and store the networks
>>>and dhcp agents mapping in db. then notify related dhcp agents with
>>>related data.
>>
>>I personally do not like this approach and I think that the agents
>>should be independent of the service (as much as possible). At the
>>moment the DHCP agent gets its configuration via the notifications. I
>>have a number of questions:
>Agent's configuration is all in quantum.conf and dhcp.ini file.  the
>network data of dhcp server is from quantum server by message
>notification.
>>1. do you plan to continue to use the same mechanism of implement
>>direct communication with the specific agents?
>I am not quite following you here. But I think we can use the similar
>AMQP message way to do this.  dhcp agents listening on their specific
>queue, such as dhcp.hosta, dhcp.hostb, etc.
>we will send message to these queues according to which networks are
>hosted by which agents. Maybe, there are some ones hating host (which is
>kind of physical meaning). But I think it is just a name.
>>2. just because a compute node/network node loses connectivity with
>>the quantum service does not mean that it loses network connectivity
>>(and vice versa). How will we deal with issues like this?
>I don't understand what u mean here. Can u give an example?
>>3. Can you please clarify more on the extension. will this be on the
>>subnet (that is where there is a flag for dhcp status)
>Extension is just like l3, quota and provider network extension we have.
>I think it will extend the network model since our dhcp agent is
>starting a dnsmasq server per network.
>of course, we can have a try to a smaller granule subnet
>>
>>>>>So for scale vertically:
>>>>>we can specify much agents host some same networks
>>>>>So for scale horizontally:
>>>>>we can add as many as dhcp agents. quantum scheduler will
>>>>>distribute new networks automatically or admin user can specify.
>>>>
>>>>i have a number of problems with a quantum "scheduler". the first
>>>>being a single point of failure. the second being the fact that it
>>>>needs to be aware of the state and load of a dhcp agent. how will
>>>>the scheduler provide for HA?
>>>What do you mean by single point of failure? In fact we don't need to
>>>run it in a separate binary at all. It is just a module of
>>>quantum-server.
>>>state and load of a dhcp agent is in db. If you are meaning the
>>>quantum-server's HA, I don't think we already have a good one by now.
>>>I think we should break the current quantum-server into two parts:
>>>1. one part is for just API (rest): for which we can use multiple
>>>process just like other nova api servers. and operator will also be
>>>able to use load balancer to HA it.
>>>2. anyone part is for incoming queue message processing. We can run
>>>multiple this part nodes, this is the AMQP feature to scale out.
>>>>
>>>>
>>>>>For us to run multiple dhcp agents, we need to make sure our dhcp
>>>>>anti spoofing work.
>>>>>>
>>>>>>Either way we look at it, I think it will be helpful if we
>>>>>>decoupled the horizontal (scaling to multiple nodes) and vertical
>>>>>>scaling (redundancy and failover). One should not imply the other.
>>>>>>In your last paragraph, you mention "orchestration tool" and dhcp
>>>>>>agents configured to handle specific networks. I have not been
>>>>>>able to wrap my head around this completely but it appears to b ea
>>>>>>different variant of the "scheduler" approach where it is
>>>>>>configured manually. Is my understanding correct? Or if you don't
>>>>>>mind, can you elaborate further on that idea.
>>>>>>
>>>>>>Thanks
>>>>>>Vinay
>>>>>>
>>>>>>On Sun, Dec 2, 2012 at 7:16 AM, Gary Kotton <gkotton at redhat.com
>>>>>><mailto:gkotton at redhat.com>> wrote:
>>>>>>
>>>>>>     On 12/01/2012 03:31 AM, gong yong sheng wrote:
>>>>>>>     On 12/01/2012 07:49 AM, Vinay Bannai wrote:
>>>>>>>>     Gary and Mark,
>>>>>>>>
>>>>>>>>     You brought up the issue of scaling horizontally
>>>>>>>>     and vertically in your earlier email. In the case of
>>>>>>>>     horizontal scaling, I would agree that it would have to be
>>>>>>>>     based on the "scheduler" approach proposed by Gong and Nachi.
>>>>>>
>>>>>>     I am not sure that I understand the need for a scheduler when
>>>>>>     it comes to the DHCP agent.  In my opinion this is unnecessary
>>>>>>     overhead and it is not necessarily required.
>>>>>>
>>>>>>     Last week Mark addressed the problem with all of the DHCP
>>>>>>     agents all listening on the same message queue. In theory we
>>>>>>     are able to run more than one DHCP agents in parallel. This
>>>>>>     offers HA at the expense of an IP per DHCP agent per subnet.
>>>>>>
>>>>>>     I think that for the DHCP agents we need to look into enabling
>>>>>>     the DHCP agents to treat specific networks. This can be done
>>>>>>     in a very rudimentary way - have a configuration variable for
>>>>>>     the DHCP agent indicating a list of networks to be treated by
>>>>>>     the agent. A orchestration tool can just configure the network
>>>>>>     ID's and launch the service - then we will have scalable and
>>>>>>     highly available DHCP service. I would prefer not to have to
>>>>>>     add this into the Quantum API as it just complicates things.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>>>     On the issue of vertical scaling (I am using the DHCP
>>>>>>>>     redundancy as an example), I think it would be good to base
>>>>>>>>     our discussions on the various methods that have been
>>>>>>>>     discussed and do pro/con analysis in terms of scale,
>>>>>>>>     performance and other such metrics.
>>>>>>>>
>>>>>>>>     - Split scope DHCP (two or more servers split the IP address
>>>>>>>>     and there is no overlap)
>>>>>>>>       pros: simple
>>>>>>>>       cons: wastes IP addresses,
>>>>>>>>
>>>>>>>>     - Active/Standby model (might have run VRRP or hearbeats to
>>>>>>>>     dictate who is active)
>>>>>>>>       pros: load evenly shared
>>>>>>>>       cons: needs shared knowledge of address assignments,
>>>>>>>>                 need hearbeats or VRRP to keep track of failovers
>>>>>>>     another one is the IP address waste. we need one VIP, and 2+
>>>>>>>     more address for VRRP servers. ( we can use dhcp server's ip
>>>>>>>     if we don't want to do load balancing behind the VRRP servers)
>>>>>>>     another one is it will make system complicated.
>>>>>>>>
>>>>>>>>     - LB method (use load balancer to fan out to multiple dhcp
>>>>>>>>     servers)
>>>>>>>>       pros: scales very well
>>>>>>>>       cons: the lb becomes the single point of failure,
>>>>>>>>                the lease assignments needs to be shared between
>>>>>>>>     the dhcp servers
>>>>>>>>
>>>>>>>     LB method will also wast ip address. First we at lease need a
>>>>>>>     VIP address. then we will need more dhcp servers running for
>>>>>>>     one network.
>>>>>>>     If we need to VRRP the VIP, we will need 2+ more addresses.
>>>>>>>     another one is it will make system complicated.
>>>>>>>>     I see that the DHCP agent and the quantum server communicate
>>>>>>>>     using RPC. Is the plan to leave it alone or migrate it
>>>>>>>>     towards something like AMQP based server in the future when
>>>>>>>>     the "scheduler" stuff is implemented.
>>>>>>>     I am not very clear your point. But current RPC is on AMQP.
>>>>>>>>
>>>>>>>>     Vinay
>>>>>>>>
>>>>>>>>
>>>>>>>>     On Wed, Nov 28, 2012 at 8:03 AM, Mark McClain
>>>>>>>>     <mark.mcclain at dreamhost.com
>>>>>>>>     <mailto:mark.mcclain at dreamhost.com>> wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>         On Nov 28, 2012, at 8:03 AM, gong yong sheng
>>>>>>>>         <gongysh at linux.vnet.ibm.com
>>>>>>>>         <mailto:gongysh at linux.vnet.ibm.com>> wrote:
>>>>>>>>
>>>>>>>>         > On 11/28/2012 08:11 AM, Mark McClain wrote:
>>>>>>>>         >> On Nov 27, 2012, at 6:33 PM, gong yong sheng
>>>>>>>>         <gongysh at linux.vnet.ibm.com
>>>>>>>>         <mailto:gongysh at linux.vnet.ibm.com>> wrote:
>>>>>>>>         >>
>>>>>>>>         >> Just wanted to clarify two items:
>>>>>>>>         >>
>>>>>>>>         >>>> At the moment all of the dhcp agents receive all of
>>>>>>>>         the updates. I do not see why we need the quantum
>>>>>>>>         service to indicate which agent runs where. This will
>>>>>>>>         change the manner in which the dhcp agents work.
>>>>>>>>         >>> No. currently, we can run only one dhcp agent  since
>>>>>>>>         we are using a topic queue for notification.
>>>>>>>>         >> You are correct.  There is a bug in the underlying
>>>>>>>>         Oslo RPC implementation that sets the topic and queue
>>>>>>>>         names to be same value.  I didn't get a clear
>>>>>>>>         explanation of this problem until today and will have to
>>>>>>>>         figure out a fix to oslo.
>>>>>>>>         >>
>>>>>>>>         >>> And one problem with multiple agents serving the
>>>>>>>>         same ip is:
>>>>>>>>         >>> we will have more than one agents want to update the
>>>>>>>>         ip's leasetime now and than.
>>>>>>>>         >> This is not a problem.  The DHCP protocol was
>>>>>>>>         designed for multiple servers on a network.  When a
>>>>>>>>         client accepts a lease, the server that offered the
>>>>>>>>         accepted lease will be the only process attempting to
>>>>>>>>         update the lease for that port.  The other DHCP
>>>>>>>>         instances will not do anything, so there won't be any
>>>>>>>>         chance for a conflict.  Also, when a client renews it
>>>>>>>>         sends a unicast message to that previous DHCP server and
>>>>>>>>         so there will only be one writer in this scenario too.
>>>>>>>>          Additionally, we don't have to worry about conflicting
>>>>>>>>         assignments because the dhcp agents use the same static
>>>>>>>>         allocations from the Quantum database.
>>>>>>>>         > I mean dhcp agent is trying to update leasetime to
>>>>>>>>         quantum server. If we have more than one dhcp agents,
>>>>>>>>         this will cause confusion.
>>>>>>>>         >    def update_lease(self, network_id, ip_address,
>>>>>>>>         time_remaining):
>>>>>>>>         >        try:
>>>>>>>>         >  self.plugin_rpc.update_lease_expiration(network_id,
>>>>>>>>         ip_address,
>>>>>>>>         >                    time_remaining)
>>>>>>>>         >        except:
>>>>>>>>         >            self.needs_resync = True
>>>>>>>>         >  LOG.exception(_('Unable to update lease'))
>>>>>>>>         > I think it is our dhcp agent's defect. Why does our
>>>>>>>>         dhcp agent need the lease time? all the IPs are managed
>>>>>>>>         in our quantum server, there is not need for dynamic ip
>>>>>>>>         management in dhcp server managed by dhcp agent.
>>>>>>>>
>>>>>>>>         There cannot be confusion.  The dhcp client selects only
>>>>>>>>         one server to accept a lease, so only one agent will
>>>>>>>>         update this field at a time. (See RFC2131 section 4.3.2
>>>>>>>>         for protocol specifics).  The dnsmasq allocation
>>>>>>>>         database is static in Quantum's setup, so the lease
>>>>>>>>         renewal needs to propagate to the Quantum Server.  The
>>>>>>>>         Quantum server then uses the lease time to avoid
>>>>>>>>         allocating IP addresses before the lease has expired.
>>>>>>>>          In Quantum, we add an additional restriction that
>>>>>>>>         expired allocations are not reclaimed until the
>>>>>>>>         associated port has been deleted as well.
>>>>>>>>
>>>>>>>>         mark
>>>>>>>>




More information about the OpenStack-dev mailing list