My understanding of the "scheduler" approach based on what I read on the ML's is to have a mechanism where the DHCP agents can land on different nodes. For example, just like we have compute hosts in nova, we have a bunch of DHCP capable hosts (and L3 capable hosts etc) that can be selected to host the network service for a tenant when the network/subnet is created. The process of selecting the host to run the service is based on a "scheduler". This allows a graceful horizontal scaling. This approach is similar to what nova does. You have a bunch of hosts capable of providing a network service and the "scheduler" picks them based on filters and other tunable knobs. I think you already know this:-). I  was spelling it out so that you can see where I am coming from. <div>

<br></div><div>Either way we look at it, I think it will be helpful if we decoupled the horizontal (scaling to multiple nodes) and vertical scaling (redundancy and failover). One should not imply the other. In your last paragraph, you mention "orchestration tool" and dhcp agents configured to handle specific networks. I have not been able to wrap my head around this completely but it appears to b ea different variant of the "scheduler" approach where it is configured manually. Is my understanding correct? Or if you don't mind, can you elaborate further on that idea. </div>

<div><br></div><div>Thanks</div><div>Vinay<br><br><div class="gmail_quote">On Sun, Dec 2, 2012 at 7:16 AM, Gary Kotton <span dir="ltr"><<a href="mailto:gkotton@redhat.com" target="_blank">gkotton@redhat.com</a>></span> wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

  <div bgcolor="#FFFFFF" text="#000000"><div class="im">

    On 12/01/2012 03:31 AM, gong yong sheng wrote:

    <blockquote type="cite">

      <div>On 12/01/2012 07:49 AM, Vinay Bannai

        wrote:<br>

      </div>

      <blockquote type="cite">Gary and Mark,

        <div><br>

        </div>

        <div>You brought up the issue of scaling horizontally

          and vertically in your earlier email. In the case of

          horizontal scaling, I would agree that it would have to be

          based on the "scheduler" approach proposed by Gong and Nachi.

          <br>

        </div>

      </blockquote>

    </blockquote>

    <br></div>

    I am not sure that I understand the need for a scheduler when it

    comes to the DHCP agent.  In my opinion this is unnecessary overhead

    and it is not necessarily required. <br>

    <br>

    Last week Mark addressed the problem with all of the DHCP agents all

    listening on the same message queue. In theory we are able to run

    more than one DHCP agents in parallel. This offers HA at the expense

    of an IP per DHCP agent per subnet. <br>

    <br>

    I think that for the DHCP agents we need to look into enabling the

    DHCP agents to treat specific networks. This can be done in a very

    rudimentary way - have a configuration variable for the DHCP agent

    indicating a list of networks to be treated by the agent. A

    orchestration tool can just configure the network ID's and launch

    the service - then we will have scalable and highly available DHCP

    service. I would prefer not to have to add this into the Quantum API

    as it just complicates things.<div><div class="h5"><br>

    <br>

    <br>

    <br>

    <br>

    <blockquote type="cite">

      <blockquote type="cite">

        <div>On the issue of vertical scaling (I am using the DHCP

          redundancy as an example), I think it would be good to base

          our discussions on the various methods that have been

          discussed and do pro/con analysis in terms of scale,

          performance and other such metrics. </div>

        <div><br>

        </div>

        <div>- Split scope DHCP (two or more servers split the IP

          address and there is no overlap)</div>

        <div>  pros: simple</div>

        <div>  cons: wastes IP addresses,</div>

        <div><br>

        </div>

        <div>- Active/Standby model (might have run VRRP or hearbeats to

          dictate who is active)</div>

        <div>  pros: load evenly shared</div>

        <div>  cons: needs shared knowledge of address assignments, </div>

        <div>            need hearbeats or VRRP to keep track of

          failovers</div>

      </blockquote>

      another one is the IP address waste. we need one VIP, and 2+ more

      address for VRRP servers. ( we can use dhcp server's ip if we

      don't want to do load balancing behind the VRRP servers)<br>

      another one is it will make system complicated.<br>

      <blockquote type="cite">

        <div><br>

        </div>

        <div>- LB method (use load balancer to fan out to multiple dhcp

          servers)</div>

        <div>  pros: scales very well </div>

        <div>  cons: the lb becomes the single point of failure,</div>

        <div>           the lease assignments needs to be shared between

          the dhcp servers</div>

        <div><br>

        </div>

      </blockquote>

      LB method will also wast ip address. First we at lease need a VIP

      address. then we will need more dhcp servers running for one

      network.<br>

      If we need to VRRP the VIP, we will need 2+ more addresses.<br>

      another one is it will make system complicated.<br>

      <blockquote type="cite">

        <div>I see that the DHCP agent and the quantum server

          communicate using RPC. Is the plan to leave it alone or

          migrate it towards something like AMQP based server in the

          future when the "scheduler" stuff is implemented. <br>

        </div>

      </blockquote>

      I am not very clear your point. But current RPC is on AMQP.<br>

      <blockquote type="cite">

        <div><br>

        </div>

        <div>Vinay</div>

        <div><br>

        </div>

        <div><br>

        </div>

        <div class="gmail_quote">On Wed, Nov 28, 2012 at 8:03 AM, Mark

          McClain <span dir="ltr"><<a href="mailto:mark.mcclain@dreamhost.com" target="_blank">mark.mcclain@dreamhost.com</a>></span>

          wrote:<br>

          <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

            <div><br>

              On Nov 28, 2012, at 8:03 AM, gong yong sheng <<a href="mailto:gongysh@linux.vnet.ibm.com" target="_blank">gongysh@linux.vnet.ibm.com</a>>

              wrote:<br>

              <br>

              > On 11/28/2012 08:11 AM, Mark McClain wrote:<br>

              >> On Nov 27, 2012, at 6:33 PM, gong yong sheng <<a href="mailto:gongysh@linux.vnet.ibm.com" target="_blank">gongysh@linux.vnet.ibm.com</a>>

              wrote:<br>

              >><br>

              >> Just wanted to clarify two items:<br>

              >><br>

              >>>> At the moment all of the dhcp agents

              receive all of the updates. I do not see why we need the

              quantum service to indicate which agent runs where. This

              will change the manner in which the dhcp agents work.<br>

              >>> No. currently, we can run only one dhcp agent

               since we are using a topic queue for notification.<br>

              >> You are correct.  There is a bug in the

              underlying Oslo RPC implementation that sets the topic and

              queue names to be same value.  I didn't get a clear

              explanation of this problem until today and will have to

              figure out a fix to oslo.<br>

              >><br>

              >>> And one problem with multiple agents serving

              the same ip is:<br>

              >>> we will have more than one agents want to

              update the ip's leasetime now and than.<br>

              >> This is not a problem.  The DHCP protocol was

              designed for multiple servers on a network.  When a client

              accepts a lease, the server that offered the accepted

              lease will be the only process attempting to update the

              lease for that port.  The other DHCP instances will not do

              anything, so there won't be any chance for a conflict.

               Also, when a client renews it sends a unicast message to

              that previous DHCP server and so there will only be one

              writer in this scenario too.  Additionally, we don't have

              to worry about conflicting assignments because the dhcp

              agents use the same static allocations from the Quantum

              database.<br>

              > I mean dhcp agent is trying to update leasetime to

              quantum server. If we have more than one dhcp agents, this

              will cause confusion.<br>

              >    def update_lease(self, network_id, ip_address,

              time_remaining):<br>

              >        try:<br>

              >          

               self.plugin_rpc.update_lease_expiration(network_id,

              ip_address,<br>

              >                                                  

               time_remaining)<br>

              >        except:<br>

              >            self.needs_resync = True<br>

              >            LOG.exception(_('Unable to update lease'))<br>

              > I think it is our dhcp agent's defect. Why does our

              dhcp agent need the lease time? all the IPs are managed in

              our quantum server, there is not need for dynamic ip

              management in dhcp server managed by dhcp agent.<br>

              <br>

            </div>

            There cannot be confusion.  The dhcp client selects only one

            server to accept a lease, so only one agent will update this

            field at a time. (See RFC2131 section 4.3.2 for protocol

            specifics).  The dnsmasq allocation database is static in

            Quantum's setup, so the lease renewal needs to propagate to

            the Quantum Server.  The Quantum server then uses the lease

            time to avoid allocating IP addresses before the lease has

            expired.  In Quantum, we add an additional restriction that

            expired allocations are not reclaimed until the associated

            port has been deleted as well.<br>

            <div>

              <div><br>

                mark<br>

                <br>

                <br>

                _______________________________________________<br>

                OpenStack-dev mailing list<br>

                <a href="mailto:OpenStack-dev@lists.openstack.org" target="_blank">OpenStack-dev@lists.openstack.org</a><br>

                <a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev" target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev</a><br>

              </div>

            </div>

          </blockquote>

        </div>

        <br>

        <br clear="all">

        <div><br>

        </div>

        -- <br>

        Vinay Bannai<br>

        Email: <a href="mailto:vbannai@gmail.com" target="_blank">vbannai@gmail.com</a><br>

        Google Voice: <a href="tel:415%20938%207576" value="+14159387576" target="_blank">415 938 7576</a><br>

        <br>

        <br>

        <fieldset></fieldset>

        <br>

        <pre>_______________________________________________

OpenStack-dev mailing list

<a href="mailto:OpenStack-dev@lists.openstack.org" target="_blank">OpenStack-dev@lists.openstack.org</a>

<a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev" target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev</a>

</pre>

      </blockquote>

      <br>

      <br>

      <fieldset></fieldset>

      <br>

      <pre>_______________________________________________

OpenStack-dev mailing list

<a href="mailto:OpenStack-dev@lists.openstack.org" target="_blank">OpenStack-dev@lists.openstack.org</a>

<a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev" target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev</a>

</pre>

    </blockquote>

    <br>

  </div></div></div>

<br>_______________________________________________<br>

OpenStack-dev mailing list<br>

<a href="mailto:OpenStack-dev@lists.openstack.org">OpenStack-dev@lists.openstack.org</a><br>

<a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev" target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev</a><br>

<br></blockquote></div><br><br clear="all"><div><br></div>-- <br>Vinay Bannai<br>Email: <a href="mailto:vbannai@gmail.com">vbannai@gmail.com</a><br>Google Voice: 415 938 7576<br><br>

</div>