<html>
  <head>
    <meta content="text/html; charset=windows-1252"
      http-equiv="Content-Type">
  </head>
  <body bgcolor="#FFFFFF" text="#000000">
    Hi Yatin,<br>
    <br>
    Thanks for sharing your presentation. That looks great. Welcome to
    contribute to ZeroMQ driver.<br>
    <br>
    Cheers,<br>
    Li Ma<br>
    <br>
    <div class="moz-cite-prefix">On 2014/11/19 12:50, yatin kumbhare
      wrote:<br>
    </div>
    <blockquote
cite="mid:CAKXfucwSv_bS1uusZZRMyQF_21h1nksxHaXo4FRu1vTw8qmk4g@mail.gmail.com"
      type="cite">
      <div dir="ltr">Hello Folks,
        <div><br class="">
          Couple of slides/diagrams, I documented it for my
          understanding way back for havana release. Particularly slide
          no. 10 onward.<br>
        </div>
        <div><br>
        </div>
        <div><a moz-do-not-send="true"
href="https://docs.google.com/presentation/d/1ZPWKXN7dzXs9bX3Ref9fPDiia912zsHCHNMh_VSMhJs/edit#slide=id.p">https://docs.google.com/presentation/d/1ZPWKXN7dzXs9bX3Ref9fPDiia912zsHCHNMh_VSMhJs/edit#slide=id.p</a><br>
        </div>
        <div><br>
        </div>
        <div>I am also committed to using zeromq as it's
          light-weight/fast/scalable.<br>
        </div>
        <div><br>
        </div>
        <div>I would like to chip in for further development regarding
          zeromq.</div>
        <div><br>
        </div>
        <div>Regards,</div>
        <div>Yatin<br>
          <div class="gmail_extra"><br>
            <div class="gmail_quote">On Wed, Nov 19, 2014 at 8:05 AM, Li
              Ma <span dir="ltr"><<a moz-do-not-send="true"
                  href="mailto:skywalker.nick@gmail.com" target="_blank">skywalker.nick@gmail.com</a>></span>
              wrote:<br>
              <blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
                <div bgcolor="#FFFFFF" text="#000000"><span class=""> <br>
                    <div>On 2014/11/19 1:49, Eric Windisch wrote:<br>
                    </div>
                    <blockquote type="cite">
                      <div dir="ltr">
                        <div class="gmail_extra">
                          <div class="gmail_quote">
                            <div>
                              <blockquote class="gmail_quote"
                                style="margin:0px 0px 0px
0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">I
                                think for this cycle we really do need
                                to focus on consolidating and<br>
                                testing the existing driver design and
                                fixing up the biggest<br>
                                deficiency (1) before we consider moving
                                forward with lots of new</blockquote>
                            </div>
                            <div><br>
                            </div>
                            <div>+1</div>
                            <div> </div>
                            <blockquote class="gmail_quote"
                              style="margin:0px 0px 0px
0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">1)

                              Outbound messaging connection re-use -
                              right now every outbound<br>
                              messaging creates and consumes a tcp
                              connection - this approach scales<br>
                              badly when neutron does large fanout
                              casts.<br>
                            </blockquote>
                            <div><br>
                            </div>
                            <div><br>
                            </div>
                            <div>I'm glad you are looking at this and by
                              doing so, will understand the system
                              better. I hope the following will give
                              some insight into, at least, why I made
                              the decisions I made:</div>
                            <div> </div>
                            <div>This was an intentional design
                              trade-off. I saw three choices here: build
                              a fully decentralized solution, build a
                              fully-connected network, or use
                              centralized brokerage. I wrote off
                              centralized brokerage immediately. The
                              problem with a fully connected system is
                              that active TCP connections are required
                              between all of the nodes. I didn't think
                              that would scale and would be brittle
                              against floods (intentional or otherwise).</div>
                            <div><br>
                            </div>
                            <div>IMHO, I always felt the right solution
                              for large fanout casts was to use
                              multicast. When the driver was written,
                              Neutron didn't exist and there was no
                              use-case for large fanout casts, so I
                              didn't implement multicast, but knew it as
                              an option if it became necessary. It isn't
                              the right solution for everyone, of
                              course.</div>
                            <div><br>
                            </div>
                          </div>
                        </div>
                      </div>
                    </blockquote>
                  </span> Using multicast will add some complexity of
                  switch forwarding plane that it will enable and
                  maintain multicast group communication. For large
                  deployment scenario, I prefer to make forwarding
                  simple and easy-to-maintain. IMO, run a set of
                  fanout-router processes in the cluster can also
                  achieve the goal.<br>
                  The data path is: openstack-daemon --------send the
                  message (with fanout=true) ---------> fanout-router
                  -----read the matchmaker------> send to the
                  destinations<br>
                  Actually it just uses unicast to simulate multicast.<span
                    class=""><br>
                    <blockquote type="cite">
                      <div dir="ltr">
                        <div class="gmail_extra">
                          <div class="gmail_quote">
                            <div>For connection reuse, you could manage
                              a pool of connections and keep those
                              connections around for a configurable
                              amount of time, after which they'd expire
                              and be re-opened. This would keep the most
                              actively used connections alive. One
                              problem is that it would make the service
                              more brittle by making it far more
                              susceptible to running out of file
                              descriptors by keeping connections around
                              significantly longer. However, this
                              wouldn't be as brittle as fully-connecting
                              the nodes nor as poorly scalable.</div>
                            <div><br>
                            </div>
                          </div>
                        </div>
                      </div>
                    </blockquote>
                  </span> +1. Set a large number of fds is not a
                  problem. Because we use socket pool, we can control
                  and keep the fixed number of fds.<span class=""><br>
                    <blockquote type="cite">
                      <div dir="ltr">
                        <div class="gmail_extra">
                          <div class="gmail_quote">
                            <div>If OpenStack and oslo.messaging were
                              designed specifically around this message
                              pattern, I might suggest that the library
                              and its applications be aware of
                              high-traffic topics and persist the
                              connections for those topics, while
                              keeping others ephemeral. A good example
                              for Nova would be api->scheduler
                              traffic would be persistent, whereas
                              scheduler->compute_node would be
                              ephemeral.  Perhaps this is something that
                              could still be added to the library.</div>
                            <div><br>
                            </div>
                            <blockquote class="gmail_quote"
                              style="margin:0px 0px 0px
0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">2)

                              PUSH/PULL tcp sockets - Pieter suggested
                              we look at ROUTER/DEALER<br>
                              as an option once 1) is resolved - this
                              socket type pairing has some<br>
                              interesting features which would help with
                              resilience and availability<br>
                              including heartbeating. </blockquote>
                            <div><br>
                            </div>
                            <div>Using PUSH/PULL does not eliminate the
                              possibility of being fully connected, nor
                              is it incompatible with persistent
                              connections. If you're not going to be
                              fully-connected, there isn't much
                              advantage to long-lived persistent
                              connections and without those persistent
                              connections, you're not benefitting from
                              features such as heartbeating.</div>
                            <div><br>
                            </div>
                          </div>
                        </div>
                      </div>
                    </blockquote>
                  </span> How about REQ/REP? I think it is appropriate
                  for long-lived persistent connections and also provide
                  reliability due to reply.<span class=""><br>
                    <blockquote type="cite">
                      <div dir="ltr">
                        <div class="gmail_extra">
                          <div class="gmail_quote">
                            <div>I'm not saying ROUTER/DEALER cannot be
                              used, but use them with care. They're
                              designed for long-lived channels between
                              hosts and not for the ephemeral-type
                              connections used in a peer-to-peer system.
                              Dealing with how to manage timeouts on the
                              client and the server and the swelling
                              number of active file descriptions that
                              you'll get by using ROUTER/DEALER is not
                              trivial, assuming you can get past the
                              management of all of those synchronous
                              sockets (hidden away by tons of eventlet
                              greenthreads)...</div>
                            <div><br>
                            </div>
                            <div>Extra anecdote: During a conversation
                              at the OpenStack summit, someone told me
                              about their experiences using ZeroMQ and
                              the pain of using REQ/REP sockets and how
                              they felt it was a mistake they used them.
                              We discussed a bit about some other
                              problems such as the fact it's impossible
                              to avoid TCP fragmentation unless you
                              force all frames to 552 bytes or have a
                              well-managed network where you know the
                              MTUs of all the devices you'll pass
                              through. Suggestions were made to make
                              ZeroMQ better, until we realized we had
                              just described TCP-over-ZeroMQ-over-TCP,
                              finished our beers, and quickly changed
                              topics.<br>
                            </div>
                          </div>
                        </div>
                      </div>
                    </blockquote>
                  </span> Well, seems I need to take my last question
                  back. In our deployment, I always take advantage of
                  jumbo frame to increase throughput. You said that
                  REQ/REP would introduce TCP fragmentation unless
                  zeromq frames == 552 bytes? Could you please
                  elaborate?<span class=""><br>
                    <blockquote type="cite"> <br>
                      <fieldset></fieldset>
                      <br>
                      <pre>_______________________________________________
OpenStack-dev mailing list
<a moz-do-not-send="true" href="mailto:OpenStack-dev@lists.openstack.org" target="_blank">OpenStack-dev@lists.openstack.org</a>
<a moz-do-not-send="true" href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev" target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev</a>
</pre>
                    </blockquote>
                    <br>
                  </span></div>
                <br>
                _______________________________________________<br>
                OpenStack-dev mailing list<br>
                <a moz-do-not-send="true"
                  href="mailto:OpenStack-dev@lists.openstack.org">OpenStack-dev@lists.openstack.org</a><br>
                <a moz-do-not-send="true"
                  href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev"
                  target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev</a><br>
                <br>
              </blockquote>
            </div>
            <br>
          </div>
        </div>
      </div>
      <br>
      <fieldset class="mimeAttachmentHeader"></fieldset>
      <br>
      <pre wrap="">_______________________________________________
OpenStack-dev mailing list
<a class="moz-txt-link-abbreviated" href="mailto:OpenStack-dev@lists.openstack.org">OpenStack-dev@lists.openstack.org</a>
<a class="moz-txt-link-freetext" href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev</a>
</pre>
    </blockquote>
    <br>
  </body>
</html>