<html>

  <head>

    <meta content="text/html; charset=windows-1252"

      http-equiv="Content-Type">

  </head>

  <body bgcolor="#FFFFFF" text="#000000">

    Hi Yatin,<br>

    <br>

    Thanks for sharing your presentation. That looks great. Welcome to

    contribute to ZeroMQ driver.<br>

    <br>

    Cheers,<br>

    Li Ma<br>

    <br>

    <div class="moz-cite-prefix">On 2014/11/19 12:50, yatin kumbhare

      wrote:<br>

    </div>

    <blockquote

cite="mid:CAKXfucwSv_bS1uusZZRMyQF_21h1nksxHaXo4FRu1vTw8qmk4g@mail.gmail.com"

      type="cite">

      <div dir="ltr">Hello Folks,

        <div><br class="">

          Couple of slides/diagrams, I documented it for my

          understanding way back for havana release. Particularly slide

          no. 10 onward.<br>

        </div>

        <div><br>

        </div>

        <div><a moz-do-not-send="true"

href="https://docs.google.com/presentation/d/1ZPWKXN7dzXs9bX3Ref9fPDiia912zsHCHNMh_VSMhJs/edit#slide=id.p">https://docs.google.com/presentation/d/1ZPWKXN7dzXs9bX3Ref9fPDiia912zsHCHNMh_VSMhJs/edit#slide=id.p</a><br>

        </div>

        <div><br>

        </div>

        <div>I am also committed to using zeromq as it's

          light-weight/fast/scalable.<br>

        </div>

        <div><br>

        </div>

        <div>I would like to chip in for further development regarding

          zeromq.</div>

        <div><br>

        </div>

        <div>Regards,</div>

        <div>Yatin<br>

          <div class="gmail_extra"><br>

            <div class="gmail_quote">On Wed, Nov 19, 2014 at 8:05 AM, Li

              Ma <span dir="ltr"><<a moz-do-not-send="true"

                  href="mailto:skywalker.nick@gmail.com" target="_blank">skywalker.nick@gmail.com</a>></span>

              wrote:<br>

              <blockquote class="gmail_quote" style="margin:0px 0px 0px

0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">

                <div bgcolor="#FFFFFF" text="#000000"><span class=""> <br>

                    <div>On 2014/11/19 1:49, Eric Windisch wrote:<br>

                    </div>

                    <blockquote type="cite">

                      <div dir="ltr">

                        <div class="gmail_extra">

                          <div class="gmail_quote">

                            <div>

                              <blockquote class="gmail_quote"

                                style="margin:0px 0px 0px

0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">I

                                think for this cycle we really do need

                                to focus on consolidating and<br>

                                testing the existing driver design and

                                fixing up the biggest<br>

                                deficiency (1) before we consider moving

                                forward with lots of new</blockquote>

                            </div>

                            <div><br>

                            </div>

                            <div>+1</div>

                            <div> </div>

                            <blockquote class="gmail_quote"

                              style="margin:0px 0px 0px

0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">1)

                              Outbound messaging connection re-use -

                              right now every outbound<br>

                              messaging creates and consumes a tcp

                              connection - this approach scales<br>

                              badly when neutron does large fanout

                              casts.<br>

                            </blockquote>

                            <div><br>

                            </div>

                            <div><br>

                            </div>

                            <div>I'm glad you are looking at this and by

                              doing so, will understand the system

                              better. I hope the following will give

                              some insight into, at least, why I made

                              the decisions I made:</div>

                            <div> </div>

                            <div>This was an intentional design

                              trade-off. I saw three choices here: build

                              a fully decentralized solution, build a

                              fully-connected network, or use

                              centralized brokerage. I wrote off

                              centralized brokerage immediately. The

                              problem with a fully connected system is

                              that active TCP connections are required

                              between all of the nodes. I didn't think

                              that would scale and would be brittle

                              against floods (intentional or otherwise).</div>

                            <div><br>

                            </div>

                            <div>IMHO, I always felt the right solution

                              for large fanout casts was to use

                              multicast. When the driver was written,

                              Neutron didn't exist and there was no

                              use-case for large fanout casts, so I

                              didn't implement multicast, but knew it as

                              an option if it became necessary. It isn't

                              the right solution for everyone, of

                              course.</div>

                            <div><br>

                            </div>

                          </div>

                        </div>

                      </div>

                    </blockquote>

                  </span> Using multicast will add some complexity of

                  switch forwarding plane that it will enable and

                  maintain multicast group communication. For large

                  deployment scenario, I prefer to make forwarding

                  simple and easy-to-maintain. IMO, run a set of

                  fanout-router processes in the cluster can also

                  achieve the goal.<br>

                  The data path is: openstack-daemon --------send the

                  message (with fanout=true) ---------> fanout-router

                  -----read the matchmaker------> send to the

                  destinations<br>

                  Actually it just uses unicast to simulate multicast.<span

                    class=""><br>

                    <blockquote type="cite">

                      <div dir="ltr">

                        <div class="gmail_extra">

                          <div class="gmail_quote">

                            <div>For connection reuse, you could manage

                              a pool of connections and keep those

                              connections around for a configurable

                              amount of time, after which they'd expire

                              and be re-opened. This would keep the most

                              actively used connections alive. One

                              problem is that it would make the service

                              more brittle by making it far more

                              susceptible to running out of file

                              descriptors by keeping connections around

                              significantly longer. However, this

                              wouldn't be as brittle as fully-connecting

                              the nodes nor as poorly scalable.</div>

                            <div><br>

                            </div>

                          </div>

                        </div>

                      </div>

                    </blockquote>

                  </span> +1. Set a large number of fds is not a

                  problem. Because we use socket pool, we can control

                  and keep the fixed number of fds.<span class=""><br>

                    <blockquote type="cite">

                      <div dir="ltr">

                        <div class="gmail_extra">

                          <div class="gmail_quote">

                            <div>If OpenStack and oslo.messaging were

                              designed specifically around this message

                              pattern, I might suggest that the library

                              and its applications be aware of

                              high-traffic topics and persist the

                              connections for those topics, while

                              keeping others ephemeral. A good example

                              for Nova would be api->scheduler

                              traffic would be persistent, whereas

                              scheduler->compute_node would be

                              ephemeral.  Perhaps this is something that

                              could still be added to the library.</div>

                            <div><br>

                            </div>

                            <blockquote class="gmail_quote"

                              style="margin:0px 0px 0px

0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">2)

                              PUSH/PULL tcp sockets - Pieter suggested

                              we look at ROUTER/DEALER<br>

                              as an option once 1) is resolved - this

                              socket type pairing has some<br>

                              interesting features which would help with

                              resilience and availability<br>

                              including heartbeating. </blockquote>

                            <div><br>

                            </div>

                            <div>Using PUSH/PULL does not eliminate the

                              possibility of being fully connected, nor

                              is it incompatible with persistent

                              connections. If you're not going to be

                              fully-connected, there isn't much

                              advantage to long-lived persistent

                              connections and without those persistent

                              connections, you're not benefitting from

                              features such as heartbeating.</div>

                            <div><br>

                            </div>

                          </div>

                        </div>

                      </div>

                    </blockquote>

                  </span> How about REQ/REP? I think it is appropriate

                  for long-lived persistent connections and also provide

                  reliability due to reply.<span class=""><br>

                    <blockquote type="cite">

                      <div dir="ltr">

                        <div class="gmail_extra">

                          <div class="gmail_quote">

                            <div>I'm not saying ROUTER/DEALER cannot be

                              used, but use them with care. They're

                              designed for long-lived channels between

                              hosts and not for the ephemeral-type

                              connections used in a peer-to-peer system.

                              Dealing with how to manage timeouts on the

                              client and the server and the swelling

                              number of active file descriptions that

                              you'll get by using ROUTER/DEALER is not

                              trivial, assuming you can get past the

                              management of all of those synchronous

                              sockets (hidden away by tons of eventlet

                              greenthreads)...</div>

                            <div><br>

                            </div>

                            <div>Extra anecdote: During a conversation

                              at the OpenStack summit, someone told me

                              about their experiences using ZeroMQ and

                              the pain of using REQ/REP sockets and how

                              they felt it was a mistake they used them.

                              We discussed a bit about some other

                              problems such as the fact it's impossible

                              to avoid TCP fragmentation unless you

                              force all frames to 552 bytes or have a

                              well-managed network where you know the

                              MTUs of all the devices you'll pass

                              through. Suggestions were made to make

                              ZeroMQ better, until we realized we had

                              just described TCP-over-ZeroMQ-over-TCP,

                              finished our beers, and quickly changed

                              topics.<br>

                            </div>

                          </div>

                        </div>

                      </div>

                    </blockquote>

                  </span> Well, seems I need to take my last question

                  back. In our deployment, I always take advantage of

                  jumbo frame to increase throughput. You said that

                  REQ/REP would introduce TCP fragmentation unless

                  zeromq frames == 552 bytes? Could you please

                  elaborate?<span class=""><br>

                    <blockquote type="cite"> <br>

                      <fieldset></fieldset>

                      <br>

                      <pre>_______________________________________________

OpenStack-dev mailing list

<a moz-do-not-send="true" href="mailto:OpenStack-dev@lists.openstack.org" target="_blank">OpenStack-dev@lists.openstack.org</a>

<a moz-do-not-send="true" href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev" target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev</a>

</pre>

                    </blockquote>

                    <br>

                  </span></div>

                <br>

                _______________________________________________<br>

                OpenStack-dev mailing list<br>

                <a moz-do-not-send="true"

                  href="mailto:OpenStack-dev@lists.openstack.org">OpenStack-dev@lists.openstack.org</a><br>

                <a moz-do-not-send="true"

                  href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev"

                  target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev</a><br>

                <br>

              </blockquote>

            </div>

            <br>

          </div>

        </div>

      </div>

      <br>

      <fieldset class="mimeAttachmentHeader"></fieldset>

      <br>

      <pre wrap="">_______________________________________________

OpenStack-dev mailing list

<a class="moz-txt-link-abbreviated" href="mailto:OpenStack-dev@lists.openstack.org">OpenStack-dev@lists.openstack.org</a>

<a class="moz-txt-link-freetext" href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev</a>

</pre>

    </blockquote>

    <br>

  </body>

</html>