<div dir="ltr">Tim,<div style>thanks for your vision. Reducing load average on Rabbit is a good reason to use cells. Technically RPC is a kind of bottleneck on large number of hypervisors but I think dividing cluster on small peaces is just increasing number of bottlenecks. May be it's better improve RPC mechanism (in terms of scalability and performance) instead?</div>
</div><div class="gmail_extra"><br><br><div class="gmail_quote">On Thu, Oct 3, 2013 at 11:49 PM, Mike Wilson <span dir="ltr"><<a href="mailto:geekinutah@gmail.com" target="_blank">geekinutah@gmail.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">Tim,<div><br></div><div>We currently run a bit more than 20k hypervisors on a single cell. We had three major problems with getting this large: RPC, DB and scheduler. RPC is solvable by getting away from the hub-spoke topology that brokered messaging forces you into, AKA, use 0mq. DB was overcome by a combination of tuning mysql for openstack workloads and shoveling off appropriate reads to mysql replicas, see <a href="https://blueprints.launchpad.net/nova/+spec/db-slave-handle" target="_blank">https://blueprints.launchpad.net/nova/+spec/db-slave-handle</a> and <a href="https://blueprints.launchpad.net/nova/+spec/periodic-tasks-to-db-slave" target="_blank">https://blueprints.launchpad.net/nova/+spec/periodic-tasks-to-db-slave</a> for some examples of how this works. Scheduler is a problem that we didn't really solve within Openstack.  Fortunately, the way we use Openstack internally makes our scheduling decisions very simple so we basically skipped that problem by implementing a simplified scheduler that runs O(1).  The filter scheduler is the most limiting factor in my opinion. This is what really keeps folks from going larger than around 1k nodes. Sans filter_scheduler I think the realistic upper limit is somewhere around 3k.</div>

<div><br></div><div>Now that I've said all this, cells does handle these three problems very nicely by partitioning them all off and coordinating the api. However, there are some missing features that I think are not trivial to implement. I'm also not a fan of how cells decentralizes data and messaging, but I digress. I feel like much more development needs to be done on it and I'm not sure I really like the structure and requirements. I guess my view of cells is that it's a good way to partition clouds into a hierarchy and divide failure domains. I just don't think it's the end all in matters of scale in it's current state. I'm hopeful that we can flesh this out a bunch more in Icehouse.</div>
<span class="HOEnZb"><font color="#888888">
<div><br></div><div>-Mike</div></font></span></div><div class="HOEnZb"><div class="h5"><div class="gmail_extra"><br><br><div class="gmail_quote">On Thu, Oct 3, 2013 at 12:41 PM, Tim Bell <span dir="ltr"><<a href="mailto:Tim.Bell@cern.ch" target="_blank">Tim.Bell@cern.ch</a>></span> wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">





<div lang="EN-GB" link="#0563C1" vlink="#954F72">
<div>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">We’ve got several OpenStack clouds at CERN (details in
<a href="http://openstack-in-production.blogspot.fr/2013/09/a-tale-of-3-openstack-clouds-50000.html" target="_blank">
http://openstack-in-production.blogspot.fr/2013/09/a-tale-of-3-openstack-clouds-50000.html</a>).<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">The CMS farm was the further ahead and encountered problems with the number of database connections at around 1300 hypervisors. Nova
 conductor helps some of these.<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">Given that we’re heading towards 15K hypervisors for the central instance at CERN, I am not sure a single cell would handle it.<u></u><u></u></span></p>


<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">I’d be happy to hear experiences from others in this area.<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">Belmiro will be giving a summit talk on the deep dive including our experiences for those who are able to make it.<u></u><u></u></span></p>


<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">Tim<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d"><u></u> <u></u></span></p>
<div style="border:none;border-left:solid blue 1.5pt;padding:0cm 0cm 0cm 4.0pt">
<div>
<div style="border:none;border-top:solid #e1e1e1 1.0pt;padding:3.0pt 0cm 0cm 0cm">
<p class="MsoNormal"><b><span lang="EN-US" style="font-size:11.0pt;font-family:"Calibri","sans-serif"">From:</span></b><span lang="EN-US" style="font-size:11.0pt;font-family:"Calibri","sans-serif""> Joshua Harlow [mailto:<a href="mailto:harlowja@yahoo-inc.com" target="_blank">harlowja@yahoo-inc.com</a>]
<br>
<b>Sent:</b> 03 October 2013 20:32<br>
<b>To:</b> Subbu Allamaraju; Tim Bell<br>
<b>Cc:</b> <a href="mailto:openstack@lists.openstack.org" target="_blank">openstack@lists.openstack.org</a></span></p><div><div><br>
<b>Subject:</b> Re: [Openstack] Cells use cases<u></u><u></u></div></div><p></p>
</div>
</div><div><div>
<p class="MsoNormal"><u></u> <u></u></p>
<div>
<p class="MsoNormal"><span style="font-size:10.5pt;font-family:"Calibri","sans-serif"">Hi Tim,<u></u><u></u></span></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:10.5pt;font-family:"Calibri","sans-serif""><u></u> <u></u></span></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:10.5pt;font-family:"Calibri","sans-serif"">I'd also like to know what happens above 1000 hypervisors that u think needs cells?<u></u><u></u></span></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:10.5pt;font-family:"Calibri","sans-serif""><u></u> <u></u></span></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:10.5pt;font-family:"Calibri","sans-serif"">From experience at y! we actually start to see the nova-scheduler (and the filter scheduler mainly) be the problem (at around ~400 hypervisors) and that seems
 addressable without cells (yes it requires some smart/fast coding that the current scheduler is not designed for, but that seems manageable and achievable) via reviews like
<a href="https://review.openstack.org/#/c/46588" target="_blank">https://review.openstack.org/#/c/46588</a>,
<a href="https://review.openstack.org/#/c/45867" target="_blank">https://review.openstack.org/#/c/45867</a> (and others that are popping up). The filter scheduler appears to scale linearly with the number of hypervisors, and this is problematic since the filter-scheduler is
 also single-CPU bound (due to eventlet) so that overall, makes for some nice suckage. We haven't seen the RPC layer be a problem at our current scale, but maybe u guys have hit this. The other issue that starts to happen around ~400 is the nova service group
 code, which is not exactly performant when using the DB backend (we haven't tried the ZK backend yet, WIP!) due to frequent and repeated DB calls. <u></u><u></u></span></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:10.5pt;font-family:"Calibri","sans-serif""><u></u> <u></u></span></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:10.5pt;font-family:"Calibri","sans-serif"">It'd be interesting to hear the kind of limitations u guys hit that cells resolved, instead of just fixing the underlying code itself to scale better.<u></u><u></u></span></p>


</div>
<div>
<p class="MsoNormal"><span style="font-size:10.5pt;font-family:"Calibri","sans-serif""><u></u> <u></u></span></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:10.5pt;font-family:"Calibri","sans-serif"">-Josh<u></u><u></u></span></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:10.5pt;font-family:"Calibri","sans-serif""><u></u> <u></u></span></p>
</div>
<div style="border:none;border-top:solid #b5c4df 1.0pt;padding:3.0pt 0cm 0cm 0cm">
<p class="MsoNormal"><b><span style="font-size:11.0pt;font-family:"Calibri","sans-serif"">From:
</span></b><span style="font-size:11.0pt;font-family:"Calibri","sans-serif"">Subbu Allamaraju <<a href="mailto:subbu@subbu.org" target="_blank">subbu@subbu.org</a>><br>
<b>Date: </b>Thursday, October 3, 2013 10:23 AM<br>
<b>To: </b>Tim Bell <<a href="mailto:Tim.Bell@cern.ch" target="_blank">Tim.Bell@cern.ch</a>><br>
<b>Cc: </b>"<a href="mailto:openstack@lists.openstack.org" target="_blank">openstack@lists.openstack.org</a>" <<a href="mailto:openstack@lists.openstack.org" target="_blank">openstack@lists.openstack.org</a>><br>


<b>Subject: </b>Re: [Openstack] Cells use cases<u></u><u></u></span></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:10.5pt;font-family:"Calibri","sans-serif""><u></u> <u></u></span></p>
</div>
<div>
<div>
<div>
<p class="MsoNormal"><span style="font-size:10.5pt;font-family:"Calibri","sans-serif"">Hi Tim,<u></u><u></u></span></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:10.5pt;font-family:"Calibri","sans-serif""><u></u> <u></u></span></p>
</div>
<div>
<p class="MsoNormal" style="margin-bottom:12.0pt"><span style="font-size:10.5pt;font-family:"Calibri","sans-serif"">Can you comment on scalability more? Are you referring to just the RPC layer in the control plane?<u></u><u></u></span></p>


<div>
<p class="MsoNormal"><span style="font-size:10.5pt;font-family:"Calibri","sans-serif"">Subbu<u></u><u></u></span></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:10.5pt;font-family:"Calibri","sans-serif""><u></u> <u></u></span></p>
</div>
</div>
<div>
<p class="MsoNormal" style="margin-bottom:12.0pt"><span style="font-size:10.5pt;font-family:"Calibri","sans-serif""><br>
On Oct 3, 2013, at 8:53 AM, Tim Bell <<a href="mailto:Tim.Bell@cern.ch" target="_blank">Tim.Bell@cern.ch</a>> wrote:<u></u><u></u></span></p>
</div>
<blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">
<div>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d"> </span><span><u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">At CERN, we’re running cells for scalability. When you go over 1000 hypervisors or so, the general recommendation is to be in a cells configuration.</span><span><u></u><u></u></span></p>


<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d"> </span><span><u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">Cells are quite complex and the full functionality is not there yet so some parts will need to wait for Havana.</span><span><u></u><u></u></span></p>


<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d"> </span><span><u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">Tim</span><span><u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d"> </span><span><u></u><u></u></span></p>
<div style="border:none;border-left:solid blue 1.5pt;padding:0cm 0cm 0cm 4.0pt">
<div>
<div style="border:none;border-top:solid #e1e1e1 1.0pt;padding:3.0pt 0cm 0cm 0cm">
<p class="MsoNormal"><b><span lang="EN-US" style="font-size:11.0pt;font-family:"Calibri","sans-serif"">From:</span></b><span lang="EN-US" style="font-size:11.0pt;font-family:"Calibri","sans-serif""> Dmitry Ukov [<a href="mailto:dukov@mirantis.com" target="_blank">mailto:dukov@mirantis.com</a>]
<br>
<b>Sent:</b> 03 October 2013 16:38<br>
<b>To:</b> <a href="mailto:openstack@lists.openstack.org" target="_blank">openstack@lists.openstack.org</a><br>
<b>Subject:</b> [Openstack] Cells use cases</span><span><u></u><u></u></span></p>
</div>
</div>
<p class="MsoNormal"><span> <u></u><u></u></span></p>
<div>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Arial","sans-serif"">Hello all,</span><span><u></u><u></u></span></p>
<div>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Arial","sans-serif"">I've really interested in  cells but unfortunately i can't find any useful use cases of them.</span><span><u></u><u></u></span></p>


</div>
<div>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Arial","sans-serif"">For instance I have 4 DCs and I need single entry point for them. In this case cells are  a bit complicated  solution. It's better to use multiple regions in keystone
 instead</span><span><u></u><u></u></span></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Arial","sans-serif""> </span><span><u></u><u></u></span></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Arial","sans-serif"">The only one good reason for cells, which I've found, is to organize so-called failure domains, i.e. scheduling on another DCs in case of failures.</span><span><u></u><u></u></span></p>


</div>
<div>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Arial","sans-serif""> </span><span><u></u><u></u></span></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Arial","sans-serif"">Does anyone have different use cases or vision on cells usage?</span><span><u></u><u></u></span></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Arial","sans-serif"">Thanks in advance.</span><span><u></u><u></u></span></p>
</div>
<div>
<p class="MsoNormal"><span> <u></u><u></u></span></p>
</div>
<p class="MsoNormal"><span>-- <br>
Kind regards<u></u><u></u></span></p>
<div>
<p class="MsoNormal"><span>Dmitry Ukov<u></u><u></u></span></p>
</div>
<div>
<p class="MsoNormal"><span>IT Engineer<u></u><u></u></span></p>
</div>
<div>
<div>
<p class="MsoNormal"><span>Mirantis, Inc.<u></u><u></u></span></p>
</div>
</div>
<div>
<p class="MsoNormal"><span> <u></u><u></u></span></p>
</div>
</div>
</div>
</div>
</blockquote>
<blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">
<div>
<p class="MsoNormal"><span style="font-size:10.5pt;font-family:"Calibri","sans-serif"">_______________________________________________<br>
Mailing list: <a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack" target="_blank">
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack</a><br>
Post to     : <a href="mailto:openstack@lists.openstack.org" target="_blank">openstack@lists.openstack.org</a><br>
Unsubscribe : <a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack" target="_blank">
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack</a><u></u><u></u></span></p>
</div>
</blockquote>
</div>
</div>
</div></div></div>
</div>
</div>

<br>_______________________________________________<br>
Mailing list: <a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack" target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack</a><br>
Post to     : <a href="mailto:openstack@lists.openstack.org" target="_blank">openstack@lists.openstack.org</a><br>
Unsubscribe : <a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack" target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack</a><br>
<br></blockquote></div><br></div>
</div></div><br>_______________________________________________<br>
Mailing list: <a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack" target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack</a><br>
Post to     : <a href="mailto:openstack@lists.openstack.org">openstack@lists.openstack.org</a><br>
Unsubscribe : <a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack" target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack</a><br>
<br></blockquote></div><br><br clear="all"><div><br></div>-- <br>Kind regards<div>Dmitry Ukov</div><div>IT Engineer</div><div><div>Mirantis, Inc.</div></div><div><br></div>
</div>