<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40"><head><meta http-equiv=Content-Type content="text/html; charset=utf-8"><meta name=Generator content="Microsoft Word 14 (filtered medium)"><style><!--
/* Font Definitions */
@font-face
{font-family:Wingdings;
panose-1:5 0 0 0 0 0 0 0 0 0;}
@font-face
{font-family:Wingdings;
panose-1:5 0 0 0 0 0 0 0 0 0;}
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
{font-family:Tahoma;
panose-1:2 11 6 4 3 5 4 4 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0cm;
margin-bottom:.0001pt;
font-size:12.0pt;
font-family:"Times New Roman","serif";}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:blue;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{mso-style-priority:99;
color:purple;
text-decoration:underline;}
span.StylE-mailovZprvy17
{mso-style-type:personal-reply;
font-family:"Calibri","sans-serif";
color:#1F497D;}
.MsoChpDefault
{mso-style-type:export-only;
font-size:10.0pt;}
@page WordSection1
{size:612.0pt 792.0pt;
margin:70.85pt 70.85pt 70.85pt 70.85pt;}
div.WordSection1
{page:WordSection1;}
/* List Definitions */
@list l0
{mso-list-id:2046175712;
mso-list-template-ids:-2141792966;}
@list l0:level1
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:36.0pt;
mso-level-number-position:left;
text-indent:-18.0pt;
mso-ansi-font-size:10.0pt;
font-family:Symbol;}
@list l0:level2
{mso-level-number-format:bullet;
mso-level-text:o;
mso-level-tab-stop:72.0pt;
mso-level-number-position:left;
text-indent:-18.0pt;
mso-ansi-font-size:10.0pt;
font-family:"Courier New";
mso-bidi-font-family:"Times New Roman";}
@list l0:level3
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:108.0pt;
mso-level-number-position:left;
text-indent:-18.0pt;
mso-ansi-font-size:10.0pt;
font-family:Wingdings;}
@list l0:level4
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:144.0pt;
mso-level-number-position:left;
text-indent:-18.0pt;
mso-ansi-font-size:10.0pt;
font-family:Wingdings;}
@list l0:level5
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:180.0pt;
mso-level-number-position:left;
text-indent:-18.0pt;
mso-ansi-font-size:10.0pt;
font-family:Wingdings;}
@list l0:level6
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:216.0pt;
mso-level-number-position:left;
text-indent:-18.0pt;
mso-ansi-font-size:10.0pt;
font-family:Wingdings;}
@list l0:level7
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:252.0pt;
mso-level-number-position:left;
text-indent:-18.0pt;
mso-ansi-font-size:10.0pt;
font-family:Wingdings;}
@list l0:level8
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:288.0pt;
mso-level-number-position:left;
text-indent:-18.0pt;
mso-ansi-font-size:10.0pt;
font-family:Wingdings;}
@list l0:level9
{mso-level-number-format:bullet;
mso-level-text:;
mso-level-tab-stop:324.0pt;
mso-level-number-position:left;
text-indent:-18.0pt;
mso-ansi-font-size:10.0pt;
font-family:Wingdings;}
ol
{margin-bottom:0cm;}
ul
{margin-bottom:0cm;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]--></head><body lang=CS link=blue vlink=purple><div class=WordSection1><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>We have upgraded to RabbitMQ 3.6, and it resulted in one node crashing about every week on out of memory errors. To avoid this, we had to turn off the message rate collection. So no throughput graphs until it gets fixed. Avoid this version if you can.<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'>Tomas<o:p></o:p></span></p><p class=MsoNormal><span style='font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D'><o:p> </o:p></span></p><div><div style='border:none;border-top:solid #B5C4DF 1.0pt;padding:3.0pt 0cm 0cm 0cm'><p class=MsoNormal><b><span style='font-size:10.0pt;font-family:"Tahoma","sans-serif"'>From:</span></b><span style='font-size:10.0pt;font-family:"Tahoma","sans-serif"'> Sam Morrison [mailto:sorrison@gmail.com] <br><b>Sent:</b> Monday, January 09, 2017 3:55 AM<br><b>To:</b> Matt Fischer<br><b>Cc:</b> OpenStack Operators<br><b>Subject:</b> Re: [Openstack-operators] RabbitMQ 3.6.x experience?<o:p></o:p></span></p></div></div><p class=MsoNormal><o:p> </o:p></p><p class=MsoNormal>We’ve been running 3.6.5 for sometime now and it’s working well.<o:p></o:p></p><div><p class=MsoNormal><o:p> </o:p></p></div><div><p class=MsoNormal>3.6.1 - 3.6.3 are unusable, we had lots of issues with stats DB and other weirdness. <o:p></o:p></p></div><div><p class=MsoNormal><o:p> </o:p></p></div><div><p class=MsoNormal>Our setup is a 3 physical node cluster with around 9k connections, average around the 300 messages/sec delivery. We have the stats sample rate set to default and it is working fine.<o:p></o:p></p><div><p class=MsoNormal><o:p> </o:p></p></div><div><p class=MsoNormal>Yes we did have to restart the cluster to upgrade.<o:p></o:p></p></div><div><p class=MsoNormal><o:p> </o:p></p></div><div><p class=MsoNormal>Cheers,<o:p></o:p></p></div><div><p class=MsoNormal>Sam<o:p></o:p></p></div><div><p class=MsoNormal><o:p> </o:p></p></div><div><p class=MsoNormal><o:p> </o:p></p></div><div><p class=MsoNormal><o:p> </o:p></p><div><blockquote style='margin-top:5.0pt;margin-bottom:5.0pt'><div><p class=MsoNormal>On 6 Jan 2017, at 5:26 am, Matt Fischer <<a href="mailto:matt@mattfischer.com">matt@mattfischer.com</a>> wrote:<o:p></o:p></p></div><p class=MsoNormal><o:p> </o:p></p><div><div><p class=MsoNormal>MIke,<o:p></o:p></p><div><p class=MsoNormal><o:p> </o:p></p></div><div><p class=MsoNormal>I did a bunch of research and experiments on this last fall. We are running Rabbit 3.5.6 on our main cluster and 3.6.5 on our Trove cluster which has significantly less load (and criticality). We were going to upgrade to 3.6.5 everywhere but in the end decided not to, mainly because there was little perceived benefit at the time. Our main issue is unchecked memory growth at random times. I ended up making several config changes to the stats collector and then we also restart it after every deploy and that solved it (so far). <o:p></o:p></p></div><div><p class=MsoNormal><o:p> </o:p></p></div><div><p class=MsoNormal>I'd say these were my main reasons for not going to 3.6 for our control nodes:<o:p></o:p></p></div><div><ul type=disc><li class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;mso-list:l0 level1 lfo1'>In 3.6.x they re-wrote the stats processor to make it parallel. In every 3.6 release since then, Pivotal has fixed bugs in this code. Then finally they threw up their hands and said "we're going to make a complete rewrite in 3.7/4.x" (you need to look through issues on Github to find this discussion)<o:p></o:p></li><li class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;mso-list:l0 level1 lfo1'>Out of the box with the same configs 3.6.5 used more memory than 3.5.6, since this was our main issue, I consider this a negative.<o:p></o:p></li><li class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto;mso-list:l0 level1 lfo1'>Another issue is the ancient version of erlang we have with Ubuntu Trusty (which we are working on) which made upgrades more complex/impossible depending on the version.<o:p></o:p></li></ul><div><p class=MsoNormal>Given those negatives, the main one being that I didn't think there would be too many more fixes to the parallel statsdb collector in 3.6, we decided to stick with 3.5.6. In the end the devil we know is better than the devil we don't and I had no evidence that 3.6.5 would be an improvement.<o:p></o:p></p></div></div><div><p class=MsoNormal><o:p> </o:p></p></div><div><p class=MsoNormal>I did decide to leave Trove on 3.6.5 because this would give us some bake-in time if 3.5.x became untenable we'd at least have had it up and running in production and some data on it.<o:p></o:p></p></div><div><p class=MsoNormal><o:p> </o:p></p></div><div><p class=MsoNormal>If statsdb is not a concern for you, I think this changes the math and maybe you should use 3.6.x. I would however recommend at least going to 3.5.6, it's been better than 3.3/3.4 was.<o:p></o:p></p></div><div><p class=MsoNormal><o:p> </o:p></p></div><div><p class=MsoNormal>No matter what you do definitely read all the release notes. There are some upgrades which require an entire cluster shutdown. The upgrade to 3.5.6 did not require this IIRC.<o:p></o:p></p></div><div><p class=MsoNormal><o:p> </o:p></p></div><div><p class=MsoNormal>Here's the hiera for our rabbit settings which I assume you can translate:<o:p></o:p></p></div><div><p class=MsoNormal><o:p> </o:p></p></div><div><p class=MsoNormal>rabbitmq::cluster_partition_handling: 'autoheal'<o:p></o:p></p></div><div><div><p class=MsoNormal>rabbitmq::config_variables:<o:p></o:p></p></div><div><p class=MsoNormal> 'vm_memory_high_watermark': '0.6'<o:p></o:p></p></div><div><p class=MsoNormal> 'collect_statistics_interval': 30000<o:p></o:p></p></div><div><p class=MsoNormal>rabbitmq::config_management_variables:<o:p></o:p></p></div><div><p class=MsoNormal> 'rates_mode': 'none'<o:p></o:p></p></div><div><p class=MsoNormal>rabbitmq::file_limit: '65535'<o:p></o:p></p></div></div><div><p class=MsoNormal><o:p> </o:p></p></div><div><p class=MsoNormal>Finally, if you do upgrade to 3.6.x please report back here with your results at scale!<o:p></o:p></p></div><div><p class=MsoNormal><o:p> </o:p></p></div></div><div><p class=MsoNormal><o:p> </o:p></p><div><p class=MsoNormal>On Thu, Jan 5, 2017 at 8:49 AM, Mike Dorman <<a href="mailto:mdorman@godaddy.com" target="_blank">mdorman@godaddy.com</a>> wrote:<o:p></o:p></p><div><div><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span lang=EN-US style='font-size:11.0pt'>We are looking at upgrading to the latest RabbitMQ in an effort to ease some cluster failover issues we’ve been seeing. (Currently on 3.4.0)</span><span lang=EN-US><o:p></o:p></span></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span lang=EN-US style='font-size:11.0pt'> </span><span lang=EN-US><o:p></o:p></span></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span lang=EN-US style='font-size:11.0pt'>Anyone been running 3.6.x? And what has been your experience? Any gottchas to watch out for?</span><span lang=EN-US><o:p></o:p></span></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span lang=EN-US style='font-size:11.0pt'> </span><span lang=EN-US><o:p></o:p></span></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span lang=EN-US style='font-size:11.0pt'>Thanks,</span><span lang=EN-US><o:p></o:p></span></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span lang=EN-US style='font-size:11.0pt'>Mike</span><span lang=EN-US><o:p></o:p></span></p><p class=MsoNormal style='mso-margin-top-alt:auto;mso-margin-bottom-alt:auto'><span lang=EN-US style='font-size:11.0pt'> </span><span lang=EN-US><o:p></o:p></span></p></div></div><p class=MsoNormal style='margin-bottom:12.0pt'><br>_______________________________________________<br>OpenStack-operators mailing list<br><a href="mailto:OpenStack-operators@lists.openstack.org">OpenStack-operators@lists.openstack.org</a><br><a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators" target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators</a><o:p></o:p></p></div><p class=MsoNormal><o:p> </o:p></p></div><p class=MsoNormal>_______________________________________________<br>OpenStack-operators mailing list<br><a href="mailto:OpenStack-operators@lists.openstack.org">OpenStack-operators@lists.openstack.org</a><br><a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators</a><o:p></o:p></p></div></blockquote></div><p class=MsoNormal><o:p> </o:p></p></div></div></div></body></html>