<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
</head>
<body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; color: rgb(0, 0, 0); font-size: 14px; font-family: Calibri, sans-serif;">
<div>
<div>
<div>BTW the tracker link is <a href="http://tracker.ceph.com/issues/6142" class="url" style="border-bottom-style: dotted; border-bottom-width: 1px; border-bottom-color: rgb(140, 179, 255); color: rgb(140, 179, 255); text-decoration: none; font-family: LucidaGrande; font-size: 12px; line-height: 16px;">http://tracker.ceph.com/issues/6142</a></div>
<div><br>
</div>
<div>This is an interesting issue, I'm definitely curious.</div>
<div><br>
</div>
<div>May I ask if this happened to you as well during recovery as is described in the tracker issue ?</div>
<div>Also, if you divide the amount of placement groups by the amount of OSDs - what number are you getting at ?</div>
<div><br>
</div>
</div>
<div>If this happens mostly during recovery, I'm curious if the amount of placement groups (other than the thread config) plays a role in the amount of threads required for healing and replication.</div>
<div><br>
</div>
<div>Thanks.</div>
<div>
<div>-- </div>
<div>David Moreau Simard</div>
</div>
</div>
<div><br>
</div>
<span id="OLK_SRC_BODY_SECTION">
<div style="font-family:Calibri; font-size:11pt; text-align:left; color:black; BORDER-BOTTOM: medium none; BORDER-LEFT: medium none; PADDING-BOTTOM: 0in; PADDING-LEFT: 0in; PADDING-RIGHT: 0in; BORDER-TOP: #b5c4df 1pt solid; BORDER-RIGHT: medium none; PADDING-TOP: 3pt">
<span style="font-weight:bold">De : </span>"Fischer, Matt" <<a href="mailto:matthew.fischer@twcable.com">matthew.fischer@twcable.com</a>><br>
<span style="font-weight:bold">Date : </span>Thu, 28 Aug 2014 16:51:18 -0400<br>
<span style="font-weight:bold">À : </span>Warren Wang <<a href="mailto:warren@wangspeed.com">warren@wangspeed.com</a>>, "<a href="mailto:openstack-operators@lists.openstack.org">openstack-operators@lists.openstack.org</a>" <<a href="mailto:openstack-operators@lists.openstack.org">openstack-operators@lists.openstack.org</a>><br>
<span style="font-weight:bold">Objet : </span>Re: [Openstack-operators] Ceph crashes with larger clusters and denser hardware<br>
</div>
<div><br>
</div>
<div>
<div style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; color: rgb(0, 0, 0); font-size: 14px; font-family: Calibri, sans-serif; ">
<div>What version of ceph was this seen on?</div>
<div><br>
</div>
<span id="OLK_SRC_BODY_SECTION">
<div style="font-family:Calibri; font-size:11pt; text-align:left; color:black; BORDER-BOTTOM: medium none; BORDER-LEFT: medium none; PADDING-BOTTOM: 0in; PADDING-LEFT: 0in; PADDING-RIGHT: 0in; BORDER-TOP: #b5c4df 1pt solid; BORDER-RIGHT: medium none; PADDING-TOP: 3pt">
<span style="font-weight:bold">From: </span>Warren Wang <<a href="mailto:warren@wangspeed.com">warren@wangspeed.com</a>><br>
<span style="font-weight:bold">Date: </span>Thursday, August 28, 2014 10:38 AM<br>
<span style="font-weight:bold">To: </span>"<a href="mailto:openstack-operators@lists.openstack.org">openstack-operators@lists.openstack.org</a>" <<a href="mailto:openstack-operators@lists.openstack.org">openstack-operators@lists.openstack.org</a>><br>
<span style="font-weight:bold">Subject: </span>[Openstack-operators] Ceph crashes with larger clusters and denser hardware<br>
</div>
<div><br>
</div>
<div dir="ltr">One of my colleagues here at Comcast just returned from the Operators Summit and mentioned that multiple folks experienced Ceph instability with larger clusters. I wanted to send out a note and save headache for some folks.
<br clear="all">
<div>
<div><br>
</div>
<div><span style="font-family: arial, helvetica, sans-serif;">If you up the number of threads per OSD, there are situations where many threads could be quickly spawned. You must up the max number of PIDs available to the OS, otherwise you essentially get fork
bombed. Every single Ceph process with crash, and you might see a message in your shell about "Cannot allocate memory"<code>.<br>
</code></span></div>
<div><span style="font-family: arial, helvetica, sans-serif;"><code><br>
<font>In your sysctl.conf:</font><br>
<br>
# For Ceph<br>
kernel.pid_max=4194303<br>
<br>
</code></span></div>
<span style="font-family: arial, helvetica, sans-serif;">Then run "sysctl -p". In 5 days on a lab Ceph box, we have mowed through nearly 2 million PIDs. There's a tracker about this to add it to the
<a href="http://ceph.com">ceph.com</a> docs.<br>
</span></div>
<div>
<div><span style="font-family: arial, helvetica, sans-serif;"><br>
Warren<br>
</span></div>
<div><span style="font-family: arial, helvetica, sans-serif;">@comcastwarren<br>
</span></div>
</div>
</div>
</span><br>
<hr>
<font face="Arial" color="Gray" size="1">This E-mail and any of its attachments may contain Time Warner Cable proprietary information, which is privileged, confidential, or subject to copyright belonging to Time Warner Cable. This E-mail is intended solely
for the use of the individual or entity to which it is addressed. If you are not the intended recipient of this E-mail, you are hereby notified that any dissemination, distribution, copying, or action taken in relation to the contents of and attachments to
this E-mail is strictly prohibited and may be unlawful. If you have received this E-mail in error, please notify the sender immediately and permanently delete the original and any copy of this E-mail and any printout.<br>
</font></div>
</div>
_______________________________________________ OpenStack-operators mailing list <a href="mailto:OpenStack-operators@lists.openstack.org">
OpenStack-operators@lists.openstack.org</a> <a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators">
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators</a> </span>
</body>
</html>