[Openstack-operators] Ceph crashes with larger clusters and denser hardware
warren at wangspeed.com
Thu Aug 28 21:20:29 UTC 2014
For me, it happened on .80.5, but is relevant to any version. Here is the
On Thu, Aug 28, 2014 at 4:51 PM, Fischer, Matt <matthew.fischer at twcable.com>
> What version of ceph was this seen on?
> From: Warren Wang <warren at wangspeed.com>
> Date: Thursday, August 28, 2014 10:38 AM
> To: "openstack-operators at lists.openstack.org" <
> openstack-operators at lists.openstack.org>
> Subject: [Openstack-operators] Ceph crashes with larger clusters and
> denser hardware
> One of my colleagues here at Comcast just returned from the Operators
> Summit and mentioned that multiple folks experienced Ceph instability with
> larger clusters. I wanted to send out a note and save headache for some
> If you up the number of threads per OSD, there are situations where many
> threads could be quickly spawned. You must up the max number of PIDs
> available to the OS, otherwise you essentially get fork bombed. Every
> single Ceph process with crash, and you might see a message in your shell
> about "Cannot allocate memory".
> In your sysctl.conf:
> # For Ceph
> Then run "sysctl -p". In 5 days on a lab Ceph box, we have mowed through
> nearly 2 million PIDs. There's a tracker about this to add it to the
> ceph.com docs.
> This E-mail and any of its attachments may contain Time Warner Cable
> proprietary information, which is privileged, confidential, or subject to
> copyright belonging to Time Warner Cable. This E-mail is intended solely
> for the use of the individual or entity to which it is addressed. If you
> are not the intended recipient of this E-mail, you are hereby notified that
> any dissemination, distribution, copying, or action taken in relation to
> the contents of and attachments to this E-mail is strictly prohibited and
> may be unlawful. If you have received this E-mail in error, please notify
> the sender immediately and permanently delete the original and any copy of
> this E-mail and any printout.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the OpenStack-operators