[Openstack-operators] Shared storage for live-migration with NFS, Ceph and Lustre

Miguel A Diaz Corchero miguelangel.diaz at externos.ciemat.es
Thu Jun 25 07:16:49 UTC 2015


Great Andrew, thanks for the helpful clarification. The copy-on-write 
cloning is a crucial point for the "instant" boot of instances. Raw +1 
with RBD.

Miguel.

El 24/06/15 21:19, Andrew Woodward escribió:
> Miguel,
>
> For RBD in Openstack, you will want to use raw images (even convert 
> them while loading them into glance). The reason being is this enables 
> the copy on write functions in RBD that allows for the image to be 
> 'cloned' quickly. Not using raw will only impact storage sizes in 
> glance since the compute will just download the image and re-write it 
> into raw into RBD before starting the instance. With a raw glance 
> image (and the compute node having proper access to the glance image 
> pool in ceph) Nova will simply instruct ceph to use the glance image 
> as the clone and the instance will complete disk provisioning nearly 
> instantly.
>
>
> On Tue, Jun 23, 2015 at 12:29 AM Miguel A Diaz Corchero 
> <miguelangel.diaz at externos.ciemat.es 
> <mailto:miguelangel.diaz at externos.ciemat.es>> wrote:
>
>     Hi Dmitry
>
>     After reading CephRDB the impressions were extremely good and even
>     better than CephFS to ephemeral storage. Are you using qcow2 or
>     raw type? I prefer qcow2, but in this case we cannot enable the
>     writing cache in the cluster reducing a bit the performance. I
>     should test the CephRDB performance of both (qcow2 and raw) before
>     migrating to production.
>
>     Thanks for sharing your experience.
>     Miguel.
>
>
>     El 20/06/15 22:49, Dmitry Borodaenko escribió:
>>     With Ceph, you'll want to use RBD instead of CephFS, we had
>>     OpenStack live migration working with Ceph RBD for about a year
>>     and a half now, here's a PDF slide deck with some details:
>>     https://drive.google.com/open?id=0BxYswyvIiAEZUEp4aWJPYVNjeU0
>>
>>     If you take CephFS and the bottlenecks associated with POSIX
>>     metadata (which you don't need to manage your boot volumes which
>>     are just block devices) out of the way, the need to partition
>>     your storage cluster disappears, a single Ceph cluster can serve
>>     all 40 nodes.
>>
>>     It may be tempting to combine compute and storage on the same
>>     nodes, but there's a gotcha associated with that. Ceph OSD
>>     processes may be fairly CPU-heavy at high IOPS loads or when
>>     rebalancing data after an disk dies or a node goes offline, you'd
>>     have to figure out a way to isolated their CPU usage from that of
>>     your workloads. Which is why, for example, Fuel allows you to
>>     combine ceph-osd and compute roles on the same node, but Fuel
>>     documentation discourages you from doing so.
>>
>>
>>     On Wed, Jun 17, 2015 at 2:11 AM Miguel A Diaz Corchero
>>     <miguelangel.diaz at externos.ciemat.es
>>     <mailto:miguelangel.diaz at externos.ciemat.es>> wrote:
>>
>>         Hi friends.
>>
>>         I'm evaluating different DFS to increase our infrastructure
>>         from 10 nodes to 40 nodes approximately. One of the
>>         bottleneck is the shared storage installed to enable the
>>         live-migration.
>>         Well, the selected candidate are NFS, Ceph or Lustre (which
>>         is already installed for HPC purpose).
>>
>>         Creating a brief planning and avoiding network connectivities:
>>
>>         *a)* with NFS and Ceph, I think it is possible but dividing
>>         the whole infrastructure (40 nodes) in smaller clusters, for
>>         instance; 10 nodes with 1 storage each one. Obviously, the
>>         live-migration is only possible between nodes on the same
>>         cluster (or zone)
>>
>>         *b) *with Lustre, my idea is to connect all the nodes (40
>>         nodes) to the same lustre (MDS) and use all the concurrency
>>         advantages of the storage. In this case, the live migration
>>         could be possible among all the nodes.
>>
>>         I would like to ask you for any idea, comment or experience.
>>         I think the most untested case is b), but has anyone tried to
>>         use Lustre in a similar scenario? Any comment in any case a)
>>         o b) are appreciated.
>>
>>         Thanks
>>         Miguel.
>>
>>
>>         -- 
>>         /Miguel Angel Díaz Corchero/
>>         /*System Administrator / Researcher*/
>>         /c/ Sola nº 1; 10200 TRUJILLO, SPAIN/
>>         /Tel: +34 927 65 93 17 Fax: +34 927 32 32 37/
>>
>>         CETA-Ciemat logo <http://www.ceta-ciemat.es/>
>>
>>         /
>>         ----------------------------
>>         Confidencialidad:
>>         Este mensaje y sus ficheros adjuntos se dirige exclusivamente a su destinatario y puede contener información privilegiada o
>>         confidencial. Si no es vd. el destinatario indicado, queda notificado de que la utilización, divulgación y/o copia sin autorización está
>>         prohibida en virtud de la legislación vigente. Si ha recibido este mensaje por error, le rogamos que nos lo comunique
>>         inmediatamente respondiendo al mensaje y proceda a su destrucción.
>>           
>>         Disclaimer:
>>         This message and its attached files is intended exclusively for its recipients and may contain confidential information. If you received
>>         this e-mail in error you are hereby notified that any dissemination, copy or disclosure of this communication is strictly prohibited and
>>         may be unlawful. In this case, please notify us by a reply and delete this email and its contents immediately.
>>         ----------------------------
>>         /
>>
>>         _______________________________________________
>>         OpenStack-operators mailing list
>>         OpenStack-operators at lists.openstack.org
>>         <mailto:OpenStack-operators at lists.openstack.org>
>>         http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>>
>
>     _______________________________________________
>     OpenStack-operators mailing list
>     OpenStack-operators at lists.openstack.org
>     <mailto:OpenStack-operators at lists.openstack.org>
>     http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>
> -- 
>
> --
>
> Andrew Woodward
>
> Mirantis
>
> Fuel Community Ambassador
>
> Ceph Community
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20150625/2b2337bb/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: image/png
Size: 26213 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20150625/2b2337bb/attachment.png>


More information about the OpenStack-operators mailing list