Open Stack

Wed Jul 22 12:52:26 UTC 2015

Greetings,

While working on upgrade of OpenStack with Fuel installer, I meet a
requirement to re-add OSD devices with the existing data set to a Ceph
cluster using Puppet module. Node is reinstalled during the upgrade, thus
disks used for OSDs are not mounted at Puppet runtime.

Current version of Ceph module in fuel-library only supports addition of
new OSD devices. Mounted devices are skipped. Not mounted devices with Ceph
UUID in GPT label are passed to 'ceph-deploy osd prepare' command that
formats the device, recreates file system and all existing data is lost.

I proposed a patch to allow support for OSD devices with existing data set:
https://review.openstack.org/#/c/203639/2

However, this fix is very straightforward and doesn't account for different
corner cases, as was pointed out by Mykola Golub in review. As this problem
seems rather significant to me, I'd like to bring this discussion to the
broader audience.

So, here's the comment with my replies inline:

I am not sure just reactivating disks that have a filesystem is a safe
approach:

1) If you are deploying a mix of new and restored disks you may end up with
confiicting OSDs joining the cluster with the same ID. 2) It makes sense to
restore OSDs only if a monitor (cluster) is restored, otherwise activation
of old OSDs will fail. 3) It might happen that the partition contains a
valid filesystem by accident (e.g. the user reused disk/hosts from another
cluster) -- it will not join the cluster because wrong fsid and credentials
but the deployment will unexpectedly fail.

1) As far as I can tell, OSD device IDs are assgined by Ceph cluster based
on already existing devices. So, if some ID is stored on the device, either
device with the given ID already exists in the cluster and no other new
device will the same ID, or cluster doesn't know about a device with the
given ID, and that means we already lost the data placement before.
2) This can be fixed by adding a check that ensures that fsid parameter in
ceph.conf on the node and cluster-fsid on the device are equal. Otherwise
the device is treated like a new device, i.e. passed to 'ceph-deploy osd
prepare'.
3) This situation would be covered by previous check, in my understanding.

Is it posible to pass information that the cluster is restored using
partition preservation? Becasue I think a much safer approach is:

1) Pass some flag from the user that we are restoring the cluster 2)
Restore controller (monitor) and abort deployment if it fails. 3) When
deploying osd host, if 'restore' flag is present, skip prepare step and try
only activate for all disks if possible (we might want to ignore activate
error, and continue with other disks so we restore osds as many as possible)

The case I want to support by this change is not restoration of the whole
cluster, but rather support for reinstallation of OSD node's operating
system. For this case, the approach you propose seems actually more correct
than my implementation. For node being reinstalled we do not expect new
devices, but only ones with the existing data set, so we don't need to
specifically check for it, but rather just skip prepare for all devices.

We still need to check that the value of fsid on the disk is consistent
with the cluster's fsid.

Which issues should we anticipate with this kind of approach?

Another question that is still unclear to me is if someone really needs
support for a hybrid use case when the new and existing unmounted OSD
devices are mixed in one OSD node?

--
Best regards,
Oleg Gelbukh
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20150722/9dbed696/attachment.html>

Open Stack

[openstack-dev] [Fuel] Restore OSD devices with Puppet Ceph module

OpenStack

Community

Documentation

Branding & Legal