[openstack-dev] [Fuel] Wiping node's disks on delete

Alexander Gordeev agordeev at mirantis.com
Wed Mar 23 15:22:56 UTC 2016

Hello Dmitry,

First of all, thanks for recovering the thread.

Please read my comments inline.

On Tue, Mar 22, 2016 at 1:07 PM, Dmitry Guryanov <dguryanov at mirantis.com>

> The first problem could be solved with zeroing first 512 bytes of each
> disk (not partition). Even 446 to be precise, because last 66 bytes are
> partition scheme, see
> https://wiki.archlinux.org/index.php/Master_Boot_Record .
Apparently, fuel has been using GPT since the very beginning [1].

fuel-agent does create only GPT [2] (but in fact it has got some soft of
rudimentary MBR support inside [3], but i really doubt if the corresponding
code path was executed even for once for real use cases. Looks like only
unit tests are actually using it)

Currently, due to lack of UEFI support in fuel, fuel-agent got to use
special dedicated partition for allowing to boot in CSM (BIOS/GPT) mode.
[4] [5]

And it turns out, that you're right about the fact that first stage of grub
resides in MBR [6] no




[4] https://help.ubuntu.com/community/Grub2/Installing#BIOS.2FGPT_Notes


[6] https://en.wikipedia.org/wiki/BIOS_boot_partition

The second problem should be solved only after reboot into bootstrap.
> Because if we bring a new node to the cluster from some other place and
> boot it with bootstrap image it will possibly have disks with some
> partitions, md devices and lvm volumes. So all these entities should be
> correctly cleared before provisioning, not before reboot. And fuel-agent
> does it in [1].
However, the code from

is not allowing us to mix LVM and MD. So, fuel-agent could only
use/wipe/create them separately. The case when MD is built on top of LVM
volumes and vice versa are not supported and then fuel-agent will fail. I
suspect this issue should go to another thread entirely, I just want to
keep you aware of that the way how fuel-agent does it, it's not perfectly
correct. At least, it works for fuel's case.

> I propose to remove erasing first 1M of each partiton, because it can lead
> to errors in FS kernel drivers and kernel panic. An existing workaround,
> that in case of kernel panic we do reboot is bad because it may occur just
> after clearing first partition of the first disk and after reboot bios will
> read MBR of the second disk and boot from it instead of network. Let's just
> clear first 446 bytes of each disk.
> [0]
> https://github.com/openstack/fuel-astute/blob/master/mcagents/erase_node.rb#L162-L174

Yep, astute needs to be fixed as the way how it wipes the disks is way too
fragile, dangerous and not always reliable due to what you mentioned above.

Nope, I think that zeroing of 446 bytes is not enough. Why don't we want to
wipe bios_boot partition too? Let's wipe all grub leftovers such as
bios_boot partitions too. They doesn't contain any FS, so unlikely that
kernel or any other process will prevent us from wiping it. No errors, no
kernel panic are expected.

On Tue, Mar 22, 2016 at 5:06 PM, Dmitry Guryanov <dguryanov at mirantis.com>

> For GPT disks and non-UEFI boot this method will work, since MBR will
> still contain first stage of a bootloader code.

Agreed, it will work. But how about bios_boot partition? What do you think?

Thanks,  Alex.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20160323/527beb3b/attachment.html>

More information about the OpenStack-dev mailing list