[Ironic] No suitable device was found for deployment

韩光宇 hanguangyu2 at gmail.com
Fri Mar 4 09:25:11 UTC 2022


Hi Julia,

Sorry that my last email didn't reploy some question in your email.
When I get into RAID config menu, I said disk state is "unconfig bad".
And the more info is that Virtual Drive Management displayed "No
Configuration Present!". But I cound not modify disk state in RAID
confug menu. Even if I moved the disk to other server, and used the
software in operating system to modify it, it still coudn't to be
modify. such as:
```shell
# MegaCli -PDList -a0

Adapter #0

Enclosure Device ID: 252
Slot Number: 1
Enclosure position: 0
Device Id: 9
WWN: 50014EE2BE72FEBF
Sequence Number: 2
Media Error Count: 0
Other Error Count: 0
Predictive Failure Count: 0
Last Predictive Failure Event Seq Number: 0
PD Type: SATA
Raw Size: 0 KB [0x0 Sectors]
Non Coerced Size: 0 KB [0x0 Sectors]
Coerced Size: 0 KB [0x0 Sectors]
Firmware state: Unconfigured(bad)
Device Firmware Level: WA09
Shield Counter: 0
Successful diagnostics completion on :  N/A
SAS Address(0): 0x4433221101000000
Connected Port Number: 0(path0)
Inquiry Data: ATA     HGST HUS722T2TALWA09WCC6N4HZV9SX
FDE Enable: Disable
Secured: Unsecured
Locked: Unlocked
Needs EKM Attention: No
Foreign State: None
Device Speed: Unknown
Link Speed: 6.0Gb/s
Media Type: Hard Disk Device
Drive Temperature :0C (32.00 F)
PI Eligibility:  No
Drive is formatted for PI information:  No
PI: No PI
Drive's write cache : Disabled
Drive's NCQ setting : Enabled
Port-0 :
Port status: Active
Port's Linkspeed: 6.0Gb/s
Drive has flagged a S.M.A.R.T alert : No



Enclosure Device ID: 252
Slot Number: 2
Enclosure position: 0
Device Id: 10
WWN: 50014EE2691D724C
Sequence Number: 2
Media Error Count: 0
Other Error Count: 0
Predictive Failure Count: 0
Last Predictive Failure Event Seq Number: 0
PD Type: SATA
Raw Size: 0 KB [0x0 Sectors]
Non Coerced Size: 0 KB [0x0 Sectors]
Coerced Size: 0 KB [0x0 Sectors]
Firmware state: Unconfigured(bad)
Device Firmware Level: WA09
Shield Counter: 0
Successful diagnostics completion on :  N/A
SAS Address(0): 0x4433221102000000
Connected Port Number: 1(path0)
Inquiry Data: ATA     HGST HUS722T2TALWA09WCC6N4HZVTV5
FDE Enable: Disable
Secured: Unsecured
Locked: Unlocked
Needs EKM Attention: No
Foreign State: None
Device Speed: Unknown
Link Speed: 6.0Gb/s
Media Type: Hard Disk Device
Drive Temperature :0C (32.00 F)
PI Eligibility:  No
Drive is formatted for PI information:  No
PI: No PI
Drive's write cache : Disabled
Drive's NCQ setting : Enabled
Port-0 :
Port status: Active
Port's Linkspeed: 6.0Gb/s
Drive has flagged a S.M.A.R.T alert : No



Enclosure Device ID: 252
Slot Number: 3
Enclosure position: 0
Device Id: 11
WWN: 50014EE2BE733A5E
Sequence Number: 2
Media Error Count: 0
Other Error Count: 0
Predictive Failure Count: 0
Last Predictive Failure Event Seq Number: 0
PD Type: SATA
Raw Size: 1.819 TB [0xe8e088b0 Sectors]
Non Coerced Size: 1.818 TB [0xe8d088b0 Sectors]
Coerced Size: 1.818 TB [0xe8d00000 Sectors]
Firmware state: JBOD
Device Firmware Level: WA09
Shield Counter: 0
Successful diagnostics completion on :  N/A
SAS Address(0): 0x4433221103000000
Connected Port Number: 2(path0)
Inquiry Data: WCC6N0KX0HJD        HGST HUS722T2TALA604
   RAGNWA09
FDE Enable: Disable
Secured: Unsecured
Locked: Unlocked
Needs EKM Attention: No
Foreign State: None
Device Speed: 6.0Gb/s
Link Speed: 6.0Gb/s
Media Type: Hard Disk Device
Drive Temperature :34C (93.20 F)
PI Eligibility:  No
Drive is formatted for PI information:  No
PI: No PI
Drive's write cache : Enabled
Drive's NCQ setting : Enabled
Port-0 :
Port status: Active
Port's Linkspeed: 6.0Gb/s
Drive has flagged a S.M.A.R.T alert : No




Exit Code: 0x00
```
```shell
# MegaCli -PDList -a0 | grep state
Firmware state: Unconfigured(bad)
Firmware state: Unconfigured(bad)
Firmware state: JBOD
# MegaCli -PDMakeGood -PhysDrv[252:2] -a0

Adapter: 0: Failed to change PD state at EnclId-252 SlotId-2.

Exit Code: 0x01
```

And the "security locked state" is a idea that I continue. I will try
to determine if it is in this state and find a way to disarm it.

Thank you very much Julia,
cheers,
Han Guangyu

Julia Kreger <juliaashleykreger at gmail.com> 于2022年3月1日周二 22:06写道:
>
> On Mon, Feb 28, 2022 at 1:12 AM Arne Wiebalck <arne.wiebalck at cern.ch> wrote:
> >
> > Hi Guangyu,
> >
> > I am not aware of anything in the Ironic Python Agent that
> > would remove disks from the system in a way that they would
> > not be visible after a reboot (apart from, as mentioned before,
> > the clean up of a hardware RAID in a way the IPA is not able
> > to see any devices after).
> >
> > How about trying to access and configure the hardware RAID with
> > the corresponding tool from the RAM disk you booted into from the
> > USB stick? Install the tool and see if it detects the controller.
> >
> > The very first step before doing anything with Ironic is to
> > get the disks back or understand why they are not visible.
> >
>
> Did cleaning fail at any given point with these machines?
>
> If you have physical access, try disconnecting all of the drives, and
> then powering up the machine and see if you can get into the firmware
> configuration screen with control-h. If you can, remove all of the
> prior configuration or disk volumes. They will look like they are in
> error states most likely. If your unable to get into this screen, I
> would be worried about your disk controller card. If your able to
> clear everything out of the controller, power off, try re-inserting
> drives, and see what happens. See if the controller can view/interact
> with the drives. If it sees no drives, then my next paragraph is
> likely the case.
>
> The disks sound like they might be in security locked state which will
> likely require a desktop SATA disk controller to remedy by attaching
> and manually removing from a security locked state. Megaraid
> controllers can't recognize security locked devices (most controllers
> and especially ones labeled "raid controllers" can't handle it) when
> in pass-through mode, but I've never heard of security lock commands
> actually getting through to the device with those controllers in
> pass-through mode. If the card was in raid mode to begin with, then it
> likely never did anything involving secure erase as the controller
> should not be offering that as a feature of provided disks to the OS.
>
> > Cheers,
> >   Arne
> >
> > On 28.02.22 09:28, 韩光宇 wrote:
> > > Hi Arne,
> > >
> > > I didn't find hardware RAID config option during the initial boot
> > > sequence. Ctrl+H is unresponsive in this computer. I just saw "Press
> > > Del to enter firmware configuration, press F3 to enter boot menu, and
> > > press F12 to enter network boot". And I press 'Del' to enter the BIOS.
> > > But I didn't find RAID config menu in BIOS. Sorry that I have poor
> > > knowledge about this.
> > >
> > > And now, even though I installed the operating system manually using a
> > > USB stick, I still couldn't find the hard drive. Is there anything
> > > that ironic-agent did in the clean phase that would have caused this
> > > problem?
> > >
> > > I wonder if this is a thinking pointto solve the problem. Now, my idea
> > > is to first find a way to manually configure RAID.  Do you agree with
> > > this?  And than, whether RAID configurations are still cleared in the
> > > Clean phase if clean phase will do this?
> > >
> > > Sorry that I have so much confuse.
> > >
> > > love you,
> > > Guangyu
> > >
> > > Arne Wiebalck <arne.wiebalck at cern.ch> 于2022年2月14日周一 15:59写道:
> > >>
> > >> Hi Guangyu,
> > >>
> > >> It seems like Julia had the right idea and the disks
> > >> are not visible since the RAID controller does not
> > >> expose anything to the operating system. This seems
> > >> to be confirmed by you booting into the CentOS7 image.
> > >>
> > >> What I would suggest to try next is to look for the
> > >> hardware RAID config option during the initial boot
> > >> sequence to enter the RAID config menu (there should be
> > >> a message quite early on, and maybe Ctrl-H is needed
> > >> to enter the menu).
> > >>
> > >> Once there, manually configure the disks as JBODs or
> > >> create a RAID device. Upon reboot this should be visible
> > >> and accessible as a device. Maybe check from your CentOS7
> > >> image again. If the devices are there, Ironic should
> > >> also be able to deploy on them (for this you can remove
> > >> the RAID config you added).
> > >>
> > >> It depends a little on what your goal is, but I would
> > >> try this first to see if you can make a device visible
> > >> and if the Ironic deploy bit works, before trying to
> > >> configure the hardware RAID via Ironic.
> > >>
> > >> Cheers,
> > >>    Arne
> > >>
> > >> On 14.02.22 03:20, 韩光宇 wrote:
> > >>> Hi Arne and Julia,
> > >>>
> > >>> You make me feel so warm. Best wishes to you.
> > >>>
> > >>> I have tried to boot the node into a CentOS7, but it still coundnot to
> > >>> find disk. And sorry that I didn't notice the RAID card.
> > >>>
> > >>> # lspci -v
> > >>> ...
> > >>> 23:00.0 RAID bus controller: Broadcom / LSI MegaRAID SAS-3 3108
> > >>> [Invader] (rev 02)
> > >>>           Subsystem: Broadcom / LSI MegaRAID SAS 9361-8i
> > >>>           Flags: bus master, fast devsel, latency 0, IRQ -2147483648, NUMA node 1
> > >>>           I/O ports at 3000 [size=256]
> > >>>           Memory at e9900000 (64-bit, non-prefetchable) [size=64K]
> > >>>           Memory at e9700000 (64-bit, non-prefetchable) [size=1M]
> > >>>           Expansion ROM at e9800000 [disabled] [size=1M]
> > >>>           Capabilities: [50] Power Management version 3
> > >>>           Capabilities: [68] Express Endpoint, MSI 00
> > >>>           Capabilities: [d0] Vital Product Data
> > >>>           Capabilities: [a8] MSI: Enable- Count=1/1 Maskable+ 64bit+
> > >>>           Capabilities: [c0] MSI-X: Enable+ Count=97 Masked-
> > >>>           Capabilities: [100] Advanced Error Reporting
> > >>>           Capabilities: [1e0] #19
> > >>>           Capabilities: [1c0] Power Budgeting <?>
> > >>>           Capabilities: [148] Alternative Routing-ID Interpretation (ARI)
> > >>>           Kernel driver in use: megaraid_sas
> > >>>           Kernel modules: megaraid_sas
> > >>> ...
> > >>>
> > >>> I try to config raid fallowing
> > >>> https://docs.openstack.org/ironic/latest/admin/raid.html
> > >>> by `baremetal node set $NODE_UUID --target-raid-config raid.json`. The
> > >>> server have  three same disk(Western Digital DC HA210 2TB SATA 6GB/s)
> > >>> # cat raid.json
> > >>> {
> > >>>     "logical_disks": [
> > >>>       {
> > >>>         "size_gb": "MAX",
> > >>>         "raid_level": "0",
> > >>>         "is_root_volume": true
> > >>>       }
> > >>>     ]
> > >>> }
> > >>>
> > >>> But Ironic still coundn't see disk. I still got
> > >>> ```
> > >>> ## In deploy images
> > >>> # journalctl -fxeu ironic-python-agent
> > >>> Feb 14 02:17:22 host-10-12-22-74 ironic-python-agent[2329]: 2022-02-14
> > >>> 02:17:22.863 2329 WARNING root [-] Path /dev/disk/by-path is
> > >>> inaccessible, /dev/disk/by-path/* version of block device name is
> > >>> unavailable Cause: [Errno 2] No such file or directory:
> > >>> '/dev/disk/by-path': FileNotFoundError: [Errno 2] No such file or
> > >>> directory: '/dev/disk/by-path'
> > >>> Feb 14 02:17:44 host-10-12-22-74 ironic-python-agent[2329]: 2022-02-14
> > >>> 02:17:44.391 2329 ERROR root [-] Unexpected error dispatching
> > >>> get_os_install_device to manager
> > >>> <ironic_python_agent.hardware.GenericHardwareManager object at
> > >>> 0x7efbf4da2208>: Error finding the disk or partition device to deploy
> > >>> the image onto: No suitable device was found for deployment - root
> > >>> device hints were not provided and all found block devices are smaller
> > >>> than 4294967296B.: ironic_python_agent.errors.DeviceNotFound: Error
> > >>> finding the disk or partition device to deploy the image onto: No
> > >>> suitable device was found for deployment - root device hints were not
> > >>> provided and all found block devices are smaller than 4294967296B.
> > >>> ```
> > >>>
> > >>> I don't know if it's a lack of a RAID card driver or a lack of a disk
> > >>> driver or a lack of RAID configuration. Could you have some idea about
> > >>> this question?
> > >>>
> > >>> love you,
> > >>> Han Guangyu
> > >>>
> > >>>
> > >>> Julia Kreger <juliaashleykreger at gmail.com> 于2022年2月10日周四 23:11写道:
> > >>>
> > >>>>
> > >>>> If the disk controllers *are* enumerated in the kernel log, which is
> > >>>> something to also look for, then the disks themselves may be in some
> > >>>> weird state like security locked. Generally this shows up as the
> > >>>> operating system kind of sees the disk and the SATA port connected but
> > >>>> can't really access it. This is also an exceptionally rare state to
> > >>>> find one's self in.
> > >>>>
> > >>>> More common, especially in enterprise grade hardware: If the disk
> > >>>> controller is actually a raid controller, and there are no raid
> > >>>> volumes configured, then the operating system likely cannot see the
> > >>>> underlying disks and turn that into a usable block device. I've seen a
> > >>>> couple drivers over the years which expose hints of disks in the
> > >>>> kernel log and without raid configuration in the cards, the drivers
> > >>>> can't present usable block devices to the operating system system.
> > >>>>
> > >>>> -Julia
> > >>>>
> > >>>> On Thu, Feb 10, 2022 at 3:17 AM Arne Wiebalck <arne.wiebalck at cern.ch> wrote:
> > >>>>>
> > >>>>> Hi Guangyu,
> > >>>>>
> > >>>>> No worries about asking questions, this is what the mailing
> > >>>>> list is for :)
> > >>>>>
> > >>>>> Just to clarify, you do not have to set root device hints,
> > >>>>> it also works without (with the algorithm I mentioned).
> > >>>>> However, hints help to define the exact device and/or make
> > >>>>> deployment more predictable/repeatable.
> > >>>>>
> > >>>>> If it is really a driver problem, it is an issue with the
> > >>>>> operating system of the image you use, i.e. CentOS8. Some
> > >>>>> drivers were removed from 7 to 8, and we have seen issues
> > >>>>> with specific drive models as well.
> > >>>>>
> > >>>>> You can try to build your own IPA images as described in
> > >>>>> [1], e.g. to add your ssh key to be able to log into the
> > >>>>> IPA to debug further, and to eventually include drivers
> > >>>>> (if you can identify them and they are available for CentOS8).
> > >>>>>
> > >>>>> Another option may be to add another (newer) disk model to
> > >>>>> the server, just to confirm it is the disk model/driver which
> > >>>>> is the cause.
> > >>>>>
> > >>>>> You could also try to boot the node into a CentOS7 (and then
> > >>>>> a CentOS8) live image to confirm it can see the disks at all.
> > >>>>>
> > >>>>> Hope this helps!
> > >>>>>     Arne
> > >>>>>
> > >>>>> [1]
> > >>>>> https://docs.openstack.org/ironic-python-agent-builder/latest/admin/dib.html
> > >>>>>
> > >>>>>
> > >>>>> On 10.02.22 11:15, 韩光宇 wrote:
> > >>>>>> Hi Arne,
> > >>>>>>
> > >>>>>> Thank you very much for your response. Love you. You take away a lot
> > >>>>>> of my confusion.
> > >>>>>>
> > >>>>>> You are right, I didn't set 'root device'. And Ironic also can not see
> > >>>>>> disk, the content of the 'lsblk' file in the deploy los is emply.
> > >>>>>> I tried to set 'root device', but because ironic can't find any disk,
> > >>>>>> the deploy still filed.
> > >>>>>>
> > >>>>>> Feb 10 09:51:55 host-10-12-22-59 ironic-python-agent[2324]: 2022-02-10
> > >>>>>> 09:51:55.045 2324 WARNING root [-] Path /dev/disk/by-path is
> > >>>>>> inaccessible, /dev/disk/by-path/* version of block device name is
> > >>>>>> unavailable Cause: [Errno 2] No such file or directory:
> > >>>>>> '/dev/disk/by-path': FileNotFoundError: [Errno 2] No such file or
> > >>>>>> directory: '/dev/disk/by-path'
> > >>>>>> Feb 10 09:51:55 host-10-12-22-59 ironic-python-agent[2324]: 2022-02-10
> > >>>>>> 09:51:55.056 2324 WARNING ironic_lib.utils [-] No device found that
> > >>>>>> matches the root device hints {'wwn': '0x50014EE2691D724C'}:
> > >>>>>> StopIteration
> > >>>>>>
> > >>>>>> Sorry to bother you, I'm a newcomer of Ironic and I didn't find
> > >>>>>> information about it on google.
> > >>>>>>
> > >>>>>> The bare metal node have three same disk(Western Digital DC HA210 2TB
> > >>>>>> SATA 6GB/s). Where I can confirm whether ironic-python-agent supports
> > >>>>>> this disk?
> > >>>>>>
> > >>>>>> And If Ironic cannot find disk since the corresponding drivers in the
> > >>>>>> IPA image are missing, do you know how to resolve it? I have used the
> > >>>>>> latest deploy images in
> > >>>>>> https://tarballs.opendev.org/openstack/ironic-python-agent/dib/files/
> > >>>>>> .  Do I need to find and manually add driver in the source code or
> > >>>>>> ramdisk(That was difficult tome)?
> > >>>>>>
> > >>>>>> Love you.
> > >>>>>>
> > >>>>>> Cheers,
> > >>>>>> Guangyu
> > >>>>>>
> > >>>>>> Arne Wiebalck <arne.wiebalck at cern.ch> 于2022年2月10日周四 15:51写道:
> > >>>>>>>
> > >>>>>>> Hi Guangyu,
> > >>>>>>>
> > >>>>>>> The error indicates that Ironic was not able to find
> > >>>>>>> a device where it could deploy the image to.
> > >>>>>>>
> > >>>>>>> To find a device, Ironic will use 'root device'
> > >>>>>>> hints [1], usually set by the admin on a node. If that
> > >>>>>>> does not yield anything, Ironic will loop over all
> > >>>>>>> block devices and pick the smallest which is larger
> > >>>>>>> than 4GB (and order them alphabetically).
> > >>>>>>>
> > >>>>>>> If you have disks in your server which are larger than
> > >>>>>>> 4GB, one potential explanation is that Ironic cannot see them,
> > >>>>>>> e.g. since the corresponding drivers in the IPA image are missing.
> > >>>>>>> The logs you posted seem to confirm something along those
> > >>>>>>> lines.
> > >>>>>>>
> > >>>>>>> Check the content of the 'lsblk' file in the deploy logs which
> > >>>>>>> you can find in the tar archive in /var/log/ironic/deploy/
> > >>>>>>> on the controller for your deployment attempt to see what
> > >>>>>>> devices Ironic has access to.
> > >>>>>>>
> > >>>>>>> Cheers,
> > >>>>>>>      Arne
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> [1] https://docs.openstack.org/ironic/latest/install/advanced.html#root-device-hints
> > >>>>>>>
> > >>>>>>> On 10.02.22 02:50, 韩光宇 wrote:
> > >>>>>>>> Dear all,
> > >>>>>>>>
> > >>>>>>>> I have a OpenStack Victoria environment, and tried to use ironic
> > >>>>>>>> manage bare metal. But I got "- root device hints were not provided
> > >>>>>>>> and all found block devices are smaller than 4294967296B." in deploy
> > >>>>>>>> stage.
> > >>>>>>>>
> > >>>>>>>> 2022-02-09 17:57:56.492 3908982 ERROR
> > >>>>>>>> ironic.drivers.modules.agent_base [-] Agent returned error for deploy
> > >>>>>>>> step {'step': 'write_image', 'priority': 80, 'argsinfo': None,
> > >>>>>>>> 'interface': 'deploy'} on node cc68c450-ce54-4e1c-be04-8b0a6169ef92 :
> > >>>>>>>> No suitable device was found for deployment - root device hints were
> > >>>>>>>> not provided and all found block devices are smaller than
> > >>>>>>>> 4294967296B..
> > >>>>>>>>
> > >>>>>>>> I used "openstack server create --flavor my-baremetal-flavor --nic
> > >>>>>>>> net-id=$net_id --image $image testing" to deploy bare metal node.  I
> > >>>>>>>> download deploy images(ipa-centos8-master.kernel and
> > >>>>>>>> ipa-centos8-master.initramfs) in
> > >>>>>>>> https://tarballs.opendev.org/openstack/ironic-python-agent/dib/files/.
> > >>>>>>>>
> > >>>>>>>> The baremetal node info and flavor info as following:
> > >>>>>>>> https://paste.opendev.org/show/bV7lgO6RkNQY6ZGPbT2e/
> > >>>>>>>> Ironic configure file as following:
> > >>>>>>>> https://paste.opendev.org/show/bTgY9Kpn7KWqwQl73aEa/
> > >>>>>>>> Ironic-conductor log:    https://paste.opendev.org/show/bFKZYlXmccxNxU8lEogk/
> > >>>>>>>> The log of ironic-python-agent in bare metal node:
> > >>>>>>>> https://paste.opendev.org/show/btAuaMuV2IutV2Pa7YIa/
> > >>>>>>>>
> > >>>>>>>> I see some old discussion about this, such as:
> > >>>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1312187. But those
> > >>>>>>>> discussions took place a long time ago, not version V, and no solution
> > >>>>>>>> was seen.
> > >>>>>>>>
> > >>>>>>>> Does anyone know how to solve this problem? I would appreciate any
> > >>>>>>>> kind of guidance or help.
> > >>>>>>>>
> > >>>>>>>> Thank you,
> > >>>>>>>> Han Guangyu
> > >>>>>>>>
> > >>>>>



More information about the openstack-discuss mailing list