Hey Junya,

I am unsure specifically, but https://bugs.launchpad.net/ironic/+bug/2045191 looks like it may be impacting you. I'd suggest using IPMI in the meantime as it works. If you can add your details to that bug, it may help us reach a resolution faster.

Thanks,
Jay Faulkner

On Fri, Mar 22, 2024 at 6:44 AM Junya Noguchi (Fujitsu) <noguchi.junya@fujitsu.com> wrote:
To whom it may concern.

My name is Junya Noguchi.

We have the following problems and questions.
Please respond.

1. Problems and Questions
When we use OpenStack Ironic Redfish driver to manage Redfish capable server, the server may not accept power on operations.
Server may or may not power on.
We want to clarify whether this is a server-specific issue or an OpenStack Ironic bug.

Occurrence event
・Auto-cleaning or instance startup failed.
・Ironic does not power on via Redfish driver (see log)
  ・We tried increasing the power_state_change in ironic.conf to 30 minutes, but it didn't resolved.
・It does not happen when changing Redfish to IPMI

2. Server information
R272-P31 (GIGABYTE Arm Server)
CPU                     : Ampere Altra Max      2.8GHz-3.0GHz   1/128
Memory                  : 256GB
Disk                    : 1TB/SSDx3/RAID1
Server firmware         : MegaRac SP-X:https://www.ami.com/megarac/ ※It states that it is compatible with IPMI/Redfish
Openstaack ver  : Zed
Hardware interface      : redfish
Power interface         : redfish
Boot interface          : pxe
Deploy interface                : direct
Network interface       : flat

3. Log
File : /var/log/ironic/ironic-conductor.log
2023-12-19 10:24:47.677 384940 DEBUG sushy.connector [-] HTTP request: GET "Server FQDN"/redfish/v1/Systems/Self; headers: {'OData-Version': '4.0'}; body: None; blocking: False; timeout: 60; session arguments: {}; _op /usr/lib/python3.9/site-packages/sushy/connector.py:118
2023-12-19 10:24:49.975 384940 DEBUG sushy.connector [-] HTTP response for GET "Server FQDN"/redfish/v1/Systems/Self: status code: 200 _op /usr/lib/python3.9/site-packages/sushy/connector.py:226
2023-12-19 10:24:49.976 384940 DEBUG sushy.resources.base [-] Received representation of System /redfish/v1/Systems/Self: {'_actions': {'reset': {'allowed_values': None, 'operation_apply_time_support': {'_maintenance_window_resource': {'resource_uri': '/redfish/v1/Systems/Self'}, 'maintenance_window_duration_in_seconds': 600, 'maintenance_window_start_time': None, 'mapped_supported_values': [<ApplyTime.IMMEDIATE: 'Immediate'>, <ApplyTime.AT_MAINTENANCE_WINDOW_START: 'AtMaintenanceWindowStart'>], 'supported_values': ['Immediate', 'AtMaintenanceWindowStart']}, 'target_uri': '/redfish/v1/Systems/Self/Actions/ComputerSystem.Reset'}}, '_oem_vendors': ['Gbt'], '_settings': None, 'asset_tag': 'Free form asset tag', 'bios_version': 'F31h (SCP: 2.10.20220531)', 'boot': {'allowed_values': ['None', 'Pxe', 'Floppy', 'Cd', 'Usb', 'Hdd', 'BiosSetup', 'Utilities', 'UefiShell', 'UefiTarget', 'SDCard', 'UefiHttp', 'RemoteDrive', 'UefiBootNext'], 'enabled': <BootSourceOverrideEnabled.ONCE: 'Once'>, 'mode': <BootSourceOverrideMode.UEFI: 'UEFI'>, 'target': <BootSource.PXE: 'Pxe'>}, 'description': 'System Self', 'hostname': None, 'identity': 'Self', 'indicator_led': <IndicatorLED.OFF: 'Off'>, 'links': {'oem_vendors': None}, 'maintenance_window': None, 'manufacturer': 'GIGABYTE', 'memory_summary': {'health': 'OK', 'size_gib': 256}, 'name': 'System', 'part_number': '000000000001', 'power_state': <PowerState.OFF: 'Off'>, 'serial_number': 'XXXXXXXXXXXXXX', 'sku': '01234567890123456789AB', 'status': {'health': <Health.OK: 'OK'>, 'health_rollup': <Health.CRITICAL: 'Critical'>, 'state': <State.ENABLED: 'Enabled'>}, 'system_type': <SystemType.PHYSICAL: 'Physical'>, 'uuid': '00000000-0000-4000-8000-D85ED387A8B3'} refresh /usr/lib/python3.9/site-packages/sushy/resources/base.py:689
2023-12-19 10:24:49.976 384940 ERROR oslo.service.loopingcall [-] Dynamic backoff interval looping call 'ironic.conductor.utils.node_wait_for_power_state.<locals>._wait' failed: oslo_service.loopingcall.LoopingCallTimeOut: Looping call timed out after 1490.91 seconds
2023-12-19 10:24:49.976 384940 ERROR oslo.service.loopingcall Traceback (most recent call last):
2023-12-19 10:24:49.976 384940 ERROR oslo.service.loopingcall   File "/usr/lib/python3.9/site-packages/oslo_service/loopingcall.py", line 154, in _run_loop
2023-12-19 10:24:49.976 384940 ERROR oslo.service.loopingcall     idle = idle_for_func(result, self._elapsed(watch))
2023-12-19 10:24:49.976 384940 ERROR oslo.service.loopingcall   File "/usr/lib/python3.9/site-packages/oslo_service/loopingcall.py", line 349, in _idle_for
2023-12-19 10:24:49.976 384940 ERROR oslo.service.loopingcall     raise LoopingCallTimeOut(
2023-12-19 10:24:49.976 384940 ERROR oslo.service.loopingcall oslo_service.loopingcall.LoopingCallTimeOut: Looping call timed out after 1490.91 seconds
2023-12-19 10:24:49.976 384940 ERROR oslo.service.loopingcall
2023-12-19 10:24:49.977 384940 ERROR ironic.conductor.utils [None req-60791dc1-85c6-44c6-a3b7-a60e3242607d 4cb102fe974543b19c3e39916c7e61c1 00cfc520ffcb4533861e005a9a1e694f - - default default] Timed out after 1200 secs waiting for power on on node a357fdab-d8d5-4260-8e72-b444fa7c5cb9.: oslo_service.loopingcall.LoopingCallTimeOut: Looping call timed out after 1490.91 seconds
2023-12-19 10:24:49.989 384940 ERROR ironic.conductor.utils [None req-60791dc1-85c6-44c6-a3b7-a60e3242607d 4cb102fe974543b19c3e39916c7e61c1 00cfc520ffcb4533861e005a9a1e694f - - default default] Failed to prepare node a357fdab-d8d5-4260-8e72-b444fa7c5cb9 for cleaning: Failed to set node power state to power on.: ironic.common.exception.PowerStateFailure: Failed to set node power state to power on.
2023-12-19 10:24:49.989 384940 ERROR ironic.conductor.utils Traceback (most recent call last):
2023-12-19 10:24:49.989 384940 ERROR ironic.conductor.utils   File "/usr/lib/python3.9/site-packages/ironic/conductor/utils.py", line 183, in node_wait_for_power_state
2023-12-19 10:24:49.989 384940 ERROR ironic.conductor.utils     return timer.start(initial_delay=1, timeout=retry_timeout).wait()
2023-12-19 10:24:49.989 384940 ERROR ironic.conductor.utils   File "/usr/lib/python3.9/site-packages/eventlet/event.py", line 125, in wait
2023-12-19 10:24:49.989 384940 ERROR ironic.conductor.utils     result = hub.switch()
2023-12-19 10:24:49.989 384940 ERROR ironic.conductor.utils   File "/usr/lib/python3.9/site-packages/eventlet/hubs/hub.py", line 313, in switch
2023-12-19 10:24:49.989 384940 ERROR ironic.conductor.utils     return self.greenlet.switch()
2023-12-19 10:24:49.989 384940 ERROR ironic.conductor.utils   File "/usr/lib/python3.9/site-packages/oslo_service/loopingcall.py", line 154, in _run_loop
2023-12-19 10:24:49.989 384940 ERROR ironic.conductor.utils     idle = idle_for_func(result, self._elapsed(watch))
2023-12-19 10:24:49.989 384940 ERROR ironic.conductor.utils   File "/usr/lib/python3.9/site-packages/oslo_service/loopingcall.py", line 349, in _idle_for
2023-12-19 10:24:49.989 384940 ERROR ironic.conductor.utils     raise LoopingCallTimeOut(
2023-12-19 10:24:49.989 384940 ERROR ironic.conductor.utils oslo_service.loopingcall.LoopingCallTimeOut: Looping call timed out after 1490.91 seconds
2023-12-19 10:24:49.989 384940 ERROR ironic.conductor.utils
2023-12-19 10:24:49.989 384940 ERROR ironic.conductor.utils oslo_service.loopingcall.LoopingCallTimeOut: Looping call timed out after 1490.91 seconds
2023-12-19 10:24:49.989 384940 ERROR ironic.conductor.utils
2023-12-19 10:24:49.989 384940 ERROR ironic.conductor.utils During handling of the above exception, another exception occurred:
2023-12-19 10:24:49.989 384940 ERROR ironic.conductor.utils
2023-12-19 10:24:49.989 384940 ERROR ironic.conductor.utils Traceback (most recent call last):
2023-12-19 10:24:49.989 384940 ERROR ironic.conductor.utils   File "/usr/lib/python3.9/site-packages/ironic/conductor/cleaning.py", line 91, in do_node_clean
2023-12-19 10:24:49.989 384940 ERROR ironic.conductor.utils     prepare_result = task.driver.deploy.prepare_cleaning(task)
2023-12-19 10:24:49.989 384940 ERROR ironic.conductor.utils   File "/usr/lib/python3.9/site-packages/ironic_lib/metrics.py", line 59, in wrapped
2023-12-19 10:24:49.989 384940 ERROR ironic.conductor.utils     result = f(*args, **kwargs)
2023-12-19 10:24:49.989 384940 ERROR ironic.conductor.utils   File "/usr/lib/python3.9/site-packages/ironic/drivers/modules/agent_base.py", line 726, in prepare_cleaning
2023-12-19 10:24:49.989 384940 ERROR ironic.conductor.utils     result = deploy_utils.prepare_inband_cleaning(
2023-12-19 10:24:49.989 384940 ERROR ironic.conductor.utils   File "/usr/lib/python3.9/site-packages/ironic/drivers/modules/deploy_utils.py", line 732, in prepare_inband_cleaning
2023-12-19 10:24:49.989 384940 ERROR ironic.conductor.utils     manager_utils.node_power_action(task, states.REBOOT)
2023-12-19 10:24:49.989 384940 ERROR ironic.conductor.utils   File "/usr/lib/python3.9/site-packages/ironic/conductor/task_manager.py", line 153, in wrapper
2023-12-19 10:24:49.989 384940 ERROR ironic.conductor.utils     return f(*args, **kwargs)
2023-12-19 10:24:49.989 384940 ERROR ironic.conductor.utils   File "/usr/lib/python3.9/site-packages/ironic/conductor/utils.py", line 341, in node_power_action
2023-12-19 10:24:49.989 384940 ERROR ironic.conductor.utils     notify_utils.emit_power_set_notification(
2023-12-19 10:24:49.989 384940 ERROR ironic.conductor.utils   File "/usr/lib/python3.9/site-packages/oslo_utils/excutils.py", line 227, in __exit__
2023-12-19 10:24:49.989 384940 ERROR ironic.conductor.utils     self.force_reraise()
2023-12-19 10:24:49.989 384940 ERROR ironic.conductor.utils   File "/usr/lib/python3.9/site-packages/oslo_utils/excutils.py", line 200, in force_reraise
2023-12-19 10:24:49.989 384940 ERROR ironic.conductor.utils     raise self.value
2023-12-19 10:24:49.989 384940 ERROR ironic.conductor.utils   File "/usr/lib/python3.9/site-packages/ironic/conductor/utils.py", line 329, in node_power_action
2023-12-19 10:24:49.989 384940 ERROR ironic.conductor.utils     task.driver.power.reboot(task, timeout=timeout)
2023-12-19 10:24:49.989 384940 ERROR ironic.conductor.utils   File "/usr/lib/python3.9/site-packages/ironic/conductor/task_manager.py", line 153, in wrapper
2023-12-19 10:24:49.989 384940 ERROR ironic.conductor.utils     return f(*args, **kwargs)
2023-12-19 10:24:49.989 384940 ERROR ironic.conductor.utils   File "/usr/lib/python3.9/site-packages/ironic/drivers/modules/redfish/power.py", line 166, in reboot
2023-12-19 10:24:49.989 384940 ERROR ironic.conductor.utils     _set_power_state(task, system, next_state, timeout=timeout)
2023-12-19 10:24:49.989 384940 ERROR ironic.conductor.utils   File "/usr/lib/python3.9/site-packages/ironic/drivers/modules/redfish/power.py", line 68, in _set_power_state
2023-12-19 10:24:49.989 384940 ERROR ironic.conductor.utils     cond_utils.node_wait_for_power_state(task, target_state,
2023-12-19 10:24:49.989 384940 ERROR ironic.conductor.utils   File "/usr/lib/python3.9/site-packages/ironic/conductor/utils.py", line 189, in node_wait_for_power_state
2023-12-19 10:24:49.989 384940 ERROR ironic.conductor.utils     raise exception.PowerStateFailure(pstate=new_state)
2023-12-19 10:24:49.989 384940 ERROR ironic.conductor.utils ironic.common.exception.PowerStateFailure: Failed to set node power state to power on.
2023-12-19 10:24:49.989 384940 ERROR ironic.conductor.utils
2023-12-19 10:24:49.991 384940 DEBUG ironic.conductor.utils [None req-60791dc1-85c6-44c6-a3b7-a60e3242607d 4cb102fe974543b19c3e39916c7e61c1 00cfc520ffcb4533861e005a9a1e694f - - default default] Node a357fdab-d8d5-4260-8e72-b444fa7c5cb9 is not fast-track-able because it has an error: Failed to change power state to 'power on' by 'rebooting': Failed to set node power state to power on. is_fast_track /usr/lib/python3.9/site-packages/ironic/conductor/utils.py:1146


----------------------------------------------------------------
富士通株式会社
先端技術開発本部 ソフトウェア開発統括部 基盤ソフト開発部

野口 惇矢(Noguchi Junya)

外線:070-3052-0998 内線:72-6042-3844
mail:noguchi.junya@jp.fujitsu.com