Re: [glance][openstack-ansible] Snapshots disappear during saving
Hi Dmitriy, thanks for your answer! Yes, we do use swift and its use as glance backend is intentional. I got the following from the swift-proxy-server logs in the swift container on the infra host after taking a snapshot: Mar 22 08:43:43 infra1-swift-proxy-container-27169fa7 proxy-server[87]: Client disconnected without sending last chunk (txn: txa7c64547baf0450eb0034-006058588b) (client_ip: 192.168.110. 106) Mar 22 08:43:43 infra1-swift-proxy-container-27169fa7 proxy-server[87]: 192.168.110.106 192.168.110.211 22/Mar/2021/08/43/43 PUT /v1/AUTH_024cc551782f41e395d3c9f13582ef7d/glance_images/3ec63ec2-aa3b-4c3b-a904-b55d1a6ec878 -00001 HTTP/1.0 499 - python-swiftclient-3.10.1 gAAAAABgWFiKMz9R... 204800000 89 - txa7c64547baf0450eb0034-006058588b - 52.2991 - - 1616402571.623875856 1616402623.922997952 0 On the swift host some services logs contain errors. E.g. the swift-container-updater service: Mar 22 08:42:19 bc1bl12 systemd[1]: swift-container-updater.service: Main process exited, code=exited, status=1/FAILURE Mar 22 08:42:19 bc1bl12 systemd[1]: swift-container-updater.service: Failed with result 'exit-code'. Mar 22 08:42:21 bc1bl12 systemd[1]: swift-container-updater.service: Scheduled restart job, restart counter is at 162982. Mar 22 08:42:21 bc1bl12 systemd[1]: Stopped swift-container-updater service. Mar 22 08:42:21 bc1bl12 systemd[1]: Started swift-container-updater service. Mar 22 08:42:22 bc1bl12 swift-container-updater[50699]: Traceback (most recent call last): Mar 22 08:42:22 bc1bl12 swift-container-updater[50699]: File "/openstack/venvs/swift-22.1.0/lib/python3.8/site-packages/swift/common/utils.py", line 803, in config_fallocate_value Mar 22 08:42:22 bc1bl12 swift-container-updater[50699]: reserve_value = float(reserve_value[:-1]) Mar 22 08:42:22 bc1bl12 swift-container-updater[50699]: ValueError: could not convert string to float: '1%' Mar 22 08:42:22 bc1bl12 swift-container-updater[50699]: During handling of the above exception, another exception occurred: Mar 22 08:42:22 bc1bl12 swift-container-updater[50699]: Traceback (most recent call last): Mar 22 08:42:22 bc1bl12 swift-container-updater[50699]: File "/openstack/venvs/swift-22.1.0/bin/swift-container-updater", line 23, in <module> Mar 22 08:42:22 bc1bl12 swift-container-updater[50699]: run_daemon(ContainerUpdater, conf_file, **options) Mar 22 08:42:22 bc1bl12 swift-container-updater[50699]: File "/openstack/venvs/swift-22.1.0/lib/python3.8/site-packages/swift/common/daemon.py", line 304, in run_daemon Mar 22 08:42:22 bc1bl12 swift-container-updater[50699]: utils.config_fallocate_value(conf.get('fallocate_reserve', '1%')) Mar 22 08:42:22 bc1bl12 swift-container-updater[50699]: File "/openstack/venvs/swift-22.1.0/lib/python3.8/site-packages/swift/common/utils.py", line 809, in config_fallocate_value Mar 22 08:42:22 bc1bl12 swift-container-updater[50699]: raise ValueError('Error: %s is an invalid value for fallocate' Mar 22 08:42:22 bc1bl12 swift-container-updater[50699]: ValueError: Error: 1%% is an invalid value for fallocate_reserve. The same exit-code and traceback shows in the logs of swift-container-auditor, swift-account-auditor and swift-account-reaper services. Does this tell you anything useful? We didn't experience any problems when uploading files to containers, only when taking snapshots of instances. Kind regards, Oliver
------------------------------
Message: 4 Date: Thu, 18 Mar 2021 12:44:47 +0200 From: Dmitriy Rabotyagov <noonedeadpunk@ya.ru> To: "openstack-discuss@lists.openstack.org" <openstack-discuss@lists.openstack.org> Subject: Re: [glance][openstack-ansible] Snapshots disappear during saving Message-ID: <374941616064157@mail.yandex.ru> Content-Type: text/plain; charset=utf-8
Hi Olver,
Am I right that you're also using OpenStack Swift and it's intentional to store images there? Since the issue is related to the upload process into the Swift. So also checking Swift logs be usefull as well.
Well, looking into the fix I suggested, I'm not sure if it's valid one. There's really be a mess in patches, and according to the log provided, config needs just `1%` instead of the `1%%` you currently have. And it feels like that's what default behaviour should do with [1]. But I'm pretty sure that this error was making swift fail and thus having weird issues while operating. So I'm not sure what specifily wrong with value of swift_fallocate_reserve - maybe we've missed some config to define it or it has been overriden somewhere, but it feels like current default should cover issue you see in swift... [1] https://opendev.org/openstack/openstack-ansible-os_swift/src/branch/stable/v... 22.03.2021, 11:56, "Dmitriy Rabotyagov" <noonedeadpunk@ya.ru>:
Yes, 1%% is smth we're fighting for years, as this setting changes on the swift side from time to time, and I really lost account which one is valid at the moment.
Here's related SWIFT bug:
https://bugs.launchpad.net/swift/+bug/1844368
I've just pushed https://review.opendev.org/c/openstack/openstack-ansible-os_swift/+/782117 to cover this issue. Can you try applying this change manually to see if this works?
22.03.2021, 11:25, "Oliver Wenz" <oliver.wenz@dhbw-mannheim.de>:
Hi Dmitriy, thanks for your answer! Yes, we do use swift and its use as glance backend is intentional.
I got the following from the swift-proxy-server logs in the swift container on the infra host after taking a snapshot:
Mar 22 08:43:43 infra1-swift-proxy-container-27169fa7 proxy-server[87]: Client disconnected without sending last chunk (txn: txa7c64547baf0450eb0034-006058588b) (client_ip: 192.168.110. 106) Mar 22 08:43:43 infra1-swift-proxy-container-27169fa7 proxy-server[87]: 192.168.110.106 192.168.110.211 22/Mar/2021/08/43/43 PUT /v1/AUTH_024cc551782f41e395d3c9f13582ef7d/glance_images/3ec63ec2-aa3b-4c3b-a904-b55d1a6ec878 -00001 HTTP/1.0 499 - python-swiftclient-3.10.1 gAAAAABgWFiKMz9R... 204800000 89 - txa7c64547baf0450eb0034-006058588b - 52.2991 - - 1616402571.623875856 1616402623.922997952 0
On the swift host some services logs contain errors. E.g. the swift-container-updater service:
Mar 22 08:42:19 bc1bl12 systemd[1]: swift-container-updater.service: Main process exited, code=exited, status=1/FAILURE Mar 22 08:42:19 bc1bl12 systemd[1]: swift-container-updater.service: Failed with result 'exit-code'. Mar 22 08:42:21 bc1bl12 systemd[1]: swift-container-updater.service: Scheduled restart job, restart counter is at 162982. Mar 22 08:42:21 bc1bl12 systemd[1]: Stopped swift-container-updater service. Mar 22 08:42:21 bc1bl12 systemd[1]: Started swift-container-updater service. Mar 22 08:42:22 bc1bl12 swift-container-updater[50699]: Traceback (most recent call last): Mar 22 08:42:22 bc1bl12 swift-container-updater[50699]: File "/openstack/venvs/swift-22.1.0/lib/python3.8/site-packages/swift/common/utils.py", line 803, in config_fallocate_value Mar 22 08:42:22 bc1bl12 swift-container-updater[50699]: reserve_value = float(reserve_value[:-1]) Mar 22 08:42:22 bc1bl12 swift-container-updater[50699]: ValueError: could not convert string to float: '1%' Mar 22 08:42:22 bc1bl12 swift-container-updater[50699]: During handling of the above exception, another exception occurred: Mar 22 08:42:22 bc1bl12 swift-container-updater[50699]: Traceback (most recent call last): Mar 22 08:42:22 bc1bl12 swift-container-updater[50699]: File "/openstack/venvs/swift-22.1.0/bin/swift-container-updater", line 23, in <module> Mar 22 08:42:22 bc1bl12 swift-container-updater[50699]: run_daemon(ContainerUpdater, conf_file, **options) Mar 22 08:42:22 bc1bl12 swift-container-updater[50699]: File "/openstack/venvs/swift-22.1.0/lib/python3.8/site-packages/swift/common/daemon.py", line 304, in run_daemon Mar 22 08:42:22 bc1bl12 swift-container-updater[50699]: utils.config_fallocate_value(conf.get('fallocate_reserve', '1%')) Mar 22 08:42:22 bc1bl12 swift-container-updater[50699]: File "/openstack/venvs/swift-22.1.0/lib/python3.8/site-packages/swift/common/utils.py", line 809, in config_fallocate_value Mar 22 08:42:22 bc1bl12 swift-container-updater[50699]: raise ValueError('Error: %s is an invalid value for fallocate' Mar 22 08:42:22 bc1bl12 swift-container-updater[50699]: ValueError: Error: 1%% is an invalid value for fallocate_reserve.
The same exit-code and traceback shows in the logs of swift-container-auditor, swift-account-auditor and swift-account-reaper services. Does this tell you anything useful?
We didn't experience any problems when uploading files to containers, only when taking snapshots of instances.
Kind regards, Oliver
------------------------------
Message: 4 Date: Thu, 18 Mar 2021 12:44:47 +0200 From: Dmitriy Rabotyagov <noonedeadpunk@ya.ru> To: "openstack-discuss@lists.openstack.org" <openstack-discuss@lists.openstack.org> Subject: Re: [glance][openstack-ansible] Snapshots disappear during saving Message-ID: <374941616064157@mail.yandex.ru> Content-Type: text/plain; charset=utf-8
Hi Olver,
Am I right that you're also using OpenStack Swift and it's intentional to store images there? Since the issue is related to the upload process into the Swift. So also checking Swift logs be usefull as well.
-- Kind Regards, Dmitriy Rabotyagov
-- Kind Regards, Dmitriy Rabotyagov
participants (2)
-
Dmitriy Rabotyagov
-
Oliver Wenz