[cinder] Help with Fedora 29 devstack volume/iscsi issues
Hello, I'm trying to diagnose what has gone wrong with Fedora 29 in our gate devstack test; it seems there is a problem with the iscsi setup and consequently the volume based tempest tests all fail. AFAICS we end up with nova hitting parsing errors inside os_brick's iscsi querying routines; so it seems whatever error path we've hit is outside the usual as it's made it pretty far down the stack. I have a rather haphazard bug report going on at https://bugs.launchpad.net/os-brick/+bug/1814849 as I've tried to trace it down. At this point, it's exceeding the abilities of my cinder/nova/lvm/iscsi/how-this-all-hangs-together knowledge. The final comment there has a link the devstack logs and a few bits and pieces of gleaned off the host (which I have on hold and can examine) which is hopefully useful to someone skilled in the art. I'm hoping ultimately it's a rather simple case of a missing package or config option; I would greatly appreciate any input so we can get this test stable. Thanks, -i
On 07/02, Ian Wienand wrote:
Hello,
I'm trying to diagnose what has gone wrong with Fedora 29 in our gate devstack test; it seems there is a problem with the iscsi setup and consequently the volume based tempest tests all fail. AFAICS we end up with nova hitting parsing errors inside os_brick's iscsi querying routines; so it seems whatever error path we've hit is outside the usual as it's made it pretty far down the stack.
I have a rather haphazard bug report going on at
https://bugs.launchpad.net/os-brick/+bug/1814849
as I've tried to trace it down. At this point, it's exceeding the abilities of my cinder/nova/lvm/iscsi/how-this-all-hangs-together knowledge.
The final comment there has a link the devstack logs and a few bits and pieces of gleaned off the host (which I have on hold and can examine) which is hopefully useful to someone skilled in the art.
I'm hoping ultimately it's a rather simple case of a missing package or config option; I would greatly appreciate any input so we can get this test stable.
Thanks,
-i
Hi Ian, Well, the system from the pastebin [1] doesn't look too good. DB and LIO are out of sync. You can see that the database says that there must be 3 exports and maps available, yet you only see 1 in LIO. It is werid that there are things missing from the logs: In method _get_connection_devices we have: LOG.debug('Getting connected devices for (ips,iqns,luns)=%s', 1 ips_iqns_luns) nodes = self._get_iscsi_nodes() And we can see the message in the logs [2], but then we don't see the call to iscsiadm that happens as the first instruction in _get_iscsi_nodes: out, err = self._execute('iscsiadm', '-m', 'node', run_as_root=True, root_helper=self._root_helper, check_exit_code=False) And we only see the error coming from parsing the output of that command that is not logged. I believe Matthew is right in his assessment, the problem is the output from "iscsiadm -m node", there is a missing space between the first 2 columns in the output [4]. This looks like an issue in Open iSCSI, not in OS-Brick, Cinder, or Nova. And checking their code, it looks like this is the patch that fixes it [5], so it needs to be added to F29 iscsi-initiator-utils package. Cheers, Gorka. [1]: http://paste.openstack.org/show/744723/ [2]: http://logs.openstack.org/59/619259/2/check/devstack-platform-fedora-latest/... [3]: https://bugs.launchpad.net/os-brick/+bug/1814849/comments/9 [4]: http://paste.openstack.org/show/744724/ [5]: https://github.com/open-iscsi/open-iscsi/commit/baa0cb45cfcf10a81283c191b0b2...
On Mon, Feb 11, 2019 at 11:12:29AM +0100, Gorka Eguileor wrote:
It is werid that there are things missing from the logs:
In method _get_connection_devices we have:
LOG.debug('Getting connected devices for (ips,iqns,luns)=%s', 1 ips_iqns_luns) nodes = self._get_iscsi_nodes()
And we can see the message in the logs [2], but then we don't see the call to iscsiadm that happens as the first instruction in _get_iscsi_nodes:
out, err = self._execute('iscsiadm', '-m', 'node', run_as_root=True, root_helper=self._root_helper, check_exit_code=False)
And we only see the error coming from parsing the output of that command that is not logged.
Yes, I wonder if this is related to a rootwrap stdout/stderr capturing or something?
I believe Matthew is right in his assessment, the problem is the output from "iscsiadm -m node", there is a missing space between the first 2 columns in the output [4].
This looks like an issue in Open iSCSI, not in OS-Brick, Cinder, or Nova.
And checking their code, it looks like this is the patch that fixes it [5], so it needs to be added to F29 iscsi-initiator-utils package.
Thank you! This excellent detective work has solved the problem. I did a copr build with that patch [1] and got a good tempest run [2]. Amazing how much trouble a " " can cause. I have filed an upstream bug on the package https://bugzilla.redhat.com/show_bug.cgi?id=1676365 Anyway, it has led to a series of patches you may be interested in, which I think would help future debugging efforts https://review.openstack.org/636078 : fix for quoting of devstack args (important for follow-ons) https://review.openstack.org/636079 : export all journal logs. Things like iscsid were logging to the journal, but we weren't capturing them. Includes instructions on how to use the exported journal [3] https://review.openstack.org/636080 : add a tcpdump service. With this you can easily packet capture during a devstack run. e.g. https://review.openstack.org/636082 captures all iscsi traffic and stores it [4] https://review.openstack.org/636081 : iscsid debug option, which uses a systemd override to turn up debug logging. Reviews welcome :) Thanks, -i [1] https://github.com/open-iscsi/open-iscsi/commit/baa0cb45cfcf10a81283c191b0b2... [2] http://logs.openstack.org/82/636082/9/check/devstack-platform-fedora-latest/... [3] http://logs.openstack.org/82/636082/9/check/devstack-platform-fedora-latest/... [4] http://logs.openstack.org/82/636082/9/check/devstack-platform-fedora-latest/...
participants (2)
-
Gorka Eguileor
-
Ian Wienand