I'm having a sense of deja vu! Because of the way the mechanics work, the iscsi deploy driver is in an unfortunate position of being harder to troubleshoot and diagnose failures. Which basically means we've not been able to really identify common failures and add logic to handle them appropriately, like we are able to with a tcp socket and file download. Based on this alone, I think it makes a solid case for us to seriously consider deprecation. Overall, I'm +1 for the proposal and I believe over two cycles is the right way to go. I suspect we're going to have lots of push back from the TripleO community because there has been resistance to change their default usage in the past. As such I'm adding them to the subject so hopefully they will be at least aware. I guess my other worry is operators who already have a substantial operational infrastructure investment built around the iscsi deploy interface. I wonder why they didn't use direct, but maybe they have all migrated in the past ?5? years. This could just be a non-concern in reality, I'm just not sure. Of course, if someone is willing to step up and make the iscsi deployment interface their primary focus, that also shifts the discussion to making direct the default interface? -Julia On Thu, Aug 20, 2020 at 1:57 AM Dmitry Tantsur <dtantsur@redhat.com> wrote:
Hi all,
Side note for those lacking context: this proposal concerns deprecating one of the ironic deploy interfaces detailed in https://docs.openstack.org/ironic/latest/admin/interfaces/deploy.html. It does not affect the boot-from-iSCSI feature.
I would like to propose deprecating and removing the 'iscsi' deploy interface over the course of the next 2 cycles. The reasons are: 1) The iSCSI deploy is a source of occasional cryptic bugs when a target cannot be discovered or mounted properly. 2) Its security is questionable: I don't think we even use authentication. 3) Operators confusion: right now we default to the iSCSI deploy but pretty much direct everyone who cares about scalability or security to the 'direct' deploy. 4) Cost of maintenance: our feature set is growing, our team - not so much. iscsi_deploy.py is 800 lines of code that can be removed, and some dependencies that can be dropped as well.
As far as I can remember, we've kept the iSCSI deploy for two reasons: 1) The direct deploy used to require Glance with Swift backend. The recently added [agent]image_download_source option allows caching and serving images via the ironic's HTTP server, eliminating this problem. I guess we'll have to switch to 'http' by default for this option to keep the out-of-box experience. 2) Memory footprint of the direct deploy. With the raw images streaming we no longer have to cache the downloaded images in the agent memory, removing this problem as well (I'm not even sure how much of a problem it is in 2020, even my phone has 4GiB of RAM).
If this proposal is accepted, I suggest to execute it as follows: Victoria release: 1) Put an early deprecation warning in the release notes. 2) Announce the future change of the default value for [agent]image_download_source. W release: 3) Change [agent]image_download_source to 'http' by default. 4) Remove iscsi from the default enabled_deploy_interfaces and move it to the back of the supported list (effectively making direct deploy the default). X release: 5) Remove the iscsi deploy code from both ironic and IPA.
Thoughts, opinions, suggestions?
Dmitry