[Openstack-operators] cinder/nova issues
Sean McGinnis
sean.mcginnis at gmx.com
Wed Aug 23 18:04:46 UTC 2017
Hey Adam,
There have been some updates since Liberty to improve handling in the os-brick
library that handles the local device management. But with this showing the
paths down, I wonder if there's something else going on there between the
NetApp box and the Nova compute host.
Could you file a bug to track this? I think you could just copy and paste the
content of your original email since it captures a lot of great info.
https://bugs.launchpad.net/cinder/+filebug
We can tag it with netapp so maybe it will get some attention there.
Thanks,
Sean
On Wed, Aug 23, 2017 at 01:01:24PM -0400, Adam Dibiase wrote:
> Greetings,
>
> I am having an issue with nova starting an instance that is using a root
> volume that cinder has extended. More specifically, a volume that has been
> extended past the max resize limit of our Netapp filer. I am running
> Liberty and upgraded cinder packages to 7.0.3 from 7.0.0 to take advantage
> of this functionality. From what I can gather, it uses sub-lun cloning to
> get past the hard limit set by Netapp when cloning past 64G (starting from
> a 4G volume).
>
> *Environment*:
>
> - Release: Liberty
> - Filer: Netapp
> - Protocol: Fiberchannel
> - Multipath: yes
>
>
>
> *Steps to reproduce: *
>
> - Create new instance
> - stop instance
> - extend the volume by running the following commands:
> - cinder reset-state --state available (volume-ID or name)
> - cinder extend (volume-ID or name) 100
> - cinder reset-state --state in-use (volume-ID or name)
> - start instance with either nova start or nova reboot --hard --same
> result
>
>
> I can see that the instance's multipath status is good before the resize...
>
> *360a98000417643556a2b496d58665473 dm-17 NETAPP ,LUN *
>
> size=20G features='1 queue_if_no_path' hwhandler='0' wp=rw
>
> |-+- policy='round-robin 0' prio=-1 status=active
>
> | |- 6:0:1:5 sdy 65:128 active undef running
>
> | `- 7:0:0:5 sdz 65:144 active undef running
>
> `-+- policy='round-robin 0' prio=-1 status=enabled
>
> |- 6:0:0:5 sdx 65:112 active undef running
>
> `- 7:0:1:5 sdaa 65:160 active undef running
>
>
> Once the volume is resized, the lun goes to a failed state and it does not
> show the new size:
>
>
> *360a98000417643556a2b496d58665473 dm-17 NETAPP ,LUN *
>
> size=20G features='1 queue_if_no_path' hwhandler='0' wp=rw
>
> |-+- policy='round-robin 0' prio=-1 status=enabled
>
> | |- 6:0:1:5 sdy 65:128 failed undef running
>
> | `- 7:0:0:5 sdz 65:144 failed undef running
>
> `-+- policy='round-robin 0' prio=-1 status=enabled
>
> |- 6:0:0:5 sdx 65:112 failed undef running
>
> `- 7:0:1:5 sdaa 65:160 failed undef running
>
>
> Like I said, this only happens on volumes that have been extended past 64G.
> Smaller sizes to not have this issue. I can only assume that the original
> lun is getting destroyed after the clone process and that is cause of the
> failed state. Why is it not picking up the new one and attaching it to the
> compute node? Is there something I am missing?
>
> Thanks in advance,
>
> Adam
> _______________________________________________
> OpenStack-operators mailing list
> OpenStack-operators at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
More information about the OpenStack-operators
mailing list