[Openstack-operators] cinder/nova issues

Sean McGinnis sean.mcginnis at gmx.com
Wed Aug 23 18:04:46 UTC 2017


Hey Adam,

There have been some updates since Liberty to improve handling in the os-brick
library that handles the local device management. But with this showing the
paths down, I wonder if there's something else going on there between the
NetApp box and the Nova compute host.

Could you file a bug to track this? I think you could just copy and paste the
content of your original email since it captures a lot of great info.

https://bugs.launchpad.net/cinder/+filebug

We can tag it with netapp so maybe it will get some attention there.

Thanks,
Sean

On Wed, Aug 23, 2017 at 01:01:24PM -0400, Adam Dibiase wrote:
> Greetings,
> 
> I am having an issue with nova starting an instance that is using a root
> volume that cinder has extended. More specifically, a volume that has been
> extended past the max resize limit of our Netapp filer. I am running
> Liberty and upgraded cinder packages to 7.0.3 from 7.0.0 to take advantage
> of this functionality. From what I can gather, it uses sub-lun cloning to
> get past the hard limit set by Netapp when cloning past 64G (starting from
> a 4G volume).
> 
> *Environment*:
> 
>    - Release: Liberty
>    - Filer:       Netapp
>    - Protocol: Fiberchannel
>    - Multipath: yes
> 
> 
> 
> *Steps to reproduce: *
> 
>    - Create new instance
>    - stop instance
>    - extend the volume by running the following commands:
>       - cinder reset-state --state available (volume-ID or name)
>       - cinder extend (volume-ID or name) 100
>       - cinder reset-state --state in-use (volume-ID or name)
>    - start instance with either nova start or nova reboot --hard  --same
>    result
> 
> 
> I can see that the instance's multipath status is good before the resize...
> 
> *360a98000417643556a2b496d58665473 dm-17 NETAPP  ,LUN             *
> 
> size=20G features='1 queue_if_no_path' hwhandler='0' wp=rw
> 
> |-+- policy='round-robin 0' prio=-1 status=active
> 
> | |- 6:0:1:5 sdy   65:128 active undef  running
> 
> | `- 7:0:0:5 sdz   65:144 active undef  running
> 
> `-+- policy='round-robin 0' prio=-1 status=enabled
> 
>   |- 6:0:0:5 sdx   65:112 active undef  running
> 
>   `- 7:0:1:5 sdaa  65:160 active undef  running
> 
> 
> Once the volume is resized, the lun goes to a failed state and it does not
> show the new size:
> 
> 
> *360a98000417643556a2b496d58665473 dm-17 NETAPP  ,LUN             *
> 
> size=20G features='1 queue_if_no_path' hwhandler='0' wp=rw
> 
> |-+- policy='round-robin 0' prio=-1 status=enabled
> 
> | |- 6:0:1:5 sdy   65:128 failed undef  running
> 
> | `- 7:0:0:5 sdz   65:144 failed undef  running
> 
> `-+- policy='round-robin 0' prio=-1 status=enabled
> 
>   |- 6:0:0:5 sdx   65:112 failed undef  running
> 
>   `- 7:0:1:5 sdaa  65:160 failed undef  running
> 
> 
> Like I said, this only happens on volumes that have been extended past 64G.
> Smaller sizes to not have this issue. I can only assume that the original
> lun is getting destroyed after the clone process and that is cause of the
> failed state. Why is it not picking up the new one and attaching it to the
> compute node?  Is there something I am missing?
> 
> Thanks in advance,
> 
> Adam

> _______________________________________________
> OpenStack-operators mailing list
> OpenStack-operators at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators




More information about the OpenStack-operators mailing list