Open Stack

Wed Aug 23 17:01:24 UTC 2017

Greetings,

I am having an issue with nova starting an instance that is using a root
volume that cinder has extended. More specifically, a volume that has been
extended past the max resize limit of our Netapp filer. I am running
Liberty and upgraded cinder packages to 7.0.3 from 7.0.0 to take advantage
of this functionality. From what I can gather, it uses sub-lun cloning to
get past the hard limit set by Netapp when cloning past 64G (starting from
a 4G volume).

*Environment*:

   - Release: Liberty
   - Filer:       Netapp
   - Protocol: Fiberchannel
   - Multipath: yes

*Steps to reproduce: *

   - Create new instance
   - stop instance
   - extend the volume by running the following commands:
      - cinder reset-state --state available (volume-ID or name)
      - cinder extend (volume-ID or name) 100
      - cinder reset-state --state in-use (volume-ID or name)
   - start instance with either nova start or nova reboot --hard  --same
   result

I can see that the instance's multipath status is good before the resize...

*360a98000417643556a2b496d58665473 dm-17 NETAPP  ,LUN             *

size=20G features='1 queue_if_no_path' hwhandler='0' wp=rw

|-+- policy='round-robin 0' prio=-1 status=active

| |- 6:0:1:5 sdy   65:128 active undef  running

| `- 7:0:0:5 sdz   65:144 active undef  running

`-+- policy='round-robin 0' prio=-1 status=enabled

  |- 6:0:0:5 sdx   65:112 active undef  running

  `- 7:0:1:5 sdaa  65:160 active undef  running

Once the volume is resized, the lun goes to a failed state and it does not
show the new size:

*360a98000417643556a2b496d58665473 dm-17 NETAPP  ,LUN             *

size=20G features='1 queue_if_no_path' hwhandler='0' wp=rw

|-+- policy='round-robin 0' prio=-1 status=enabled

| |- 6:0:1:5 sdy   65:128 failed undef  running

| `- 7:0:0:5 sdz   65:144 failed undef  running

`-+- policy='round-robin 0' prio=-1 status=enabled

  |- 6:0:0:5 sdx   65:112 failed undef  running

  `- 7:0:1:5 sdaa  65:160 failed undef  running

Like I said, this only happens on volumes that have been extended past 64G.
Smaller sizes to not have this issue. I can only assume that the original
lun is getting destroyed after the clone process and that is cause of the
failed state. Why is it not picking up the new one and attaching it to the
compute node?  Is there something I am missing?

Thanks in advance,

Adam
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20170823/f451fb3b/attachment.html>

Open Stack

[Openstack-operators] cinder/nova issues

OpenStack

Community

Documentation

Branding & Legal