[openstack-dev] [nova][cinder] Austin summit nova/cinder cross-project session recap

Ildikó Váncsa ildiko.vancsa at ericsson.com
Thu May 5 19:11:27 UTC 2016

Hi All,

First of all thanks Matt for the detailed summary about last week.

As a continuation I would like to add the meeting minutes I captured today on the Hangout referred below.

As an overall summary, the most important step for today was to move back a little from multi-attach and try to examine the Cinder-Nova interaction before moving forward with any kind of addition. Our philosophy here was to simplify what we can and move back the responsibilities where they belong.

We started with a quick recap by Scott D'Angelo on what we are trying to solve here. The main items were:
* Multi-attach
  * detach issue: when to disconnect volume in Nova/remove target in Cinder
  * too many state checks on Nova side
* Live migration issue
  * calling initialize_connection for the destination host and then for the source host again
  * some Cinder drivers export new target, which is an existing issue

John Garbutt asked about the detach case and asked whether 'os-brick' can handle the decision when to disconnect a volume. That module would need more information in order to be able to make the decision. One of the main concerns from Walter Boring was that 'os-brick' is a stateless simple module currently and the intention is still to keep it like that.

After this discussion John Griffith described what he is already working on. The solution concentrates on 'initialize_connection'. The main changes in the plans are to create the attachment at initialize time and also save information on Cinder side like the 'host' and 'connector_info'. 'initialize_connection' will also be extended with the logic to check whether the attachment already exists or not. In the latter case it will not create a new one and will also prevent the drivers from the aforementioned extra export. This solution can possibly solve the live migration issue as well.

John Griffith will work on the above described solution, that target is to have patches up by next week.

The next topic was started by Walter Boring, which is about to remove 'check_attach' from Nova. Ildiko Vancsa raised the point that 'check_attach' looks like a legacy check in Nova, which was not removed when the checks were added to 'reserve_volume' to Cinder. As Nova calls 'reserve_volume' after 'check_attach' the removal of this extra check should be safe. Walter is already working on the changes, the patch is targeted for this week. This should simplify the interaction between the two modules and keep Cinder as the ultimate source of truth.

Walter also started to work on to fix the live migration issue by removing the second 'initialize_connection' call from the flow. The patch [1] in WIP state is already up for review and needs some review attention to see whether it's the best way of solving this issue.

We briefly discussed the hypervisor layer during the meeting as an already known step for multi-attach. For libvirt we need to be able to set the 'shareable' flag for the guest instance in its XML file (to disable caching), therefore Nova still needs to know whether a volume is multi-attach enabled or not. The volume info already contains this information that Nova can use for this check. Because of this reason the other virt drivers will not support multi-attach, at least for now, which means an additional check in Nova before attaching a multi-attach volume.

We also talked about missing tests. We identified Cinder migrate as a major item here. Scott D'Angelo will work on this as Cinder already has a test patch up for review for a specific case of this workflow. This test will trigger Nova swap volume as well, which is also important in order to be able to safely introduce multi-attach. Ildiko Vancsa will work on further test cases in Tempest for multi-attach and look into fixing the already existing one that Matt Riedemann uploaded during the Mitaka cycle.

Ildiko will also work on the multi-attach spec [2] for Nova and upload a latest version adapted to the outcome of the today's meeting for review this week.

Walter brought up the extend volume use case during the meeting. Matt stated that this item should be presented and discussed on one of the Nova meetings, we did not spend time on this item today.

We also agreed that we will continue with the weekly IRC meetings, which will only be the forum to track what we agreed on today and to discuss multi-attach.

Please feel free to add more items or correct me if I'm mistaken on any of the above points.

Thanks and Best Regards,

[1] https://review.openstack.org/#/c/312773/ 
[2] https://review.openstack.org/#/c/304681/ 

> -----Original Message-----
> From: Matt Riedemann [mailto:mriedem at linux.vnet.ibm.com]
> Sent: May 05, 2016 02:55
> To: OpenStack Development Mailing List (not for usage questions)
> Subject: [openstack-dev] [nova][cinder] Austin summit nova/cinder cross-project session recap
> On Thursday morning the Nova and Cinder teams got together for a
> cross-project design summit session. The full etherpad is here [1].
> This was all about volume multi-attach.
> A subset of people from both teams were actually meeting weekly for four
> weeks leading up to the summit to hash out some details which we hoped
> to resolve at the summit. Unfortunately that didn't happen.
> The first thing we wanted to settle was why do we actually want/need to
> support volume multi-attach? Because "Cinder added it to their API in
> Juno so Nova needs to support it" isn't a good reason. There are a few
> drivers for this feature:
> * The need for active/active and active/hot standby scenarios which
> can't accept downtime due to attaching a new volume.
> * Some database clusters, like Oracle RAC, require shared volumes. So
> Trove is a stakeholder for this feature also.
> * Other legacy application use cases were brought up that essentially
> mean this is something they need to bridge a gap to adopting OpenStack.
> So we agreed that while this is not really something we necessarily like
> (because of the non-cloud legacy application nature of it), it is
> necessary so we're going to continue trying to make it happen.
> We then quickly went over what was completed in Mitaka and explained the
> detach issue we ran into. The problem is when you have more than one
> volume attached to the same instance on the same host, when you detach
> one of them, both of the volume connections actually get terminated on
> the host.
> This problem is also complicated by the fact that some Cinder backends
> will create one attachment per export/volume, whereas others will
> multiplex all volumes onto one attachment.
> Coming into the session we really had two competing solutions from the
> Cinder team, one from Walter Boring and one from John Griffith. However,
> during the session another idea was brought up from Dan Smith. The full
> details are in the etherpad, but it's really an idea to abstract the
> multiple volume attachments on the Cinder side that Nova only sees a
> single volume, so Nova wouldn't have to change any of it's API handling
> for Cinder volumes to be checking if they support multiattach or not,
> and thus have to have conditional logic spread all over Nova (API,
> compute, and virt drivers).
> With Dan's idea we'd still have the disconnect/detach problem where Nova
> would need to check if it can disconnect from the host if there is only
> a single attachment left, but Nova has to do that regardless for all of
> the proposed solutions.
> It sounded like John Griffith had a similar idea before when looking at
> this problem, and we spent a fair amount of time discussing it on both
> sides during the session.
> At the end of the session, we (Nova) came away with the following next
> steps:
> 1. John Griffith (Cinder team) would work on a proof of concept for the
> abstracted volume idea.
> 2. Cinder would work on adding a volume migration test to Tempest to be
> run in the multi-node gate job (this needs to happen regardless). Scott
> D'Angelo is going to work on this.
> 3. Ildikó Váncsa was going to work on the Nova volume disconnect ref
> counting logic.
> 4. We'd meet on Friday during the Nova meetup session to discuss more
> details.
> --
> So what happened then was we met on Friday and found out that we were
> all speaking different languages on Thursday, because the Cinder team
> didn't think that they were going to be going with this new abstracted
> volume idea. After much wailing and gnashing of teeth we agreed to do a
> hangout shortly after the summit to try and get back on the same page.
> So we're doing that tomorrow (5/5) at 1600 UTC. This is going to be at
> least myself, Ildikó, Scott and Walter on the call. Walter has been
> creating diagrams of the flows through Nova and Cinder for various
> interactions like attach/detach of volumes and volume-backed live
> migration so that we can try to step back and see where the proposed
> changes fit in.
> It's sounding like regardless of solution there will be changes to the
> Cinder API (at least for os-initialize_connection). There might be some
> new APIs too, for example, an API to get connection_info during live
> migration without Nova having to call os-initialize_connection to get it.
> So we'll see what happens. We'll eventually figure out. After all, it's
> only code, right? :)
> [1] https://etherpad.openstack.org/p/newton-nova-cinder
> --
> Thanks,
> Matt Riedemann
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

More information about the OpenStack-dev mailing list