On Tue, Oct 22, 2024 at 6:18 PM David Pineau <david.pineau@shadow.tech> wrote:

Hello,

Having no further answer for the past week, I'm raising awareness of
the thread just in case.
Does anyone have any input for us on the matter ?

Thanks,

On Wed, Oct 9, 2024 at 10:18 AM David Pineau <david.pineau@shadow.tech> wrote:
>
> Hello Rajat,
>
> On Tue, Oct 8, 2024 at 9:50 PM Rajat Dhasmana <rdhasman@redhat.com> wrote:
> >
> > Hi David,
> >
> >
> >
> > On Tue, Oct 8, 2024 at 9:28 PM David Pineau <david.pineau@shadow.tech> wrote:
> >>
> >> Hello Openstack and Cinder team,
> >>
> >> I'm currently working on the storage management layer of our company's
> >> Openstack platform. As we manage the storage ourselves, we may run
> >> background operations, which are akin to a maintenance, on individual
> >> volumes, or part of the infrastructure.
> >>
> >> As a bit of context, this infrastructure will be both exposed to
> >> external customers, as well as internal departments, and as such, we
> >> are looking for a refined user experience where an error is not a
> >> simple "an error occurred in service X", but something potentially
> >> actionable by whoever encounters it.
> >>
> >> At the moment in the Cinder documentation (and what I could find on
> >> the discuss archive), there seems to be no way for a Cinder Driver and
> >> its backing services to:
> >> - Tell Cinder that a specific Volume is undergoing a backend
> >> maintenance and should be considered unavailable
> >> - raise a relevant error to Cinder about ongoing Operations affecting
> >> a Volume within its backend, that cinder could properly react to
> >>
> >
> > Thanks for mentioning the use cases however the problem statement
> > is still not very clear to me with the limited information.
> > 1. Which backend storage are you using for Cinder and with which transport protocol?
>
> For a bit more context, we have an existing infrastructure that we'll
> switch over to openstack, as soon as we can. It is made up of a few
> hundred storage servers over a few datacenters.
>
> Given the limitations of cinder, which expects backends to be in a
> configuration file, and to avoid reloading the configuration for each
> addition/removal of a server, we chose to go towards the vendor-driver
> design approach, writing our own custom volume driver (not contributed
> upstream, as it's 100% custom). This will help us in the previous
> endeavor.
>
> As for the transport protocol, we currently use iSCSI, but are working
> on supporting NVMEoF; and both will use the available components of
> Cinder providing the relevant protocol's support.
>
> > 2. Why would we want to set a particular volume in maintenance state and
> > what would be the maintenance being performed on the volume/LUN?
> > I'm asking because we have not encountered a case (at least to my knowledge) where
> > a particular volume goes under maintenance instead of the whole backend.
>
> Our Hardware maintenance implies moving data out of the storage server
> before we hand it over to our Datacenter operator, for historical and
> practical matters (risk management, essentially). This means that for
> every maintenance, we might want to "migrate" a volume within the same
> cinder backend (but between actual hardware hosts), thus "behind the
> scenes" from cinder's POV. As these operations may make the volume
> unusable for a few moments, we wanted to check if there was a way to
> properly bubble-up this information to cinder and hoped there was, as
> many vendor drivers exist, that are probably able to do exactly so.
>
> >
> > We provide APIs to disable/freeze[1] a backend and also enable/thaw a backend[2] to
> > avoid resources being created on it but not on a per volume basis.
>
> I indeed saw this approach on various threads/issues, but then we'd
> make the whole backend unusable, while only one or two HW servers
> behind it are undergoing maintenance. This is not ideal in our book,
> as we strive to provide the best user experience we can, with the
> least interruptions possible.
>
> >
> > It would also be good to bring this topic to the upcoming Epoxy virtual PTG where we
> > can properly discuss this idea. You can add your topic in the Cinder PTG etherpad[3].
>
> We'd gladly bring our use-case and needs to the discussion if need be.
>
> >
> > [1] https://docs.openstack.org/api-ref/block-storage/v3/#freeze-a-cinder-backend-host
> > [2] https://docs.openstack.org/api-ref/block-storage/v3/#thaw-a-cinder-backend-host
> > [3] https://etherpad.opendev.org/p/epoxy-ptg-cinder
> >
> > Thanks
> > Rajat Dhasmana
> >
> >>
> >> As a first step, I wanted to check if you, the community, had ever
> >> considered this issue (or at least we consider it to be one). We'd be
> >> very happy if you had recipes or pieces of advice to share with us on
> >> how to handle this.
> >>
> >> If nothing is available or known, how could we help in bringing such
> >> an improvement to Cinder ?
> >>
> >> Kind regards,
> >>
> >> --
> >> David Pineau (joa)
> >>
>
> Thanks