Hi Tobias,
I saw your message, interesting method to get around the transient mdev issue.
Have you looked into implementing cyborg as a method to alleviate this? We are currently assessing it for a different project using nvidia A40’s.
Would be keen to swap war stories and see if we can make a better solution than the current vGPU mdev support going on.
Kind Regards,
Karl.
You can book a 30-minute meeting with me by clicking
this link.
--
Karl Kloppenborg, Systems Engineering (BCompSc, CNCF-[KCNA, CKA, CKAD], LFCE,
CompTIA Linux+ XK0-004)
Real World Technology Solutions - IT People you can trust
Voice | Data | IT Procurement | Managed IT
rwts.com.au | 1300 798 718
Real World is a DellEMC Gold Partner
This document should be read only by those persons to whom it is addressed and its content is not intended for use by any other persons. If you have received this message
in error, please notify us immediately. Please also destroy and delete the message from your computer. Any unauthorised form of reproduction of this message is strictly prohibited. We are not liable for the proper and complete transmission of the information
contained in this communication, nor for any delay in its receipt. Please consider the environment before printing this e-mail.
From:
openstack-discuss-request@lists.openstack.org <openstack-discuss-request@lists.openstack.org>
Date: Tuesday, 17 January 2023 at 9:06 pm
To: openstack-discuss@lists.openstack.org <openstack-discuss@lists.openstack.org>
Subject: openstack-discuss Digest, Vol 51, Issue 51
Send openstack-discuss mailing list submissions to
openstack-discuss@lists.openstack.org
To subscribe or unsubscribe via the World Wide Web, visit
https://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-discuss
or, via email, send a message with subject or body 'help' to
openstack-discuss-request@lists.openstack.org
You can reach the person managing the list at
openstack-discuss-owner@lists.openstack.org
When replying, please edit your Subject line so it is more specific
than "Re: Contents of openstack-discuss digest..."
Today's Topics:
1. Re: Enable fstrim automatically on cinder thin lvm
provisioning (Rajat Dhasmana)
2. Re: Experience with VGPUs (Tobias Urdin)
3. Re: [designate] Proposal to deprecate the agent framework and
agent based backends (Thomas Goirand)
4. Re: Experience with VGPUs (Sylvain Bauza)
----------------------------------------------------------------------
Message: 1
Date: Tue, 17 Jan 2023 10:11:27 +0530
From: Rajat Dhasmana <rdhasman@redhat.com>
To: A Monster <amonster369@gmail.com>
Cc: openstack-discuss <openstack-discuss@lists.openstack.org>
Subject: Re: Enable fstrim automatically on cinder thin lvm
provisioning
Message-ID:
<CAARK8KQ3rM9KSQ9vFP+CAVPL_xAksxfXvFuvZj8gd7FeFT6LTw@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"
Hi,
We've a config option 'report_discard_supported'[1] which can be added to
cinder.conf that will enable trim/unmap support.
Also I would like to suggest not creating new openstack-discuss threads for
the same issue and reuse the first one created.
As I can see these are the 3 threads for the same issue[2][3][4].
[1]
https://docs.openstack.org/cinder/latest/configuration/block-storage/config-options.html
[2]
https://lists.openstack.org/pipermail/openstack-discuss/2023-January/031789.html
[3]
https://lists.openstack.org/pipermail/openstack-discuss/2023-January/031797.html
[4]
https://lists.openstack.org/pipermail/openstack-discuss/2023-January/031805.html
Thanks
Rajat Dhasmana
On Tue, Jan 17, 2023 at 8:57 AM A Monster <amonster369@gmail.com> wrote:
> I deployed openstack using kolla ansible, and used LVM as storage backend
> for my cinder service, however I noticed that the lvm thin pool size keeps
> increasing even though the space used by instances volumes is the same, and
> after a bit of investigating I found out that I had to enable fstrim
> because the data deleted inside the logical volumes was still allocated
> from the thin pool perspective and I had to do fstrim on those volumes,
>
> how can I enable this automatically in openstack?
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.openstack.org/pipermail/openstack-discuss/attachments/20230117/805f4aba/attachment-0001.htm>
------------------------------
Message: 2
Date: Tue, 17 Jan 2023 08:54:03 +0000
From: Tobias Urdin <tobias.urdin@binero.com>
To: openstack-discuss <openstack-discuss@lists.openstack.org>
Subject: Re: Experience with VGPUs
Message-ID: <220CE3FB-C139-492E-ADD1-BC1ECBEAE65E@binero.com>
Content-Type: text/plain; charset="utf-8"
Hello,
We are using vGPUs with Nova on OpenStack Xena release and we?ve had a fairly good experience integration
NVIDIA A10 GPUs into our cloud.
As we see it there is some painpoints that just goes with mantaining the GPU feature.
- There is a very tight coupling of the NVIDIA driver in the guest (instance) and on the compute node that needs to
be managed.
- Doing maintainance need more planning i.e powering off instances, NVIDIA driver on compute node needs to be
rebuilt on hypervisor if kernel is upgraded unless you?ve implemented DKMS for that.
- Because we?ve different flavor of GPU (we split the A10 cards into different flavors for maximum utilization of
other compute resources) we added custom traits in the Placement service to handle that, handling that with
a script since doing anything manually related to GPUs you will get confused quickly. [1]
- Since Nova does not handle recreation of mdevs (or use the new libvirt autostart feature for mdevs) we have
a systemd unit that executes before the nova-compute service that walks all the libvirt domains and does lookups
in Placement to recreate the mdevs before nova-compute start. [2] [3] [4]
Best regards
Tobias
DISCLAIMER: Below is provided without any warranty of actually working for you or your setup and does
very specific things that we need and is only provided to give you some insight and help. Use at your own risk.
[1] https://paste.opendev.org/show/b6FdfwDHnyJXR0G3XarE/
[2] https://paste.opendev.org/show/bGtO6aIE519uysvytWv0/
[3] https://paste.opendev.org/show/bftOEIPxlpLptkosxlL6/
[4] https://paste.opendev.org/show/bOYBV6lhRON4ntQKYPkb/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.openstack.org/pipermail/openstack-discuss/attachments/20230117/daed7e1e/attachment-0001.htm>
------------------------------
Message: 3
Date: Tue, 17 Jan 2023 10:11:44 +0100
From: Thomas Goirand <zigo@debian.org>
To: openstack-discuss <OpenStack-discuss@lists.openstack.org>
Subject: Re: [designate] Proposal to deprecate the agent framework and
agent based backends
Message-ID: <46a43b97-063d-ed46-6dc1-94f7e0d12e5e@debian.org>
Content-Type: text/plain; charset=UTF-8; format=flowed
On 1/17/23 01:52, Michael Johnson wrote:
> TLDR: The Designate team would like to deprecate the backend agent
> framework and the agent based backends due to lack of development and
> design issues with the current implementation. The following backends
> would be deprecated: Bind9 (Agent), Denominator, Microsoft DNS
> (Agent), Djbdns (Agent), Gdnsd (Agent), and Knot2 (Agent).
Hi Michael,
Thanks for this.
Now, if we're going to get rid of the code soonish, can we just get rid
of the unit tests, rather than attempting to monkey-patch dnspython?
That feels safer, no? With Eventlet, I have the experience that monkey
patching is dangerous and often leads to disaster.
Cheers,
Thomas Goirand (zigo)
------------------------------
Message: 4
Date: Tue, 17 Jan 2023 11:04:59 +0100
From: Sylvain Bauza <sbauza@redhat.com>
To: Tobias Urdin <tobias.urdin@binero.com>
Cc: openstack-discuss <openstack-discuss@lists.openstack.org>
Subject: Re: Experience with VGPUs
Message-ID:
<CALOCmukWP2qwfh7D8sUotSbhrpqok739s8KcvHmiXEZWT3JSfQ@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"
Le mar. 17 janv. 2023 ? 10:00, Tobias Urdin <tobias.urdin@binero.com> a
?crit :
> Hello,
>
> We are using vGPUs with Nova on OpenStack Xena release and we?ve had a
> fairly good experience integration
> NVIDIA A10 GPUs into our cloud.
>
>
Great to hear, thanks for your feedback, much appreciated Tobias.
> As we see it there is some painpoints that just goes with mantaining the
> GPU feature.
>
> - There is a very tight coupling of the NVIDIA driver in the guest
> (instance) and on the compute node that needs to
> be managed.
>
>
As nvidia provides proprietary drivers, there isn't much we can move on
upstream, even for CI testing.
Many participants in this thread explained this as a common concern and I
understand their pain, but yeah you need third-party tooling for managing
both the driver installation and the licensing servers.
> - Doing maintainance need more planning i.e powering off instances, NVIDIA
> driver on compute node needs to be
> rebuilt on hypervisor if kernel is upgraded unless you?ve implemented
> DKMS for that.
>
>
Ditto, unfortunately I wish the driver could be less kernel-dependent but I
don't see a foreseenable future for this.
> - Because we?ve different flavor of GPU (we split the A10 cards into
> different flavors for maximum utilization of
> other compute resources) we added custom traits in the Placement service
> to handle that, handling that with
> a script since doing anything manually related to GPUs you will get
> confused quickly. [1]
>
True, that's why you can also use generic mdevs which will create different
resource classes (but ssssht) or use the placement.yaml file to manage your
inventories.
https://specs.openstack.org/openstack/nova-specs/specs/xena/implemented/generic-mdevs.html
> - Since Nova does not handle recreation of mdevs (or use the new libvirt
> autostart feature for mdevs) we have
> a systemd unit that executes before the nova-compute service that walks
> all the libvirt domains and does lookups
> in Placement to recreate the mdevs before nova-compute start. [2] [3] [4]
>
>
This is a known issue and we agreed on the last PTG for a direction.
Patches on review.
https://review.opendev.org/c/openstack/nova/+/864418
Thanks,
-Sylvain
> Best regards
> Tobias
>
> DISCLAIMER: Below is provided without any warranty of actually working for
> you or your setup and does
> very specific things that we need and is only provided to give you some
> insight and help. Use at your own risk.
>
> [1] https://paste.opendev.org/show/b6FdfwDHnyJXR0G3XarE/
> [2] https://paste.opendev.org/show/bGtO6aIE519uysvytWv0/
> [3] https://paste.opendev.org/show/bftOEIPxlpLptkosxlL6/
> [4] https://paste.opendev.org/show/bOYBV6lhRON4ntQKYPkb/
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.openstack.org/pipermail/openstack-discuss/attachments/20230117/7b9455f2/attachment.htm>
------------------------------
Subject: Digest Footer
_______________________________________________
openstack-discuss mailing list
openstack-discuss@lists.openstack.org
------------------------------
End of openstack-discuss Digest, Vol 51, Issue 51
*************************************************