[nova] Information Sharing for Arm CCA Feature (Following 4/8 PTG)
Thank you for the productive discussion about Arm CCA feature [1] at the PTG on April 8th.
The use case for Arm CCA looks ok. But it is a bit early. It is wise to wait at least libvirt / qemu release to have a "stable" code. We may want to have a OS distribution version (such as Ubuntu XX.YY) which supports Arm CCA so that users can actually use/test the feature
The above information is quoted from the PTG Etherpad [2](L316) summary. I'd like to follow up on these points. I'd like to share relevant information about Arm CCA surrounding components (libvirt, QEMU, OS Distributions(e.g., Ubuntu xx.xx), hardware) as it becomes available, and proceed with the discussion step by step.. I'm starting with this email, but if there's a more appropriate method for sharing this type of information, please let me know. Here's the Arm CCA software components information I can share at this time: * Linux Kernel: The roadmap is outlined on the fifth page of the third attachment in this link [3]. The initial commits I've proposed to OpenStack are focused on RMM 1.0. KVM host support for RMM 1.0 will be available in H2 2025. * QEMU Upstream Status: The development is in-progress, awaiting the Linux Kernel merge * libvirt Upstream Status: The development is in-progress, awaiting the QEMU merge * OS Distribution Support: The road map is not clear now
We may want to have a OS distribution version (such as Ubuntu XX.YY) which supports Arm CCA so that users can actually use/test the feature
I have a question regarding the above. For design discussions leading to spec approval, I understand that upstream QEMU/libvirt stable release is necessary. Is a specific OS distribution release (e.g., Ubuntu xx.xx) also required for spec approval? [1] https://review.opendev.org/c/openstack/nova-specs/+/938276 [2] https://etherpad.opendev.org/p/nova-2025.2-ptg [3] https://lists.trustedfirmware.org/archives/list/tsc@lists.trustedfirmware.or... Regards, Ryo Taketani
Hi, all. Here are the expected release schedules for key Arm CCA software components: * Linux Kernel: 2025/12 * This is mentioned in Arm CCA roadmap [1] * QEMU: 2026/02 * libvirt: 2026/03 * These are awaiting Linux Kernel release. The anticipated schedule is based on the Linux Kernel roadmap and the release cycles of QEMU and libvirt. * OS Distribution(Ubuntu 26.04): 2026/04 * Ubuntu xx.xx is expected to be released in alignment with the latest Linux Kernel release. Our goal is to merge Arm CCA feature shortly after all necessary conditions are met. Therefore, I'd like to propose a discussion about how to best provide the required test environment. Specifically, should we provide real hardware and a CI/CD environment? If so, we should start preparing as soon as possible. Is it possible to begin this discussion now? If not, what do you think we should do to keep this discussion moving forward? Regards, Ryo Taketani [1] https://lists.trustedfirmware.org/archives/list/tsc@lists.trustedfirmware.or...
On 27/05/2025 09:44, taketani.ryo@fujitsu.com wrote:
Hi, all.
Here are the expected release schedules for key Arm CCA software components:
* Linux Kernel: 2025/12 * This is mentioned in Arm CCA roadmap [1] * QEMU: 2026/02 * libvirt: 2026/03 * These are awaiting Linux Kernel release. The anticipated schedule is based on the Linux Kernel roadmap and the release cycles of QEMU and libvirt. * OS Distribution(Ubuntu 26.04): 2026/04 * Ubuntu xx.xx is expected to be released in alignment with the latest Linux Kernel release.
based on that timeline the earliest release of openstack that would have support for this would be 2026.2 releasing in October 2026 you can however start setting up third party testing and and testing before all of the component are available in ubuntu for example in your ci you could use a custom vm image based on ubuntu/debian that has a mainline kernel with the supprot and qemu/libvirt compileed form source or usign a ppa version. you are free to use other distos for third party ci too, although goign too far form the common distros will liekly create more overhead. currently opendev ci has limit arm hardware so its unlikely that we will be able to do 1st party integration testing now or in the future.
Our goal is to merge Arm CCA feature shortly after all necessary conditions are met.
Therefore, I'd like to propose a discussion about how to best provide the required test environment. Specifically, should we provide real hardware and a CI/CD environment? If so, we should start preparing as soon as possible.
Is it possible to begin this discussion now? If not, what do you think we should do to keep this discussion moving forward?
you can start the discussion the one think i will say is that its unlikely that many will have time to implement this for you so you and your team will have to take lead on doing the work to run the third party ci on your own hardware and make the results available puhlicly. in general as a community we use zuul for ci as our test runner. zull uses nodepool to provide test test host to run the josb on nodepool support sever types of provides like openstack cloud or kubernetes cluster. in your case since you want toe test a hardware feature that likely cannot be emulated? the simplest way to do that would be with static nodes added to your local nodepool instance which can then be used by zuul to execute jobs. if you already have an internal jenkins system with your own provisioning system you can use that instead to set up the ci. as a third party ci provide you have the freedom to define how that ci runs. if runnng a third party ci is too much overhead but you were willing to donate test hardwre to the first party ci that may also be an option but that would need discusion with the openstack/opendev infra team.
Regards,
Ryo Taketani
[1] https://lists.trustedfirmware.org/archives/list/tsc@lists.trustedfirmware.or...
Thank you for replying, Sean.
you can start the discussion the one think i will say is that its unlikely that many will have time to implement this for you so you and your team will have to take lead on doing the work to run the third party ci on your own hardware and make the results available puhlicly.
I understand that while meeting the all conditions is ultimately necessary, we should focus on preparing for and aligning our efforts with the community towards submitting a third-party CI [1] environment and the results. Is my understanding correct? If so, I'm committed to setting up the third-party CI and making the results publicly available. In addition, regarding the phrase "on your own hardware", can I use our internally available hardware for CI testing?
in your case since you want toe test a hardware feature that likely cannot be emulated?
if you already have an internal jenkins system with your own provisioning system you can use that instead to set up
Arm CCA environment can be emulated by QEMU[2]. Can this be leveraged for proceeding CI testing? For example, we could set up and run CI tests in an emulated environment before executing them on the actual hardware. Internally, we are planning to do this. Would it be valuable to share this environment and the results with the community? By doing this, we'd like to discuss how to provide CI in advance. [1] https://docs.openstack.org/infra/openstackci/third_party_ci.html [2] https://linaro.atlassian.net/wiki/spaces/QEMU/pages/29051027459/Building+an+... Regards, Ryo Taketani
On 04/06/2025 10:49, taketani.ryo@fujitsu.com wrote:
Thank you for replying, Sean.
you can start the discussion the one think i will say is that its unlikely that many will have time to implement this for you so you and your team will have to take lead on doing the work to run the third party ci on your own hardware and make the results available puhlicly. I understand that while meeting the all conditions is ultimately necessary, we should focus on preparing for and aligning our efforts with the community towards submitting a third-party CI [1] environment and the results. Is my understanding correct?
If so, I'm committed to setting up the third-party CI and making the results publicly available.
i used ot recommend the OVH Performance Hosting plan for this https://www.ovhcloud.com/en-ie/web-hosting/compare/ for about ~€150 a year you get 500GB of storage, multiple 1 GB dbs that you can use if you want to as part of your ci. and plenty of bandwidth that allows you to host the ci logs publicly outside your company firewall. when i was involved in the intel third party ci that what we used but since they the have added a Professional Hosting option with 250G of storage but you may not have enough space to retain logs for up to 30 days with that plan. it is half the cost however. if your using zuul or just the zuul playbooks you can pretty trivially scp the logs form the folder on the executor too the root of the web share since you have ssh access in both plans. i know when we spoke to our internal lab team being able to fully outsource the log storage out side our cooperate network made them very happy so if you can and your company does not have a fatality to do that already i would consider ovh ro similar.
In addition, regarding the phrase "on your own hardware", can I use our internally available hardware for CI testing?
yes if you lab security policy allows that then you can use any hard ware you have available when running a third party ci. i don't think we will have a way to use that hardware easily in the first party ci so that really only an option if you run the ci yourself.
in your case since you want toe test a hardware feature that likely cannot be emulated?
if you already have an internal jenkins system with your own provisioning system you can use that instead to set up Arm CCA environment can be emulated by QEMU[2]. Can this be leveraged for proceeding CI testing?
maybe, for third party ci yes, you could create vms to emulate the compute nodes and then install opentack on those emulated compute nodes the constraint there would be performance and the ability to use nested CCA. if it is supported by libvirt then nova could be enhanced to support enabling the emulation and the eventually once one of our first party open-stack providers supports this feature we could use that in the first party ci. That will likely take 1-2+ years to get in place so i would not recommend that as the initially approach.
For example, we could set up and run CI tests in an emulated environment before executing them on the actual hardware. Internally, we are planning to do this. Would it be valuable to share this environment and the results with the community? By doing this, we'd like to discuss how to provide CI in advance.
yes i can see a number of options. if you run a third party ci you can automate creating and deleting the vms to emulate CCA then you can allocate those vms too ci job as worker/compute nodes. i don't think providing access to the community would really be useful but if you provided documentation for how to deploy a vm that is capable of emullating CCA and installing devstack in that that would be useful.
[1] https://docs.openstack.org/infra/openstackci/third_party_ci.html [2] https://linaro.atlassian.net/wiki/spaces/QEMU/pages/29051027459/Building+an+...
Regards,
Ryo Taketani
Thank you for replying, Sean.
i used ot recommend the OVH Performance Hosting plan for this
https://www.ovhcloud.com/en-ie/web-hosting/compare/ for about ~€150 a year you get 500GB of storage, multiple 1 GB dbs that you can use if you want to as part of your ci. and plenty of bandwidth that allows you to host the ci logs publicly outside your company firewall.
when i was involved in the intel third party ci that what we used but since they the have added a Professional Hosting
option with 250G of storage but you may not have enough space to retain logs for up to 30 days with that plan.
it is half the cost however.
if your using zuul or just the zuul playbooks you can pretty trivially scp the logs form the folder on the executor too the root of the web share since you have ssh access in both plans.
i know when we spoke to our internal lab team being able to fully outsource the log storage out side our cooperate network made them very happy so if you can and your company does not have a fatality to do that already i would consider ovh ro similar.
I understand that I need to scp CI test logs to external storage and retain them. I will use the suggested methods for selecting the appropriate approach.
i don't think providing access to the community would really be useful but if you provided documentation for how to deploy a vm that is capable of emullating CCA and installing devstack in that that would be useful.
I understand that providing publicly accessible testing result logs from an emulated environment would not be useful. Instead, I should commit documentation on how to emulate a CCA environment and build OpenStack within it. We cannot run ci tests on actual hardware in the immediate future. So, we will proceed ci testing with the following steps: 1. Run CI tests on an emulated environment and contribute documentation on how to emulate a CCA environment and build DevStack. 2. Run CI tests on actual hardware and commit CI environment configurations and logs.
you can however start setting up third party testing and and testing before all of the component are available in ubuntu
I'd like to seek clarification on an above comment, specifically the one received two turns ago. Is it correct that "you can however start setting up third party" refers to the possibility of doing internal preparation for CI testing? Is the ultimate goal to store CI test results from real hardware running upstream versions of OS/libvirt/QEMU/LinuxKernel, and will approval of those results be required for merging? Regards, Ryo Taketani
Hi all and Sean,
Is the ultimate goal to store CI test results from real hardware running upstream versions of OS/libvirt/QEMU/LinuxKernel, and will approval of those results be required for merging?
I apologize for the lack of clarity in my previous question. I'd like to rephrase and clarify my question. For merging and releasing the nova CCA code, is it mandatory to have Third Party CI running on bare metal (non-emulated) environments? Or is Third Party CI on emulated environments also acceptable? Furthermore, how long should the Third-Party environment be maintained? Are there any policies (e.g., "X number of years after the CCA feature is released")? Regards, Ryo Taketani
I wonder if we can apply the same policies which we agreed with for AMD SEV or SEV-ES. - All required features have been implemented in kernel/qemu/libvirt (or any other dependent software) and we know exactly what should be checked/configured by nova (from documentations). - Do not require 3rd party CI due to limited availability of hardware - Require functional tests with fake libvirt driver (and a few more mocks) which simulates the behavior of kernel/qemu/libvirt with ARM-CCA features enabled. - Actual functionality with actual hardware(*1) needs to be tested locally by the developer IIUC this was one of the suggestions bausas mentioned when this topic was first discussed in the past nova meeting (If I understand what he means by "fake hardware") and I didn't find actual specific concern discussed to make third party CI a strict blocker. I know that having CI which can actually test the functionality is preferred but as you was mentioned earlier it may require a lot of overheads. [1] https://meetings.opendev.org/meetings/nova/2025/nova.2025-01-28-16.00.log.ht... (*1) Note that actual CPU with ARM CCA feature is not currently available. All development of ARM CCA support features in kernel/qemu/libvirt use a simulated CPU (which is actually a emulated CPU provided by forked QEMU) and the initial development in OpenStack may follow the same approach. On 6/18/25 7:03 PM, taketani.ryo@fujitsu.com wrote:
Hi all and Sean,
Is the ultimate goal to store CI test results from real hardware running upstream versions of OS/libvirt/QEMU/LinuxKernel, and will approval of those results be required for merging?
I apologize for the lack of clarity in my previous question. I'd like to rephrase and clarify my question.
For merging and releasing the nova CCA code, is it mandatory to have Third Party CI running on bare metal (non-emulated) environments? Or is Third Party CI on emulated environments also acceptable?
Furthermore, how long should the Third-Party environment be maintained? Are there any policies (e.g., "X number of years after the CCA feature is released")?
Regards, Ryo Taketani
On 19/06/2025 17:05, Takashi Kajinami wrote:
I wonder if we can apply the same policies which we agreed with for AMD SEV or SEV-ES.
- All required features have been implemented in kernel/qemu/libvirt (or any other dependent software) and we know exactly what should be checked/configured by nova (from documentations).
- Do not require 3rd party CI due to limited availability of hardware
- Require functional tests with fake libvirt driver (and a few more mocks) which simulates the behavior of kernel/qemu/libvirt with ARM-CCA features enabled.
- Actual functionality with actual hardware(*1) needs to be tested locally by the developer
IIUC this was one of the suggestions bausas mentioned when this topic was first discussed in the past nova meeting (If I understand what he means by "fake hardware") and I didn't find actual specific concern discussed to make third party CI a strict blocker.
for SEV we found actul hardware in redaht to allow core developer to test it too before it was merged we have required testing on real hardware in the past awith actual resutl published for review before. and we did not accpent nabeling intel PMEM feature until there was actul hardware aviabel in the market that supprot it. so we may be able to proceed without a third party ci but i dont think we should proceed until the hardware existi and the testing is doen at least manually onece on the real hardware. without third party ci i also think we would want to conditioner it somewhat experimental. that does not mean jumping though any hoops to be able to use the feature jsut a warnign in the docs that this feature has very limtied testing an that upsream has a limited ablity to fix any bugs that are found. this would make the feature similar to the ablity to emulate diffent arcitreus with qemu i.e. run arm vms on x86 hosts. its aviabel to use but not recommend for production workloads. its intend mainly for CI workloads or devleoper envionment.
I know that having CI which can actually test the functionality is preferred but as you was mentioned earlier it may require a lot of overheads.
[1] https://meetings.opendev.org/meetings/nova/2025/nova.2025-01-28-16.00.log.ht...
(*1) Note that actual CPU with ARM CCA feature is not currently available. All development of ARM CCA support features in kernel/qemu/libvirt use a simulated CPU (which is actually a emulated CPU provided by forked QEMU) and the initial development in OpenStack may follow the same approach.
ya so the actul hardware need to be avaible to merge the supprot in nova. we can use qemu to do the deveopemrnt befor that btu to actully supprot ti the hard ware must exist and be aviabel to buy. it can just exiist in an internal lab soemwhere. also with regards to using qemu if we are merging this withotu thrid party ci i would expect a full writeup in the contibutor docs of how to recplicate a qemu vm that you can install devstack in using ubuntu/debin and run tempest to validate teh feature. similar to https://docs.openstack.org/nova/latest/contributor/testing/pci-passthrough-s... and the other testing guides https://docs.openstack.org/nova/latest/contributor/index.html#testing like this https://docs.openstack.org/nova/latest/contributor/testing/libvirt-numa.html if the core team does nto have a way to replciate and debug a bug i dont think we could really supprot it beyond an experimental capastiy.
On 6/18/25 7:03 PM, taketani.ryo@fujitsu.com wrote:
Hi all and Sean,
Is the ultimate goal to store CI test results from real hardware running upstream versions of OS/libvirt/QEMU/LinuxKernel, and will approval of those results be required for merging?
I apologize for the lack of clarity in my previous question. I'd like to rephrase and clarify my question.
For merging and releasing the nova CCA code, is it mandatory to have Third Party CI running on bare metal (non-emulated) environments? Or is Third Party CI on emulated environments also acceptable?
Furthermore, how long should the Third-Party environment be maintained? Are there any policies (e.g., "X number of years after the CCA feature is released")?
Regards, Ryo Taketani
Hi Sean, Thanks for your detailed explanation (as always !). I wasn't aware full context about the past development of SEV support and it helps me understand the situation more clearly. So based on your feedback I think what we can agree with at this point would be - The feature can be merged once a real hardware is available and the code is tested with it - Setting up 3rd party CI is not a strict blocker if the following items are provided - A developer guide to reproduce the setup is provided so that cores can set up a virtual environment with devstack + tempest to try/test the feature - A documentation which clearly states that the feature is "experimental" due to its limited testing but is required to mark the feature full-support. but let me know if you misunderstand your intention. Thank you, Takashi On 6/20/25 4:13 AM, Sean Mooney wrote:
On 19/06/2025 17:05, Takashi Kajinami wrote:
I wonder if we can apply the same policies which we agreed with for AMD SEV or SEV-ES.
- All required features have been implemented in kernel/qemu/libvirt (or any other dependent software) and we know exactly what should be checked/configured by nova (from documentations).
- Do not require 3rd party CI due to limited availability of hardware
- Require functional tests with fake libvirt driver (and a few more mocks) which simulates the behavior of kernel/qemu/libvirt with ARM-CCA features enabled.
- Actual functionality with actual hardware(*1) needs to be tested locally by the developer
IIUC this was one of the suggestions bausas mentioned when this topic was first discussed in the past nova meeting (If I understand what he means by "fake hardware") and I didn't find actual specific concern discussed to make third party CI a strict blocker.
for SEV we found actul hardware in redaht to allow core developer to test it too before it was merged
we have required testing on real hardware in the past awith actual resutl published for review before.
and we did not accpent nabeling intel PMEM feature until there was actul hardware aviabel in the market
that supprot it.
so we may be able to proceed without a third party ci but i dont think we should proceed until the hardware existi and the testing is doen at least manually onece on the real hardware. without third party ci i also think we would want to conditioner it somewhat experimental.
that does not mean jumping though any hoops to be able to use the feature jsut a warnign in the docs that this feature has very limtied testing an that upsream has a limited ablity to fix any bugs that are found.
this would make the feature similar to the ablity to emulate diffent arcitreus with qemu i.e. run arm vms on x86 hosts. its aviabel to use but not recommend for production workloads. its intend mainly for CI workloads or devleoper envionment.
I know that having CI which can actually test the functionality is preferred but as you was mentioned earlier it may require a lot of overheads.
[1] https://meetings.opendev.org/meetings/nova/2025/nova.2025-01-28-16.00.log.ht...
(*1) Note that actual CPU with ARM CCA feature is not currently available. All development of ARM CCA support features in kernel/qemu/libvirt use a simulated CPU (which is actually a emulated CPU provided by forked QEMU) and the initial development in OpenStack may follow the same approach.
ya so the actul hardware need to be avaible to merge the supprot in nova.
we can use qemu to do the deveopemrnt befor that btu to actully supprot ti the hard ware must exist and
be aviabel to buy. it can just exiist in an internal lab soemwhere.
also with regards to using qemu if we are merging this withotu thrid party ci i would expect a full
writeup in the contibutor docs of how to recplicate a qemu vm that you can install devstack in using ubuntu/debin
and run tempest to validate teh feature.
similar to https://docs.openstack.org/nova/latest/contributor/testing/pci-passthrough-s... and the other testing guides
https://docs.openstack.org/nova/latest/contributor/index.html#testing
like this https://docs.openstack.org/nova/latest/contributor/testing/libvirt-numa.html
if the core team does nto have a way to replciate and debug a bug i dont think we could really supprot it beyond an experimental capastiy.
On 6/18/25 7:03 PM, taketani.ryo@fujitsu.com wrote:
Hi all and Sean,
Is the ultimate goal to store CI test results from real hardware running upstream versions of OS/libvirt/QEMU/LinuxKernel, and will approval of those results be required for merging?
I apologize for the lack of clarity in my previous question. I'd like to rephrase and clarify my question.
For merging and releasing the nova CCA code, is it mandatory to have Third Party CI running on bare metal (non-emulated) environments? Or is Third Party CI on emulated environments also acceptable?
Furthermore, how long should the Third-Party environment be maintained? Are there any policies (e.g., "X number of years after the CCA feature is released")?
Regards, Ryo Taketani
Thank you for the discussion and your information, Sean and Takashi. These discussions have been very helpful in deepening our understanding. I would like to confirm my understanding of the requirements for merging the feature(and I have one question): - Required - The actual hardware must be available for purchase on the market. - libvirt/QEMU/OS are available on upstream - One of the following testing approaches must be satisfied: 1. Testing without Third party CI: 1. The feature must be tested on the actual hardware at least once. - Question(To clarify): Is it necessary for us to run the tests ourselves and publish test results on a public server? 2. The Contributor guide must include detailed instructions on how to set up and run tests in an emulated environment. - This will result in the feature being provided with limited testing methods available to developers and Core Reviewers because emulated environment is not recommended for production workloads. So, the feature will be experimental. 2. Testing with Third party CI: - This will result in feature full-support. Please let me know if I've misunderstood something and answer the question. I will consider these testing options internally and decide which approach we want to proceed. After that, I'd appreciate it if we could discuss this further. Regards, Taketani
参加者 (3)
-
Sean Mooney
-
Takashi Kajinami
-
taketani.ryo@fujitsu.com