On 11/20/23 18:50, Matthias Runge wrote:
On 16/11/2023 16:35, Takashi Kajinami wrote:
Hi,
As I mentioned in the heat patch, I have some concerns about this, and now I disagree with the exception request as a heat core.
1. The proposed change is a new feature, not a bug fix. Heat have asserted that we follow the stable branch policy.[1] IMO we should not accept its backport unless - TC updates the stable policy - Heat decides to abandon following the global stable policy and use own one
Thank you Takashi!
The argument here is, that it is a feature backport that would not change existing behavior, thus is safe for current users of Bobcat or Antelope.
My main point is not whether backporting this is "safe", but whether it is aligned with the global policies and whether there is any critical requirement which justifies an exception.
3. This is similar to 2, but heat would be broken in case a user attemptes upgrade from 20.1.0 (antelope with the feature backported) to 21.0.0 (bobcat without the feature), after the new resource is created. We need a proper block to prohibit this upgrade path.
To me, the scenario of a user using the latest antelope branch that would update to the earliest bobcat release seems artificial. Backporting patches or features requires to backport in this case from Caracal to Bobcat and then to Antelope.
For a user to land in a "broken" state or regression would then require: - use the latest Antelope version AND the new prometheus feature - upgrade to Bobcat in a version where the feature does not exist (i.e not including latest backports)
"Upgrade to an older version" might sound artificial, but is still a possible scenario. This has been working with the current global policy but is broken by approving an exception, and we should evaluate the risk of potential breaking expectations by some users.
For users of distributions, no regressions should be guaranteed by the distribution.
This may work for products, but we still have users using non-product distributions such as upstream code or middle-stream like RDO/UCA.
4. The feature was added quite recently, and has never been tested in upstream due to lack of jobs with prometheus. Even if we backport the feature, we should have a proper test coverage in CI or at least manual testing done, to avoid backportimg multiple bug fixes later.
We are working on jobs that involve prometheus, see https://review.opendev.org/c/openstack/ceilometer/+/898087 https://review.opendev.org/c/openstack/ceilometer/+/900509
and following (functional tests for observabilityclient) https://review.opendev.org/c/openstack/python-observabilityclient/+/899924
Ultimately, prometheus should be a drop-in replacement for gnocchi and the telemetry tests should continue to work as they do right now.
It's very nice to see some work going on about test coverage !