[all][interop][cinder][qa] API changes with/without microversion and Tempest verification of API interoperability

older
[goals][IPv6-Only Deployments and...

Ghanshyam Mann

16 Sep 2019 16 Sep '19

2:53 a.m.

Hello Everyone, As per discussion over ML, Tempest started the JSON schema strict validation for Volume APIs response [1]. Because it may affect the interop certification, it was explained to the Interop team as well as in the Board of Director meeting[2]. In Train, Tempest started implementing the validation and found an API change where the new field was added in API response without versioning[3] (Cinder has API microversion mechanism). IMO, that was not the correct way to change the API and as per API-WG guidelines[4] any field added/modified/removed in API should be with microverison(means old versions/user should not be affected by that change) and must for API interoperability. With JSON schema validation, Tempest verifies the API interoperability recommended behaviour by API-WG. But as per IRC conversion with cinder team, we have different opinion on API interoperability and how API should be changed under microversion mechanism. I would like to have a conclusion on this so that Tempest can verify or leave the Volume API for strict validation. [1] http://lists.openstack.org/pipermail/openstack-discuss/2018-November/000358.... [2] - http://lists.openstack.org/pipermail/openstack-discuss/2019-March/003652.htm... - http://lists.openstack.org/pipermail/openstack-discuss/2019-March/003655.htm... [3] https://bugs.launchpad.net/tempest/+bug/1843762 https://review.opendev.org/#/c/439461/ [4] https://specs.openstack.org/openstack/api-wg/guidelines/api_interoperability... -gmann

Show replies by date

Ghanshyam Mann

16 Sep 16 Sep

3:02 a.m.

---- On Mon, 16 Sep 2019 18:53:58 +0900 Ghanshyam Mann <gmann@ghanshyammann.com> wrote ----

...

Hello Everyone,

As per discussion over ML, Tempest started the JSON schema strict validation for Volume APIs response [1]. Because it may affect the interop certification, it was explained to the Interop team as well as in the Board of Director meeting[2].

In Train, Tempest started implementing the validation and found an API change where the new field was added in API response without versioning[3] (Cinder has API microversion mechanism). IMO, that was not the correct way to change the API and as per API-WG guidelines[4] any field added/modified/removed in API should be with microverison(means old versions/user should not be affected by that change) and must for API interoperability.

With JSON schema validation, Tempest verifies the API interoperability recommended behaviour by API-WG. But as per IRC conversion with cinder team, we have different opinion on API interoperability and how API should be changed under microversion mechanism. I would like to have a conclusion on this so that Tempest can verify or leave the Volume API for strict validation.

I found the same flow chart what Sean created in Nova about "when to bump microverison" in Cinder also which clearly say any addition to response need new microversion. - https://docs.openstack.org/cinder/latest/contributor/api_microversion_dev.ht... -gmann

...

[1] http://lists.openstack.org/pipermail/openstack-discuss/2018-November/000358.... [2] - http://lists.openstack.org/pipermail/openstack-discuss/2019-March/003652.htm... - http://lists.openstack.org/pipermail/openstack-discuss/2019-March/003655.htm... [3] https://bugs.launchpad.net/tempest/+bug/1843762 https://review.opendev.org/#/c/439461/ [4] https://specs.openstack.org/openstack/api-wg/guidelines/api_interoperability...

-gmann

Michael McCune

8:30 a.m.

On Mon, Sep 16, 2019 at 6:06 AM Ghanshyam Mann <gmann@ghanshyammann.com> wrote:

...

...
Hello Everyone,

As per discussion over ML, Tempest started the JSON schema strict validation for Volume APIs response [1]. Because it may affect the interop certification, it was explained to

---- On Mon, 16 Sep 2019 18:53:58 +0900 Ghanshyam Mann < gmann@ghanshyammann.com> wrote ---- the Interop team as well as in the Board of Director meeting[2].

...
In Train, Tempest started implementing the validation and found an API

change where the new field was added in API response without versioning[3] (Cinder has API microversion mechanism). IMO, that was not the correct way to change the API and as per API-WG guidelines[4] any field added/modified/removed in API should be with microverison(means old versions/user should not be affected by that change) and must for API interoperability.

...
With JSON schema validation, Tempest verifies the API interoperability

recommended behaviour by API-WG. But as per IRC conversion with cinder team, we have different opinion on API interoperability and how API should be changed under microversion mechanism. I would like to have a conclusion on this so that Tempest can verify or leave the Volume API for strict validation.

I found the same flow chart what Sean created in Nova about "when to bump microverison" in Cinder also which clearly say any addition to response need new microversion. - https://docs.openstack.org/cinder/latest/contributor/api_microversion_dev.ht...

i would also expect any change in the request or response to result in a microversion bump as well. peace o/ -gmann

...

...
[1]

http://lists.openstack.org/pipermail/openstack-discuss/2018-November/000358....

...
[2] - http://lists.openstack.org/pipermail/openstack-discuss/2019-March/003652.htm... - http://lists.openstack.org/pipermail/openstack-discuss/2019-March/003655.htm... [3] https://bugs.launchpad.net/tempest/+bug/1843762 https://review.opendev.org/#/c/439461/ [4] https://specs.openstack.org/openstack/api-wg/guidelines/api_interoperability...

-gmann

Eric Harney

10:40 a.m.

On 9/16/19 6:02 AM, Ghanshyam Mann wrote:

...

---- On Mon, 16 Sep 2019 18:53:58 +0900 Ghanshyam Mann <gmann@ghanshyammann.com> wrote ----

...
Hello Everyone,

As per discussion over ML, Tempest started the JSON schema strict validation for Volume APIs response [1]. Because it may affect the interop certification, it was explained to the Interop team as well as in the Board of Director meeting[2].

In Train, Tempest started implementing the validation and found an API change where the new field was added in API response without versioning[3] (Cinder has API microversion mechanism). IMO, that was not the correct way to change the API and as per API-WG guidelines[4] any field added/modified/removed in API should be with microverison(means old versions/user should not be affected by that change) and must for API interoperability.

With JSON schema validation, Tempest verifies the API interoperability recommended behaviour by API-WG. But as per IRC conversion with cinder team, we have different opinion on API interoperability and how API should be changed under microversion mechanism. I would like to have a conclusion on this so that Tempest can verify or leave the Volume API for strict validation.

I found the same flow chart what Sean created in Nova about "when to bump microverison" in Cinder also which clearly say any addition to response need new microversion. - https://docs.openstack.org/cinder/latest/contributor/api_microversion_dev.ht...

-gmann

I don't believe that it is clear that a microversion bump was required for the "groups" response showing up in a GET quota-sets response, and here's why: This API has, since at least Havana, returned dynamic fields based on quotas that are assigned to volume types. i.e.: $ cinder --debug quota-show b73b1b7e82a247038cd01a441ec5a806 DEBUG:keystoneauth:RESP BODY: {"quota_set": {"per_volume_gigabytes": -1, "volumes_ceph": -1, "groups": 10, "gigabytes": 1000, "backup_gigabytes": 1000, "snapshots": 10, "volumes_enc": -1, "snapshots_enc": -1, "snapshots_ceph": -1, "gigabytes_ceph": -1, "volumes": 10, "gigabytes_enc": -1, "backups": 10, "id": "b73b1b7e82a247038cd01a441ec5a806"}} "gigabytes_ceph" is in that response because there's a "ceph" volume type defined, same for "gigabytes_enc", etc. This puts this API alongside something more like listing volume types -- you get a list of what's defined on the deployment, not a pre-baked list of defined fields. Complaints about the fact that "groups" being added without a microversion imply that these other dynamic fields shouldn't be in this response either -- but this is how this API works. There's a lot of talk here about interoperability problems... what are those problems, exactly? If we ignore Ocata and just look at Train -- why is this API not problematic for interoperability there, when requests on different clouds would return different data, depending on how types are configured? It's not clear to me that rectifying the microversion concerns around the "groups" field is helpful without also understanding this piece, because if the concern is that different clouds return different fields for this API -- that will still happen. We need more detail to understand how to address this, and what the problem is that we are trying to solve exactly. (Other than the problem that Tempest currently fails on Ocata. My inclination is still that the Tempest tests could just be wrong.)

...

...
[1] http://lists.openstack.org/pipermail/openstack-discuss/2018-November/000358.... [2] - http://lists.openstack.org/pipermail/openstack-discuss/2019-March/003652.htm... - http://lists.openstack.org/pipermail/openstack-discuss/2019-March/003655.htm... [3] https://bugs.launchpad.net/tempest/+bug/1843762 https://review.opendev.org/#/c/439461/ [4] https://specs.openstack.org/openstack/api-wg/guidelines/api_interoperability...

-gmann

Sean McGinnis

10:50 a.m.

...

...
I don't believe that it is clear that a microversion bump was required for the "groups" response showing up in a GET quota-sets response, and here's why:

This API has, since at least Havana, returned dynamic fields based on quotas that are assigned to volume types. i.e.:

$ cinder --debug quota-show b73b1b7e82a247038cd01a441ec5a806 DEBUG:keystoneauth:RESP BODY: {"quota_set": {"per_volume_gigabytes": -1, "volumes_ceph": -1, "groups": 10, "gigabytes": 1000, "backup_gigabytes": 1000, "snapshots": 10, "volumes_enc": -1, "snapshots_enc": -1, "snapshots_ceph": -1, "gigabytes_ceph": -1, "volumes": 10, "gigabytes_enc": -1, "backups": 10, "id": "b73b1b7e82a247038cd01a441ec5a806"}}

"gigabytes_ceph" is in that response because there's a "ceph" volume type defined, same for "gigabytes_enc", etc.

This puts this API alongside something more like listing volume types -- you get a list of what's defined on the deployment, not a pre-baked list of defined fields.

I think this is the root of the confusion, and why I still think that enforcements, at least as it is now, should be reverted from tempest. This is not an API change where Cinder changed the columns in the response, it's the rows. This is a dynamic list. Like Eric points out, this really is no different than listing volumes or volumes types. This definitely should *not* be a microversion bump and the enforcement by tempest of the content (not the structure) is wrong. Sean

Michael McCune

1 p.m.

On Mon, Sep 16, 2019 at 1:55 PM Sean McGinnis <sean.mcginnis@gmx.com> wrote:

...

...
...
I don't believe that it is clear that a microversion bump was required for the "groups" response showing up in a GET quota-sets response, and here's why:

This API has, since at least Havana, returned dynamic fields based on quotas that are assigned to volume types. i.e.:

$ cinder --debug quota-show b73b1b7e82a247038cd01a441ec5a806 DEBUG:keystoneauth:RESP BODY: {"quota_set": {"per_volume_gigabytes": -1, "volumes_ceph": -1, "groups": 10, "gigabytes": 1000, "backup_gigabytes": 1000, "snapshots": 10, "volumes_enc": -1, "snapshots_enc": -1, "snapshots_ceph": -1, "gigabytes_ceph": -1, "volumes": 10, "gigabytes_enc": -1, "backups": 10, "id": "b73b1b7e82a247038cd01a441ec5a806"}}

"gigabytes_ceph" is in that response because there's a "ceph" volume type defined, same for "gigabytes_enc", etc.

This puts this API alongside something more like listing volume types -- you get a list of what's defined on the deployment, not a pre-baked list of defined fields.

I think this is the root of the confusion, and why I still think that enforcements, at least as it is now, should be reverted from tempest.

This is not an API change where Cinder changed the columns in the response, it's the rows. This is a dynamic list. Like Eric points out, this really is no different than listing volumes or volumes types.

This definitely should *not* be a microversion bump and the enforcement by tempest of the content (not the structure) is wrong.

these details definitely make a difference to me. perhaps i should clarify my previous statement, i would expect any changes to the request or response /schemas/ to be associated with a version bump. if these tolerances are allowed within the current schema, then it makes sense to me that no version change would occur. thanks for the clarification Eric and Sean peace o/ Sean

...

Matt Riedemann

2:15 p.m.

On 9/16/2019 12:40 PM, Eric Harney wrote:

...

There's a lot of talk here about interoperability problems... what are those problems, exactly? If we ignore Ocata and just look at Train -- why is this API not problematic for interoperability there, when requests on different clouds would return different data, depending on how types are configured?

It's not clear to me that rectifying the microversion concerns around the "groups" field is helpful without also understanding this piece, because if the concern is that different clouds return different fields for this API -- that will still happen. We need more detail to understand how to address this, and what the problem is that we are trying to solve exactly.

Backend/type specific information leaking out of the API dynamically like that is definitely an interoperability problem and as you said it sounds like it's been that way for a long time. The compute servers diagnostics API had a similar problem for a long time and the associated Tempest test for that API was disabled for a long time because the response body was hypervisor specific, so we eventually standardized it in a microversion so it was driver agnostic. -- Thanks, Matt

Sean McGinnis

3:11 p.m.

...

Backend/type specific information leaking out of the API dynamically like that is definitely an interoperability problem and as you said it sounds like it's been that way for a long time. The compute servers diagnostics API had a similar problem for a long time and the associated Tempest test for that API was disabled for a long time because the response body was hypervisor specific, so we eventually standardized it in a microversion so it was driver agnostic.

Except this isn't backend specific information that is leaking. It's just reflecting the configuration of the system.

Sean Mooney

3:59 p.m.

On Mon, 2019-09-16 at 17:11 -0500, Sean McGinnis wrote:

...

...
Backend/type specific information leaking out of the API dynamically like that is definitely an interoperability problem and as you said it sounds like it's been that way for a long time. The compute servers diagnostics API had a similar problem for a long time and the associated Tempest test for that API was disabled for a long time because the response body was hypervisor specific, so we eventually standardized it in a microversion so it was driver agnostic.

Except this isn't backend specific information that is leaking. It's just reflecting the configuration of the system. yes and config driven api behavior is also an iterop problem. ideally you should not be able to tell if cinder is abcked by ceph or emc form the api responce at all.

sure you might have a volume type call ceph and another called emc but both should be report capasty in the same field with teh same unit. ideally you would have a snapshots or gigabytes quota and option ly associate that with a volume types but shanshot_ceph is not interoperable aross could if that exstis with that name solely becaue ceph was used on the backend. as a client i would have to look at snapshost* to figure out my quotat and in princiapal that is an ubounded set.

...

Ghanshyam Mann

5:05 p.m.

---- On Tue, 17 Sep 2019 07:59:19 +0900 Sean Mooney <smooney@redhat.com> wrote ----

...

On Mon, 2019-09-16 at 17:11 -0500, Sean McGinnis wrote:

...
...
Backend/type specific information leaking out of the API dynamically like that is definitely an interoperability problem and as you said it sounds like it's been that way for a long time. The compute servers diagnostics API had a similar problem for a long time and the associated Tempest test for that API was disabled for a long time because the response body was hypervisor specific, so we eventually standardized it in a microversion so it was driver agnostic.

Except this isn't backend specific information that is leaking. It's just reflecting the configuration of the system. yes and config driven api behavior is also an iterop problem. ideally you should not be able to tell if cinder is abcked by ceph or emc form the api responce at all.

sure you might have a volume type call ceph and another called emc but both should be report capasty in the same field with teh same unit.

ideally you would have a snapshots or gigabytes quota and option ly associate that with a volume types but shanshot_ceph is not interoperable aross could if that exstis with that name solely becaue ceph was used on the backend. as a client i would have to look at snapshost* to figure out my quotat and in princiapal that is an ubounded set.

Yeah and this is real pain point for end-user or app using API directly. Dynamic API behaviour base don system configuration is interoperability issue. In bug#1687538 case, new field is going to be reflected for the same backend and same configuration Cloud. Cloud provider upgrade their cloud from ocata->anything and user will start getting the new field without any mechanism to discover whether that field is expected to be present or not. -gmann

...

...

zhu.fanglei＠zte.com.cn

17 Sep 17 Sep

1:41 a.m.

New subject: [all][interop][cinder][qa] API changes with/withoutmicroversion and Tempest verification of API interoperability

Seems we can hardly reach an agreement about whether to use microverion for fields added in response, but, I think for tempest, things are simpler, we can add schema check according to the api-ref, and if some issues are found (like groups field) in older version, we can simply remove that field from required fields. That won't happen very often. Original Mail Sender: GhanshyamMann <gmann@ghanshyammann.com> To: Sean Mooney <smooney@redhat.com>; CC: Sean McGinnis <sean.mcginnis@gmx.com>;Matt Riedemann <mriedemos@gmail.com>;openstack-discuss <openstack-discuss@lists.openstack.org>; Date: 2019/09/17 08:08 Subject: Re: [all][interop][cinder][qa] API changes with/withoutmicroversion and Tempest verification of API interoperability ---- On Tue, 17 Sep 2019 07:59:19 +0900 Sean Mooney <smooney@redhat.com> wrote ----

...

On Mon, 2019-09-16 at 17:11 -0500, Sean McGinnis wrote:

...
...
Backend/type specific information leaking out of the API dynamically like that is definitely an interoperability problem and as you said it sounds like it's been that way for a long time. The compute servers diagnostics API had a similar problem for a long time and the associated Tempest test for that API was disabled for a long time because the response body was hypervisor specific, so we eventually standardized it in a microversion so it was driver agnostic.

Except this isn't backend specific information that is leaking. It's just reflecting the configuration of the system. yes and config driven api behavior is also an iterop problem. ideally you should not be able to tell if cinder is abcked by ceph or emc form the api responce at all.

sure you might have a volume type call ceph and another called emc but both should be report capasty in the same field with teh same unit.

ideally you would have a snapshots or gigabytes quota and option ly associate that with a volume types but shanshot_ceph is not interoperable aross could if that exstis with that name solely becaue ceph was used on the backend. as a client i would have to look at snapshost* to figure out my quotat and in princiapal that is an ubounded set.

...

...

Ghanshyam Mann

7:05 a.m.

New subject: [all][interop][cinder][qa] API changes with/withoutmicroversion and Tempest verification of API interoperability

---- On Tue, 17 Sep 2019 01:41:23 -0700 <zhu.fanglei@zte.com.cn> wrote ----

...

Seems we can hardly reach an agreement about whether to use microverion for fields added in response, but, I think for tempest, things are simpler, we can add schema check according to the api-ref, and if some issues are found (like groups field) in older version, we can simply remove that field from required fields. That won't happen very often.

I do not think we should do that. When I proposed the idea of doing the strict JSON schema validation for volume API, main goal was to block any backward compatible or non-interoperable changes in API strictly. Compute JSON schema strict validation is good example to show the benefits of doing that. Idea behind strict validation is we have a predefined schema for API response with its optional/mandatory fields, name, type, range etc and compare with the API actual response and see if any API field is added, removed, type changed, range changed etc. If we do not do 'required' field or AdditionalPropoerties=False then, there is no meaning of strict validation. I will leave those API from strict validation. I have commented the same in your all open patches also. JSON schema strict validation has to be completely strict validation or no validation. As you mentioned about the api-ref, I do not think we have well-defined api-ref for volume case. You can see that during Tempest schema implementation, you have fixed 18 bugs in api-ref[1]. I always go with what code return as code is what end user get response from. [1] https://review.opendev.org/#/q/project:openstack/cinder+branch:master+topic:... -gmann

...

Original MailSender: GhanshyamMann <gmann@ghanshyammann.com>To: Sean Mooney <smooney@redhat.com>;CC: Sean McGinnis <sean.mcginnis@gmx.com>;Matt Riedemann <mriedemos@gmail.com>;openstack-discuss <openstack-discuss@lists.openstack.org>;Date: 2019/09/17 08:08Subject: Re: [all][interop][cinder][qa] API changes with/withoutmicroversion and Tempest verification of API interoperability ---- On Tue, 17 Sep 2019 07:59:19 +0900 Sean Mooney <smooney@redhat.com> wrote ----

...
On Mon, 2019-09-16 at 17:11 -0500, Sean McGinnis wrote:

...
...
Backend/type specific information leaking out of the API dynamically like that is definitely an interoperability problem and as you said it sounds like it's been that way for a long time. The compute servers diagnostics API had a similar problem for a long time and the associated Tempest test for that API was disabled for a long time because the response body was hypervisor specific, so we eventually standardized it in a microversion so it was driver agnostic.

Except this isn't backend specific information that is leaking. It's just reflecting the configuration of the system. yes and config driven api behavior is also an iterop problem. ideally you should not be able to tell if cinder is abcked by ceph or emc form the api responce at all.

sure you might have a volume type call ceph and another called emc but both should be report capasty in the same field with teh same unit.

ideally you would have a snapshots or gigabytes quota and option ly associate that with a volume types but shanshot_ceph is not interoperable aross could if that exstis with that name solely becaue ceph was used on the backend. as a client i would have to look at snapshost* to figure out my quotat and in princiapal that is an ubounded set.

Yeah and this is real pain point for end-user or app using API directly. Dynamic API behaviour base don system configuration is interoperability issue. In bug#1687538 case, new field is going to be reflected for the same backend and same configuration Cloud. Cloud provider upgrade their cloud from ocata->anything and user will start getting the new field without any mechanism to discover whether that field is expected to be present or not.

-gmann

...
...

Eric Harney

6:23 a.m.

On 9/16/19 6:59 PM, Sean Mooney wrote:

...

On Mon, 2019-09-16 at 17:11 -0500, Sean McGinnis wrote:

...
...
Backend/type specific information leaking out of the API dynamically like that is definitely an interoperability problem and as you said it sounds like it's been that way for a long time. The compute servers diagnostics API had a similar problem for a long time and the associated Tempest test for that API was disabled for a long time because the response body was hypervisor specific, so we eventually standardized it in a microversion so it was driver agnostic.

Except this isn't backend specific information that is leaking. It's just reflecting the configuration of the system. yes and config driven api behavior is also an iterop problem. ideally you should not be able to tell if cinder is abcked by ceph or emc form the api responce at all.

sure you might have a volume type call ceph and another called emc but both should be report capasty in the same field with teh same unit.

ideally you would have a snapshots or gigabytes quota and option ly associate that with a volume types but shanshot_ceph is not interoperable aross could if that exstis with that name solely becaue ceph was used on the backend. as a client i would have to look at snapshost* to figure out my quotat and in princiapal that is an ubounded set.

I think you are confusing types vs backends here. In my example, it was called "snapshots_ceph" because there was a type called "ceph". That's an admin choice, not a behavior of the API.

Sean Mooney

6:46 a.m.

On Tue, 2019-09-17 at 09:23 -0400, Eric Harney wrote:

...

On 9/16/19 6:59 PM, Sean Mooney wrote:

...
On Mon, 2019-09-16 at 17:11 -0500, Sean McGinnis wrote:

...
...
Backend/type specific information leaking out of the API dynamically like that is definitely an interoperability problem and as you said it sounds like it's been that way for a long time. The compute servers diagnostics API had a similar problem for a long time and the associated Tempest test for that API was disabled for a long time because the response body was hypervisor specific, so we eventually standardized it in a microversion so it was driver agnostic.

Except this isn't backend specific information that is leaking. It's just reflecting the configuration of the system.

yes and config driven api behavior is also an iterop problem. ideally you should not be able to tell if cinder is abcked by ceph or emc form the api responce at all.

sure you might have a volume type call ceph and another called emc but both should be report capasty in the same field with teh same unit.

ideally you would have a snapshots or gigabytes quota and option ly associate that with a volume types but shanshot_ceph is not interoperable aross could if that exstis with that name solely becaue ceph was used on the backend. as a client i would have to look at snapshost* to figure out my quotat and in princiapal that is an ubounded set.

I think you are confusing types vs backends here. In my example, it was called "snapshots_ceph" because there was a type called "ceph". That's an admin choice, not a behavior of the API. or it could have been express in the api with a dedicated type filed and

so you would always have a snapshots filed regardless of the volume type but have a since type filed per quota set that identifed what type it applied too.

Ghanshyam Mann

7:26 a.m.

---- On Tue, 17 Sep 2019 06:46:31 -0700 Sean Mooney <smooney@redhat.com> wrote ----

...

On Tue, 2019-09-17 at 09:23 -0400, Eric Harney wrote:

...
On 9/16/19 6:59 PM, Sean Mooney wrote:

...
On Mon, 2019-09-16 at 17:11 -0500, Sean McGinnis wrote:

...
...
Backend/type specific information leaking out of the API dynamically like that is definitely an interoperability problem and as you said it sounds like it's been that way for a long time. The compute servers diagnostics API had a similar problem for a long time and the associated Tempest test for that API was disabled for a long time because the response body was hypervisor specific, so we eventually standardized it in a microversion so it was driver agnostic.

Except this isn't backend specific information that is leaking. It's just reflecting the configuration of the system.

yes and config driven api behavior is also an iterop problem. ideally you should not be able to tell if cinder is abcked by ceph or emc form the api responce at all.

sure you might have a volume type call ceph and another called emc but both should be report capasty in the same field with teh same unit.

ideally you would have a snapshots or gigabytes quota and option ly associate that with a volume types but shanshot_ceph is not interoperable aross could if that exstis with that name solely becaue ceph was used on the backend. as a client i would have to look at snapshost* to figure out my quotat and in princiapal that is an ubounded set.

I think you are confusing types vs backends here. In my example, it was called "snapshots_ceph" because there was a type called "ceph". That's an admin choice, not a behavior of the API. or it could have been express in the api with a dedicated type filed and

so you would always have a snapshots filed regardless of the volume type but have a since type filed per quota set that identifed what type it applied too.

IMO, the best way is to make it in an array structure and volume_type specific quotas can be optional items in mandatory 'snapshots' array field. For example: { "quota_set": { . . "snapshots": { "total/project": 10, "ceph": -1, "lvm-thin": -1, "lvmdriver-1": -1, } } -gmann

...

Sean Mooney

7:39 a.m.

On Tue, 2019-09-17 at 07:26 -0700, Ghanshyam Mann wrote:

...

...
On Tue, 2019-09-17 at 09:23 -0400, Eric Harney wrote:

...
On 9/16/19 6:59 PM, Sean Mooney wrote:

...
On Mon, 2019-09-16 at 17:11 -0500, Sean McGinnis wrote:

...
...
Backend/type specific information leaking out of the API dynamically like that is definitely an interoperability problem and as you said it sounds like it's been that way for a long time. The compute servers diagnostics API had a similar problem for a long time and the associated Tempest test for that API was disabled for a long time because the response body was hypervisor specific, so we eventually standardized it in a microversion so it was driver agnostic.

Except this isn't backend specific information that is leaking. It's just reflecting the configuration of the system.

yes and config driven api behavior is also an iterop problem. ideally you should not be able to tell if cinder is abcked by ceph or emc form the api responce at all.

sure you might have a volume type call ceph and another called emc but both should be report capasty in the same field with teh same unit.

ideally you would have a snapshots or gigabytes quota and option ly associate that with a volume types but shanshot_ceph is not interoperable aross could if that exstis with that name solely becaue ceph was used on

---- On Tue, 17 Sep 2019 06:46:31 -0700 Sean Mooney <smooney@redhat.com> wrote ---- the

...
...
...
backend. as a client i would have to look at snapshost* to figure out my quotat and in princiapal that is an ubounded set.

I think you are confusing types vs backends here. In my example, it was called "snapshots_ceph" because there was a type called "ceph". That's an admin choice, not a behavior of the API. or it could have been express in the api with a dedicated type filed and

so you would always have a snapshots filed regardless of the volume type but have a since type filed per quota set that identifed what type it applied too.

IMO, the best way is to make it in an array structure and volume_type specific quotas can be optional items in mandatory 'snapshots' array field. For example:

{ "quota_set": { . . "snapshots": { "total/project": 10, "ceph": -1, "lvm-thin": -1, "lvmdriver-1": -1, } }

well you can do it that way or invert it { "quota_set": { ceph:{snapshot:-1,gigabytpes:100 ...} lvm-1:{snapshot:-1,gigabytpes:100 ...} lvm-2:{snapshot:-1,gigabytpes:100 ...} project:{snapshot:-1,gigabytpes:100 ...} ... } } in either case the filed names remain the same with and the type is treated as an opaque sting that is decoupled the field names.

...

-gmann

...

Sean Mooney

7:43 a.m.

On Tue, 2019-09-17 at 15:39 +0100, Sean Mooney wrote:

...

On Tue, 2019-09-17 at 07:26 -0700, Ghanshyam Mann wrote:

...
...
On Tue, 2019-09-17 at 09:23 -0400, Eric Harney wrote:

...
On 9/16/19 6:59 PM, Sean Mooney wrote:

...
On Mon, 2019-09-16 at 17:11 -0500, Sean McGinnis wrote:

...
> > Backend/type specific information leaking out of the API dynamically like > that is definitely an interoperability problem and as you said it sounds > like it's been that way for a long time. The compute servers diagnostics API > had a similar problem for a long time and the associated Tempest test for > that API was disabled for a long time because the response body was > hypervisor specific, so we eventually standardized it in a microversion so > it was driver agnostic. >

Except this isn't backend specific information that is leaking. It's just reflecting the configuration of the system.

yes and config driven api behavior is also an iterop problem. ideally you should not be able to tell if cinder is abcked by ceph or emc form the api responce at all.

sure you might have a volume type call ceph and another called emc but both should be report capasty in the same field with teh same unit.

ideally you would have a snapshots or gigabytes quota and option ly associate that with a volume types but shanshot_ceph is not interoperable aross could if that exstis with that name solely becaue ceph was used on

---- On Tue, 17 Sep 2019 06:46:31 -0700 Sean Mooney <smooney@redhat.com> wrote ---- the

...
...
...
backend. as a client i would have to look at snapshost* to figure out my quotat and in princiapal that is an ubounded set.

I think you are confusing types vs backends here. In my example, it was called "snapshots_ceph" because there was a type called "ceph". That's an admin choice, not a behavior of the API. or it could have been express in the api with a dedicated type filed and

so you would always have a snapshots filed regardless of the volume type but have a since type filed per quota set that identifed what type it applied too.

IMO, the best way is to make it in an array structure and volume_type specific quotas can be optional items in mandatory 'snapshots' array field. For example:

{ "quota_set": { . . "snapshots": { "total/project": 10, "ceph": -1, "lvm-thin": -1, "lvmdriver-1": -1, } }

well you can do it that way or invert it

{ "quota_set": { ceph:{snapshot:-1,gigabytpes:100 ...} lvm-1:{snapshot:-1,gigabytpes:100 ...} lvm-2:{snapshot:-1,gigabytpes:100 ...} project:{snapshot:-1,gigabytpes:100 ...} ... } }

i ment to say i was orginly think of it slitly differently by having a type colume in the and thje quota_set being a list { "quota_set": [ {snapshot:-1,gigabytpes:100 type:"ceph",...} {snapshot:-1,gigabytpes:100, type:"lvm-1", ...} {snapshot:-1,gigabytpes:100, type:"lvm-2" ...} {snapshot:-1,gigabytpes:100, type:"project" ...} ... ] } this is my prefered form of the 3 since you can validate the keys and values eaily with json schema and it maps nicely to a db schema.

...

in either case the filed names remain the same with and the type is treated as an opaque sting that is decoupled the field names.

...
-gmann

...

Ghanshyam Mann

16 Sep 16 Sep

5:01 p.m.

---- On Tue, 17 Sep 2019 02:40:36 +0900 Eric Harney <eharney@redhat.com> wrote ----

...

On 9/16/19 6:02 AM, Ghanshyam Mann wrote:

...
---- On Mon, 16 Sep 2019 18:53:58 +0900 Ghanshyam Mann <gmann@ghanshyammann.com> wrote ----

...
Hello Everyone,

As per discussion over ML, Tempest started the JSON schema strict validation for Volume APIs response [1]. Because it may affect the interop certification, it was explained to the Interop team as well as in the Board of Director meeting[2].

In Train, Tempest started implementing the validation and found an API change where the new field was added in API response without versioning[3] (Cinder has API microversion mechanism). IMO, that was not the correct way to change the API and as per API-WG guidelines[4] any field added/modified/removed in API should be with microverison(means old versions/user should not be affected by that change) and must for API interoperability.

With JSON schema validation, Tempest verifies the API interoperability recommended behaviour by API-WG. But as per IRC conversion with cinder team, we have different opinion on API interoperability and how API should be changed under microversion mechanism. I would like to have a conclusion on this so that Tempest can verify or leave the Volume API for strict validation.

I found the same flow chart what Sean created in Nova about "when to bump microverison" in Cinder also which clearly say any addition to response need new microversion. - https://docs.openstack.org/cinder/latest/contributor/api_microversion_dev.ht...

-gmann

I don't believe that it is clear that a microversion bump was required for the "groups" response showing up in a GET quota-sets response, and here's why:

This API has, since at least Havana, returned dynamic fields based on quotas that are assigned to volume types. i.e.:

$ cinder --debug quota-show b73b1b7e82a247038cd01a441ec5a806 DEBUG:keystoneauth:RESP BODY: {"quota_set": {"per_volume_gigabytes": -1, "volumes_ceph": -1, "groups": 10, "gigabytes": 1000, "backup_gigabytes": 1000, "snapshots": 10, "volumes_enc": -1, "snapshots_enc": -1, "snapshots_ceph": -1, "gigabytes_ceph": -1, "volumes": 10, "gigabytes_enc": -1, "backups": 10, "id": "b73b1b7e82a247038cd01a441ec5a806"}}

"gigabytes_ceph" is in that response because there's a "ceph" volume type defined, same for "gigabytes_enc", etc.

This puts this API alongside something more like listing volume types -- you get a list of what's defined on the deployment, not a pre-baked list of defined fields.

Complaints about the fact that "groups" being added without a microversion imply that these other dynamic fields shouldn't be in this response either -- but this is how this API works.

There's a lot of talk here about interoperability problems... what are those problems, exactly? If we ignore Ocata and just look at Train -- why is this API not problematic for interoperability there, when requests on different clouds would return different data, depending on how types are configured?

It's not clear to me that rectifying the microversion concerns around the "groups" field is helpful without also understanding this piece, because if the concern is that different clouds return different fields for this API -- that will still happen. We need more detail to understand how to address this, and what the problem is that we are trying to solve exactly.

There are two things here. 1. API behaviour depends on backend. This has been discussed two years back also and Tempest team along with cinder team decided not to test the backend-specific behaviour in Tempest[1]. 2. API is changed without versioning. The second one is the issue here. If any API is changed without versioning cause the interoperability issue here. New field is being added for older microversion also for same backend. *Why this is interoperability: CloudA with same configuration and same backend is upgraded and have API return new field. I deploy my app on that cloud and use that field. Now CloudB with same configuration and same backend is not upgraded yet so does not have API return the new field added. Now I want to move my app from CloudA to CloudB and it will fail because CloudB API does not have that new field. And I cannot check what version it got added or there is no mechanism for app to discover that field as expected in which Cloud. So this is a very clear case of interoperability. There is no way for end-user to discover the API change which is a real pain point for them. Note: same backend and same configuration cloud have different behaviour of API. We should consider the addition of new field same as delete or modify (name or type) any field in API.

...

(Other than the problem that Tempest currently fails on Ocata. My inclination is still that the Tempest tests could just be wrong.)

Ocata gate is going to be solved by https://review.opendev.org/#/c/681950/ -gmann [1] http://lists.openstack.org/pipermail/openstack-dev/2017-May/116172.html

...

...
...
[1] http://lists.openstack.org/pipermail/openstack-discuss/2018-November/000358.... [2] - http://lists.openstack.org/pipermail/openstack-discuss/2019-March/003652.htm... - http://lists.openstack.org/pipermail/openstack-discuss/2019-March/003655.htm... [3] https://bugs.launchpad.net/tempest/+bug/1843762 https://review.opendev.org/#/c/439461/ [4] https://specs.openstack.org/openstack/api-wg/guidelines/api_interoperability...

-gmann

Eric Harney

17 Sep 17 Sep

6:22 a.m.

On 9/16/19 8:01 PM, Ghanshyam Mann wrote:

...

---- On Tue, 17 Sep 2019 02:40:36 +0900 Eric Harney <eharney@redhat.com> wrote ----

...
On 9/16/19 6:02 AM, Ghanshyam Mann wrote:

...
---- On Mon, 16 Sep 2019 18:53:58 +0900 Ghanshyam Mann <gmann@ghanshyammann.com> wrote ----

...
Hello Everyone,

As per discussion over ML, Tempest started the JSON schema strict validation for Volume APIs response [1]. Because it may affect the interop certification, it was explained to the Interop team as well as in the Board of Director meeting[2].

In Train, Tempest started implementing the validation and found an API change where the new field was added in API response without versioning[3] (Cinder has API microversion mechanism). IMO, that was not the correct way to change the API and as per API-WG guidelines[4] any field added/modified/removed in API should be with microverison(means old versions/user should not be affected by that change) and must for API interoperability.

With JSON schema validation, Tempest verifies the API interoperability recommended behaviour by API-WG. But as per IRC conversion with cinder team, we have different opinion on API interoperability and how API should be changed under microversion mechanism. I would like to have a conclusion on this so that Tempest can verify or leave the Volume API for strict validation.

I found the same flow chart what Sean created in Nova about "when to bump microverison" in Cinder also which clearly say any addition to response need new microversion. - https://docs.openstack.org/cinder/latest/contributor/api_microversion_dev.ht...

-gmann

I don't believe that it is clear that a microversion bump was required for the "groups" response showing up in a GET quota-sets response, and here's why:

This API has, since at least Havana, returned dynamic fields based on quotas that are assigned to volume types. i.e.:

$ cinder --debug quota-show b73b1b7e82a247038cd01a441ec5a806 DEBUG:keystoneauth:RESP BODY: {"quota_set": {"per_volume_gigabytes": -1, "volumes_ceph": -1, "groups": 10, "gigabytes": 1000, "backup_gigabytes": 1000, "snapshots": 10, "volumes_enc": -1, "snapshots_enc": -1, "snapshots_ceph": -1, "gigabytes_ceph": -1, "volumes": 10, "gigabytes_enc": -1, "backups": 10, "id": "b73b1b7e82a247038cd01a441ec5a806"}}

"gigabytes_ceph" is in that response because there's a "ceph" volume type defined, same for "gigabytes_enc", etc.

This puts this API alongside something more like listing volume types -- you get a list of what's defined on the deployment, not a pre-baked list of defined fields.

Complaints about the fact that "groups" being added without a microversion imply that these other dynamic fields shouldn't be in this response either -- but this is how this API works.

There's a lot of talk here about interoperability problems... what are those problems, exactly? If we ignore Ocata and just look at Train -- why is this API not problematic for interoperability there, when requests on different clouds would return different data, depending on how types are configured?

It's not clear to me that rectifying the microversion concerns around the "groups" field is helpful without also understanding this piece, because if the concern is that different clouds return different fields for this API -- that will still happen. We need more detail to understand how to address this, and what the problem is that we are trying to solve exactly.

There are two things here. 1. API behaviour depends on backend. This has been discussed two years back also and Tempest team along with cinder team decided not to test the backend-specific behaviour in Tempest[1].

This is wrong. Nothing about what is happening in this API is backend-specific.

...

2. API is changed without versioning.

The second one is the issue here. If any API is changed without versioning cause the interoperability issue here. New field is being added for older microversion also for same backend.

If the concern is that different fields can be returned as part of quota info, it's worth understanding that fixing the Ocata tempest failures won't fix your concern, because this API still returns dynamic fields when the deployment is using per-type quotas, even on master. Are those considered "changes"? Need concrete details here.

...

*Why this is interoperability: CloudA with same configuration and same backend is upgraded and have API return new field. I deploy my app on that cloud and use that field. Now CloudB with same configuration and same backend is not upgraded yet so does not have API return the new field added. Now I want to move my app from CloudA to CloudB and it will fail because CloudB API does not have that new field. And I cannot check what version it got added or there is no mechanism for app to discover that field as expected in which Cloud. So this is a very clear case of interoperability.

There is no way for end-user to discover the API change which is a real pain point for them. Note: same backend and same configuration cloud have different behaviour of API.

We should consider the addition of new field same as delete or modify (name or type) any field in API.

This seems to imply that the whole Cinder per-type quota feature is invalid, or implemented in an invalid way. Is the concern about how things are expressed in the API, or the broader features?

...

...
(Other than the problem that Tempest currently fails on Ocata. My inclination is still that the Tempest tests could just be wrong.)

Ocata gate is going to be solved by https://review.opendev.org/#/c/681950/

Fixing Ocata is great, but I'd like to settle the bigger questions about this API that you are raising. I'd prefer to not end up worrying about these same problems the next time someone writes tests for this API, or makes a change to it. What would be a valid way to design it that meet the concerns around interop?

...

-gmann

[1] http://lists.openstack.org/pipermail/openstack-dev/2017-May/116172.html

...
...
...
[1] http://lists.openstack.org/pipermail/openstack-discuss/2018-November/000358.... [2] - http://lists.openstack.org/pipermail/openstack-discuss/2019-March/003652.htm... - http://lists.openstack.org/pipermail/openstack-discuss/2019-March/003655.htm... [3] https://bugs.launchpad.net/tempest/+bug/1843762 https://review.opendev.org/#/c/439461/ [4] https://specs.openstack.org/openstack/api-wg/guidelines/api_interoperability...

-gmann

Ghanshyam Mann

7:19 a.m.

---- On Tue, 17 Sep 2019 06:22:09 -0700 Eric Harney <eharney@redhat.com> wrote ----

...

On 9/16/19 8:01 PM, Ghanshyam Mann wrote:

...
---- On Tue, 17 Sep 2019 02:40:36 +0900 Eric Harney <eharney@redhat.com> wrote ----

...
On 9/16/19 6:02 AM, Ghanshyam Mann wrote:

...
---- On Mon, 16 Sep 2019 18:53:58 +0900 Ghanshyam Mann <gmann@ghanshyammann.com> wrote ----

...
Hello Everyone,

As per discussion over ML, Tempest started the JSON schema strict validation for Volume APIs response [1]. Because it may affect the interop certification, it was explained to the Interop team as well as in the Board of Director meeting[2].

In Train, Tempest started implementing the validation and found an API change where the new field was added in API response without versioning[3] (Cinder has API microversion mechanism). IMO, that was not the correct way to change the API and as per API-WG guidelines[4] any field added/modified/removed in API should be with microverison(means old versions/user should not be affected by that change) and must for API interoperability.

With JSON schema validation, Tempest verifies the API interoperability recommended behaviour by API-WG. But as per IRC conversion with cinder team, we have different opinion on API interoperability and how API should be changed under microversion mechanism. I would like to have a conclusion on this so that Tempest can verify or leave the Volume API for strict validation.

I found the same flow chart what Sean created in Nova about "when to bump microverison" in Cinder also which clearly say any addition to response need new microversion. - https://docs.openstack.org/cinder/latest/contributor/api_microversion_dev.ht...

-gmann

I don't believe that it is clear that a microversion bump was required for the "groups" response showing up in a GET quota-sets response, and here's why:

This API has, since at least Havana, returned dynamic fields based on quotas that are assigned to volume types. i.e.:

$ cinder --debug quota-show b73b1b7e82a247038cd01a441ec5a806 DEBUG:keystoneauth:RESP BODY: {"quota_set": {"per_volume_gigabytes": -1, "volumes_ceph": -1, "groups": 10, "gigabytes": 1000, "backup_gigabytes": 1000, "snapshots": 10, "volumes_enc": -1, "snapshots_enc": -1, "snapshots_ceph": -1, "gigabytes_ceph": -1, "volumes": 10, "gigabytes_enc": -1, "backups": 10, "id": "b73b1b7e82a247038cd01a441ec5a806"}}

"gigabytes_ceph" is in that response because there's a "ceph" volume type defined, same for "gigabytes_enc", etc.

This puts this API alongside something more like listing volume types -- you get a list of what's defined on the deployment, not a pre-baked list of defined fields.

Complaints about the fact that "groups" being added without a microversion imply that these other dynamic fields shouldn't be in this response either -- but this is how this API works.

There's a lot of talk here about interoperability problems... what are those problems, exactly? If we ignore Ocata and just look at Train -- why is this API not problematic for interoperability there, when requests on different clouds would return different data, depending on how types are configured?

It's not clear to me that rectifying the microversion concerns around the "groups" field is helpful without also understanding this piece, because if the concern is that different clouds return different fields for this API -- that will still happen. We need more detail to understand how to address this, and what the problem is that we are trying to solve exactly.

There are two things here. 1. API behaviour depends on backend. This has been discussed two years back also and Tempest team along with cinder team decided not to test the backend-specific behaviour in Tempest[1].

This is wrong. Nothing about what is happening in this API is backend-specific.

I agree and I am just giving the background information about backend specific features/API behaviour testing.

...

...
2. API is changed without versioning.

The second one is the issue here. If any API is changed without versioning cause the interoperability issue here. New field is being added for older microversion also for same backend.

If the concern is that different fields can be returned as part of quota info, it's worth understanding that fixing the Ocata tempest failures won't fix your concern, because this API still returns dynamic fields when the deployment is using per-type quotas, even on master.

Are those considered "changes"? Need concrete details here.

+1. ocata fix is separate from this discussion. That has to be done at some time whenever Tempest starts failing on Ocata for any reason.

...

...
*Why this is interoperability: CloudA with same configuration and same backend is upgraded and have API return new field. I deploy my app on that cloud and use that field. Now CloudB with same configuration and same backend is not upgraded yet so does not have API return the new field added. Now I want to move my app from CloudA to CloudB and it will fail because CloudB API does not have that new field. And I cannot check what version it got added or there is no mechanism for app to discover that field as expected in which Cloud. So this is a very clear case of interoperability.

There is no way for end-user to discover the API change which is a real pain point for them. Note: same backend and same configuration cloud have different behaviour of API.

We should consider the addition of new field same as delete or modify (name or type) any field in API.

This seems to imply that the whole Cinder per-type quota feature is invalid, or implemented in an invalid way. Is the concern about how things are expressed in the API, or the broader features?

I am also concern about the general rule for Cinder API which is important to consider for Tempest BP of strict volume validation[1] let me ask clearly: 1. does Cinder consider adding any new field to any API does not need microversion bump as opposit to this doc[2] ? 2. Or it is only quota API where additional to the new field is considered as no microversion bump? if so how many such dynamic APIs Cinder has ? [1] https://blueprints.launchpad.net/tempest/+spec/volume-response-schema-valida... [2] https://docs.openstack.org/cinder/latest/contributor/api_microversion_dev.ht... https://specs.openstack.org/openstack/api-wg/guidelines/api_interoperability... -gmann

...

...
...
(Other than the problem that Tempest currently fails on Ocata. My inclination is still that the Tempest tests could just be wrong.)

Ocata gate is going to be solved by https://review.opendev.org/#/c/681950/

Fixing Ocata is great, but I'd like to settle the bigger questions about this API that you are raising.

I'd prefer to not end up worrying about these same problems the next time someone writes tests for this API, or makes a change to it.

What would be a valid way to design it that meet the concerns around interop?

...
-gmann

[1] http://lists.openstack.org/pipermail/openstack-dev/2017-May/116172.html

...
...
...
[1] http://lists.openstack.org/pipermail/openstack-discuss/2018-November/000358.... [2] - http://lists.openstack.org/pipermail/openstack-discuss/2019-March/003652.htm... - http://lists.openstack.org/pipermail/openstack-discuss/2019-March/003655.htm... [3] https://bugs.launchpad.net/tempest/+bug/1843762 https://review.opendev.org/#/c/439461/ [4] https://specs.openstack.org/openstack/api-wg/guidelines/api_interoperability...

-gmann

Kenichi Omichi

3:35 p.m.

2019年9月17日(火) 6:23 Eric Harney <eharney@redhat.com>:

...

...
---- On Tue, 17 Sep 2019 02:40:36 +0900 Eric Harney < eharney@redhat.com> wrote ----

...
On 9/16/19 6:02 AM, Ghanshyam Mann wrote:

...
---- On Mon, 16 Sep 2019 18:53:58 +0900 Ghanshyam Mann < gmann@ghanshyammann.com> wrote ----

...
Hello Everyone,

As per discussion over ML, Tempest started the JSON schema strict validation for Volume APIs response [1]. Because it may affect the interop certification, it was explained to the Interop team as well as in the Board of Director meeting[2].

In Train, Tempest started implementing the validation and found an API change where the new field was added in API response without versioning[3] (Cinder has API microversion mechanism). IMO, that was not

...
...
...
...
With JSON schema validation, Tempest verifies the API

interoperability recommended behaviour by API-WG. But as per IRC conversion with cinder team, we have different opinion on API interoperability and how API should be changed under microversion mechanism. I would like to have a conclusion on this so that Tempest can verify or leave the Volume API for strict validation.

I found the same flow chart what Sean created in Nova about "when to bump microverison" in Cinder also which clearly say any addition to response need new microversion. - https://docs.openstack.org/cinder/latest/contributor/api_microversion_dev.ht...

-gmann

I don't believe that it is clear that a microversion bump was required for the "groups" response showing up in a GET quota-sets response, and here's why:

This API has, since at least Havana, returned dynamic fields based on quotas that are assigned to volume types. i.e.:

$ cinder --debug quota-show b73b1b7e82a247038cd01a441ec5a806 DEBUG:keystoneauth:RESP BODY: {"quota_set": {"per_volume_gigabytes": -1, "volumes_ceph": -1, "groups": 10, "gigabytes": 1000, "backup_gigabytes": 1000, "snapshots": 10, "volumes_enc": -1, "snapshots_enc": -1, "snapshots_ceph": -1, "gigabytes_ceph": -1, "volumes": 10, "gigabytes_enc": -1, "backups": 10, "id": "b73b1b7e82a247038cd01a441ec5a806"}}

"gigabytes_ceph" is in that response because there's a "ceph" volume type defined, same for "gigabytes_enc", etc.

This puts this API alongside something more like listing volume types -- you get a list of what's defined on the deployment, not a pre-baked

...
...
of defined fields.

Complaints about the fact that "groups" being added without a microversion imply that these other dynamic fields shouldn't be in

On 9/16/19 8:01 PM, Ghanshyam Mann wrote: the correct way to change the API and as per API-WG guidelines[4] any field added/modified/removed in API should be with microverison(means old versions/user should not be affected by that change) and must for API interoperability. list this

...
...
response either -- but this is how this API works.

There's a lot of talk here about interoperability problems... what are those problems, exactly? If we ignore Ocata and just look at Train -- why is this API not problematic for interoperability there, when requests on different clouds would return different data, depending on how types are configured?

It's not clear to me that rectifying the microversion concerns around the "groups" field is helpful without also understanding this piece, because if the concern is that different clouds return different fields for this API -- that will still happen. We need more detail to understand how to address this, and what the problem is that we are trying to solve exactly.

There are two things here. 1. API behaviour depends on backend. This has been discussed two years back also and Tempest team along with cinder team decided not to test the backend-specific behaviour in Tempest[1].

This is wrong. Nothing about what is happening in this API is backend-specific.

...
2. API is changed without versioning.

The second one is the issue here. If any API is changed without versioning cause the interoperability issue here. New field is being added for older microversion also for same backend.

If the concern is that different fields can be returned as part of quota info, it's worth understanding that fixing the Ocata tempest failures won't fix your concern, because this API still returns dynamic fields when the deployment is using per-type quotas, even on master.

Are those considered "changes"? Need concrete details here.

It is not difficult to answer this question. The Cinder official API document[1] says the field "groups" is always returned(not optional) in the API response. IIUC the above dynamic fields are not written on the document, right? In addition, Cinder implements microversions which clearly controls the addition of API field as [2]. But actually the field "groups" has been added without bumping a microversion against the above [2]. Then Ghanshyam raises the interoperability concern. What is the interoperability concern? The concern is if writing an application based on the API official document, the application can work on newer clouds(Pike+ in this case) but it cannot work on older clouds(Ocata-). Actually we can consider Tempest is one of applications by consuming OpenStack APIs and we implemented JSON-Schema validation based on the API official document. Then Tempest could not work on older clouds(stable/ocata) and it is clearly an interoperability issue. (We Tempest reviewers read the API document carefully for reviewing JSON-Schema patches, then we found Cinder document issues and fixed them :-) BTW IMO backwards compatibility is more important than interoperability. So we should not remove the field "groups" from the base microversion anyways. Thanks Kenichi Omichi --- [1]: https://docs.openstack.org/api-ref/block-storage/v3/index.html?expanded=show... [2]: https://docs.openstack.org/cinder/latest/contributor/api_microversion_dev.ht... (the part of " the list of attributes and data structures returned")

Matt Riedemann

16 Sep 16 Sep

7:20 a.m.

On 9/16/2019 4:53 AM, Ghanshyam Mann wrote:

...

I would like to have a conclusion on this so that Tempest can verify or leave the Volume API for strict validation.

In Nova, when changing the request or response contents, it's pretty simple - we always microversion those. Why? Because if my application is written to work against different OpenStack clouds, maybe multiple public clouds or maybe a hybrid environment where it's running in both my local private cloud and in let's say VEXXHOST, and I need to know what I can do on each cloud, I use microversions, e.g. just because something works on the version of OpenStack I have running in private doesn't mean it's going to work in the public cloud provider I'm using and vice-versa. I think generally the people that are OK with not versioning additions to a response justify it by saying if the app code isn't looking for that specific field anyway they won't notice or care, and if they are looking for a specific field, they should treat all fields as optional and fallback gracefully. That may work in some cases, but I don't think as a general rule that's something we should be following since we don't know the situation for the app code or how the field is going to be used. For example, maybe the app code wouldn't make an initial request to perform some action on the resource if they know they can't properly monitor it with a later GET response. So rather than say a microversion for response changes is OK in some cases and not others, just keep it simple and always require it. The problem we've faced in nova several times when asking if we need a microversion is more about behavioral changes that signal when you can do something in the API, since that's a grey area. For example, we added a microversion when we added support for multiattach volumes even though the type of volume you're using or the compute driver your VM is on might not support multiattach. Feature discovery is still a problem in OpenStack but at least with the microversion you can determine via API version discovery which cloud supports the feature at a basic level and which doesn't. Any issues you hit after that are likely due to the cloud provider's configuration, which as a user yes that sucks, but we as a community haven't solved the "capability discovery" problem and likely never will at this rate of development. Anyway, that's a tangent, but my point is it's much easier to enforce a consistent development policy for request/response changes than it is for behavioral changes. -- Thanks, Matt

2119

Age (days ago)

2120

Last active (days ago)

List overview

Download

21 comments

8 participants

participants (8)

Eric Harney
Ghanshyam Mann
Kenichi Omichi
Matt Riedemann
Michael McCune
Sean McGinnis
Sean Mooney
zhu.fanglei＠zte.com.cn