[keystone][placement][neutron][api-sig] http404 to NotFound, or how should a http json error body look like?
Hi, Recently I planned to add tests to the neutron feature routed provider networks, which actually uses placement API to make possible the scheduling based on neutron segment information. I realized that the feature last worked on Queens. We can discuss separately the lack of test coverage, and similar things, but I would like to narrow or view now. The neutron bug report, with some background information: https://bugs.launchpad.net/neutron/+bug/1828543 After some debugging I realized that the problem comes from the fact that at a point neutron uses the information received in http404, like this: try: return self._get(url).json() except keystoneauth1.exceptions.NotFound as e: if 'foo' in e.details: do_something() see: https://opendev.org/openstack/neutron-lib/src/branch/master/neutron_lib/plac... keystoneauth1 expects the http body in case of for example http404 to be something like this (python dict comes now): body= {'error': {'message': 'Foooooo', 'details': 'Baaaaar'}} see: https://opendev.org/openstack/keystoneauth/src/branch/master/keystoneauth1/e... But placement started to adopt one suggestion from the API-SIG: http://specs.openstack.org/openstack/api-wg/guidelines/errors.html, and that is something like this (again python): body={'errors': [{'status': 404, 'title': 'Not Found', 'detail': 'The resource could not be found.... ', 'request_id': '...'}]} Shall I ask for help how I should make this bug in neutron fixed? As a quick and dirty solution in neutron I can use the response from the exception, but I feel that there should be a better way to fix this:-) Thanks in advance for the help Regards LAjos
On Fri, 17 May 2019, Lajos Katona wrote:
keystoneauth1 expects the http body in case of for example http404 to be something like this (python dict comes now): body= {'error': {'message': 'Foooooo', 'details': 'Baaaaar'}} see: https://opendev.org/openstack/keystoneauth/src/branch/master/keystoneauth1/e...
But placement started to adopt one suggestion from the API-SIG: http://specs.openstack.org/openstack/api-wg/guidelines/errors.html,
and that is something like this (again python): body={'errors': [{'status': 404, 'title': 'Not Found', 'detail': 'The resource could not be found.... ', 'request_id': '...'}]}
Thanks for all the detail in this message and on the bug report. It helps make understanding what's going on a lot easier. As you've discovered placement is following the guidelines for how errors are supposed to be formatted. If keystoneauth1 can't speak that format, that's probably the primary bug. However, it also sounds like you're encountering a few different evolutions in placement that may need to be addressed in older versions of neutron's placement client: * For quite a while placement was strict about responding to Accept headers appropriately. This was based on its interaction with Webob. If you didn't ask for json in the Accept header, errors could come in HTML or Text. The most reliable fix for this in any client of any API is to always send an Accept header that states how you want responses to be presented (in this case application/json). This can lead to interesting parsing troubles if you are rely on the bodies of responses. * In microversion 1.23 we started sending 'code' with error responses in an effort to avoid needing to parse error responses. I've got a few different suggestions that you might want to explore. None of them are a direct fix for the issue you are seeing but may lead to some ideas. First off, this would be a big change, but I do not think it is good practice to raise exceptions when getting 4xx responses. Instead in the neutron-lib placement client it would be better to branch on status code in the resp object. If it doesn't matter why you 404d, just that you did, you could log just that, not the detail. Another thing to think about is in the neutron placement client you have a get_inventory [1] method which has both resource provider uuid and resource class in the URL and thus can lead to the "two different types of 404s" issue that is making parsing the error response required. You could potentially avoid this by implementing get_inventories [2] which would only 404 on bad resource provider uuid and would return inventory for all resource classes on that rp. You can use PUT on the same URL to replace all inventory on that rp. Make sure you send Accept: application/json in all your requests in the client. Make keystoneauth1 interpret two diferent types of errors response: the one it does, and the one in the api-sig guidelines. Note that I've yet to see any context where there is more than one error in the list, so it is always errors[0] that gets inspected. On the placement side we should probably add codes to the few different places where a 404 can happen for different reasons (they are almost all combinations of not finding a resource provider or not finding a resource class). If that were present you could branch the library code on the code in the errors structure instead of the detail. However if its possible to avoid choosing which 404, that's probably better. I hope some of that helps. Hopefully some keystoneauth folk will chime in too. [1] https://opendev.org/openstack/neutron-lib/src/branch/master/neutron_lib/plac... [2] https://developer.openstack.org/api-ref/placement/#resource-provider-invento... -- Chris Dent ٩◔̯◔۶ https://anticdent.org/ freenode: cdent
Thanks Chris for the reply, let's wait for keystone folks for comments. On 2019. 05. 17. 13:42, Chris Dent wrote:
On Fri, 17 May 2019, Lajos Katona wrote:
keystoneauth1 expects the http body in case of for example http404 to be something like this (python dict comes now): body= {'error': {'message': 'Foooooo', 'details': 'Baaaaar'}} see: https://protect2.fireeye.com/url?k=e824ec9f-b4f7fcf7-e824ac04-8691b328a8b8-1c7b10b9a8d8e9c8&q=1&u=https%3A%2F%2Fopendev.org%2Fopenstack%2Fkeystoneauth%2Fsrc%2Fbranch%2Fmaster%2Fkeystoneauth1%2Fexceptions%2Fhttp.py%23L406-L415
But placement started to adopt one suggestion from the API-SIG: http://specs.openstack.org/openstack/api-wg/guidelines/errors.html,
and that is something like this (again python): body={'errors': [{'status': 404, 'title': 'Not Found', 'detail': 'The resource could not be found.... ', 'request_id': '...'}]}
Thanks for all the detail in this message and on the bug report. It helps make understanding what's going on a lot easier.
As you've discovered placement is following the guidelines for how errors are supposed to be formatted. If keystoneauth1 can't speak that format, that's probably the primary bug.
However, it also sounds like you're encountering a few different evolutions in placement that may need to be addressed in older versions of neutron's placement client:
* For quite a while placement was strict about responding to Accept headers appropriately. This was based on its interaction with Webob. If you didn't ask for json in the Accept header, errors could come in HTML or Text. The most reliable fix for this in any client of any API is to always send an Accept header that states how you want responses to be presented (in this case application/json). This can lead to interesting parsing troubles if you are rely on the bodies of responses.
* In microversion 1.23 we started sending 'code' with error responses in an effort to avoid needing to parse error responses.
I've got a few different suggestions that you might want to explore. None of them are a direct fix for the issue you are seeing but may lead to some ideas.
First off, this would be a big change, but I do not think it is good practice to raise exceptions when getting 4xx responses. Instead in the neutron-lib placement client it would be better to branch on status code in the resp object.
If it doesn't matter why you 404d, just that you did, you could log just that, not the detail.
Another thing to think about is in the neutron placement client you have a get_inventory [1] method which has both resource provider uuid and resource class in the URL and thus can lead to the "two different types of 404s" issue that is making parsing the error response required. You could potentially avoid this by implementing get_inventories [2] which would only 404 on bad resource provider uuid and would return inventory for all resource classes on that rp. You can use PUT on the same URL to replace all inventory on that rp.
Make sure you send Accept: application/json in all your requests in the client.
Make keystoneauth1 interpret two diferent types of errors response: the one it does, and the one in the api-sig guidelines. Note that I've yet to see any context where there is more than one error in the list, so it is always errors[0] that gets inspected.
On the placement side we should probably add codes to the few different places where a 404 can happen for different reasons (they are almost all combinations of not finding a resource provider or not finding a resource class). If that were present you could branch the library code on the code in the errors structure instead of the detail. However if its possible to avoid choosing which 404, that's probably better.
I hope some of that helps. Hopefully some keystoneauth folk will chime in too.
[2] https://developer.openstack.org/api-ref/placement/#resource-provider-invento...
On Mon, May 20, 2019, at 05:40, Lajos Katona wrote:
Thanks Chris for the reply, let's wait for keystone folks for comments.
Hi!
On 2019. 05. 17. 13:42, Chris Dent wrote:
On Fri, 17 May 2019, Lajos Katona wrote:
keystoneauth1 expects the http body in case of for example http404 to be something like this (python dict comes now): body= {'error': {'message': 'Foooooo', 'details': 'Baaaaar'}} see: https://protect2.fireeye.com/url?k=e824ec9f-b4f7fcf7-e824ac04-8691b328a8b8-1c7b10b9a8d8e9c8&q=1&u=https%3A%2F%2Fopendev.org%2Fopenstack%2Fkeystoneauth%2Fsrc%2Fbranch%2Fmaster%2Fkeystoneauth1%2Fexceptions%2Fhttp.py%23L406-L415
But placement started to adopt one suggestion from the API-SIG: http://specs.openstack.org/openstack/api-wg/guidelines/errors.html,
and that is something like this (again python): body={'errors': [{'status': 404, 'title': 'Not Found', 'detail': 'The resource could not be found.... ', 'request_id': '...'}]}
Thanks for all the detail in this message and on the bug report. It helps make understanding what's going on a lot easier.
As you've discovered placement is following the guidelines for how errors are supposed to be formatted. If keystoneauth1 can't speak that format, that's probably the primary bug.
[snipped]
Make keystoneauth1 interpret two diferent types of errors response: the one it does, and the one in the api-sig guidelines. Note that I've yet to see any context where there is more than one error in the list, so it is always errors[0] that gets inspected.
We'll happily accept patches to keystoneauth that make it compliant with the API-SIG's guidelines (as long as it is backwards compatible). I gotta say, though, this guideline on error handling really takes me aback. Why should a single HTTP request ever result in a list of errors, plural? Is there any standard, pattern, or example *outside* of OpenStack where this is done or recommended? Why? Colleen
On Mon, 20 May 2019, Colleen Murphy wrote:
Make keystoneauth1 interpret two diferent types of errors response: the one it does, and the one in the api-sig guidelines. Note that I've yet to see any context where there is more than one error in the list, so it is always errors[0] that gets inspected.
We'll happily accept patches to keystoneauth that make it compliant with the API-SIG's guidelines (as long as it is backwards compatible).
I gotta say, though, this guideline on error handling really takes me aback. Why should a single HTTP request ever result in a list of errors, plural? Is there any standard, pattern, or example *outside* of OpenStack where this is done or recommended? Why?
I can't remember the exact details (it was 4 years ago [1]) but I think the rationale was that if there was a call behind the call it would be useful and important to be able to report a stack of errors: I called neutron and it failed like this, and I failed as a result of that failure, like this. I agree it is a bit weird but it seems the guideline had some acclaim at the time so... Ed or Monty may have additional recollections.k [1] https://review.opendev.org/#/c/167793/ -- Chris Dent ٩◔̯◔۶ https://anticdent.org/ freenode: cdent
On Mon, May 20, 2019 at 10:59 AM Colleen Murphy <colleen@gazlene.net> wrote:
I gotta say, though, this guideline on error handling really takes me aback. Why should a single HTTP request ever result in a list of errors, plural? Is there any standard, pattern, or example *outside* of OpenStack where this is done or recommended? Why?
hi Colleen, i will attempt to answer your questions, but it has been quite awhile since i (apparently) approved that guideline. i don't remember the specifics of that discussion, but as Chris pointed out the idea was that there are conditions where a single call will result in several errors that /may/ be of interest to caller. we decided that adding a list for this object was a low bar to pass for being able to support those conditions. for example, a call to sahara to deploy a hadoop cluster might result in several nodes of the cluster failing to deploy and thus an error for a call to "create an entire hadoop cluster" might need to return a dense error message. (note, i don't think this is how it's actually done, but i just wanted to provide an example and i happen to know sahara better than other services) to your question about other examples outside of openstack, i can't provide anything further. i vaguely recall that we had a suggested format that included the list of errors, but in looking back through my materials i can't find it. i want to say that it was something like a json-home (it wasn't json-home, but similar in spirit) type spec in which it was suggested. these guidelines are not written in stone, and perhaps we need to revisit this one. although given that folks are now implementing it, that might be troublesome in other ways. i hope that helps at least shed some light on the thought process. peace o/
hi, just wanted to post a followup to this discussion. during our office hours today we discussed this conversation again, i think the ultimate output is that i will take a look at adding a patch for keystoneauth to be more tolerant of the error payload variances. thanks again for the discussions =) peace o/
Hi Michael, Thanks. That would be the best. I checked another way to do the error processing on the client side like nova does (see the wip patches: https://review.opendev.org/662204 & https://review.opendev.org/662205) but that would mean everybody has to do the same who start using an API which was changed to list of errors. Regards Lajos Michael McCune <elmiko@redhat.com> ezt írta (időpont: 2019. máj. 30., Cs, 19:15):
hi, just wanted to post a followup to this discussion.
during our office hours today we discussed this conversation again, i think the ultimate output is that i will take a look at adding a patch for keystoneauth to be more tolerant of the error payload variances.
thanks again for the discussions =)
peace o/
participants (5)
-
Chris Dent
-
Colleen Murphy
-
Lajos Katona
-
Lajos Katona
-
Michael McCune