[Openstack-operators] [Product] [openstack-dev] [all][log] Openstack HTTP error codes

Duncan Thomas duncan.thomas at gmail.com
Sat Jan 31 10:24:56 UTC 2015


Hi

This discussion came up at the cinder mid-cycle last week too, specifically
in the context of 'Can we change the details text in an existing error, or
is that an unacceptable API change'.

I have to second security / operational concerns about exposing too much
granularity of failure in these error codes.

For cases where there is something wrong with the request (item out of
range, invalid names, feature not supported, etc) I totally agree that we
should have good, clear, parsable response, and standardisation would be
good. Having some fixed part of the response (whether a numeric code or, as
I tend to prefer, a CamelCaseDescription so that I don't have to go look it
up) and a human readable description section that is subject to change
seems sensible.

What I would rather not see is leakage of information when something
internal to the cloud goes wrong, that the tenant can do nothing against.
We certainly shouldn't be leaking internal implementation details like
vendor details - that is what request IDs and logs are for. The whole point
of the cloud, to me, is that separation between the things a tenant
controls (what they want done) and what the cloud provider controls (the
details of how the work is done).

For example, if a create volume request fails because cinder-scheduler has
crashed, all the tenant should get back is 'Things are broken, try again
later or pass request id 1234-5678-abcd-def0 to the cloud admin'. They
should need to or even be allowed to care about the details of the failure,
it is not their domain.



On 30 January 2015 at 02:34, Rochelle Grober <rochelle.grober at huawei.com>
wrote:

> Hi folks!
>
> Changed the tags a bit because this is a discussion for all projects and
> dovetails with logging rationalization/standards/
>
> At the Paris summit, we had a number of session on logging that kept
> circling back to Error Codes.  But, these codes would not be http codes,
> rather, as others have pointed out, codes related to the calling entities
> and referring entities and the actions that happened or didn’t.  Format
> suggestions were gathered from the Operators and from some senior
> developers.  The Logging Working Group is planning to put forth a spec for
> discussion on formats and standards before the Ops mid-cycle meetup.
>
> Working from a Glance proposal on error codes:
> https://review.openstack.org/#/c/127482/ and discussions with operators
> and devs, we have a strawman to propose.  We also have a number of
> requirements from Ops and some Devs.
>
> Here is the basic idea:
>
> Code for logs would have four segments:
> Project                                 Vendor/Component      Error
> Catalog number     Criticality
> Def         [A-Z] [A-Z] [A-Z]               -
> [{0-9}|{A-Z}][A-Z] -         [0000-9999]-                       [0-9]
> Ex.          CIN-                                       NA-
>                         0001-                                     2
>                 Cinder                                   NetApp
>                                     driver error no
> Criticality
> Ex.          GLA-                                      0A-
>                          0051                                       3
>                 Glance                                  Api
>                          error no                               Criticality
> Three letters for project,  Either a two letter vendor code or a number
> and letter for 0+letter for internal component of project (like API=0A,
> Controller =0C, etc),  four digit error number which could be subsetted for
> even finer granularity, and a criticality number.
>
> This is for logging purposes and tracking down root cause faster for
> operators, but if an error is generated, why can the same codes be used
> internally for the code as externally for the logs?  This also allows for a
> unique message to be associated with the error code that is more
> descriptive and that can be pre translated.  Again, for logging purposes,
> the error code would not be part of the message payload, but part of the
> headers.  Referrer IDs and other info would still be expected in the
> payload of the message and could include instance ids/names, NICs or VIFs,
> etc.  The message headers is code in Oslo.log and when using the Oslo.log
> library, will be easy to use.
>
> Since this discussion came up, I thought I needed to get this info out to
> folks and advertise that anyone will be able to comment on the spec to
> drive it to agreement.  I will be  advertising it here and on Ops and
> Product-WG mailing lists.  I’d also like to invite anyone who want to
> participate in discussions to join them.  We’ll be starting a bi-weekly or
> weekly IRC meeting (also announced in the stated MLs) in February.
>
> And please realize that other than Oslo.log, the changes to make the
> errors more useable will be almost entirely community created standards
> with community created tools to help enforce them.  None of which exist
> yet, FYI.
>
> --RockyG
>
>
>
>
>
>
> From: Eugeniya Kudryashova [mailto:ekudryashova at mirantis.com]
> Sent: Thursday, January 29, 2015 8:33 AM
> To: openstack-dev at lists.openstack.org
> Subject: [openstack-dev] [api][nova] Openstack HTTP error codes
>
>
> Hi, all
>
>
>
> Openstack APIs interact with each other and external systems partially by
> passing of HTTP errors. The only valuable difference between types of
> exceptions is HTTP-codes, but current codes are generalized, so external
> system can’t distinguish what actually happened.
>
>
> As an example two different failures below differs only by error message:
>
>
> request:
>
> POST /v2/790f5693e97a40d38c4d5bfdc45acb09/servers HTTP/1.1
>
> Host: 192.168.122.195:8774<http://192.168.122.195:8774>
>
> X-Auth-Project-Id: demo
>
> Accept-Encoding: gzip, deflate, compress
>
> Content-Length: 189
>
> Accept: application/json
>
> User-Agent: python-novaclient
>
> X-Auth-Token: 2cfeb9283d784cfba694f3122ef413bf
>
> Content-Type: application/json
>
>
> {"server": {"name": "demo", "imageRef":
> "171c9d7d-3912-4547-b2a5-ea157eb08622", "key_name": "test", "flavorRef":
> "42", "max_count": 1, "min_count": 1, "security_groups": [{"name": "bar"}]}}
>
> response:
>
>     HTTP/1.1 400 Bad Request
>
> Content-Length: 118
>
> Content-Type: application/json; charset=UTF-8
>
> X-Compute-Request-Id: req-a995e1fc-7ea4-4305-a7ae-c569169936c0
>
> Date: Fri, 23 Jan 2015 10:43:33 GMT
>
>
> {"badRequest": {"message": "Security group bar not found for project
> 790f5693e97a40d38c4d5bfdc45acb09.", "code": 400}}
>
>
> and
>
>
> request:
>
> POST /v2/790f5693e97a40d38c4d5bfdc45acb09/servers HTTP/1.1
>
> Host: 192.168.122.195:8774<http://192.168.122.195:8774>
>
> X-Auth-Project-Id: demo
>
> Accept-Encoding: gzip, deflate, compress
>
> Content-Length: 192
>
> Accept: application/json
>
> User-Agent: python-novaclient
>
> X-Auth-Token: 24c0d30ff76c42e0ae160fa93db8cf71
>
> Content-Type: application/json
>
>
> {"server": {"name": "demo", "imageRef":
> "171c9d7d-3912-4547-b2a5-ea157eb08622", "key_name": "foo", "flavorRef":
> "42", "max_count": 1, "min_count": 1, "security_groups": [{"name":
> "default"}]}}
>
> response:
>
> HTTP/1.1 400 Bad Request
>
> Content-Length: 70
>
> Content-Type: application/json; charset=UTF-8
>
> X-Compute-Request-Id: req-87604089-7071-40a7-a34b-7bc56d0551f5
>
> Date: Fri, 23 Jan 2015 10:39:43 GMT
>
>
> {"badRequest": {"message": "Invalid key_name provided.", "code": 400}}
>
>
> The former specifies an incorrect security group name, and the latter an
> incorrect keypair name. And the problem is, that just looking at the
> response body and HTTP response code an external system can’t understand
> what exactly went wrong. And parsing of error messages here is not the way
> we’d like to solve this problem.
>
>
> Another example for solving this problem is AWS EC2 exception codes [1]
>
>
> So if we have some service based on Openstack projects it would be useful
> to have some concrete error codes(textual or numeric), which could allow to
> define what actually goes wrong and later correctly process obtained
> exception. These codes should be predefined for each exception, have
> documented structure and allow to parse exception correctly in each step of
> exception handling.
>
>
> So I’d like to discuss implementing such codes and its usage in openstack
> projects.
>
> [1] -
> http://docs.aws.amazon.com/AWSEC2/latest/APIReference/errors-overview.html
> _______________________________________________
> Product-wg mailing list
> Product-wg at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/product-wg
>



-- 
Duncan Thomas
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20150131/5f5e3460/attachment.html>


More information about the OpenStack-operators mailing list