[openstack-dev] [openstack-sdk-php] discussion: json schema to define apis
Shaunak Kashyap
shaunak.kashyap at RACKSPACE.COM
Tue Apr 29 02:33:49 UTC 2014
Yes, thanks Jamie for the very thorough write ups and Matt for the thoughtful comments.
My comments are inline below. A point of clarification: I’m using “SDK authors” to mean all of us who are building the SDK and I’m using “SDK consumers” to mean the end-user developers who will be using the SDK.
Thanks,
Shaunak
On Apr 28, 2014, at 6:29 PM, Matthew Farina <matt at mattfarina.com> wrote:
> Jamie, thanks for going into so much detail.
>
>
> On Mon, Apr 28, 2014 at 9:28 PM, Matthew Farina <matt at mattfarina.com> wrote:
> While reading this it struck me that we should prioritize the experience of end-user, that is application developers, over the experience of those working on the SDK. I don't think we'd ever directly talked about this so I wanted to take a moment and state it.
>
> What I put in below isn't my full set of questions but I think it's enough for now.
>
> On Mon, Apr 28, 2014 at 11:34 AM, Jamie Hannaford <jamie.hannaford at rackspace.com> wrote:
>
> Thanks Matt for bringing up these questions - I think having this kind of discussion is essential for such a big idea. It also helps me clarify my own thinking towards this issue.
>
> Before I answer, I want to point out that I'm not staunchly for or against any particular idea. I do think that schemas offer a lot of advantages over writing user-land code, but I'm more than willing to abandon the proposal if we all determine there's a stronger and more compelling alternative.
>
> 1. Why use schemas instead of userland code?
>
> I've put my answer to this question here: https://wiki.openstack.org/wiki/OpenStack-SDK-PHP/JSON-schema
>
>> Can we look at this from the experience an end user would have? In the Python SDK they are working on an ORM style system. It's sorta similar to the system currently in the PHP SDK. For example you could do something like this in Python,
>>
>> o = Container.get_by_id('foo').get_object('bar/baz.awesome')
>>
>> I would imagine something similar in PHP like,
>>
>> $o = $objectStore->getContainer('foo')->getObject('bar/baz.awesome');
>>
>> I don't think you can do this using the json schema code I've seen so far. Can you touch on the experience for developers who are using the library? For example, the coding style or ability to know what they have access to? I was just thinking of how magic methods using a schema aren't going to work for tools that do autocompletion.
Good point about the discoverability aspect. SDK consumers would need to refer to schemas during development and IDEs might not be able to grok these like they would a PHP API.
That said, the benefits mentioned on the wiki page still stand. Is there a way for SDK authors to get the benefits of using schemas while giving SDK consumers the benefits of an easily- and automatically-discoverable PHP API?
>>
>> I'm curious about blueprints for the schema support. Things on the mailing list are great. I'm curious about plans and what's in the blueprints. Do you have any info on that?
>>
>> If the other SDKs aren't interested in using json schema, wouldn't that be a lack of consistency?
This concerns me less. Consistency is useful if there’s a significant overlap amongst the consumers of different SDKs or, secondarily, amongst the authors of the different SDKs. I’m not sure there are such significant overlaps.
>
>
> 2. How will debugging work?
>
> I'll highlight two conceivable issues which might need debugging. The first issue is the API rejecting a request for whatever reason (i.e. a proxy modifying headers); the second issue is when a data structure returned from the API fails to validate against a particular schema file.
>
> Issue 1: Malformed requests
> There are two reasons why a request would fail: if an end-user stocks it with bad data, or if something in the middle deforms it. A very easy solution to the first problem is using schemas to perform basic parameter checking before a request is serialized. If we know, for example, that the API is expecting a particular value - or a particular header - the schema is in charge of making that happen. Performing basic validation catches most errors - and debugging is very easy due to the exception thrown. If you're ever in doubt, you just refer to the schema to see what was serialized into a request in the same way you do for a concrete class method.
If I understand this right, the same code path would be used to perform basic parameter checking regardless of the upstream service. The specific validation rules for each upstream service would be represented in the schema. Given this setup, when an exception is thrown, it would be awesome if we could point the SDK consumer to the line/section of the schema where the violated validation rule was defined. This way the exception is actionable for the SDK consumer.
>
> If something in the middle deforms the request, the API will naturally reject it. When it comes to debugging this issue, all you need to do is wrap your original code in a try/catch block and use Guzzle's BadResponseException to return the API's response. You can easily see the type of failure through the HTTP status code, and the exact reason why the request failed. So it doesn't matter where the failure happens - all that matters is that there's a way to catch and spit out the API's response and the originating request.
>
>
>> First, this assumes Guzzle. Since we aren't tightly coupled to Guzzle we can't always assume that. But, for practical purposes we can assume it for now.
>>
>> I was curious how things would work in PHP, such as the stack trace. For magic methods you'll have a call to the magic method and to __call() where the logic actually sits. In a debugger you'll be able to step through this just fine.
>>
>> One thing that may be more difficult is that knowing how the json schema system works to debug and understand what's going on. How the schemas work and how something gets translated into a method. Walking through a few methods that are extended would be less logic to understand in the process.
>>
>> I'm curious how the debugging experience would be for an end user who doesn't know the json schema system but is using the library. Out of curiosity I might try to find some time to sit down with some PHP developers and see how they handle the debugging experience.
>
> Issue 2: Incorrect API data
> Say we've defined that a Server has two properties: a name (which is a string) and metadata (which is an object). If the API returns a name as an array, that obviously fails validation. When the schema code goes to validate the API data, it will raise validation error when it comes to validate that "name" property. How you consequently use this validation error them is completely up to you: you could output it to STDOUT, you could save it to a local log on the filesystem, you could buffer it temporarily in memory.
>
> Any API data that does not validate successfully against a schema should not be presented to the end-user. So if a "created_date" property is returned, that isn't defined in our schema, it should not be populated in the resulting model. The model returned to the end-user would be a simple object that implements \ArrayAccess, meaning that it can be accessed like a simple array.
It may be okay to hide unexpected information from upstream services (such as the “created_date” property in your example) from SDK consumers but, as SDK authors, we’ll want to know when this happens. To this end it might be useful to introduce a notion of strictness here. By default strictness would be turned off but we would want to turn it on when running our integration tests.
>
> 3. Where would JSON schemas come from?
>
> It depends on each OpenStack service. Glance and Marconi (soon) offer schemas directly through the API - so they are directly responsible for maintaining this - we'd just consume it. We could probably cache a local version to minimize requests.
>
> For services that do not offer schemas yet, we'd have to use local schema files. There's a project called Tempest which does integration tests for OpenStack clusters, and it uses schema files. So there might be a possibility of using their files in the future. If this is not possible, we'd write them ourselves. It took me 1-2 days to write the entire Nova API. Once a schema file has been fully tested and signed off as 100% operational, it can be frozen as a set version.
>
>> Can we convert the schema files from Tempest into something we can use?
>>
>
> 4. What would the workflow look like?
>
> I don't really understand what you mean: can you elaborate?
>
>> For example, when would validation happen? Is that for testing or runtime for use in an application?
>>
>
> 5. How does schema files handle business logic?
>
> That's a really great question. I've written a brief write-up here: https://wiki.openstack.org/wiki/OpenStack-SDK-PHP/JSON-schema-business-logic
>
>
>> I think what you're proposing is that the methods map to API calls. There isn't any logic in these objects that isn't an API call.
>>
>
Let me see if I’m understanding the meaning of business logic here. One example I can think of is higher-level methods that might encompass multiple API calls, for example, deleting a non-empty container in an object store. Is this what you meant by “There isn't any logic in these objects that isn't an API call?”
> Jamie
>
> From: Matthew Farina <matt at mattfarina.com>
> Date: Thursday, April 24, 2014 at 5:42 PM
> To: Jamie Hannaford <jamie.hannaford at rackspace.com>, "OpenStack Development Mailing List (not for usage questions)" <openstack-dev at lists.openstack.org>
> Cc: "sam.choi at hp.com" <sam.choi at hp.com>
> Subject: [openstack-sdk-php] discussion: json schema to define apis
>
> Jamie (and whom ever else wants to jump in),
>
> It's been proposed to use json schema to describe the API calls rather
> than code. The operations to perform and what they do would be
> described rather than coded and then some code would use the schema to
> know how to act.
>
> Others are already doing this. For example, the AWS SDK for PHP. Take
> their S3 structure as an example
> https://github.com/aws/aws-sdk-php/blob/master/src/Aws/S3/Resources/s3-2006-03-01.php.
> The ability to do this goes beyond this one example. It just appears
> to be something similar to what we're considering.
>
> Given this in the scope of PHP I've got a number of questions. Several
> of these I've compiled while discussing this with others so they don't
> represent my point of view. Rather, they are just a list of
> outstanding questions. Since this is a different method for handling
> the API calls from the other SDKs being built the concept should be
> really vetted.
>
> Here are the questions:
>
> 1. Why use json schema rather than other reuse methods? I've discussed
> the use of json schemas with others and those working on the other
> languages have not been interested in json schema at the moment. Why
> do it differently given the context?
>
> Note, it might be worth looking at the python SDK which is doing
> things differently. If I understand it right they are moving aware
> from using managers and resources all together.
>
> 2. How will debugging work in practice? For example, a call is made
> from behind a proxy. The proxy alters the HTTP headers so the request
> fails and an exception is thrown. The schema and endpoint are valid.
> It's something in the middle that changed things. Walking through the
> code goes through magic methods to handle the schema. How would
> debugging that work to understand what's happening compared to what
> was expected.
>
> 3. Where would the json schemas for services come from and who would
> manage them?
>
> 4. What would the workflow look like for working with the schemas at
> both execution time for everyday use and for testing?
>
> 5. How would logic happen? Sometimes a request to an API is more than
> just a request and response. For example, calling to something in
> object storage where the object does not exist. The transport layer
> will throw an exception (this goes all the way down to Guzzle throwing
> one) that needs to be caught and managed. How should cases with some
> logic like this be handled and easy to understand?
>
> Thanks for looking into this. The topic has really sparked my
> interest. I for one am really curious about the practicalities of
> using json schema and the developer experience around it.
>
> - Matt Farina
>
>
>
>
> Jamie Hannaford
>
> Software Developer III - CH
>
>
> Tel: +41434303908
> Mob: +41791009767
>
>
>
>
>
> Rackspace International GmbH a company registered in the Canton of Zurich, Switzerland (company identification number CH-020.4.047.077-1) whose registered office is at Pfingstweidstrasse 60, 8005 Zurich, Switzerland. Rackspace International GmbH privacy policy can be viewed at www.rackspace.co.uk/legal/swiss-privacy-policy
> -
> Rackspace Hosting Australia PTY LTD a company registered in the state of Victoria, Australia (company registered number ACN 153 275 524) whose registered office is at Suite 3, Level 7, 210 George Street, Sydney, NSW 2000, Australia. Rackspace Hosting Australia PTY LTD privacy policy can be viewed at www.rackspace.com.au/company/legal-privacy-statement.php
> -
> Rackspace US, Inc, 5000 Walzem Road, San Antonio, Texas 78218, United States of America
> Rackspace US, Inc privacy policy can be viewed at www.rackspace.com/information/legal/privacystatement
> -
> Rackspace Limited is a company registered in England & Wales (company registered number 03897010) whose registered office is at 5 Millington Road, Hyde Park Hayes, Middlesex UB3 4AZ.
> Rackspace Limited privacy policy can be viewed at www.rackspace.co.uk/legal/privacy-policy
> -
> Rackspace Benelux B.V. is a company registered in the Netherlands (company KvK nummer 34276327) whose registered office is at Teleportboulevard 110, 1043 EJ Amsterdam.
> Rackspace Benelux B.V privacy policy can be viewed at www.rackspace.nl/juridisch/privacy-policy
> -
> Rackspace Asia Limited is a company registered in Hong Kong (Company no: 1211294) whose registered office is at 9/F, Cambridge House, Taikoo Place, 979 King's Road, Quarry Bay, Hong Kong.
> Rackspace Asia Limited privacy policy can be viewed at www.rackspace.com.hk/company/legal-privacy-statement.php
> -
> This e-mail message (including any attachments or embedded documents) is intended for the exclusive and confidential use of the individual or entity to which this message is addressed, and unless otherwise expressly indicated, is confidential and privileged information of Rackspace. Any dissemination, distribution or copying of the enclosed material is prohibited. If you receive this transmission in error, please notify us immediately by e-mail at abuse at rackspace.com and delete the original message. Your cooperation is appreciated.
>
>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
More information about the OpenStack-dev
mailing list