[Openstack] Keystone API Design Issues

Ziad Sawalha ziad.sawalha at rackspace.com
Sun Nov 27 05:21:11 UTC 2011


Hi Paul -

A few of these items I would like to take offline.

See responses below for the others…

On 11/26/11 11:45 AM, "Paul Querna" <pquerna at apache.org<mailto:pquerna at apache.org>> wrote:

On Thu, Nov 24, 2011 at 9:10 AM, Ziad Sawalha
<ziad.sawalha at rackspace.com<mailto:ziad.sawalha at rackspace.com>> wrote:
Hi Paul - thank you for the good feedback.

I'm going to address your points individually below, but before I want to
to set some context and address some of your broader concerns.

The 2.0 API for Keystone is released and multiple implementers are already
working on it (in fact, we at Rackspace have just released ours). There
were many calls for comments on the API throughout the year, but we locked
down the spec finally in September to try to deliver an implementation in
time for Diablo.

Sorry I couldn't be more involved earlier in the process earlier, we
had many other projects to get done before we were worried about
looking into Keystone integration.  I hope my feedback is helpful as
(one of the first?) non-core integrations with Keystone.

We've added a lot more documentation on integration (and more to come) - you can see much of it here: http://keystone.openstack.org. We would be happy to be directly involved in helping you out with Keystone integration (including adding functionality or new APIs if needed). Please let us know what you need.



... snip ...
A) The Token Validation API is fail deadly, because of support for
Tokens without a Tenant ID scope:


<http://docs.openstack.org/api/openstack-identity-service/2.0/content/GET_
validateToken_v2.0_tokens__tokenId__Admin_API_Service_Developer_Operations
-d1e1356.html>

When you are implementing a service that needs to validates tokens,
you pass in the tenant scope as the belongsTo parameter with the
Tenant ID.  However, this parameter is optional.  If a malicious
Tenant Id is passed in, for example if a service doesn't perform
sufficient validation, like letting a user pass in a & into the
tenantId, a token is considered valid for _all_ contexts.  Now, in
theory, you should be looking at the roles provided under the user,
and the examples given in the OpenStack documentation echo back the
validated Tentant ID to you, however in practice, and as seen in
production environments, this response body includes a default
identity role, and does not echo back the validated Tenant ID.

Tokens without scope are supported by the API - we had requests with use
cases for it - but it is not required. In fact, the Rackspace
implementation always returns a scoped token.

Maybe I am misunderstanding what you mean here by scopped token -- in
the Rackspace production implementation, belongsTo is optional.  If
belongsTo isn't present, isn't that an unscopped token?

I really wish that the validate token call would echo back the
tenantIds that the token is valid for to the validator.  This would
enable validators to ensure both their parsing of the tenantId and
that of Keystone was the same.

belongsTo is just a method for validation. It does not determine if a token is scoped or not. Generally, if you do a check with belongsTo on an unstopped token it should fail with a 401 Unauthorized. If that's not the behavior you are seeing let us know; it might be a bug.

Whether a token is scoped or not is determined when a user authenticates. Here are the combinations:

1. If a user supplies a tenant when authenticating, they should get a token scoped to that tenant. A check with belongsTo where the value is anything but that tenant should fail.

2. A user does not supply a tenant and they get an unscoped token. That's how Keystone should behave. I say should because we've been working hard to undo a concept that was introduced earlier this year called a default tenant. The default tenant concept gives you back a scoped token always. We decided to undo that late before the Diablo release, but there are deployments of the code out there that already have that and removing it from our code without breaking compatibility has been a challenge. We have in review in progress to remove that: https://review.openstack.org/#change,1068

3. A user does not supply a tenant and they get back a scoped token. That's how Rackspace Auth works and how Keystone behaves if default tenant is set. The difference between Rackspace and Keystone is that Rackspace scopes your token to two tenants by default. Keystone only supports scoping to one tenant at a time.

We have some sequence diagrams to describe these (attached). They've been merged into trunk but don't show up on the OpenStack docs site yet. I'll check on that with the doc team.

The validate call does echo back the tenants the token is scoped to. Here's the response:

{
    "access": {
        "token": {
            "expires": "2012-02-05T00:00:00",
            "id": "887665443383838",
            "tenant": {
                "id": "1",
                "name": "customer-x"
            }
        },
        "user": {
            "id": "1",
            "name": "joeuser",
            "roles": [
                {
                    "id": "3",
                    "name": "Member",
                    "serviceId": "1"
                }
            ],
            "tenantId": "1",
            "tenantName": "customer-x",
            "username": "joeuser"
        }
    }
}



... snip ...
B) Requiring consumers to pass Tenant IDs around is not a common
pattern in other cloud APIs.  A consumer was already keeping track of
their username, apikey, and temporal token, and now they essentially
need to keep another piece of information around, the Tenant ID.
This seems like it is an unneeded variable.  For example, Amazon
implements AWS Identity and Access Management by changing the API key
& secret that is used against the API depending on the role of the
account -- this hides the abstraction away from both client libraries
and validating services -- they still just care about the API key and
secret, and do not need to pass around an extra Tenant ID.

This sounds like a concern with the OpenStack implementation and not the
API spec.

The Keystone API spec doesn't require consumers to pass Tenant IDs around.
It even allows for a full implementation without the consumer having to
know or manage their tenant IDs. We've done that at Rackspace where you
auth with your credentials, get URLs back for the services you have, and
then you call those URLs using your token. Granted, the tenant ID (a.k.a
account numbers) is embedded in the URL, but this comes from the Rackspace
Cloud Servers and Swift API and is not a Keystone API design requirement.

Can you give an example of where you are having to maintain tenant IDs?

Consumers do, because as a client language API consumer, none of the
services I need to access are actually in the Service Catalog.
Things like staging, beta, and even production services are not listed
in the Service Catalog, so I need to extract out the tenant Id, and
then munge it onto a base url supplied by the user.

I'm confused by this. You say "none of the services I need to access are actually in the Service Catalog." Why is that? What's stopping you from registering your services in the catalog? (see latest docs on doing that in the attached PDF).

Note also that endpoints contain a tenantId attribute PRECISELY so you don't have to parse it out of the URL.

I think I need more data on which Keystone-compatible server you are calling, what your relationship is to it and the operator, and what's the data you expect to get out of it. You may have a unique use case we may not have covered. If so, we can work on getting that covered.


On Services: How would a service implement authentication of a user
for a tenant without either burning that tenant ID into the URL, or
requiring the consumer to pass in an X-Tenant-Id header?  With the
current token structure you must somehow pass this to the service --
otherwise the only thing the service can assert is that a Username has
a Valid token -- it cannot tell what tenant(s) it is valid for,
because a single token spans multiple tenants.

As long as the service receives a token scoped to a tenant, it can find that tenant out from Keystone.

Here's the call the service would make:
$ curl -H "X-Auth-Token: 999888777666" http://localhost:35357/v2.0/tokens/887665443383838

Note, the service only knows where the Keystone server is and how to authenticate to it. And the token from the user (no tenant).

The response has the tenant ID in it:
{
    "access": {
        "token": {
            "expires": "2012-02-05T00:00:00",
            "id": "887665443383838",
            "tenant": {
                "id": "1",
                "name": "customer-x"
            }
        },
        "user": {
            "id": "1",
            "name": "joeuser",
            "roles": [
                {
                    "id": "3",
                    "name": "Member",
                    "serviceId": "1"
                }
            ],
            "tenantId": "1",
            "tenantName": "customer-x",
            "username": "joeuser"
        }
    }
}

So neither the service or the consumer need to know or pass the tenant around.




C) Requiring Services to encode the Tenant ID into their URLs is not a
common design, and can cause issues.  By encoding identity both into
the Token and the Tenant in the URL, there are now multiple attack
vectors from a security perspective, and can make URL routing in some
frameworks more painful.

As I mention above, that's not a Keystone requirement and was not driven
by Keystone. As user who has multiple accounts myself I find the model
works for me, but it is certainly not forced or required upon services in
order to work with Keystone.

I know that much thought went into that design taking into consideration
things like rate limiting, caching, etcŠ

This is probably a broader OpenStack API conversation. Fair warning, those
conversations have historically lead to bike and blood shedding :-)

I don't understand the assertion that rate limiting or caching have to
do with the URL structure in this case.  For example with caching,
you'll want to Vary on X-Auth-Token regardless, as a user may have
different roles inside the same Tenant.  If a URL was cached solely
based on tenantId, a user with a more restrictive role (granted, this
doesn't quite exist in OpenStack yet?), could view content they
shouldn't have access to.

I'll connect you with someone who knows more about this than I to continue the conversation.

But from a Keystone perspective, this is not required (per the previous example).


D) The Service Catalog makes it difficult to run testing, beta, and
staging environments.  This is because in practice many services are
not registered in the service catalog.  To work around this, we
commonly see that a regex is used to extract the Tenant Id from
another URL, and then the client builds a new URL.  You can see this
even being recommended by Rackspace in the Disqus comments ont he
Cloud DNS API here:
<http://docs.rackspace.com/cdns/api/v1.0/cdns-devguide/content/Service_Acc
ess_Endpoints-d1e753.html>

Is the problem "in practice many services are not registered in the
service catalog"?

I'm wondering if we need to make it easier for services to register
themselves in the catalog or evolve the service catalog to handle use
cases like staging, beta, and testing environments?

Yes, this is a huge integration problem right now in both Reach and
$UnannoucnedProduct.  For $UnannoucnedProduct our welcome packet is
going to tell Users to Regex out their TenantId out of the Compute
URL.  I don't see an alternative right now.

We can do this. This is a Rackspace conversation so I'll follow up with you offline.


For Reach we had to put in the ex_force_base_url feature to Apache
Libcloud, to ignore the URLs returned by the Service Catalog, because
they never contain the endpoints actually used by Reach.

... snip ...
E) The Service catalog should make it easy to expose many tenants to
the same service.  Currently you need to build a unique tentant
specific URL for each service.   The most common use case is enabling
all or a large set of tenants to access the same service, and it seems
like this use case is not well covered.

I'm not sure I get the pain point you have here, but I can talk to the
Keystone API aspect of this.

Given that Keystone has support for services that want to embed a tenant
in their URL (as well as those that don't), the spec includes a definition
of the service catalog that doesn't assume either. All the catalog focuses
on is providing endpoints and data about each endpoint.

Are you asking specifically for support to return one endpoint and have a
list of tenants that
can exist under that endpoint? Something like this:


"serviceCatalog":[{
       "name":"Cloud Servers",
       "type":"compute",
       "endpoints":[{
               "publicURL":"https://compute.north.host.com/v1/",
               "tenants": ["t1000", "t2000", "t3000", etc...]
               }]


It can be done. We've seen (and as you pointed out) the usability
decreases with the number of use cases we support. For example, that use
case above would not work for anyone using tenants in their URLs, so
they;d be limited to one tenant in the array and someone could feasibly
complain about having to put one tenant in an array every time.

But if this is a needed use case for consumers it can be done.

The process we would follow be be to add it as an extension (since the
core API is released) and f it gains traction, propose it for core in the
next API. The next API will likely be a topic brought up in the next
summit in April.

No, this isn't really what I would like.

On the Consumer side: I would like a Token per Tenant.  Under each
tenant is the list of known service urls.  Preferably I never have to
concat a tenant ID onto an alpha/beta/staging URL.

On the Validation side: I would really like there to be a mapping of
Tokens -> Single Tenant.  And when I validate that token, I don't need
another parameter with the belongsTo, it returns to me the single
tenant and roles this token is valid for.

The single token per tenant is available in Keystone. If you're looking for this in Rackspace Auth we can that conversation offline as well.



F) Features like API Keys, are marked as Extensions to the API, and
not part of the core API.

Correct. We asserted that all the backends we would support (and that
operators of OpenStack would be running) have support for passwords. Not
all backends have support for API Keys, though. If API Keys go into the
core API they would need to be supported by all implementers and that
raises the bar to deploying OpenStack in some enterprises.

Example: say Rackspace wants to deploy OpenStack internally for employee
use and wants to integrate it with an enterprise directory running LDAP.
Password authentication is a simple tie in. If we also had to find a way
to support issuing employees API keys, that becomes a much more complex
project. As most enterprise directories don't support API keys natively.



G) Support for multifactor authentication (MFA) is not currently
available in any way.  This is very worrisome since the only 'in core'
Authentication mechanism is a username and a password.  With previous
systems, where the Username and Password were not 'owned' by a service
like Keystone, products like the control panel could implement MFA
themselves, but now that you can authenticate to the backend APIs
using only a password, Keystone must also support MFA.   Password Auth
is very different from API Key auth -- yes, they are both in theory
randomly generated and big, but in practice Passwords are weak, and
reused widely, while only API keys are big and random.

Same argument as above. The extension mechanism is robust and would allow
any operator to add an MFA extension and offer that to their customers.
Putting it in core would make the prerequisites for deploying OpenStack
heavy.



H) There doesn't seem to be an open discussion about where the
Keystone API is going -- I hear mumbles about $OtherBigCompanies
wanting to extend the Keystone APIs in various ways, but there is no
discussion on this mailing list.  I know of at least 4 implementations
of Keystone within just Rackspace, but all of them are adding custom
extensions to even accomplish their baseline of use cases.

Correct. As I mentioned initially, all the focus now is on documenting and
stabilizing.

There are blueprints out there and work being done on extensions which
will "show up". IMO, that's a sign of an active community and evidence
that the extension mechanism allows people to work on their features and
evolve the domain without having to drive the core project one way or the
other or be gated by others.

The four implementations within Rackspace is a testament to the level of
innovation going on. My expectation is that the implementation that best
serves our customers will prevail and the learnings should feed back into
the Keystone API to continue evolving it.

Multiple implementations and extensions less than 6 months into the
existence of the API to me is a sign that the base API and reference
implementation is not suitable anyone's needs.  If no one can use the
reference implementation without any extensions in a production
environment then we are doing something wrong.

I disagree. The implementations all work fully without the need for extensions. Extensions exist to support features that exist only in one implementation and to expose features not in the core API but proposed for future use cases or use cases that were not incorporated when we developed the API.

The multiple extensions speaks to the dynamic community debate going on the direction we should take that API. And the fact theta all the extensions can coexist is a testament to the work that went into the extension mechanism to support that ecosystem.

The proliferation of extensions is by design and intent. The fact that the extensions allow the dynamic change while providing a stable, predictable contract that can be coded to for core functionality means they are achieving their design goal.


It seems we either missed too many base use cases, and/or that
something is seriously wrong with the reference implementation.  The
reference implementation doesn't need to be everything to everyone,
but it should be a reasonable default that even large providers are
comfortable using in production -- but that doesn't seem to be the
case at all.

We did intentionally set aside many use cases. The goal for Diablo was to get Keystone in OpenStack Core and support only existing use cases. The extension mechanism is what is allowing us to deliver on new use cases without modifying the Keystone 2.0 API. We will continue delivering new functionality that way and based on feedback and success of extensions, we will derive the 3.0 API.

Meanwhile, if you have use cases that were not solved for we can deliver them either through addinging additional functionality within the API or providing an extension.


Part 2: My Thoughts / Recommendations


A) More open discussion about the issues in Keystone.  Issues won't
get fixed until there is an open and honest dialog between the various
groups which depend on Keystone.  Identity, Authentication and
Authorization must be rock solid -- everything else in OpenStack and
many other commercial products are built upon it.

I fully support that. Thank you for driving it. There are a number of
threads and items you bring up here that I think could and should drive
blueprints and new features.



B)  I believe fundamentally that the data model for service catalogs
as exposed to the user is over complicated and broken.  I would prefer
that you hit the auth service with your username + apikey or password.
This returns to you a list of tenantIds aka billable cloud accounts.
Each of those records have a unique API Token. When you hit a service
using that Token, you do not include a TenantId. The service validates
the token, and its always in the context of a single tenant.

I believe we should consider changing the get Token API to return
something more like this:

 {access: [
       {tenantId: "42",
        token: 'XXXXXX',
        serviceCatalog: [{type: 'compute', baseURL: 'https://..../',
region: 'us-west'} ... ]
       }]};
The major change here is to make tenants a first level object, instead
of being a component of the service catalog --  each tenant has their
potentially unique service catalog, and each tenant has a unique API
token that is valid for this users's roles on it.

We intentionally made the API token centric (other standards like OAUTH
and SAML have come to that conclusion). That's the most flexible model.
For example, Rackspace has at least two tenants for each customer (cloud
files and cloud servers). We don't want to force customers to have
separate tokens for each. Your proposal would force us to do that.

This could be accomplished inside Rackspace by merging their 'two'
tenant Ids. This wouldn't actually be that complicated, because we
could easily add an cloudFilesLegacyId to the validate token response.
If we are already requiring services to migrate to the Keystone API,
them needing to reference a specific variable for their legacy needs
would be a small price to pay, and would eliminate the need for
scopeless tokens.

That's an internal Rackspace conversation. Let's take offline.


The current spec lets our customers manage only one token and we, as an
operator, can decide what goes in the token. That's what we were solving
for. Maybe there's a simpler model for the service catalog, but I haven't
seen it yet. But a good topic to keep iterating on.



This would slightly complicate "service users" with access to many
tenants, but it would massively simplify the common use cases of a
user with a small number of tenants, and for a service which needs to
validate the token.  A service user with access to many tenants would
need to fetch a unique token for each tenant, but this is the place to
put the complexity -- people writing a service that spans hundreds or
thousands of tenants are already doing something complicated --
fetching a unique auth token is the least of their issues.

This reduces the number of variables on both the consumer and the
service for passing around, and makes it less fail-deadly.

This approach also eliminates the need to encode the tenant ID into
the URLs of services.


C) There needs to be a reduction in the number of implementations of
Keystone, and a reduction in the number of extensions needed to even
operate a reasonable baseline of services.   More things, like API
keys and how one would do a multifactor authentication backend must be
part of the core specification, and we should strive to have the
reference implementation actually used by companies in the ecosystem,
instead of how today pretty much every $BigCo using OpenStack is
implementing their own Keystone service from scratch -- people will
always have custom needs, but those should be plugins to a solid
reference implementation as much as possible, instead of today where
everyone is rebuilding it from scratch.


We can do better, yes. And we can hope to settle towards one
implementation. There's a world of identity solutions out there and we
didn't start Keystone to enter that market, but to enable easier
integration in a cloud paradigm.

And not everyone is going to want to run a Python identity server. We've
split the API spec from the implementation to allow for diversity.

Generic-ism and Diversity is helpful once the problem space is well
understood.  I don't believe the space in which Keystone lives is
actually well worn.   The space of Cloud Service Identity,
authentication, authorization and token management on a whole is still
much younger than Identity on the web -- and Identity on the web in
general is still a mess.

I agree. The space is not mature or easy to navigate. We've been taking a very pragmatic approach while balancing more stakeholders than most are aware of. If we've erred to far away from the side of usability, we can adjust. But I believe one of the biggest gaps we have had is in documentation and education around what the API provides and what Keystone (and other implementations) provide.

We've been heavily focused on remedying that.





Thoughts?

Thank you again, Paul. Good topics. And please keep the discussion going.
I'd be happy to continue on this thread and also to work with anyone on
blueprints or enhancements to Keystone and the API.

I have a few questions for you and would like to hear your thoughts on how
to optimize the user experience, especially around the service catalog.

Thank you!

Thanks,

Paul

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack/attachments/20111127/f5358a3e/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: use_case_1.png
Type: image/png
Size: 46469 bytes
Desc: use_case_1.png
URL: <http://lists.openstack.org/pipermail/openstack/attachments/20111127/f5358a3e/attachment.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: use_case_2.png
Type: image/png
Size: 32030 bytes
Desc: use_case_2.png
URL: <http://lists.openstack.org/pipermail/openstack/attachments/20111127/f5358a3e/attachment-0001.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: use_case_3.png
Type: image/png
Size: 42759 bytes
Desc: use_case_3.png
URL: <http://lists.openstack.org/pipermail/openstack/attachments/20111127/f5358a3e/attachment-0002.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Endpoints and Endpoint Templates ‹ Keystone 2012.1-dev documentation.pdf
Type: application/pdf
Size: 134541 bytes
Desc: Endpoints and Endpoint Templates ‹ Keystone 2012.1-dev documentation.pdf
URL: <http://lists.openstack.org/pipermail/openstack/attachments/20111127/f5358a3e/attachment.pdf>


More information about the Openstack mailing list