[openstack-dev] [keystone] Pagination

Henry Nash henryn at linux.vnet.ibm.com
Tue Aug 13 15:43:44 UTC 2013


On 13 Aug 2013, at 16:03, Dolph Mathews wrote:

> 
> On Tue, Aug 13, 2013 at 3:10 AM, Henry Nash <henryn at linux.vnet.ibm.com> wrote:
> Hi
> 
> So few comebacks to the various comments:
> 
> 1) While I understand the idea that a client would follow the next/prev links returned in collections, I wasn't aware that we considered 'page'/'per-page' as not standardized. We list these explicitly throughout the identity API spec (look in each List 'entity' example).
> 
> They were essentially relics from a very early draft of the spec that were thoughtlessly copy/pasted around (I'm guilty of this myself)... they were recently cleaned up and removed from the spec.

Unfortunately every API list call in the spec (e.g. Users, Groups, Projects, Domains etc.) have the following in their list of items supported for query:

query_string: page (optional)
query_string: per_page (optional, default 30)

Are you suggesting that both are driver dependant, or just the 'page' item?  I assume it can't be both - otherwise a client would need to specify page size differently based on the driver in use.

>  
> How I imagined it would work would be:
> 
> a) If a client did not include 'page' in the url we would not paginate
> 
> Make that a deployment option? per_page could simply default to a very high value.
>  
> b) Once we are paginating, a client can either build the next/prevs urls themselves if they want (by incrementing/decrementing the page number), or just follow the next/prev links (which come with the appropriate 'page=x' in them) returned in the collection which saves them having to do this.
> 
> I'm obviously very opposed to this because it unreasonably forces a single approach to pagination across all drivers.

So I can see the advantage of not having to do that - although I guess the counter argument is that it is the job of a driver to map non-specific apis to a particular implementation based on the the underlying technology.  You could consider page definitions as just part of the api url to be mapped (at least that's how I had been thinking about it to date).  The pro of using standardized terms is that we can support a mixture of clients that do and do not support pagination (since we can infer if they support pagination based on whether they specify 'page' in the query string).

>  
> c) Regarding implementation, the controller would continue to be able to paginate on behalf of drivers that couldn't, but those paginate-aware drivers would take over that capability (and indicate this to the controller the state of the pagination so that it can build the correct next/prev links)
> 
> 2) On the subject of huge enumerates, options are:
> a) Support a backend manager scoped (i.e. identity/assignent/token) limit in the conf file which would be honored by drivers.  Assuming that you set this larger than your pagination limit, this would make sense whether your driver is paginating or not in terms of minimizing the delay in responding data as well as not messing up pagination.  In the non-paginated case when we hit the limit, should we indicate this to the client?  Maybe a 206 return code?  Although i) not quite sure that meets http standards, and ii) would we break a bunch of clients by doing this?
> 
> I'm not clear on what kind of limit you're referring to? A 206 sounds unexpected for this use case though.
>  
> b) We scrap the whole idea of pagination, and just set a conf limit as in 2a).  To make this work of course, we must implement any defined filters in the backend (otherwise we still end up with today's performance problems - remember that today, in general,  filtering is done in the controller on a full enumeration of the entities in question).  I was planning to implement this backend filtering anyway as part of (or on top of) my change, since we are holding (at least one of) our hands behind our backs right now by not doing so.  And our filters need to be powerful, do we support wildcards for example, e.g. GET /users?name = fred*  ?
> 
> There were some discussions on this topic from about a year ago that I'd love to continue. I don't want to invent a new "language," but we do need to settle on an approach that we can apply across a wide variety of backends. That probably means keeping it very simple (like your example). Asterisks need to be URL encoded, though. One suggestion I particularly liked (which happens to avoid claiming perfectly valid characters - asterisks - as special characters) was to adopt the syntax used in the django ORM's filter function:
> 
>   ?name__startswith=Fred
>   ?name__istartswith=fred
>   ?name__endswith=Fred
>   ?name__iendswith=fred
>   ?name__contains=Fred
>   ?name__icontains=fred
> 
> This probably represents the immediately useful subset of parameters for us, but for more:
> 
>   https://docs.djangoproject.com/en/dev/topics/db/queries/
> 
>  
> Henry
> 
> On 13 Aug 2013, at 04:40, Adam Young wrote:
> 
>> On 08/12/2013 09:22 PM, Miller, Mark M (EB SW Cloud - R&D - Corvallis) wrote:
>>> The main reason I use user lists (i.e. keystone user-list) is to get the list of usernames/IDs for other keystone commands. I do not see the value of showing all of the users in an LDAP server when they are not part of the keystone database (i.e. do not have roles assigned to them). Performing a “keystone user-list” command against the HP Enterprise Directory locks up keystone for about 1 ½ hours in that it will not perform any other commands until it is done.  If it is decided that user lists are necessary, then at a minimum they need to be paged to return control back to keystone for another command.
>>> 
>> 
>> We need a way to tell HP ED to limit the number of rows, and to do filtering.
>> 
>> We have a bug for the second part.  I'll open one for the limit.
>> 
>>>  
>>> 
>>> Mark
>>> 
>>>  
>>> 
>>> From: Adam Young [mailto:ayoung at redhat.com] 
>>> Sent: Monday, August 12, 2013 5:27 PM
>>> To: openstack-dev at lists.openstack.org
>>> Subject: Re: [openstack-dev] [keystone] Pagination
>>> 
>>>  
>>> 
>>> On 08/12/2013 05:34 PM, Henry Nash wrote:
>>> 
>>> Hi
>>> 
>>>  
>>> 
>>> I'm working on extending the pagination into the backends.  Right now, we handle the pagination in the v3 controller class....and in fact it is disabled right now and we return the whole list irrespective of whether page/per-page is set in the query string, e.g.:
>>> 
>>> Pagination is a broken concept. We should not be returning lists so long that we need to paginate.  Instead, we should have query limits, and filters to refine the queries.
>>> 
>>> Some people are doing full user lists against LDAP.  I don't need to tell you how broken that is.  Why do we allow user-list at the Domain (or unscoped level)?  
>>> 
>>> I'd argue that we should drop enumeration of objects in general, and certainly limit the number of results that come back.  Pagination in LDAP requires cursors, and thus continuos connections from Keystone to LDAP...this is not a scalable solution.
>>> 
>>> Do we really need this?
>>> 
>>> 
>>> 
>>> 
>>>  
>>> 
>>>     def paginate(cls, context, refs):
>>> 
>>>         """Paginates a list of references by page & per_page query strings."""
>>> 
>>>         # FIXME(dolph): client needs to support pagination first
>>> 
>>>         return refs
>>> 
>>>  
>>> 
>>>         page = context['query_string'].get('page', 1)
>>> 
>>>         per_page = context['query_string'].get('per_page', 30)
>>> 
>>>         return refs[per_page * (page - 1):per_page * page]
>>> 
>>>  
>>> 
>>> I wonder both for the V3 controller (which still needs to handle pagination for backends that do not support it) and the backends that do....whether we could use wether 'page' is defined in the query-string as an indicator as to whether we should paginate or not?  That way clients who can handle it can ask for it, those that don'twill just get everything.  
>>> 
>>>  
>>> 
>>> Henry
>>> 
>>>  
>>> 
>>> 
>>> 
>>> 
>>> 
>>> _______________________________________________
>>> OpenStack-dev mailing list
>>> OpenStack-dev at lists.openstack.org
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>  
>>> 
>>> 
>>> 
>>> _______________________________________________
>>> OpenStack-dev mailing list
>>> OpenStack-dev at lists.openstack.org
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>> 
>> _______________________________________________
>> OpenStack-dev mailing list
>> OpenStack-dev at lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 
> 
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 
> 
> 
> 
> -- 
> 
> -Dolph
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20130813/3d9f4858/attachment-0001.html>


More information about the OpenStack-dev mailing list