[OpenStack-Infra] [storyboard] Paging results

Heald, Mike mike.heald at hp.com
Wed Nov 19 16:38:52 UTC 2014


Hello, fellow Storyboarders!

After some discussions with the folks working on oslo.db and going over our requirements for paging with krotscheck, we came to the conclusion that paging just based on a marker did not give us the reliability or functionality that we need in Storyboard. I said I'd outline the paging model I'd used before for companies whose requirements were quite demanding (e.g. if we miss a record while paging, we could potentially lose thousands in profits), so here's an overview to start discussion. (Sorry for the delay, I've been ill)

Goals:

1) Paged list where records do not jump from page to page, even if the underlying data changes, so that the user never misses records that match search criteria, and never sees the records she is dealing with change unexpectedly
2) Able to order the list by column
3) Able to jump to an arbitrary page

To fully satisfy (1) while having (2) and (3), we need to snapshot the result set, and have a mechanism for notifying the user of any new records that match the list criteria since they started browsing.

This can be achieved by having a three column table, with a record ID, the data used to order the results, and an ID for the result set. This allows us to left join on the main data table (so that record deletions do not affect paging, and the UI can show that a record was deleted and not mess up paging). Ordering by the snapshotted data means that if records change that data in the main data table, they do not jump pages, and the UI can show that the data has changed. This does require some resultset metadata too (e.g. column sorted by)

Notifications of new records can be done with a web worker that checks if there are records that do not appear in the search but do match the search criteria. The user can be notified that new records have come in, and opt to either view a list of these records, or rerun the search to include them.

Downsides of this approach are:
 - It's relatively heavy. The search table for each datatype will have lots of inserts and deletes, so careful consideration of partitions, or use of temporary tables, or other webscale stuff needs to happen
 - Housekeeping needs to be thought about. Just viewing a list of stuff will result in a snapshot so it can be manipulated correctly, so we need to think about if there are situations when we *don't* want this to happen (for example viewing a list of all stories might not warrant it, but viewing a list of stories with a particular status, might), how to clean up result set snapshots that are no longer in use, etc.

Upsides are:
 - It works. The results you are working with remain consistent, you get notified of any data changes, and you can page through the results easily. You could even use marker based paging for the snapshotted results if you wanted to, and it would be reliable (although some functionality like jumping to an arbitrary page would be sacrified).

Hope that makes sense, what are everyone's thoughts?

Mike




More information about the OpenStack-Infra mailing list