In a different thread I had another possible suggestion - its probably more appropriate for this one. [1] It would also be helpful to give the project a way to prefer certain infra providers for certain jobs. For the most part Fort Neubla is terrible at CPU bound long running jobs... I wish I could make it better, but I cannot. Is there a method we could come up with that would allow us to exploit certain traits of a certain provider? Maybe like some additional metadata that say what the certain provider is best at doing? For example highly IO bound jobs work like gangbusters on FN because the underlying storage is very fast, but CPU bound jobs do the direct opposite. Thoughts? ~/DonnyD 1. http://lists.openstack.org/pipermail/openstack-discuss/2019-September/009592... On Mon, Sep 23, 2019 at 11:14 AM Clark Boylan <cboylan@sapwetik.org> wrote:
On Mon, Sep 23, 2019, at 8:03 AM, Donny Davis wrote:
*These are only observations, so please keep in mind I am only trying to get to the bottom of efficiency with our limited resources.* Please feel free to correct my understanding
We have some core projects which many other projects depend on - Nova, Glance, Keystone, Neutron, Cinder. etc In the CI it's equal access for any project. If feature A in non-core project depends on feature B in core project - why is feature B not prioritized ?
The priority queuing happens per "gate queue". The integrated gate (nova, cinder, keystone, etc) has one queue, Tripleo has another, OSA has one and so on. We do this so that important work can happen across disparate efforts.
What this means is if Nova and the rest of the integrated gate has a set of priority changes they should stop approving other changes while they work to merge those priority items. I have suggested that OpenStack needs an "air traffic controller" to help coordinate these efforts particularly around feature freeze time (I suggested it to both the QA team and release team). Any queue could use one if they wanted to.
All that to say you can do this today, but it requires humans to work together and communicate what their goals are then give the CI system the correct information to act on these changes in the desired manner.
Can we solve this issue by breaking apart the current equal access structure into something more granular?
I understand that improving job efficiencies will likely result in more smaller jobs, but will that actually solve issue at the gate come this time in the cycle...every release? (as I am sure it comes up every time) More smaller jobs will result in more jobs - If the job time is cut in half, but the # of jobs is doubled we will probably still have the same issue.
We have limited resources and without more providers coming online I fear this issue is only going to get worse as time goes on if we do nothing.
~/DonnyD