[infra] A change to Zuul's queuing behavior

Ghanshyam Mann gmann at ghanshyammann.com
Sun Dec 9 14:14:37 UTC 2018


 ---- On Sat, 08 Dec 2018 07:53:27 +0900 James E. Blair <corvus at inaugust.com> wrote ---- 
 > Matt Riedemann <mriedemos at gmail.com> writes: 
 >  
 > > On 12/3/2018 3:30 PM, James E. Blair wrote: 
 > >> Since some larger projects consume the bulk of cloud resources in our 
 > >> system, this can be especially frustrating for smaller projects.  To be 
 > >> sure, it impacts everyone, but while larger projects receive a 
 > >> continuous stream of results (even if delayed) smaller projects may wait 
 > >> hours before seeing results on a single change. 
 > >> 
 > >> In order to help all projects maintain a minimal velocity, we've begun 
 > >> dynamically prioritizing node requests based on the number of changes a 
 > >> project has in a given pipeline. 
 > > 
 > > FWIW, and maybe this is happening across the board right now, but it's 
 > > taking probably ~16 hours to get results on nova changes right now, 
 > > which becomes increasingly frustrating when they finally get a node, 
 > > tests run and then the job times out or something because the node is 
 > > slow (or some other known race test failure). 
 > > 
 > > Is there any way to determine or somehow track how long a change has 
 > > been queued up before and take that into consideration when it's 
 > > re-enqueued? Like take this change: 
 > > 
 > > https://review.openstack.org/#/c/620154/ 
 > > 
 > > That took about 3 days to merge with constant rechecks from the time 
 > > it was approved. It would be cool if there was a way to say, from 
 > > within 50 queued nova changes (using the example in the original 
 > > email), let's say zuul knew that 10 of those 50 have already gone 
 > > through one or more times and weigh those differently so when they do 
 > > get queued up, they are higher in the queue than maybe something that 
 > > is just going through it's first time. 
 >  
 > This suggestion would be difficult to implement, but also, I think it 
 > runs counter to some of the ideas that have been put into place 
 > in the past.  In particular, the idea of clean-check was to make it 
 > harder to merge changes with gate failures (under the assumption that 
 > they are more likely to introduce racy tests).  This might make it 
 > easier to recheck-bash bad changes in (along with good). 
 >  
 > Anyway, we chatted in IRC a bit and came up with another tweak, which is 
 > to group projects together in the check pipeline when setting this 
 > priority.  We already to in gate, but currently, every project in the 
 > system gets equal footing in check for their first change.  The change 
 > under discussion would group all tripleo projects together, and all the 
 > integrated projects together, so that the first change for a tripleo 
 > project had the same priority as the first change for an integrated 
 > project, and a puppet project, etc. 
 >  
 > The intent is to further reduce the priority "boost" that projects with 
 > lots of repos have. 
 >  
 > The idea is still to try to find a simple and automated way of more 
 > fairly distributing our resources.  If this doesn't work, we can always 
 > return to the previous strict FIFO method.  However, given the extreme 
 > delays we're seeing across the board, I'm trying to avoid the necessity 
 > of actually allocating quota to projects.  If we can't make this work, 
 > and we aren't able to reduce utilization by improving the reliability of 
 > tests (which, by *far* would be the most effective thing to do -- please 
 > work with Clark on that), we may have to start talking about that. 
 >  
 > -Jim 

We can optimize the node by removing the job from running queue on the first failure hit instead of
full run and then release the node. This is a trade-off with getting the all failure once and fix them all together
but I am not sure if that is the case all time. For example-  if any change has pep8 error then, no need to run 
integration tests jobs there.  This at least can save nodes at some extent. 


-gmann

 >  
 > 





More information about the openstack-discuss mailing list