[OpenStack-Infra] Status of check-tempest-dsvm-f20 job

Jeremy Stanley fungi at yuggoth.org
Wed Jun 18 00:07:14 UTC 2014


On 2014-06-17 17:05:05 -0400 (-0400), Sean Dague wrote:
[...]
> If nodepool (conceptually) filled the longest outstanding requests with
> higher priority, I'd be uber happy. This would also help with more fully
> using our capacity, because the mix of nodes that we need any given hour
> kind of changes. But as jeblair said, this is non trivial to implement.
> Ensuring a minimum number of nodes (where that might be 1 or 2) for each
> class would have helped this particular situation. We actually had 0
> nodes in use or ready of the type at the time.
[...]

Part of the complication here is that nodepool's current demand
determination algorithm considers nodes in a "building" state as
satisfying the stated demand, so as to avoid it madly
overprovisioning when there are nodes on the way to take care of the
pending load. Unfortunately this coupled with incidents in cloud
providers causing nodepool to think a node is building when it's
actually stuck can quickly lead to zero capacity if a lower-demand
node type ends up with all its proportion occupied by nodes stuck in
that state. Similarly for ready workers which nodepool thinks got
added to a Jenkins master but which didn't actually "stick" (we see
this sometimes and it probably indicates a Jenkins bug of some
kind).
-- 
Jeremy Stanley



More information about the OpenStack-Infra mailing list