[OpenStack-Infra] Public numbers about the scale of the infrastructure/CI ?

James E. Blair corvus at inaugust.com
Mon Mar 26 20:30:11 UTC 2018


David Moreau Simard <dmsimard at redhat.com> writes:

> On Mon, Mar 26, 2018 at 10:20 AM, James E. Blair <corvus at inaugust.com> wrote:
>>> - # of jobs and Ansible playbooks per month ran by Zuul
>>
>> I'm curious about this one -- how were you planning on defining these
>> values and obtaining them?
>>
>
> I've needed to pull statistics out of Zuul in the past for RDO (i.e,
> justifying budget for CI resources)
> and I use the sql reporter data to do it.
> It looks like this:
>
> $range = "'2018-02-01 00:00:00' AND '2018-02-28 23:59:59'"
> SELECT job_name,
>        result,
>        start_time,
>        end_time,
>        TIMEDIFF(end_time, start_time) as duration
> FROM zuul_build
> WHERE
>     start_time BETWEEN $range
>
> This gets me the amount of monthly *jobs* and I can extrapolate (over
> N playbooks..)
> by estimating a number knowing that:
> - base and post playbooks are fairly consistently X playbooks
> - there is at least one "run" playbook
>
> So pretending that 1000 jobs ran, I can say something like:
> 1000 jobs and over [1000 * (X+1)] playbooks
>
> It's not a perfect number but we know we run more playbooks than that.
>
> What I have also been thinking about is, if I want to get a more
> accurate number, I could do a sum of all the executor playbook results
> (which are in graphite) but the history for those don't go too far
> back.
> Ex: stats.zuul.executor.ze*_openstack_org.phase.*.*

The SQL query gets the number of completed jobs which are *reported*.
It doesn't get you two other numbers, which are the jobs *launched*
(many of which may have been aborted before completion), or the jobs
*completed* (the results of many of which may have been discarded due to
changes in the environment).  In reality, the system is likely to be
significantly busier than the number of jobs reported will indicate.

Both of the other values can be obtained from graphite or by parsing
logs.  I think for this purpose, graphite might be sufficient.  (The
only time I'd recommend going to logs is when we need to find
project-specific resource usage information.)

stats_counts.zuul.executor.*.builds should be all jobs launched.
stats_counts.zuul.tenant.*.pipeline.*.all_jobs should be all jobs completed.

-Jim



More information about the OpenStack-Infra mailing list