<div dir="ltr">Hi Jay, <div><br></div><div>Couple of points.</div><div><br></div><div>I support the fact that we need to define what is "success" is. </div><div>I believe that the metrics that should be used are "Voted +1" and "Skipped". </div>

<div>But to certain valid case, I would say that the  "Voted -1" is really mostly a metric of bad health of a CI.</div><div>Most of the -1 are due to environment issue, configuration problem, etc... <br></div><div>

In my case, the -1 are done manually since I want to avoid giving some extra work to the developer.</div><div><br></div><div>That are some possible solutions ? </div><div><br></div><div>On the Jenkins, I think we could develop a script that will parse the result html file.</div>

<div>Jenkins will then vote (+1, 0, -1) on the behalf of the 3rd party CI.</div><div>- It would prevent the abusive +1</div><div>- If the result HTML is empty, it would indicate the CI health is bad</div><div>- if all the result are failing, it would also indicate that CI health is bad</div>

<div><br></div><div><br></div><div>Franck  </div><div><br></div><div><br></div><div>Franck </div><div><br></div><div><br></div><div><br></div></div><div class="gmail_extra"><br clear="all"><div>Franck</div>

<br><br><div class="gmail_quote">On Mon, Jun 30, 2014 at 1:22 PM, Jay Pipes <span dir="ltr"><<a href="mailto:jaypipes@gmail.com" target="_blank">jaypipes@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

Hi Stackers,<br>

<br>

Some recent ML threads [1] and a hot IRC meeting today [2] brought up some legitimate questions around how a newly-proposed Stackalytics report page for Neutron External CI systems [2] represented the results of an external CI system as "successful" or not.<br>


<br>

First, I want to say that Ilya and all those involved in the Stackalytics program simply want to provide the most accurate information to developers in a format that is easily consumed. While there need to be some changes in how data is shown (and the wording of things like "Tests Succeeded"), I hope that the community knows there isn't any ill intent on the part of Mirantis or anyone who works on Stackalytics. OK, so let's keep the conversation civil -- we're all working towards the same goals of transparency and accuracy. :)<br>


<br>

Alright, now, Anita and Kurt Taylor were asking a very poignant question:<br>

<br>

"But what does CI tested really mean? just running tests? or tested to pass some level of requirements?"<br>

<br>

In this nascent world of external CI systems, we have a set of issues that we need to resolve:<br>

<br>

1) All of the CI systems are different.<br>

<br>

Some run Bash scripts. Some run Jenkins slaves and devstack-gate scripts. Others run custom Python code that spawns VMs and publishes logs to some public domain.<br>

<br>

As a community, we need to decide whether it is worth putting in the effort to create a single, unified, installable and runnable CI system, so that we can legitimately say "all of the external systems are identical, with the exception of the driver code for vendor X being substituted in the Neutron codebase."<br>


<br>

If the goal of the external CI systems is to produce reliable, consistent results, I feel the answer to the above is "yes", but I'm interested to hear what others think. Frankly, in the world of benchmarks, it would be unthinkable to say "go ahead and everyone run your own benchmark suite", because you would get wildly different results. A similar problem has emerged here.<br>


<br>

2) There is no mediation or verification that the external CI system is actually testing anything at all<br>

<br>

As a community, we need to decide whether the current system of self-policing should continue. If it should, then language on reports like [3] should be very clear that any numbers derived from such systems should be taken with a grain of salt. Use of the word "Success" should be avoided, as it has connotations (in English, at least) that the result has been verified, which is simply not the case as long as no verification or mediation occurs for any external CI system.<br>


<br>

3) There is no clear indication of what tests are being run, and therefore there is no clear indication of what "success" is<br>

<br>

I think we can all agree that a test has three possible outcomes: pass, fail, and skip. The results of a test suite run therefore is nothing more than the aggregation of which tests passed, which failed, and which were skipped.<br>


<br>

As a community, we must document, for each project, what are expected set of tests that must be run for each merged patch into the project's source tree. This documentation should be discoverable so that reports like [3] can be crystal-clear on what the data shown actually means. The report is simply displaying the data it receives from Gerrit. The community needs to be proactive in saying "this is what is expected to be tested." This alone would allow the report to give information such as "External CI system ABC performed the expected tests. X tests passed. Y tests failed. Z tests were skipped." Likewise, it would also make it possible for the report to give information such as "External CI system DEF did not perform the expected tests.", which is excellent information in and of itself.<br>


<br>

===<br>

<br>

In thinking about the likely answers to the above questions, I believe it would be prudent to change the Stackalytics report in question [3] in the following ways:<br>

<br>

a. Change the "Success %" column header to "% Reported +1 Votes"<br>

b. Change the phrase " Green cell - tests ran successfully, red cell - tests failed" to "Green cell - System voted +1, red cell - System voted -1"<br>

<br>

and then, when we have more and better data (for example, # tests passed, failed, skipped, etc), we can provide more detailed information than just "reported +1" or not.<br>

<br>

Thoughts?<br>

<br>

Best,<br>

-jay<br>

<br>

[1] <a href="http://lists.openstack.org/pipermail/openstack-dev/2014-June/038933.html" target="_blank">http://lists.openstack.org/<u></u>pipermail/openstack-dev/2014-<u></u>June/038933.html</a><br>

[2] <a href="http://eavesdrop.openstack.org/meetings/third_party/2014/third_party.2014-06-30-18.01.log.html" target="_blank">http://eavesdrop.openstack.<u></u>org/meetings/third_party/2014/<u></u>third_party.2014-06-30-18.01.<u></u>log.html</a><br>


[3] <a href="http://stackalytics.com/report/ci/neutron/7" target="_blank">http://stackalytics.com/<u></u>report/ci/neutron/7</a><br>

<br>

______________________________<u></u>_________________<br>

OpenStack-dev mailing list<br>

<a href="mailto:OpenStack-dev@lists.openstack.org" target="_blank">OpenStack-dev@lists.openstack.<u></u>org</a><br>

<a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev" target="_blank">http://lists.openstack.org/<u></u>cgi-bin/mailman/listinfo/<u></u>openstack-dev</a><br>

</blockquote></div><br></div>