<div dir="ltr"><div dir="ltr"><div><div>Today we had a sync up call and discussed this. To summarize</div><div><br></div><div>Attendees:</div><div>Aleksandr Didenko</div><div>Alex Schultz</div><div>Andrew Woodward</div><div>Alexey Shtokolov</div><div>Bartek Kupidura</div><div>Bogdan Dobrelya</div><div>Denis Egorenko</div><div>Ivan Berezovskiy</div><div>Kyrylo Galanov</div><div>Maksim Malchuk</div><div>Matthew Mosesohn</div><div>Max Yatsenko</div><div>Oleg Gelbukh</div><div>Oleksiy Molchanov</div><div>Petr Zhurba</div><div>Sergey Kolekonov</div><div>Sergey Vasilenko</div><div>Sergii Golovatiuk</div><div>Stanislav Makar</div><div>Stanislaw Bogatkin</div><div>Vladimir Eremin</div><div>Vladimir Kuklin</div><div><br></div><div>Issue: moving to puppet-openstack on master has exposed fuel-library to breakage and there are many concerns about changes landing that can break it.</div><div><br></div><div>Alex S. Proposed that we continue the course, and finish setting up Check voting on the relevant puppet-openstack modules - The participants agreed with this</div><div><br></div><div>Action: Sergii G & Aleksandra Fedorova will propose needed changes to project-config to add tests</div><div><br></div><div>Issue: closing the regressions gap until fuel-ci votes on puppet-openstack check</div><div><br></div><div>It was proposed that we invent a system that holds back the versions nightly, and after completion of automated testing; It can automatically move it forward.</div><div><br></div><div>Action: There was no consensus on this and should be discussed here further on this thread.</div><div><br></div><br><br><div class="gmail_quote"><div dir="ltr">On Sun, Mar 6, 2016 at 11:33 PM Dmitry Borodaenko <<a href="mailto:dborodaenko@mirantis.com" target="_blank">dborodaenko@mirantis.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Aleksandra,<br>

<br>

Very good point on separating the concerns about integration tests for<br>

Fuel as a whole and verifying commits to a single component such as<br>

fuel-library. In theory, it could support the right balance between<br>

stable CI and up-to-date code, but only if we resolve the two remaining<br>

problems: one small and technical and the other large and social.<br>

<br>

You've already pointed out the first problem: update of fuel-library CI<br>

environment is not yet fully automated, and so the environment is liable<br>

to lag behind all involved components for days if not weeks.<br>

<br>

This by itself is simple enough, if labourous, to work around (update it<br>

manually every day, or after every successful BVT), but still leaves us<br>

with the problem of motivation.<br>

<br>

We've been discussing the CI duty for fuel-library integration with<br>

puppet-openstack since more than a month ago [0], and it has<br>

continuously failed to materialize. Within days of getting an action<br>

item in that IRC meeting to arrange it, Andrew Maksimov has responded<br>

privately that nobody in his team has time for this. And we all know<br>

what "I don't have time" actually means [1]. Two weeks later, we were<br>

ready to launch the integration and the question of CI duty came up<br>

again [2], with the same result.<br>

<br>

[0] <a href="http://eavesdrop.openstack.org/meetings/fuel/2016/fuel.2016-02-04-16.02.log.html#l-66" rel="noreferrer" target="_blank">http://eavesdrop.openstack.org/meetings/fuel/2016/fuel.2016-02-04-16.02.log.html#l-66</a><br>

[1] <a href="http://lifehacker.com/5892948/instead-of-saying-i-dont-have-time-say-its-not-a-priority" rel="noreferrer" target="_blank">http://lifehacker.com/5892948/instead-of-saying-i-dont-have-time-say-its-not-a-priority</a><br>

[2] <a href="http://eavesdrop.openstack.org/meetings/fuel/2016/fuel.2016-02-18-16.00.log.html#l-190" rel="noreferrer" target="_blank">http://eavesdrop.openstack.org/meetings/fuel/2016/fuel.2016-02-18-16.00.log.html#l-190</a><br>

<br>

Here we are two more weeks later, the integration is on, and the first<br>

reaction from fuel-library core reviewers is "we don't have time to deal<br>

with this, turn it back off right now". And I'm not just summarizing<br>

Vladimir's email, on Friday we had a long thread on an internal mailing<br>

list with exactly this in the subject line (my apologies, but my disgust<br>

at the fact that it was started behind closed doors drowns any qualms<br>

about dragging it back into the open).<br>

<br>

After we change Fuel CI to use fixed, most recent to have passed BVT,<br>

revisions of puppet-openstack modules, first thing that will happen is<br>

that BVT on Fuel ISO will start failing again, while fuel-library CI<br>

will continue to work. Without the pressure of failing commit<br>

verification CI, fuel-library developers will have even less incentive<br>

to keep fuel-library up to date with puppet-openstack (not to mention<br>

pro-actively reviewing puppet-openstack commits to catch potential<br>

regressions before they happen), and very soon Fuel QA team will get fed<br>

up with not having a stable ISO for the swarm test, and will demand that<br>

we go back to using fixed puppet-openstack revisions for the ISO, too.<br>

<br>

Both here and on the internal thread, many technical and organizational<br>

concerns were raised, and I'll get to them in a bit, but a concern<br>

without the will to resolve it is only an excuse, we won't get far if we<br>

don't want to make it work.<br>

<br>

So why don't fuel-library developers want to spend time on<br>

puppet-openstack integration?<br>

<br>

I see two dimensions to this problem. On one axis, there's the<br>

cost/benefit balance: how much work does it take, and what do we gain<br>

from doing it? On the other is the question of who benefits and who<br>

carries the costs?<br>

<br>

Without tracking HEAD of puppet-openstack in fuel-library, the primary<br>

cost is carried by puppet-openstack developers who maintain the upstream<br>

modules in the first place, and a small fraction of fuel-library<br>

contributors (5+ out of 50+ [3][4]) who periodically have to spend<br>

significant amount of effort to bring fuel-library up to date with the<br>

current state of puppet-openstack. Even though the conversion to<br>

librarian has made the upstream sync simpler and safer, preparing the<br>

update to Mitaka still took a full month of work for 5-7 people.<br>

<br>

[3] <a href="http://stackalytics.com/?module=puppet%20openstack-group&company=mirantis&metric=commits" rel="noreferrer" target="_blank">http://stackalytics.com/?module=puppet%20openstack-group&company=mirantis&metric=commits</a><br>

[4] <a href="http://stackalytics.com/?module=fuel-library&company=mirantis&metric=commits" rel="noreferrer" target="_blank">http://stackalytics.com/?module=fuel-library&company=mirantis&metric=commits</a><br>

<br>

Secondary costs are carried by Fuel Infra and QA teams who have to<br>

support CI based on two OpenStack releases in parallel during that<br>

month, fuel-library and puppet-openstack developers who have to deal<br>

with a spike in code churn, all Fuel contributors who are blocked by<br>

merge freeze during transition, and once again Fuel QA team who<br>

occasionally get blocked by bugs that were fixed in upstream and not yet<br>

pulled into fuel-library.<br>

<br>

In short, under that model, most fuel-library developers don't have to<br>

do much to gain the benefit of being up to date with upstream, such us<br>

getting support of the next OpenStack release. The integration cost,<br>

around 7-10 man-months per release, is carried mostly by other people.<br>

<br>

Transition to full integration with upstream via tracking HEAD of<br>

puppet-openstack in fuel-library dramatically alters this balance.<br>

Massive upstream sync is gone, and so are the associated costs of<br>

parallel CI, transition merge freeze, and missing upstream bugfixes. The<br>

code churn is still there, but more evenly spread over time.<br>

<br>

Instead, the primary cost becomes the CI duty that requires a<br>

fuel-library developer to watch upstream commits for Fuel CI failures<br>

and prevent those from impacting fuel-library. According to the same<br>

internal thread, that's "over 50% of one developer's time every day", so<br>

3-5 man-months per release, or roughly half of the cost of the periodic<br>

sync.<br>

<br>

The secondary cost is the risk of upstream commits causing regressions<br>

that block the whole fuel-library team for several hours at a time. Is<br>

this risk a good excuse to revert the change that reduces the cost of<br>

supporting a new OpenStack release by half and reduces Fuel's lag behind<br>

puppet-openstack by a month? Only if we can't mitigate it.<br>

<br>

The problem is, most fuel-library developers don't stand to gain<br>

anything from this change: they now have to participate in something<br>

that was previously taken care of, however inefficiently, by other<br>

people. And that is why, instead of constructive proposals about<br>

mitigating the risk of regressions, we see demands to go back to the<br>

time when they didn't need to bother.<br>

<br>

As promised, moving on to specific concerns and questions.<br>

<br>

On Tue, Mar 01, 2016 at 02:21:48PM +0300, Vladimir Kuklin wrote:<br>

> Dmitry, could you please point me at the person who will be strictly<br>

> responsible for creating this 'ketchup' commit? Do you know that this<br>

> may take up the whole day (couple of hours to do RCA, couple of hours<br>

> on writing and debugging and couple of hours for FUEL CI tests run)<br>

> and block the entire Fuel project from having ANY code merged?<br>

<br>

It's not reasonable to expect a single person, or even a small team, to<br>

do this every day all year around. That's why we've been discussing CI<br>

duty. Even if it takes all day every day, between 50+ fuel-library<br>

developers that's just one week per person per year, not that much of a<br>

burden.<br>

<br>

And it doesn't have to block anyone from merging code to Fuel<br>

repositories, there are many ways to mitigate that, like the ones that<br>

Sergey and Aleksandra have proposed in this thread. We just need to<br>

start discussing these ways instead of arguing about why we shouldn't<br>

bother.<br>

<br>

> I have always thought that buliding software is about verification<br>

> being more important than 'trust'. There should not be any<br>

> humanitarian stuff invloved - we are not in a relationship with<br>

> Puppet-OpenStack folks,<br>

<br>

I have explained above why motivation is the blocking issue here, and<br>

not the technical concerns. Of course we are in a relationship with<br>

Puppet OpenStack: both projects are part of OpenStack Big Tent, we have<br>

the same six-month release cycle, and on the code level their modules<br>

are so tightly coupled into fuel-library that we can't treat them as a<br>

third-party library. The fact that we've started to pull them from<br>

separate git repositories shouldn't have stopped us from treating them<br>

as a part of our codebase. Like it or not, our relationship with them is<br>

more "in the same boat" than it is a "zero-sum game".<br>

<br>

> although I really admire their work very much.<br>

<br>

  lip service<br>

      n 1: an expression of agreement that is not supported by real<br>

           conviction [syn: {hypocrisy}, {lip service}]<br>

<br>

> We should not follow sliding git references without being 100% sure<br>

> that we have mutual gating of the code.<br>

<br>

Setting up mutual gating is impossible without the mutual trust that you<br>

have so easily dismissed. Sliding git references and the CI duty to<br>

support them are all parts of establishing that mutual trust, it won't<br>

just appear out of thin air and empty promises.<br>

<br>

Even at the level of trust we already have, I'm sure puppet-openstack<br>

core reviewers can agree to hold off merging a commit if a fuel-library<br>

developer votes -1 with a comment like "Fuel CI failed for this one,<br>

please give me a couple of hours to figure out why". A poor man's<br>

substitute of mutual gating, but serviceable nonetheless.<br>

<br>

> Moreover, having such git ref as a source in our Puppetfile will lead<br>

> to the situation when we have UNREPRODUCIBLE build of Fuel project.<br>

<br>

Easily mitigated with tooling, same as the undeservedly maligned removal<br>

of version.yaml.<br>

<br>

On Fri, Mar 04, 2016 at 04:51:34PM +0300, Dmitry Pyzhov wrote:<br>

> 1) It takes more than 50% time of a senior engineer;<br>

<br>

As explained above, even at 100% time it's less than the time we've been<br>

spending on periodic upstream syncs.<br>

<br>

> 2) There is a lot of noise in tests results because of broken CI<br>

> and/or broken Fuel master;<br>

<br>

Can be fixed by Aleksandra's proposal.<br>

<br>

> 3) There is a log of noise in tests results because of big number of<br>

> WIP commits that nobody is going to merge;<br>

<br>

Once we make Fuel CI votes visible (I see no reason to delay that any<br>

longer), it's going to be trivial to filter out commits with WIP flag or<br>

with a -1 from a voting gate job (why investigate Fuel CI failure if the<br>

commit can't pass a beaker test).<br>

<br>

> 4) There is no quick way to understand if the test failure caused by<br>

> commit or by other reasons;<br>

<br>

Is this a duplicate of #2 or a general observation about how difficult<br>

it is to investigate Fuel CI failures? If the latter, this problem is<br>

not limited to puppet-openstack and is causing us pain in all our repos,<br>

we should either fix it soon or give up on Fuel CI altogether.<br>

<br>

> 5) There is no quick way to understand if the issue should be fixed in<br>

> the commit or in Fuel;<br>

<br>

Yes there is: simply pick the side where it's easier to fix.<br>

<br>

> 6) Most important. Our monitoring doesn't protect us. Our master will<br>

> be broken by upstream manifests again sooner or later. And nobody<br>

> knows how much time it will take to fix it.<br>

<br>

Our master gets broken by our own mistakes at least as often as by<br>

upstream manifests, anything we can do to protect ourselves from that is<br>

applicable to puppet-openstack just the same.<br>

<br>

--<br>

Dmitry Borodaenko<br>

<br>

__________________________________________________________________________<br>

OpenStack Development Mailing List (not for usage questions)<br>

Unsubscribe: <a href="http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe" rel="noreferrer" target="_blank">OpenStack-dev-request@lists.openstack.org?subject:unsubscribe</a><br>

<a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev" rel="noreferrer" target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev</a><br>

</blockquote></div></div></div></div><div dir="ltr">-- <br></div><div dir="ltr"><p dir="ltr">--</p><p dir="ltr"><span style="font-size:13.1999998092651px">Andrew Woodward</span></p><p dir="ltr"><span style="font-size:13.1999998092651px">Mirantis</span></p><p dir="ltr"><span style="font-size:13.1999998092651px">Fuel Community Ambassador</span></p><p dir="ltr"><span style="font-size:13.1999998092651px">Ceph Community</span></p>

</div>