[OpenStack-Infra] Report from Gerrit User Summit

James E. Blair corvus at inaugust.com
Wed Sep 4 16:53:54 UTC 2019


Hi,

Monty and I attended the Gerrit User Summit and hackathon last week.  It
was very productive: we learned some good information about upgrading
Gerrit, received offers of help doing so if we need it, formed closer
ties with the Gerrit community, and fielded a lot of interest in Reno
and Zuul.  In general, people were happy that we attended as
representatives of the OpenDev/OpenStack/Zuul communities and (re-)
engaged with the Gerrit community.

Gerrit Upgrade
--------------

We learned some practical things about upgrading to 3.0:

* We can turn off rebuilding the secondary index ("reindexing") on
  startup to speed both or normal restarts as well as prevent unwanted
  reindexes during upgrades.  (Monty pushed a change for this.)

* We can upgrade from 2.13 -> 2.14 -> 2.15 -> 2.16 during a relatively
  quick downtime.  We could actually do some of that while up, but Monty
  and I advocate just taking a downtime to keep things simple.

* We should, under no circumstances, enable NoteDB before 2.16.  The
  migration implementation in 2.15 is flawed and will cause delays or
  errors in later upgrades.

* Once on 2.16, we should enable NoteDB and perform the migration.  This
  can happen online in the background.

* We should GC the repos before starting, to make reindexing faster.

* We should ensure that we have a sufficiently sized diff cache, as that
  Gerrit will be able to re-use previously computed patchset diffs when
  reindexing.  This can considerably speed an onlide reindex.

* We should probably run 2.16 in production for some time (1 month?) to
  allow users to acclimate to polygerrit, and deal with hideCI.

* Regarding hideCI -- will someone implement that for polygerrit?  will
  it be obviated by improvements in Zuul reporting (tagged or robot
  comments)?  even if we improve Zuul, will third-party CI's upgrade?
  do we just ignore it?

* The data in the AccountPatchReviewDb are not very important, and we
  don't need to be too concerned if we lose them during the upgrade.

* We need to pay attention to H2 tuning parameters, because many of the
  caches use H2.

* Luca has offered to provide any help if we need it.

I'm sure there's more, but that's a pretty good start.  Monty has
submitted several changes to our configuration of Gerrit with the topic
"gus2019" based on some of this info.

Gerrit Community
----------------

During the hackathon, Monty and I bootstrapped our workstations with a
full development environment for Gerrit.  We learned a bit about the new
build system (bazel) -- mostly that it's very complicated, changes
frequently from version to version, and many of the options are black
magic.  However, the bazel folks have been convinced that stability is
in the community's interest, and an initial stable version is
forthcoming.

The key practical things we learned are:

* Different versions of Gerrit may want different bazel versions
  (however, I was able to build the tips of all 3 supported branches
  with the latest bazel).

* There is a tool to manage bazel for you (bazelisk), and it will help
  get the right version of bazel for a given branch/project (it is
  highly recommended, but in theory (especially with the forthcoming
  stable release) should not be required (see last point)).

* The configuration options specified in the developer documentation are
  important and correct.  Monty fixed instability in our docker image
  builds by reverting to just those options.

* Eclipse (or intelliJ) are the IDEs of choice.  Note that the latest
  version of Eclipse (which may not be in distros) is required.  Of
  course, these aren't required, but it's Java, so they help a lot.
  There is a helper script to generate the Eclipse project file.

* Switching between branches requires a full rebuild with bazel, and a
  regeneration/re-import of the Eclipse project.  Given that, I suggest
  this pro-tip: maintain a git repo+Eclipse project for each Gerrit
  branch you work on.  Same for your test Gerrit instance (so you don't
  have to run "gerrit init" over and over).

* The Gerrit maintainers are most easily reachable on Slack.

* Monty and I have been given some additional permissions to edit bugs
  in the issue tracker.  They seem fairly willing to give out those
  permissions if others are interested.

* The issue tracker, like most, doesn't receive enough attention to
  dealing with old issues.  But for newer issues, still seems of
  practical use.

* The project has formed a steering committee and adopted a
  design-driven contribution process[1] (not dissimilar to our own specs
  process).  More on this later.

Reno
----

The Gerrit maintainers like to make releases at the end of hackathons,
and so we all (most especially the maintainers) observed that the
current process around manually curating release notes was cumbersome
and error-prone.  Monty demonstrated Reno to an enthusiastic reception
and therefore, Monty will be working on integrating Reno into Gerrit's
release process.

Zuul
----

Zuul is happily used at Volvo by the propulsion team at Volvo (currently
v2, working on moving to v3) [2].  Other teams are looking into it.

The Gerrit maintainers are interested in using Zuul to run Gerrit's
upstream CI system.  Monty and I plan on helping to implement that.

We spoke at length with Edwin and Alice who are largely driving the
development of the new "checks" API in Gerrit.  It is partially
implemented now and operational in upstream Gerrit.  As written, we
would have some difficulty using it effectively with Zuul.  However,
with Zuul as a use case, some further changes can be made so that I
think it will integrate quite well, and with more work could be a quite
nice integration.

At a very high level, a "checker" in Gerrit represents a single
pass/fail result from a CI system or code analyzer, and must be
configured on the project in advance by an administrator.  Since we want
Zuul to report the result of each job it runs on a change, and we don't
know that set of jobs until we start, the current implementation doesn't
fit the model very well.  For the moment, we can use the checks API to
report the overall buildset result, but not the jobs.  We can, of
course, still use Gerrit change messages to report the results of
individual jobs just as we do now.  But ideally, we'd like to take full
advantage of the structured data and reporting the checks API provides.

To that end, I've offered to write a design document describing an
implementation of support for "sub-checks" -- an idea which appeared in
the original checks API design as a potential follow-up.

Sub-checks would simply be structured data about individual jobs which
are reported along with the overall check result.  With this in place,
Zuul could get out of the business of leaving comments with links to
logs, as each sub-check would support its own pass/fail, duration, and
log url.

Later, we could extend this to support reporting artifact locations as
well, so that within Gerrit, we would see links to the log URL and docs
preview sites, etc.

There is an opportunity to do some cross-repo testing between Zuul and
Gerrit as we work on this.

Upstream Gerrit's Gerrit does not have the SSH event stream available,
so before we can do any work against it, we need an alternative.  I
think the best way forward is to implement partial (experimental)
support for the checks API, so that we can at least use it to trigger
and report on changes, get OpenDev's Zuul added as a checker, and then
work on implementing sub-checks in upstream Gerrit and then Zuul.

Conclusion
----------

I'm sure I'm leaving stuff out, so feel free to prompt me with
questions.  In general we got a lot of work done and I think we're set
up very well for future collaboration.

-Jim

[1] https://gerrit-review.googlesource.com/Documentation/dev-contributing.html#design-driven-contribution-process
[2] https://model-engineers.com/en/company/references/success-stories/volvo-cars/



More information about the OpenStack-Infra mailing list