[OpenStack-Infra] Your Gerrit Account has been temporarily disabled
Znoinski, Waldemar
waldemar.znoinski at intel.com
Mon Jun 15 12:56:30 UTC 2015
Hi Jeremy (and note to everybody)
As discussed on IRC after our troubleshooting of the problem...
It looks like, in my case, my small python script that's making use of 'pygerrit' seems to be causing problem when you happen to Stop the script with CTRL+Z on the shell and leave it there for some time i.e.: 15mins+ (depending on the number/size of Gerrit events coming). It may then cause stream-event to be blocked on your Gerrit server. If I remember correctly what you've suggested it may have something to do with buffer/queue on Gerrit server to fill up and 'halt' the process. My script resumes normal operation once taken back from Stopped state (fg <job_number>) and it resumes stream-event on the server normally.
I couldn NOT reproduce the same behavior with stopping a normal 'ssh review.openstack.org ... gerrit query...' command.
I'd advise anybody who writes their own Gerrit clients to avoid stopping commands running Gerrit query - at least in the Gerrit server version Openstack infra is using.
Thanks for time troubleshooting the problem Jeremy
Yet another lesson learned
Waldek
> -----Original Message-----
> From: Jeremy Stanley [mailto:fungi at yuggoth.org]
> Sent: Friday, June 5, 2015 4:52 PM
> To: Znoinski, Waldemar
> Cc: openstack-infra at lists.openstack.org
> Subject: Re: [OpenStack-Infra] Your Gerrit Account has been temporarily
> disabled
>
> On 2015-06-05 15:12:42 +0000 (+0000), Znoinski, Waldemar wrote:
> [...]
> > I'd like to know more about what you saw and/or what was causing (or
> > you think was) the problem.
>
> Specifically, we saw all available Gerrit stream-events worker threads busy
> servicing your connections, and all other stream-events tasks queued and
> waiting for an available worker thread.
> Unfortunately when Gerrit gets into this situation, it seems that merely killing
> the tasks being serviced does not wake up the waiting tasks and so all the
> other stream-events connections get no new updates until we restart the
> entire Gerrit service.
>
> > * Was it too many connections spawn in a given amount of time?
>
> No, it looked like they had been opened at different times over the course of
> at least several hours.
>
> > * Were the connections long lasting (possible lack of closing the
> > connections)?
>
> I think this may be the problem (not the long lasting, but the not being closed
> while actually defunct).
>
> > * Was the command inside the ssh session not finishing/hanging (or
> > long running) ?
>
> It's unfortunately hard to tell from what little detail we get in Gerrit logs and
> thread dumps.
>
> > What I see my side for last 24h period is ~10 connections to
> > review.openstack.org which were hanging and not closed my side, yet
> > not doing anything as far as I can tell. From your description of the
> > problem that may be it - Gerrit threads consumed unnecessarily. If you
> > have connection details of the problematic ssh sessions (source port
> > at least) it would be great.
>
> I don't, but we might be able to recreate this problem now that we have a
> little better idea of the surrounding circumstances.
>
> > As I understand listening of Gerrit event stream is not causing the
> > issue but the second part (to run 'gerrit query' over ssh) is
> > - correct me if I'm wrong.
> [...]
>
> It's actually the gerrit stream-events connections that are the problem, not
> gerrit query from what we can tell. Was there anything unusual about your
> open stream-events connections from your end, as far as you know? I'm sort
> of wondering whether connections which get uncleanly terminated at the
> client (firewall drops an existing state and doesn't spoof a FIN or RST or send
> a relevant ICMP error) cause the socket buffer to fill up and then the worker
> threads block on write once that happens. Speculation for now, but once we
> can nail this down hopefully we'll be able to provide an actionable bug report
> to the Gerrit developer community.
> --
> Jeremy Stanley
--------------------------------------------------------------
Intel Shannon Limited
Registered in Ireland
Registered Office: Collinstown Industrial Park, Leixlip, County Kildare
Registered Number: 308263
Business address: Dromore House, East Park, Shannon, Co. Clare
This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies.
More information about the OpenStack-Infra
mailing list