[largescale-sig][nova][neutron][oslo] RPC ping
Bogdan Dobrelya
bdobreli at redhat.com
Tue Jul 28 14:25:04 UTC 2020
On 7/28/20 4:11 PM, Ken Giusti wrote:
>
>
> On Tue, Jul 28, 2020 at 4:48 AM Bogdan Dobrelya <bdobreli at redhat.com
> <mailto:bdobreli at redhat.com>> wrote:
>
> On 7/27/20 7:08 PM, Dan Smith wrote:
> >> Tagging with Nova and Neutron as they are mentioned and I
> thought some
> >> people from those teams had opinions on this.
> >
> > Nova already implements ping() on the compute RPC interface, which we
> > use to make sure compute waits to start up until conductor is
> available
> > to do its bidding. So if a new obligatory RPC server method is
> actually
> > added called ping(), it will break us.
> >
> >> Can you refresh my memory on why we dropped this before? I recall
> >> talking about it in Denver, but I can't for the life of me remember
> >> what the conclusion was. Did we intend to use something else for
> this
> >> that has since fallen through?
> >
> > The prior conversation I recall was about helm sitting on our bus to
> > (ab)use our ping method for health checks:
> >
> >
> https://opendev.org/openstack/openstack-helm/commit/baf5356a4fb61590a95f64a63c0dcabfebb3baaa
> >
> > I believe that has since been reverted.
> >
> > The primary concern was about something other than nova sitting
> on our
> > bus making calls to our internal services. I imagine that the
> proposal
> > to bake it into oslo.messaging is for the same purpose, and I'd
> probably
> > have the same concern. At the time I think we agreed that if we were
> > going to support direct-to-service health checks, they should be
> teensy
> > HTTP servers with oslo healthchecks middleware. Further loading down
> > rabbit with those pings doesn't seem like the best plan to
> > me. Especially since Nova (compute) services already check in
> over RPC
> > periodically and the success of that is discoverable en masse through
> > the API.
>
> Having RPC ping in the common messaging library could improve aliveness
> handling of long-running APIs, like listing multiple Neutron ports or
> Heat objects with full details, or running some longish Mistral
> workflow
> maybe. Indeed it should be made not breaking things already existing in
> Nova ofc.
>
>
> Not sure this is related to your concern about long running API's but
> O.M. has an optional RPC call heartbeat monitor that verifies the
> connectivity to the server while the call is in progress. See the
> description of call_monitor_timeout in the RPC client docs [0].
Correct, but heartbeats didn't show off as a reliable solution. There
were WSGI & eventlet related issues [1] with running heartbeats. I can't
recall that was the final outcome of that discussion and what was the
fix. So relying on explicit pings sent by clients could work better perhaps.
[1] https://bugs.launchpad.net/tripleo/+bug/1829062
>
> 0: https://docs.openstack.org/oslo.messaging/latest/reference/rpcclient.html
>
>
>
> >
> > --Dan
> >
>
>
> --
> Best regards,
> Bogdan Dobrelya,
> Irc #bogdando
>
>
>
>
> --
> Ken Giusti (kgiusti at gmail.com <mailto:kgiusti at gmail.com>)
--
Best regards,
Bogdan Dobrelya,
Irc #bogdando
More information about the openstack-discuss
mailing list