[tc][all] Wallaby Cycle Community Goals

Thomas Goirand zigo at debian.org
Wed Sep 30 20:08:35 UTC 2020


Hi Sean,

Thanks for your reply and sharing your concerns. I still don't agree
with you, and here's why.

On 9/30/20 12:29 PM, Sean Mooney wrote:
> On Thu, 2020-09-24 at 22:39 +0200, Thomas Goirand wrote:
>> I'm still available to attempt the /healthcheck thingy, I kind of
>> succeed in all major project but ... nova. Unfortunately, it was decided
>> in the project that we should put this on hold until the /healthcheck
>> can implement more check than just to know if the API is alive. 5 months
>> forward, I believe my original patch [1] should have been approved first
>> as a first approach. Nova team: any reaction?
> i actually dont think the healthcheck enpoint is userful in it current from for any porject
> that has distibuted process like nova, neutron or cinder.

With my operator's hat on, I assure you that it is very useful to
wire-up haproxy. This is btw exactly what it was designed for.

The way haproxy works, is that it actually has to perform an http
connection to check if the web service is alive. Without a specific URL,
we can't filter-out that /healthcheck URL from the saved logs in Elastic
search, which is very annoying.

Not doing a real HTTP check means one falls back to a TCP check, which
means that your logs are polluted with so many "client disconnected
unexpectedly" (that's not the actual message, but that's what it is
doing) since haproxy otherwise does a TCP connection, then closes it
before what a normal HTTP query would do.

I've been told to use other URLs, which isn't the way to go. I very much
think the /healthcheck URL was well designed, and should be used.

> that was not the only concern raised
> either as the default content of the detail responce wich include package infomation was considerd
> a possible security vulnerablity so with out agreeing on what kindo fo info can be retruned, its format and
>  wether this would be a admin only endpoint or a public endpoint tenant can check potentially without auth
> i dont think we should be procedding with this as a goal.

Are you aware that one could simply use "/" of most projects without
auth, and get the version of the project? Example with Nova:

curl -s https://api.example.com/compute/ | jq .

{
  "versions": [
    {
      "id": "v2.0",
      "status": "SUPPORTED",
      "version": "",
      "min_version": "",
      "updated": "2011-01-21T11:33:21Z",
      "links": [
        {
          "rel": "self",
          "href": "https://api.example.com/v2/"
        }
      ]
    },
    {
      "id": "v2.1",
      "status": "CURRENT",
      "version": "2.79",
      "min_version": "2.1",
      "updated": "2013-07-23T11:33:21Z",
      "links": [
        {
          "rel": "self",
          "href": "https://clint1-api.cloud.infomaniak.ch/v2.1/"
        }
      ]
    }
  ]
}

I believe "version": "2.79" is the microversion of the Nova API, which
therefore, exposes what version of Nova (here: Train). Am I correct? I
believe we also must leave this, because clients must be able to
discover the micro-version of the API, right?

Then, if I'm right, isn't this a much more annoying problem than having
a /healtcheck URL which could anyway be filtered by an HAProxy in front?
(note: I don't think the above is really a problem, since the
micro-version doesn't tell if Nova has been patched for security or not...)

> https://github.com/openstack/oslo.middleware/blob/master/oslo_middleware/healthcheck/__init__.py#L150-L152
> ^ is concerning from a security point of view.

This needs to be explicitly enabled in the api-paste.ini, and is
otherwise not displayed. Here's what I get in my Train deployment with
the query suggested in the example you gave:

{
    "detailed": false,
    "reasons": []
}

The code in here:
https://github.com/openstack/oslo.middleware/blob/master/oslo_middleware/healthcheck/__init__.py#L387

shows I'm right. That's also well documented here:
https://docs.openstack.org/oslo.middleware/latest/reference/healthcheck_plugins.html

See where it says this:

# set to True to enable detailed output, False is the default
detailed = False

in the api-paste.ini example.

So here, the point you are making isn't IMO valid.

> nova support configurable middelway still through the api-paste.ini file
> so untill we have a useful health check

As we discussed earlier, what is *not* useful, is any healthcheck that
would do more than what oslo.middleware does right now without caching
things, because that /healthcheck URL is typically called multiple times
per seconds (in our deployment: at least 6 times per seconds), so it
needs to reply fast. So I hope that the super-nice-healthcheck thingy
wont fry my server CPUs ... :)

What is *not* useful as well, is delaying such a trivial patch for more
than 6 months, just in the hope that in a distant future, we may have
something better.

Sure, take your time, get something implemented that does a nice
healtcheck with db access and rabbitmq connectivity checks. But that
should in no way get in the path of having a configuration which works
for everyone by default.

Cheers,

Thomas Goirand (zigo)



More information about the openstack-discuss mailing list