[openstack-dev] [stable] Re: [neutron] the hostname regex pattern fix also changed behaviour :(

Angus Lees gus at inodes.org
Mon Dec 1 00:56:57 UTC 2014


On Fri Nov 28 2014 at 10:49:21 PM Ihar Hrachyshka <ihrachys at redhat.com>
wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA512
>
> On 28/11/14 01:26, Angus Lees wrote:
> > Context: https://review.openstack.org/#/c/135616
> >
> > If we're happy disabling the check for components being all-digits, then
> > a minimal change to the existing regex that could be backported might be
> > something like
> >   r'(?=^.{1,254}$)(^(?:[a-zA-Z0-9_](?:[a-zA-Z0-9_-]{,61}[a-zA-
> Z0-9])\.)*(?:[a-zA-Z]{2,})$)'
> >
> > Alternatively (and clearly preferable for Kilo), Kevin has a replacement
> > underway that rewrites this entirely to conform to modern RFCs in
> > I003cf14d95070707e43e40d55da62e11a28dfa4e
>
> With the change, will existing instances work as before?
>

Yes, this cuts straight to the heart of the matter:  What's the purpose of
these validation checks?  Specifically, if someone is using an "invalid"
hostname that passed the previous check but doesn't pass an
improved/updated check, should we continue to allow it?
I figure our role here is either to allow exactly what the relevant
standards say, and deliberately reject/break anything that falls outside
that - or be liberal, restrict only to some sort of 'safe' input and then
let the underlying system perform the final validation.  I can see plenty
of reasons for either approach, but somewhere in the middle doesn't seem to
make much sense - and I think the approach chosen also dictates any
migration path.

As they currently stand, I think both Kevin's and my alternative above
_should_ be more liberal than the original (before the "fix") regex.
Specifically, they now allow all-digits hostname components - in line with
newer RFCs.

However, TLD handling is a little different between the variants:
- Kevin's continues to reject an all-digits TLD (also following RFC
guidance)
- mine and the original force TLDs to be all alpha (a-z; no digits or
dash/underscore)

The TLD handling is more interesting because an unqualified hostname (with
no '.' characters) hits the TLD logic in all variants, but the original has
a "\.?" quirk that means an unqualified hostname is forced to end with at
least 2 alpha-only chars.  As written above, mine is probably too
restrictive for unqualified names, and this would need to be fixed.

As the above shows, describing regex patterns in prose is long, boring and
inaccurate.  Someone who is going to have to approve the change should just
dictate what they want here and then we'll go do that :P
I suggest they also consider the DoS-fix-backport and the Kilo-and-forwards
cases separately.

 - Gus
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20141201/8780496e/attachment.html>


More information about the OpenStack-dev mailing list