[openstack-dev] Documentation containing external resource links & privacy breaches

Thomas Goirand zigo at debian.org
Thu Dec 3 13:37:30 UTC 2015


On many occasion, I've seen our docs containing external links, with
images containing src=foo.com/bar. In Debian, I am packaging these
documentation, and lintian (the packaging linter for Debian) yells that
there are privacy breaches. Below is an example with oslo.context. But
that's only a single example. There's many which I have to manually
patch out in Debian, which is both time consuming and quite frustrating.

Another good example is our openstackdocstheme including references to
Google Analytics. While I am already a bit annoyed to see that we're
giving (for free) all of our data to google (sic!), this means that all
docs we're generating for absolutely all projects will contain such
Google Analytics javascript. This also is a security issue: we're
allowing not only google, but anyone doing a man-in-the-middle to
replace the google analytics ga.js by anything (remember: when browsing
the doc locally on your computer, you're not using HTTPS, and then, the
ga.js is fetch without TLS). This is not just theory, we've seen real
life examples of country-wide firewall (no, no no no, I wont name
China...) playing with these.

I've filed a bug against our docs theme, but it was marked as wontfix,
and the patch which I started, was reviewed negatively. I've been told
that we use Google Analytics for the openstack.org site, which I don't
think is the right answer. I do believe we should think twice here.
There are many alternative options to google analytics, such as web log
analysis (webalizer, and such), and others involving local javascript of
the same type as google analytics but without the privacy breach.
There's ways to serve website-wide footers too (mod_footer for Apache
for example). So I do believe there's better approaches to "we want
statistic for openstack.org" than just Google Analytics.

So, could we have a general policy that we stop having such external
resources in our documentations? What's the broader view of the
community on this issue?


Thomas Goirand (zigo)

P.S: Here's the output of lintian when generating the
python-oslo.context-doc package:

X: python-oslo.context-doc: privacy-breach-generic
N:    This package creates a potential privacy breach by fetching data
from an
N:    external website at runtime. Please remove these scripts or external
N:    HTML resources.
N:    Please replace any scripts, images, or other remote resources with
N:    non-remote resources. It is preferable to replace them with text and
N:    links but local copies of the remote resources are also acceptable as
N:    long as they don't also make calls to remote services. Please ensure
N:    that the remote resources are suitable for Debian main before making
N:    local copies of them.
N:    Severity: important, Certainty: wild-guess
N:    Check: files, Type: binary, udeb
N:    This tag is marked experimental, which means that the code that
N:    generates it is not as well-tested as the rest of Lintian and might
N:    still give surprising results. Feel free to ignore experimental tags
N:    that do not seem to make sense, though of course bug reports are
N:    welcome.

