[openstack-dev] [doc][i18n][infra][tc] plan for PDF and translation builds for documentation

Matthew Treinish mtreinish at kortar.org
Fri Sep 14 03:21:43 UTC 2018

On Fri, Sep 14, 2018 at 10:09:26AM +0900, Ian Y. Choi wrote:
> First of all, thanks a lot for nice summary - I would like to deeply
> read and put comments later.
> And @mtreinish, please see my reply inline:
> Matthew Treinish wrote on 9/14/2018 5:09 AM:
> > On Thu, Sep 13, 2018 at 07:23:53AM -0600, Doug Hellmann wrote:
> >> Excerpts from Michel Peterson's message of 2018-09-13 10:04:27 +0300:
> >>> On Thu, Sep 13, 2018 at 1:09 AM, Doug Hellmann <doug at doughellmann.com>
> >>> wrote:
> >>>
> >>>> The longer version is that we want to continue to use the existing
> >>>> tox environment in each project as the basis for the job, since
> >>>> that allows teams to control the version of python used, the
> >>>> dependencies installed, and add custom steps to their build (such
> >>>> as for pre-processing the documentation). So, the new or updated
> >>>> job will start by running "tox -e docs" as it does today. Then it
> >>>> will run Sphinx again with the instructions to build PDF output,
> >>>> and copy the results into the directory that the publish job will
> >>>> use to sync to the web server. And then it will run the scripts to
> >>>> build translated versions of the documentation as HTML, and copy
> >>>> the results into place for publishing.
> >>>>
> >>> Just a question out of curiosity. You mention that we still want to use the
> >>> docs environment because it allows fine grained control over how the
> >>> documentation is created. However, as I understand, the PDF output will
> >>> happen in a more standardized way and outside of that fine grained control,
> >>> right? That couldn't lead to differences in both documentations? Do we have
> >>> to even worry about that?
> >> Good question.  The idea is to run "tox -e docs" to get the regular
> >> HTML, then something like
> >>
> >>    .tox/docs/bin/sphinx-build -b latex doc/build doc/build/latex
> >>    cd doc/build/latex
> >>    make
> >>    cp doc/build/latex/*.pdf doc/build/html
> > To be fair, I've looked at this several times in the past, and sphinx's latex
> > generation is good enough for the simple case, but on more complex documents
> > it doesn't really work too well. For example, on nova I added this a while ago:
> >
> > https://github.com/openstack/nova/blob/master/tools/build_latex_pdf.sh
> After seeing what the script is doing, I wanna divide into several parts
> and would like to tell with some generic approach:
> - svg -> png
>  : PDF builds ideally convert all svg files into PDF with no problems,
> but there are some realistic problems
>    such as problems on determining bounding sbox size on vector svg
> files, and big memory problems with lots of tags in svg files.
>  : Maybe it would be solved if we check all svg files with correct
> formatting,
>    or if all svg files are converted to png files with temporal changes
> on rst file (.svg -> .png), wouldn't it?

Yeah we will have to do either. In my experience just converting to png images
is normally easier.

> - non-latin code problems:
>  : By default, Sphinx uses latex builder, which doesn't support
> non-latin codes and customized fonts [1].
>    Documentation team tried to make use of xelatex instead of latex in
> Sphinx configuration and now it is overridden
>    on openstackdocstheme >=1.20. So non-latin code would not generate
> problems if you use openstackdocstheme >=1.20.

Ok sure, using XeTex will solve this problem. I typically still just use
pdflatex so back when I pushed that script (which was over 3 years ago)
I was trying to fix it by converting the non-latin characters by using latex
symbol equivalents for those characters. (which is a feature built-in to
sphinx, but it just misses a lot of symbols)

> - other things
>  : I could not capture the background on other changes such as
> additional packages.
>    If u provide more background on other things, I would like to
> investigate on how to approach by changing a rst file
>    to make compatible with pdf builds or how to support all pdf builds
> on many project repos as much as possible.

The extra packages were part of the attempt to fix the non-latin characters
using latex symbols. Those packages are just added there so you can call
\checkmark and \ding{54} instead of ✔ and ✖.

> When I test PDF builds on current nova repo with master branch, it seems
> that the rst document is too big
> (876 pages with error) and more dealing with overcoming memory problems
> was needed.
> I would like to think how to overcome this, but it would be also nice if
> someone shares advices or comments on this.

Hmm, I wasn't able to even get that far. When I tried a vanilla pdf build
from nova master it only compiled 540 pages before it errored out on capacity
exceeded. I know that the limit is adjustable in a config file, but I'm not
sure if there is a more dynamic method for adjusting it.

-Matt Treinish

> [1] https://tug.org/pipermail/xetex/2011-September/021324.html
> [2] https://review.openstack.org/#/c/552070/5/openstackdocstheme/ext.py@227
> > To work around some issues with this workflow. It was enough to get the
> > generated latex to actually compile back then. But, that script has bitrotted
> > and needs to be updated, because the latex from sphinx for nova's docs no
> > longer compiles. (also I submitted a patch to sphinx in the meantime to
> > fix the check mark latex output) I'm afraid that it'll be a constant game
> > of cat and mouse trying to get everything to build.
> >
> > I think that we'll find that on most projects' documentation we will need
> > to massage the latex output from sphinx to build pdfs.
> >
> > -Matt Treinish
> >
> >> We would run the HTML translation builds in a similar way by invoking
> >> sphinx-build from the virtualenv repeatedly with different locale
> >> settings based on what translations exist.
> >>
> >> In my earlier comment, I was thinking of the case where a team runs
> >> a script to generate rst content files before invoking sphinx to
> >> build the HTML. That script would have been run before the PDF
> >> generation happens, so the content should be the same. That also
> >> applies for anyone using sphinx add-ons, which will be available
> >> to the latex builder because we'll be using the version of sphinx
> >> installed in the virtualenv managed by tox.
> >>
> >>
> >>
> >> __________________________________________________________________________
> >> OpenStack Development Mailing List (not for usage questions)
> >> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20180913/f6def231/attachment.sig>

More information about the OpenStack-dev mailing list