[Openstack] Blueprint proposal: Drop setuptools_git for including data/config files

Thomas Goirand thomas at goirand.fr
Tue Dec 18 16:22:51 UTC 2012

On 12/18/2012 05:29 PM, Sascha Peilicke wrote:
> On 12/17/2012 04:47 PM, Thomas Goirand wrote:
>> This
>> means that absolutely all of our packages have to embed a patch in
>> debian/patches to "fix" the "wrong" MANIFEST.in.
>> We've spent quite some time on that. Or rather, should I say: it's a
>> real time waster.
>> While I do agree that the MANIFEST.in should be generated automatically,
>> I don't think it should be stored in a "wrong" way on github.
> So it should either contain something meaningful or be removed. In their
> current state, these files are just worthless.


On 12/18/2012 07:50 PM, Mark McLoughlin wrote:
>   1) You don't build packages from the tarballs produced by upstream.
>      I don't understand why you don't use tarballs, but I'm willing
>      to assume it's not something you can (or want to) change.

In many ways, it's very convenient to do what we do. We could go back to
use tarballs, but if it is avoidable, I want to keep the current system.

>   2) Instead you use git-buildpackage which I know nothing about.

Shortly, git-buildpackage uses an upstream branch (often called
"master") containing upstream code, and a Debian branch containing that,
plus the debian folder. Typing "git-buildpackage" does all the magic
needed to build a Debian package, with the added bonus that you don't
have to build where your package is stored (usually, we build in
../build-area), which means no risk to modify any files in your Git

>   3) You work from git repositories forked from upstream e.g.
>      which AFAICT just add a debian/ directory to the source tree


>   4) 'get-vcs-source' somehow generates a tarball from this tree. I'm
>      guessing it does 'git archive' rather than
>      'python setup.py sdist' but I'm not sure. Some more digging
>      turns up:
>      and it seems that yes, you're using 'git archive'

That's correct, you've found out quite well. Also, "get-vcs-source"
fetches the master branch from upstream (git add remote, git fetch...).

The repository is either git://github.com/openstack/<package-name>.git,
or we just override the UPSTREAM_GIT variable before including
/usr/share/openstack-pkg-tools/pkgos.make (that's needed for python
modules which are not on the github.com/openstack repo).

>   5) I'm guessing there are two issues with these 'git archive'
>      generated tarballs - (a) there's no versioninfo file so
>      setuptools doesn't know have a version number and (b) there's
>      no .git directory so setuptools doesn't have an accurate way of
>      building a manifest
>      I'm not completely clear what setuptools commands are failing
>      because of issue (b)

All the above is exactly right!

One of the reasons we are happy to use git archive is that this produces
a Debian orig.tar.xz (notice the xz, and not just gz) from *any* commit,
if we want to. Let me explain how it works.

Let's say I want to make a snapshot release of Ceilometer for the commit
hash 23ff2f9bbfc14e435c4c04ddddfba473cf2a829b (this is an actual real
life example, in fact...). Then, to do that, I just do:

git checkout master
git log # find out what's the name of the tag...
git tag 2013.1_g0.4+23ff2f9bbf 23ff2f9bbfc14e435c4c04ddddfba473cf2a829b
git checkout debian/experimental
git merge -X theirs 23ff2f9bbfc14e435c4c04ddddfba473cf2a829b

Then I edit my debian/changelog so it matches the tag:

<example debian/changelog>
ceilometer (2013.1~g0.4+23ff2f9bbf-1) experimental; urgency=low

  * Initial release (Closes: #693406).

 -- Thomas Goirand <zigo at debian.org>  Wed, 14 Nov 2012 14:41:52 +0000
<end of example debian/changelog>

(the important bit is of course the version number above)

Then I generate the orig.tar.xz:

./debian/rules get-vcs-source
cp ../ceilometer_2013.1~g0.4+23ff2f9bbf.orig.tar.xz ../build-area

Then I just build:

Another reason why git is convenient, is that we can cherry pick -x
stuff from upstream, do branching, etc. Well, do I really have to
convince people of this list that git is convenient? :) Probably not.

>   6) However, you seem to be saying that issue (b) isn't an issue but
>      rather inaccurate MANIFEST.in files are the problem. How exactly
>      are they causing a problem. If we delete those files from
>      upstream git, does that work for you?

Well, it be better if the MANIFEST.in could be stored correctly in
upstream Git repositories, so we wouldn't have to deal with them.
Currently, we embed a patch in debian/patches to fix them.

>   7) You're also describing some issue where 'clean targets' (I don't
>       know what they are? Similar to 'make clean'?) are causing
>       commands like 'git fetch origin' to be run? What exactly is
>       going on here? Is this because the versioninfo file gets
>       deleted by the cleanup and our setup.py logic attempts to
>       recreate it? If we had a way of disabling network access during
>       the build (e.g. OS_SETUP_NONET=1) would that solve the problem?

Let me give an example from Ceilometer. Without "cleaner = true",
git-buildpackage attemps to run "./debian/rules clean", which calls
setup.py clean -a. The shell output is then:

zigo at GPLHost:dom0.node4407>_
~/openstack-auto-builder/sources/ceilometer/ceilometer$ git-buildpackage
dh clean  --with python2
   debian/rules override_dh_auto_clean
make[1]: Entering directory
Could not create directory '/root/.ssh'.
The authenticity of host 'git.debian.org (' can't be
RSA key fingerprint is 8c:c0:b8:9f:0a:79:ee:1c:77:c4:b8:a1:70:55:b7:31.
Are you sure you want to continue connecting (yes/no)?

because in ceilometer/openstack/common/setup.py, there's a "git fetch
origin". Openstack common (or Oslo...) makes the wrong assumption that
"origin" is the repository on github, which isn't true in my case (for
us, origin is alioth.debian.org). Worse: it's trying to do this as root,
because of the use of the debian tool "fakeroot" to build. A "ps axuf"
shows a bit more what's going on:

\_ /usr/bin/python -u /usr/bin/git-buildpackage
 \_ /bin/sh -c debuild -d clean
  \_ /usr/bin/perl /usr/bin/debuild -d clean
   \_ /bin/bash /usr/bin/fakeroot debian/rules clean
    \_ /usr/bin/make -f debian/rules clean
     \_ /usr/bin/perl -w /usr/bin/dh clean --with python2
      \_ /usr/bin/make -f debian/rules override_dh_auto_clean
       \_ /usr/bin/perl -w /usr/bin/dh_auto_clean
        \_ python2.6 setup.py clean -a
         \_ /bin/sh -c git fetch origin +refs/meta/*:refs/remotes/meta/*
          \_ git fetch origin +refs/meta/*:refs/remotes/meta/*
           \_ ssh git.debian.org git-upload-pack '/git/openstack/ceilome

I have discussed this problem already, and Julien Danjou made an attempt
at fixing it (but I'm not sure if his patch has been approved already or
not in gerrit).

Of course, I can add "cleaner = true" so that no "setup.py clean -a" is
called before the build, but I think this is both suboptimal and
potentially dangerous.

> Honestly, I've just spent a good amount of time trying to figure out
> your problems

Thanks a lot!

> If you want upstream to change its build process to reduce your
> packaging pain, then you really need to take the time to very clearly
> explain what you need in a way that doesn't require upstream folks to
> know anything about your packaging process.

I'm trying ... :)
I hope that with this mail, it's going to be more clear. Let me know if
it's not.

What would be great, for us in Debian, would be if every (PGP signed,
please... ;) ) tag upstream would also include the generation of a
correct MANIFEST.in and versioninfo files.

Just an idea here, I'm not sure if it's really practical. This could be
done simply with a script in Olso, for example. It would be used to do
the actual work before tagging, to make sure MANIFEST.in and versioninfo
files are correct.
Optionally, I could run this script manually when I need to do a
snapshot release (which is pretty rare) of one of the projects, then
compute the diff file, and include the result in debian/patches, if
needed. The goal being that I'd be reassured that it is the official way
to generate the MANIFEST.in, and that I don't have the risk to forget
some files, somehow.

Please let me know if you need to know more, or if I wasn't clear enough.


Thomas Goirand (zigo)

More information about the Openstack mailing list