[openstack-dev] improving PyPi modules design & FHS (was: the future of angularjs development in Horizon)

Donald Stufft donald at stufft.io
Fri Nov 14 13:44:24 UTC 2014


> On Nov 13, 2014, at 6:29 PM, Thomas Goirand <zigo at debian.org> wrote:
> 
> On 11/14/2014 06:40 AM, Donald Stufft wrote:
>>>>> Sure! That's how I do most of my Python modules these days. I don't just
>>>>> create them from scratch, I use my own "debpypi" script, which generates
>>>>> a template for packaging. But it can't be fully automated. I could
>>>>> almost do it in a fully automated manner for PEAR packages for PHP (see
>>>>> "debpear" in the Debian archive), but it's harder with Python and pip/PyPi.
>>>> 
>>>> I would be interested to know what makes Python harder in this regard, I
>>>> would like to fix it.
>>> 
>>> The fact that the standard from PyPi is very fuzzy is one of the issue.
>>> There's nothing in the format (for example in the DOAP.xml record) that
>>> tells if a module supports Python3 for example. Then the short and long
>>> descriptions aren't respected, often, you get some changelog entries
>>> there. Then there's no real convention for the location of the sphinx
>>> doc. There's also the fact that dependencies for Python have to be
>>> written by hand on a Debian package. See for example, dependencies on
>>> arparse, distribute, ordereddict, which I never put in a Debian package
>>> as it's available in Python 2.7. Or the fact that there's no real unique
>>> place where dependencies are written on a PyPi "package" (is it hidden
>>> somewhere in setup.py, or is it explicitly written in
>>> requirements.txt?). Etc. On the PHP world, everything is much cleaner,
>>> in the package.xml, which is very easily parse-able.
>> 
>> (This is fairly off topic, so if you want to reply to this in private that’s
>> fine):
> 
> Let's just change the subject line, so that those not interested in the
> discussion can skip the topic entirely.
> 
>> Nothing that says if it supports py3:
>>    Yea, this is a problem, you can somewhat estimate it using the Python 3
>>    classifier though.
> 
> The issue is that this is a not-mandatory tag. And often, it isn't set.
> 
>> Short and Long descriptions aren’t respected:
>>    I’m not sure what you mean by isn’t respected?
> 
> On my templating script, I grab what's supposed to be the short and long
> description. But this leads to importing some RST format long
> description that do include unrelated things. In fact, I'm not even sure
> there's such things as long and short desc in the proper way, so that it
> could just be included in debian/control without manual work.

I suspect this is just a difference between the two systems them. We do have
such concepts as short and long description, but we support mark up (via RST)
in the long description and obviously since PyPI is a not a curated index there’s
nothing stopping people from doing whatever they want in those descriptions.

> 
>> Have to write dependencies by hand:
>>    Not sure what you mean by not depending on argparse, distribute, ordereddict,
>>    etc? argparse and order edict are often depended on because of Python 2.6,
> 
> Right. I think this is an issue in Debian: we should have had a
> Provides: in python 2.7, so that it wouldn't have mater. I just hope
> this specific issue will just fade away as Python 2.6 gets older and
> less used.

For those particular cases probably, the general issue likely won’t go away though,
it’ll occur anytime a new version of Python adds a new module that is either already
available separately or that someone writes a backport package for older versions
of Python. On the plus side the newer formats support conditional dependencies so
you can say things like:

Requires-Diet: argparse; python_version == ‘2.6'

which will cause it to only be a dependency on Python 2.6. The sdist format doesn’t
yet support this (although since setup.py is executable you can approximate it by
generating a list of dependencies that varies depending on Python version).

> 
>>    setuptools/distribute should only be dependended on if the project is using
>>    entry points or something similar.
> 
> If only everyone was using PBR... :)
> 
>> No unique place where dependencies are written:
>>    If the project is using setuptools (or is usable from pip) then dependencies
>>    should be inside of the install_requires field in the setup.py. I can send
>>    some code for getting this information. Sadly it’s not in a static form yet
>>    so it requires executing the setup.py.
> 
> Executing blindly setup.py before I can inspect it would be an issue.
> However, yes please, I'm curious on how to extract the information, so
> please do send the code!

I just woke up so I’ll extract it from pip and send it later today, however
the general gist is that you execute ``setup.py egg_info`` which will generate
a .egg-info directory alongside the setup.py file, and then inside of that
is a requires.txt file which can be parsed to extract the dependencies. The
gotchas here are that the egg_info command and the idea of dependencies at all
is a setuptools feature not distutils, so it only works if the project supports
setuptools style setup.py. Even if they don’t support it you can force the setup.py
to use setuptools with a nasty hack, however unless they also specify install_requires
the requires.txt will be empty. The other gotcha is that the install_requires
is set when setup.py is executed so the list of dependencies that it reports may
vary depending on platform, Python version used to execute the setup.py, time of day
amount of entropy in the universe, etc.

> 
>>> No, that's for arch independent *things*. Like for example, javascript.
>>> In Debian, these are going in /usr/share/javascript. Python code used to
>>> live within /usr/share/pyshared too (but we stopped the symlink forest
>>> during the Jessie cycle).
>> 
>> Why does the FHS webpage say differently?
>> 
>> From [1]:
>> 
>>    The /usr/share hierarchy is for all read-only architecture independent data files.
> 
> Which is exactly what I wrote. Oh, maybe it's the "data files" that
> bothers you? Well, in some ways, javascript can be considered as data
> files. But let's take another example. PHP, java and perl library files
> are all stored into /usr/share as well (though surprisingly, ruby is in
> /usr/lib... but maybe because it also integrates compiled-in .so files).

Yea it’s the data files part (which is why I added the * * around it in my original message).
Maybe the FHS uses confuses terminology here but I wouldn’t, and I suspect the NPM maintainers
feel the same way, classify software that is designed to be executed on the server as “data”.

> 
>>>> I believe it also states that
>>>> /usr/lib is for object files, libraries, and internal binaries.
>>> 
>>> It's for arch dependent things.
>> 
>> Why does the FHS webpage say differently?
>> 
>> From [2]:
>> 
>>    /usr/lib includes object files, libraries, and internal binaries that are not
>>    intended to be executed directly by users or shell scripts.
> 
> That's nothing that goes against what I wrote. "object files, libraries,
> and internal binaries" are all arch-dependent things if you know how to
> read between the lines, especially if you know that /usr/share is for
> "architecture independent" stuff.
> 
> Thomas
> 
> 
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

---
Donald Stufft
PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA




More information about the OpenStack-dev mailing list