[openstack-dev] battling stale .pyc files

Lucas Alvares Gomes lucasagomes at gmail.com
Mon Sep 15 11:34:04 UTC 2014


Hi Mike,

Thanks for bringing it up. I wanna say that I'm not an expert in
CPython, but I personally like the fix because I have had some
problems with stale .pyc in Ironic before, and they are pretty
annoying.

On Fri, Sep 12, 2014 at 4:18 PM, Mike Bayer <mbayer at redhat.com> wrote:
> I’ve just found https://bugs.launchpad.net/nova/+bug/1368661, "Unit tests sometimes fail because of stale pyc files”.
>
> The issue as stated in the report refers to the phenomenon of .pyc files that remain inappropriately, when switching branches or deleting files.
>
> Specifically, the kind of scenario that in my experience causes this looks like this.  One version of the code has a setup like this:
>
>    mylibrary/mypackage/somemodule/__init__.py
>
> Then, a different version we switch to changes it to this:
>
>    mylibrary/mypackage/somemodule.py
>
> But somemodule/__init__.pyc will still be sitting around, and then things break - the Python interpreter skips the module (or perhaps the other way around. I just ran a test by hand and it seems like packages trump modules in Python 2.7).
>
> This is an issue for sure, however the fix that is proposed I find alarming, which is to use the PYTHONDONTWRITEBYTECODE=1 flag written directly into the tox.ini file to disable *all* .pyc file writing, for all environments unconditionally, both human and automated.
>
> I think that approach is a mistake.  .pyc files have a definite effect on the behavior of the interpreter.   They can, for example, be the factor that causes a dictionary to order its elements in one way versus another;  I’ve had many relying-on-dictionary-ordering issues (which make no mistake, are bugs) smoked out by the fact that a .pyc file would reveal the issue.    .pyc files also naturally have a profound effect on performance.   I’d hate for the Openstack community to just forget that .pyc files ever existed, our tox.ini’s safely protecting us from them, and then we start seeing profiling results getting published that forgot to run the Python interpreter in it’s normal state of operation.  If we put this flag into every tox.ini, it means the totality of openstack testing will not only run more slowly, it also means our code will never be run within the Python runtime environment that will actually be used when code is shipped.   The Python interpreter is incredibly stable and predictable and a small change like this is hardly something that we’d usually notice…until something worth noticing actually goes wrong, and automated testing is where that should be found, not after shipment.
>

So this ordering thing, I don't think that it's caused by the
PYTHONDONTWRITEBYTECODE, I googled that but couldn't find anything
relating this option to the way python hash things (please point me to
a document/code if I'm wrong). Are you sure you're not confusing it
with the PYTHONHASHSEED option?

So PYTHONHASHSEED yes does affect the ordering of the dict keys[1][2].
And I think that you'll find it more alarming because in the tox.ini
we are already disabling that random hash seed[3] (but note that
there's a comment there, disabling it seems to be a temporary thing)

About the performance, this also doesn't seem to be true. I don't
think .pyc affects the performance we run things at all, pyc are not
meant to be an optimization in python. It DOES affect the startup of
the application tho, because it will have to regenerate the bytecode
all the time, see [4]:

"A program doesn't run any faster when it is read from a ‘.pyc’ or
‘.pyo’ file than when it is read from a ‘.py’ file; the only thing
that's faster about ‘.pyc’ or ‘.pyo’ files is the speed with which
they are loaded. "

[1] https://docs.python.org/2/using/cmdline.html#envvar-PYTHONHASHSEED
[2] https://docs.python.org/2/using/cmdline.html#cmdoption-R
[3] https://github.com/openstack/nova/blob/master/tox.ini#L12
[4] http://www.network-theory.co.uk/docs/pytut/CompiledPythonfiles.html

> The issue of the occasional unmatched .pyc file whose name happens to still be imported by the application is not that frequent, and can be solved by just making sure unmatched .pyc files are deleted ahead of time.    I’d favor a utility such as in oslo.utils which performs this simple step of finding all unmatched .pyc files and deleting (taking care to be aware of __pycache__ / pep3147), and can be invoked from tox.ini as a startup command.
>
> But guess what - suppose you totally disagree and you really want to not have any .pyc files in your dev environment.   Simple!  Put PYTHONDONTWRITEBYTECODE=1 into *your* environment - it doesn’t need to be in tox.ini, just stick it in your .profile.   Let’s put it up on the wikis, let’s put it into the dev guides, let’s go nuts.   Banish .pyc files from your machine all you like.   But let’s *not* do this on our automated test environments, and not force it to happen in *my* environment.
>

So, although I like the fix proposed and I would +1 that idea, I'm
also not very concerned if most of the people don't want that. Because
as you just said we can fix it locally easily. I didn't set it to my
.local but the way I do nowadays is to have a small bash function in
my .bashrc to delete the pyc files from the current directory:

function delpyc () {
   find . -name "*.pyc" -exec rm -rf {} \;
}

So I just invoke it when needed :)

> I also want to note that the issue of stale .pyc files should only apply to within the library subject to testing as it lives in its source directory.  This has nothing to do with the packages that are installed under .tox as those are full packages, unless there’s some use case I’m not aware of (possible), we don’t checkout code into .tox nor do we manipulate files there as a matter of course.
>
> Just my 2.5c on this issue as to the approach I think is best.   Leave the Python interpreter’s behavior as much as “normal” as possible in our default test environment.

My 0.5c :)

Cheers,
Lucas



More information about the OpenStack-dev mailing list