[openstack-dev] testr help

Doug Hellmann doug.hellmann at dreamhost.com
Tue Mar 11 16:46:37 UTC 2014


On Mon, Mar 10, 2014 at 7:20 PM, Zane Bitter <zbitter at redhat.com> wrote:

> On 10/03/14 16:04, Clark Boylan wrote:
>
>> On Mon, Mar 10, 2014 at 11:31 AM, Zane Bitter <zbitter at redhat.com> wrote:
>>
>>> Thanks Clark for this great write-up. However, I think the solution to
>>> the
>>> problem in question is richer commands and better output formatting, not
>>> discarding information.
>>>
>>>
>>> On 07/03/14 16:30, Clark Boylan wrote:
>>>
>>>>
>>>> But running tests in parallel introduces some fun problems. Like where
>>>> do you send logging and stdout output. If you send it to the console
>>>> it will be interleaved and essentially useless. The solution to this
>>>> problem (for which I am probably to blame) is to have each test
>>>> collect the logging, stdout, and stderr associated to that test and
>>>> attach it to that tests subunit reporting. This way you get all of the
>>>> output associated with a single test attached to that test and don't
>>>> have crazy interleaving that needs to be demuxed. The capturing of
>>>>
>>>
>>>
>>> This is not really a problem unique to parallel test runners. Printing to
>>> the console is just not a great way to handle stdout/stderr in general
>>> because it messes up the output of the test runner, and nose does exactly
>>> the same thing as testr in collecting them - except that nose combines
>>> messages from the 3 sources and prints the output for human consumption,
>>> rather than in separate groups surrounded by lots of {{{random braces}}}.
>>>
>>>  Except nose can make them all the same file descriptor and let
>> everything multiplex together. Nose isn't demuxing arbitrary numbers
>> of file descriptors from arbitrary numbers of processes.
>>
>
> Can't each subunit process do the same thing?
>
> As a user, here's how I want it to work:
>  - Each subunit process works like nose - multiplexing the various streams
> of output together and associating it with a particular test - except that
> nothing is written to the console but instead returned to testr in subunit
> format.
>  - testr reads the subunit data and saves it to the test repository.
>  - testr prints a report to the console based on the data it just
> received/saved.
>
> How it actually seems to work:
>  - A magic pixie creates a TestCase class with a magic incantation to
> capture your stdout/stderr/logging without breaking other test runners.
>  - Or they don't! You're hosed. The magic incantation is undocumented.
>  - You change all of your TestCases to inherit from the class with the
> magic pixie dust.
>  - Each subunit process associates the various streams of output (if you
> set it up to) with a particular test, but keeps them separate so that if
> you want to figure out the order of events you have to direct them all to
> the same channel - which, in practice, means you can only use logging
> (since some of the events you are interested in probably already exist in
> the code as logs).
>  - when you want to debug a test, you have to all the tedious loigging
> setup if it doesn't already exist in the file. It probably won't, because
> flake8 would have made you delete it unless it's being used already.
>  - testr reads the subunit data and saves it to the test repository.
>  - testr prints a report to the console based on the data it just
> received/saved, though parts of it look like a raw data dump.
>
> While there may be practical reasons why it currently works like the
> latter, I would submit that there is no technical reason it could not work
> like the former. In particular, there is nothing about the concept of
> running the tests in parallel that would prevent it, just as there is
> nothing about what nose does that would prevent two copies of nose from
> running at the same time on different sets of tests.
>
>
>  this data is toggleable in the test suite using environment variables
>>>> and is off by default so that when you are not using testr you don't
>>>> get this behavior [0]. However we seem to have neglected log capture
>>>> toggles.
>>>>
>>>
>>>
>>> Oh wow, there is actually a way to get the stdout and stderr? Fantastic!
>>> Why
>>> on earth are these disabled?
>>>
>>>  See above, testr has to deal with multiple writers to stdout and
>> stderr, you really don't want them all going to the same place when
>> using testr (which is why stdout and stderr are captured when running
>> testr but not otherwise).
>>
>
> Ah, OK, I think I understand now. testr passes the environment variables
> automatically, so you only have to know the magic incantation at the time
> you're writing the test, not when you're running it.
>
>
>  Please, please, please don't turn off the logging too. That's the only
>>> tool
>>> left for debugging now that stdout goes into a black hole.
>>>
>>>  Logging goes into the same "black hole" today, I am suggesting that we
>> make this toggleable like we have made stdout and stderr capturing
>> toggleable. FWIW this isn't a black hole it is all captured on disk
>> and you can refer back to it at any time (the UI around doing this
>> could definitely be better though).
>>
>
> Hmm, now that you mention it, I remember Clint did the setup work in Heat
> to get the logging working. So maybe we have to do the same for stdout and
> stderr. At the moment they really do go into a black hole - we can't see
> them in the testr output at all and nor are they stored in the repository
> on disk.
>

If you're going to do this in your project, would you contribute the patch
to oslo.test as well (or instead)?

http://git.openstack.org/cgit/openstack/oslo.test

Doug


>
> It just seems bizarre to me that the _tests_ have to figure out what test
> runner they are being run by and redirect their output to the correct
> location to oblige it. Surely the point of a test runner is to do the Right
> Thing(TM) for any test, regardless of what it knows about the test runner.
>
>
>  BTW, since I'm on the subject, testr would be a lot more
>>> confidence-inspiring if running `testr failing` immediately after running
>>> `testr` reported the same number of failures, or indeed if running
>>> `testr`
>>> twice in a row was guaranteed to report the same number of failures
>>> (assuming that the tests themselves are deterministic). I can't imagine
>>> why
>>> one of the subunit processes reporting a test failure is considered a
>>> failure in itself (that's what it's supposed to do!), particularly when
>>> `testr failing` manages to filter them out OK.
>>>
>>>  IIRC the test runner exiting non zero is a failure in itself to handle
>> cases where test runners die a horrible death without outputting any
>> failed subunit data. It could probably be cleaner but this way it is
>> preserved in the subunit log.
>>
>
> It makes sense for the case where the test runner has died without
> reporting data, but why should it be reported as a separate failure when it
> has reported data that testr has regarded as valid enough to use and
> display?
>
> My machine has 4 cores with hyperthreads (a common configuration, I would
> imagine), so I have to do this mental translation every time:
>
> 1    failure  -> everything failed
> 2    failures -> 1 failure
> 3    failures -> 2 failures
> 4    failures -> 2-3 failures
> 5    failures -> 3-4 failures
> 6    failures -> 3-5 failures
> 7    failures -> 4-6 failures
> 8    failures -> 4-7 failures
> 9    failures -> 5-8 failures
> 10   failures -> 5-9 failures
> 11   failures -> 6-10 failures
> 12   failures -> 6-11 failures
> 13   failures -> 7-12 failures
> 14   failures -> 7-13 failures
> 15   failures -> 8-14 failures
> 16   failures -> 8-15 failures
> 17   failures -> 9-16 failures
> 18   failures -> 10-17 failures
> n>18 failures -> (n-8)-(n-1) failures
>
> This means that the change statistic that testr prints is completely
> useless for determining what I really needed to know, which is whether the
> thing I just modified fixed any tests or not (or, alternatively, whether I
> have an unstable test). For low-ish numbers of test failures (say <15),
> it's dominated by random noise. Call me old-fashioned, but random noise is
> not the distribution I'm looking for in reports from my test runner :p
>
>
>  (I understand that some of the things mentioned above may already have
>>> been
>>> improved since the latest 0.0.18 release. I can't actually find the repo
>>> at
>>> the moment to check, because it's not linked from PyPI or Launchpad and
>>> 'testrepository' turns out to be an extremely common name for things on
>>> the
>>> Internet.)
>>>
>>>  https://launchpad.net/testrepository is linked from pypi....
>>
>
> Ah, thanks. I looked at that page, but in my head I thought I had once
> seen a Git repo for it and I couldn't see a link from there... I never
> guessed that the upstream was in Launchpad's bzr.
>
> cheers,
> Zane.
>
>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20140311/48f33728/attachment.html>


More information about the OpenStack-dev mailing list