[openstack-dev] Nova + testr?
Robert Collins
robertc at robertcollins.net
Sun Feb 3 01:37:55 UTC 2013
On 2 February 2013 18:10, Joshua Harlow <harlowja at yahoo-inc.com> wrote:
> Howdy all,
>
> Just was running the nova unit tests and getting used to the testr running
> and was wondering if there is anyway to see exactly what tests its running
> (since the nova tests take a long long time to complete).
>
> Is this a feature that is hidden, or something that others have been using
> to do this.
Hi, so let me explain a little about what is happening, and suggest my
preferred way to fix it.
Firstly, as Monty says, testr is running the actual text executors -
they are the 'python -m subunit.run' processes you see started.
subunit is a streaming protocol that only identifies one test at a
time, but to run multi-core, each worker gets it own stream. (I touch
on this as something we might improve in subunit in my testrepository
LCA talk (slides at tinyurl.com/testr-lca-2013 , video soon)). As a
result, it needs multiplexing before it can be handed off to either
store on disk (in the .testrepository/$runid file) -or- handed out
over a single subunit stream (which is what --subunit that monty
suggests) will do. The problem is that this will only show you
multiplexed tests - that is, tests that have actually finished
executing.
Exacerbating that, the various stages may well be buffering (because
forcing actual writes when they aren't needed is poor for network
utilisation etc).
So --subunit will *at best [even if we fix all layers to permit
no-buffering] only ever show you 'the most recently completed test'.
Now, there are several different use cases conflated here into the one CLI UI.
Firstly, there is 'is it hung' : this is actually a per-worker
question, and we could, for instance, print out the name of a hung
test (assuming the test process didn't buffer the test start :)) when
no input has been received for > <some sensible number of seconds -
5/10 whatever>. This has to happen within testr, as the output stream
is multiplexed - see above. Unless we do another protocol revision to
support concurrent in-progress tests on different timelines, but this
is complex and will need some care. I'd like to do that, I think, but
it isn't amenable to a quick patch - if someone wants to take a
subunit protocol rev, expect some deep thinking to take place, with
due care for compat, C libraries etc.
Secondly there is 'how far through is it' - we have a progress bar
abstraction in subunit for this precise use case, but its really a bit
of a yagni in the sense that its not widely used, and may not handle
all corner cases correctly; I'd happily accept patches to leverage it
in testr and do a progress bar - either per backend [though this does
not scale] or for the overall run. This could easily be written as a
subunit filter to consume the output multiplexed stream. In fact, I
suggest it be done as three patches: a small standalone unittest
TestResult object that does progress bars, supports the
subunit/testtools extended TestResult API and can be used from
testtools, testrepository and other programs. Secondly, a patch to
testrepository to import and use that result object, so that users can
use it trivially. Lastly a patch to subunit to give a command line
filter that exposes it, for folk not using testrepository.
Many folk want to know 'has a test hung' though
Back to use cases - thirdly dropping into pdb is a core use case -
currently you need to run testtools.run instead of subunit.run, but
subunit was designed to support dropping into pdb; the problem is that
with 5 or 10 or more backends, two may drop into pdb at once and *they
both share stdin*, so its a mugs game to figure out what will happen.
This needs testr to arbitrary stdin between backends, and a minor
tweak to the subunit parser to drop into character by character mode
when a debugger starts.
Lastly, sometimes its nice to see what order things ran in to
determine conflicts and so on - thats well served by post-mortem
queries against the testr database today, and given the realities of
testing on 4/8/16 or more core machines, human mk1 eyeball is no
longer up to it.
HTH,
Rob
--
Robert Collins <rbtcollins at hp.com>
Distinguished Technologist
HP Cloud Services
More information about the OpenStack-dev
mailing list