[openstack-qa] Speeding up Tempest -- testr run --parallel questions
Robert Collins
robertc at robertcollins.net
Mon Jan 7 04:32:38 UTC 2013
On 7 January 2013 15:41, Jay Pipes <jaypipes at gmail.com> wrote:
> Hi Robert, Ivan, QAers,
>
> Ivan Zhu put together what seems like a trivial patch that he explains
> will get around a potential issue when running Tempest via testr run
> --parallel:
>
> https://review.openstack.org/#/c/19033/
>
> Currently, when running with allow_tenant_isolation=True (the default),
> every test case that inherits from
> tempest.tests.compute.base.BaseComputeTest gets an isolated tenant and
> user called <ClassName>-tenant and <ClassName>-user.
>
> So, for example, the ServersTestJSON test in
> tempest/tests/compute/servers/test_create_server.py will get a tenant
> called ServersTestJSON-tenant and a user called ServersTestJSON-user. At
> the end of the test, this tenant and user will be destroyed in the
> tearDownClass method.
>
> The patch from Ivan is attempting to solve a problem that running
> Tempest with testr run --parallel will spread some tests across multiple
> processes, which will result in the setUpClass and tearDownClass methods
> being run once for each process. The problem occurs when process 1
> executes the tearDownClass method and deletes the tenant for that test
> case class, process 2 might be running a test still depending on that
> tenant/user being around, which means you will run into errors in
> process 2 like this:
>
> "AuthenticationFailure: Authentication with user
> ServersTestJSON-user and password pass failed"
>
> Ivan's patch ostensibly fixes this issue by randomizing the tenant and
> user names, ensuring that tenants created for the same test in different
> processes won't be the same.
>
> This would indeed solve the problem of tenant/user deletion race
> conditions. BUT, and this is a very big BUT... the problem with this is
> that testr should NOT be creating X instances of some test case, one for
> each process it parallelizes to. The reason is because for the vast
> majority of Tempest test cases, the setUpClass and tearDownClass methods
> are extremely expensive and often take 90% or more of the total amount
> of time to run the test.
>
> So, taking ListServerFiltersTestJSON as an example, the test's
> setUpClass method creates 3 server instances. This can often take 45
> seconds or more to complete, whereas the test methods themselves (the
> dozen or so of them) complete in under 1 second total since all they are
> doing are simple list server operations. If testr "parallelizes" this
> particular test case by spreading it to 4 processes, all that has been
> done is increasing the total amount of work by a factor of 4 with no
> gain in decreased runtime, since the setUpClass method is what takes so
> darn long to complete.
>
> The real solution to this parallelization problem must be to notify
> testr that the cost of a test case is in the setUpClass call, so that it
> does not attempt to instantiate a single test case class in multiple
> processes.
>
> In other words, it should divide the total work among the processes by
> thinking of the test case class as the indivisible unit of work not the
> test method of a test case class.
>
> Robert, I'm looking for some guidance from you on how to get testr to
> view the test case as the atomic unit of work for Tempest.
>
> Thanks in advance, and sorry for the long email!
> -jay
Ok, so the direct answer is: testr needs a pluggable partition policy,
which it doesn't have today - feel free to file a bug.
However! Firstly - you should most definitely randomise that data. It
will, amongst other things, make it possible to run tempest twice on a
single live cluster, which could be quite useful.
Secondly, experience from other projects that had similar situations
was that as the number of cores grows, such manual assignment becomes
more and more a liability. How many of such fixtures do you have in
total? Can they be shared across larger contexts than class (e.g. as
scenarios or resources) - where you'd get even greater amortisation?
Lastly, if you really want them treated as an atomic unit, I would
make them one test, not a set of tests: thats the basic model of xUnit
across the board, and you may well run into other friction elsewhere
if they are not separate, concurrently safe, independently isolated
things.
HTH, sorry if its too brief, dashing off an answer before family time :)
-Rob
--
Robert Collins <rbtcollins at hp.com>
Distinguished Technologist
HP Cloud Services
More information about the openstack-qa
mailing list