[openstack-dev] [all] In defence of faking

Matthew Booth mbooth at redhat.com
Mon Sep 22 15:58:16 UTC 2014


If you missed the inaugural OpenStack Bootstrapping Hour, it's here:
http://youtu.be/jCWtLoSEfmw . I think this is a fantastic idea and big
thanks to Sean, Jay and Dan for doing this. I liked the format, the
informal style and the content. Unfortunately I missed the live event,
but I can confirm that watching it after the event worked just fine
(thanks for reading out live questions for the stream!).

I'd like to make a brief defence of faking, which perhaps predictably in
a talk about mock took a bit of a bashing.

Firstly, when not to fake. As Jay pointed out, faking adds an element of
complexity to a test, so if you can achieve what you need to with a
simple mock then you should. But, as the quote goes, you should "make
things as simple as possible, but not simpler."

Here are some simple situations where I believe fake is the better solution:

* Mock assertions aren't sufficiently expressive on their own

For example, imagine your code is calling:

def complex_set(key, value)

You want to assert that on completion of your unit, the final value
assigned to <key> was <value>. This is difficult to capture with mock
without risking false assertion failures if complex_set sets other keys
which you aren't interested in, or if <key>'s value is set multiple
times, but you're only interested in the last one. A little fake
function which stores the final value assigned to <key> does this simply
and accurately without adding a great deal of complexity. e.g.

mykey = [None]
def fake_complex_set(key, value):
  if key == 'FOO':
    mykey[0] = value

with mock.patch.object(unit, 'complex_set', side_effect=fake_complex_set):
  run_test()
self.assertEquals('expected', mykey[0])

Summary: fake method is a custom mock assertion.

* A simple fake function is simpler than a complex mock dance

For example, you're mocking 2 functions: start_frobincating(key) and
wait_for_frobnication(key). They can potentially be called overlapping
with different keys. The desired mock return value of one is dependent
on arguments passed to the other. This is better mocked with a couple of
little fake functions and some external state, or you risk introducing
artificial constraints on the code under test.

Jay pointed out that faking methods creates more opportunities for
errors. For this reason, in the above cases, you want to keep your fake
function as simple as possible (but no simpler). However, there's a big
one: the fake driver!

This may make less sense outside of driver code, although faking the
image service came up in the talk. Without looking at the detail, that
doesn't necessarily sound awful to me, depending on context. In the
driver, though, the ultimate measure of correctness isn't a Nova call:
it's the effect produced on the state of an external system.

For the VMware driver we have nova.tests.virt.vmwareapi.fake. This is a
lot of code: 1599 lines as of writing. It contains bugs, and it contains
inaccuracies, and both of these can mess up tests. However:

* It's vastly simpler than the system it models (vSphere server)
* It's common code, so gets fixed over time
* It allows tests to run almost all driver code unmodified

So, for example, it knows that you can't move a file before you create
it. It knows that creating a VM creates a bunch of different files, and
where they're created. It knows what objects are created by the server,
and what attributes they have. And what attributes they don't have. If
you do an object lookup, it knows which objects to return, and what
their properties are.

All of this knowledge is vital to testing, and if it wasn't contained in
the fake driver, or something like it[1], would have to be replicated
across all tests which require it. i.e. It may be 1599 lines of
complexity, but it's all complexity which has to live somewhere anyway.

Incidentally, this is fresh in my mind because of
https://review.openstack.org/#/c/122760/ . Note the diff stat: +70,
-161, and the rewrite has better coverage, too :) It executes the
function under test, it checks that it produces the correct outcome, and
other than that it doesn't care how the function is implemented.

TL;DR

* Bootstrap hour is awesome
* Don't fake if you don't have to
* However, there are situations where it's a good choice

Thanks for reading :)

Matt

[1] There are other ways to skin this cat, but ultimately if you aren't
actually spinning up a vSphere server, you're modelling it somehow.
-- 
Matthew Booth
Red Hat Engineering, Virtualisation Team

Phone: +442070094448 (UK)
GPG ID:  D33C3490
GPG FPR: 3733 612D 2D05 5458 8A8A 1600 3441 EA19 D33C 3490



More information about the OpenStack-dev mailing list