[openstack-dev] [all] In defence of faking
Joshua Harlow
harlowja at outlook.com
Mon Sep 22 20:01:48 UTC 2014
I would have to agree, fakes certainly have there benefits.
IMHO, sometimes when you are testing a complex set of interactions you don't
want to have to mock the full set of interactions when you just want to determine if the
end result worked or did not work.
For example, lets say I have a workflow:
A -> B -> C
Where B is composed of 20 different things (or B could itself be composed of 20 sub-workflows...).
To use mock here you would have to tightly couple whatever those workflows are doing internally
just to make mock work correctly, when in reality you just want to know did A -> B -> C work as expected
with the expected C being created/adjusted/whatever for the given input of A.
A example that I think is useful is one that I have created for faking zookeeper:
https://pypi.python.org/pypi/zake*
*mentioned @ http://kazoo.readthedocs.org/en/latest/testing.html#zake
Since its not typically possible to run zookeeper during testing, or to have a strong dependency to zookeeper
when running unit tests the above was created to function like a real kazoo zookeeper client, making it possible
to trigger the same set of things that a real zookeeper client would trigger but without requiring zookeeper, making
it possible to inject data into that fake kazoo client, delete data and test what that affects on your code...
This allows users of kazoo to test interactions of complex systems without trying to mock out that entire interaction
and without having to setup zookeeper to do the same (coupling your tests to a functioning zookeeper application).
TLDR; imho testing the interaction and expected outcome is easier with fakes, and doesn't tightly couple you to
an implementation. Mock is great for testing methods, small API's, simple checks, returned/raised results, but I believe
the better test is a test that tests an interaction in a complex system for a desired result (without requiring that full system to
be setup to test that result) because in the end that complex system is what users use as a whole.
My 2 cents.
On Sep 22, 2014, at 12:36 PM, Robert Collins <robertc at robertcollins.net> wrote:
> On 23 September 2014 03:58, Matthew Booth <mbooth at redhat.com> wrote:
>> If you missed the inaugural OpenStack Bootstrapping Hour, it's here:
>> http://youtu.be/jCWtLoSEfmw . I think this is a fantastic idea and big
>> thanks to Sean, Jay and Dan for doing this. I liked the format, the
>> informal style and the content. Unfortunately I missed the live event,
>> but I can confirm that watching it after the event worked just fine
>> (thanks for reading out live questions for the stream!).
>>
>> I'd like to make a brief defence of faking, which perhaps predictably in
>> a talk about mock took a bit of a bashing.
>>
>> Firstly, when not to fake. As Jay pointed out, faking adds an element of
>> complexity to a test, so if you can achieve what you need to with a
>> simple mock then you should. But, as the quote goes, you should "make
>> things as simple as possible, but not simpler."
>>
>> Here are some simple situations where I believe fake is the better solution:
>>
>> * Mock assertions aren't sufficiently expressive on their own
>>
>> For example, imagine your code is calling:
>>
>> def complex_set(key, value)
>>
>> You want to assert that on completion of your unit, the final value
>> assigned to <key> was <value>. This is difficult to capture with mock
>> without risking false assertion failures if complex_set sets other keys
>> which you aren't interested in, or if <key>'s value is set multiple
>> times, but you're only interested in the last one. A little fake
>> function which stores the final value assigned to <key> does this simply
>> and accurately without adding a great deal of complexity. e.g.
>>
>> mykey = [None]
>> def fake_complex_set(key, value):
>> if key == 'FOO':
>> mykey[0] = value
>>
>> with mock.patch.object(unit, 'complex_set', side_effect=fake_complex_set):
>> run_test()
>> self.assertEquals('expected', mykey[0])
>>
>> Summary: fake method is a custom mock assertion.
>>
>> * A simple fake function is simpler than a complex mock dance
>>
>> For example, you're mocking 2 functions: start_frobincating(key) and
>> wait_for_frobnication(key). They can potentially be called overlapping
>> with different keys. The desired mock return value of one is dependent
>> on arguments passed to the other. This is better mocked with a couple of
>> little fake functions and some external state, or you risk introducing
>> artificial constraints on the code under test.
>>
>> Jay pointed out that faking methods creates more opportunities for
>> errors. For this reason, in the above cases, you want to keep your fake
>> function as simple as possible (but no simpler). However, there's a big
>> one: the fake driver!
>>
>> This may make less sense outside of driver code, although faking the
>> image service came up in the talk. Without looking at the detail, that
>> doesn't necessarily sound awful to me, depending on context. In the
>> driver, though, the ultimate measure of correctness isn't a Nova call:
>> it's the effect produced on the state of an external system.
>>
>> For the VMware driver we have nova.tests.virt.vmwareapi.fake. This is a
>> lot of code: 1599 lines as of writing. It contains bugs, and it contains
>> inaccuracies, and both of these can mess up tests. However:
>>
>> * It's vastly simpler than the system it models (vSphere server)
>> * It's common code, so gets fixed over time
>> * It allows tests to run almost all driver code unmodified
>>
>> So, for example, it knows that you can't move a file before you create
>> it. It knows that creating a VM creates a bunch of different files, and
>> where they're created. It knows what objects are created by the server,
>> and what attributes they have. And what attributes they don't have. If
>> you do an object lookup, it knows which objects to return, and what
>> their properties are.
>>
>> All of this knowledge is vital to testing, and if it wasn't contained in
>> the fake driver, or something like it[1], would have to be replicated
>> across all tests which require it. i.e. It may be 1599 lines of
>> complexity, but it's all complexity which has to live somewhere anyway.
>>
>> Incidentally, this is fresh in my mind because of
>> https://review.openstack.org/#/c/122760/ . Note the diff stat: +70,
>> -161, and the rewrite has better coverage, too :) It executes the
>> function under test, it checks that it produces the correct outcome, and
>> other than that it doesn't care how the function is implemented.
>>
>> TL;DR
>>
>> * Bootstrap hour is awesome
>> * Don't fake if you don't have to
>> * However, there are situations where it's a good choice
>
> I'm going to push on this a bit further. Mocking is fine in many
> cases, but it encodes dependencies from within your objects into the
> test suite: e.g. that X will call Y and then Z - and if thats what you
> want to *test*, thats great, but it often isn't outside of the very
> smallest unit tests - because it makes your tests fragile to change -
> tests for a class Foo need to change when a parent class Bar alters,
> or when a utility class Quux alters. And that coupling has a cost and
> is a pain.
>
> Ad-hoc fakes are no better than mocks in this regard, but verified
> fakes are substantially better and give nearly the same performance as
> mocks, with (generally) better diagnostics, fidelity and low
> fragility. Where we don't have verified fakes, making a single robust
> fake for the interface is a good alternative - and I'd like to see
> OpenStack do more of this, not less.
>
> https://thoughtstreams.io/glyph/test-patterns/5952/ is a good post
> from Glyph describing them.
>
> -Rob
>
> --
> Robert Collins <rbtcollins at hp.com>
> Distinguished Technologist
> HP Converged Cloud
>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
More information about the OpenStack-dev
mailing list