13 January 2008
Avoid mega unit tests
We've just had a posting to the jMock user list that included the following:
I was involved in a project recently where JMock was used quite heavily. Looking back, here's what I found:
- The unit tests where at times unreadable (no idea what they were doing).
- Some tests classes would reach 500 lines in addition to inheriting an abstract class which also would have up to 500 lines.
- Refactoring would lead to massive changes in test code.
I've seen this failure mode on another project I've been helping with, so I think there might be a common pattern.
I don't think any unit test code should get that large, except for unusual circumstances. Unit tests are supposed to focus on at most a few classes and shouldn't need a large amount of set up. What I saw on the other project was enormous amounts of preparation and positioning to get the objects into a state where the expected feature could be exercised. Of course it's hard to understand the point of a test when there's just so much code. If you see this pattern, then I'd suggest that the code (or at least the test code) needs breaking up a bit. On the other hand, integration and acceptance tests that happen to be written using a unit test framework might well be larger.
One thing I need to explore is whether the naive use of interaction-testing is particularly susceptible to this failing, or whether it happens all the time and we're the only ones who get complained to. I am, however, convinced that the emphasis on mainly using Mocks to substitute external systems (some of which I perpetrated myself in the early days) is a deeply bad idea which pushes teams towards the sort of problem described here.
Labels: design, listening to the tests, testability
23 May 2007
Test Smell: Too many expectations
When a test has too many expectations it's hard to see what's important, what's really under test. For example, here's a test:
@Test public void decidesCasesWhenFirstPartyIsReady() {
context.checking(new Expectations(){{
one(firstPart).isReady(); will(returnValue(true));
one(organiser).getAdjudicator(); will(returnValue(adjudicator));
one(adjudicator).findCase(firstParty, issue); will(returnValue(case));
one(thirdParty).proceedWith(case);
}});
claimsProcessor.adjudicateIfReady(thirdParty, issue);
}
that might be implemented like this:
public void adjudicateIfReady(ThirdParty thirdParty, Issue issue) {
if (firstParty.isReady()) {
Adjudicator adjudicator = organisation.getAdjudicator();
Case case = adjudicator.findCase(firstParty, issue);
thirdParty.proceedWith(case);
} else{
thirdParty.adjourn();
}
}
What makes the test hard to read is that everything is an expectation, so everything seems to be equally important. I can't tell what's significant and what's just there to get me through the test.
In fact, if I look at all the methods I call, there are only two that have any side effects outside this class: thirdParty.proceedWith() and thirdParty.adjourn(). It would be an error to call these more than once. All the other methods are queries, I can call organisation.getAdjudicator() repeatedly without breaking any behaviour. adjudicator.findCase() might go either way, but it happens to be a lookup so it has no side effects.
I can make my intentions clearer by distinguishing between Stubs, simulations of real behaviour that help me get my test to pass, and Expectations, assertions I want to make about how my object interacts with its neighbours. Reworking my test, I get:
@Test public void decidesCasesWhenFirstPartyIsReady() {
context.checking(new Expectations(){{
allowing(firstPart).isReady(); will(returnValue(true));
allowing(organiser).getAdjudicator(); will(returnValue(adjudicator));
allowing(adjudicator).findCase(firstParty, issue); will(returnValue(case));
one(thirdParty).proceedWith(case);
}});
claimsProcessor.adjudicateIfReady(thirdParty, issue);
}
which is explicit about how I expect my object to change the world.
Stub queries, mock actions
For the more formally minded, you might want to think (very, very loosely) of Stubs as pre-conditions and Expectations as post-conditions. The test now says that given the state described in the Stubs, the events described by the Expectations should have happened. Once I've sorted out which calls are Stubs, I can usually move some of them into common setup code, making the tests clearer still:
@Test public void decidesCasesWhenFirstPartyIsReady() {
context.checking(new Expectations(){{
allowing(firstParty).isReady(); will(returnValue(true));
one(thirdParty).proceedWith(case);
}});
claimsProcessor.adjudicateIfReady(thirdParty, issue);
}
It is, of course, possible that all the methods calls are actions. In which case, either I just have a complicated domain and there's not much I can do to simplify it or, more likely, there are some intermediate concepts in there that need to be extracted into objects.
Clearly, when I'm thinking about the whole system, I can't ignore implementation issues such as the cost of database queries, but that's not relevant at this point. What matters now is that I have tests and code that clearly express my intentions.
On a side note, this is why I'm not keen on an expectation syntax like this: expect(firstParty.isReady()).stubReturn(true). I find the leading "expect" confusing, when my intention is to stub.
Special bonus prize
I always have problems coming up with good examples. There's actually a better improvement to this code, which is to notice that that I've pulled out a chain of objects to get to the case object, exposing dependencies that aren't relevant here. What I should have done is told the nearest object to do the work for me, like this:
public void adjudicateIfReady(ThirdParty thirdParty, Issue issue) {
if (firstParty.isReady()) {
organisation.adjudicateBetween(firstParty, thirdParty, issue);
} else{
thirdParty.adjourn();
}
}
or, possibly,
public void adjudicateIfReady(ThirdParty thirdParty, Issue issue) {
if (firstParty.isReady()) {
thirdParty.startAdjudication(organisation, firstParty, issue);
} else{
thirdParty.adjourn();
}
}
which looks more balanced. If you spotted this then I award you a Moment of Smugness™ to be exercised at your convenience.
Labels: design, listening to the tests
14 May 2007
What the tests will tell you
We've found these benefits from learning to Listen to the Test Smells.
- Keeping knowledge local
- Some of the test smells we've identified, such as needing magic to create mocks, are to do with knowledge leaking between components. If I can keep knowledge local to an object, either internal or passed in, then its implementation is independent of its context. I can safely move it wherever I like. Do this consistently and I can build an application as a web of communicating objects.
- If it's explicit I can name it
- This is why I don't like mocking concrete classes, I like to have names for the relationships between objects as well the objects themselves. As the legends say, if I have something's true name then I can control it. If I can see it, I have more chance of finding other uses and so reducing duplication.
- More names mean more domain information
- I find that when we emphasise how objects communicate, rather than what they are, I end up with types and roles defined more in terms of the domain than of the implementation. I think this might be because I have more, smaller abstractions which gets me a further away from the underlying language. Somehow I seem to get more domain vocabulary into the code.
- Pass behaviour rather than data
- I find that TDD with mocks encourages me to write objects that tell each other what to do, rather then requesting and manipulating values. Applied consistently, I end up with a coding style where I pass behaviour (in listeners and callbacks) from the higher levels of the code to the data, rather than pulling data up through the stack. It's an unusual style in the Java/C# world, but I find that we get better encapsulation and clearer abstractions because I have to clarify what the pieces will do, not just what values they hold.
Labels: design, listening to the tests
09 May 2007
Object Collaboration Stereotypes
Another diagnosis for a Bloated Constructor might be that not all of the arguments are dependencies, services that must be passed to the object before it can do its job. Other arguments might be more to do with structuring the implementation of the object or varying its behaviour to suit its neighbours. To help me think about this distinction, I've found it useful to categorise the peers of an object into four stereotypes:
- Dependencies are services that the object needs from its environment so that it can fulfil its responsibilities.
- Notifications are other parts of the system that need to know when the object changes state or performs an action.
- Policies are objects that tweak or adapt the object's behaviour to the needs of the system.
- Parts are components in the implementation that are not controlled from outside the object after being set.
Dependencies must be passed in through the constructor. An object needs the system to provide implementations of its dependencies to function at all, so there's no point in creating an instance until they're available. Partially creating an object and then finishing it off by setting properties is risky and brittle because the programmer has to know whether all the dependencies have been set. As Yoda said, "New, or new not. There is no try."
Other types of peer can be passed to the constructor as a convenience but, if the list is becoming too long, they can also be initialised to safe defaults and overridden later (one way to distinguish the two is that there are no safe defaults for a Dependency). Policies and Parts can be initialised to commonly used cases, and collections of Parts can be initialised as empty. I can then add setters and add/remove methods to the class to allow the object's clients to configure it. Similarly, Notifications can be implemented as events. Unicast events can be initialised with a null implementation of the listener interface, and multicast events can be initialised with an empty collection of listeners. I always create a valid object, and I also have the hooks I need for testing.
A high speed example
Here's an example, it's probably not quite bad enough to need fixing, but it'll do to make the point. The application is a Formula 1 racing game, players can try out different configurations of car and driving style to see which one wins. A RacingCar represents a competitor within a race.
public class RacingCar {
private final Track track;
private Tyres tyres;
private Suspension suspension;
private Wing frontWing;
private Wing backWing;
private double fuelLoad;
private CarListener listener;
private DrivingStrategy driver;
public RacingCar(Track track, DrivingStrategy driver,
Tyres tyres, Suspension suspension,
Wing frontWing, Wing backWing, double fuelLoad,
CarListener listener)
{
this.track = track;
this.driver = driver;
this.tyres = tyres;
this.suspension = suspension;
this.frontWing = frontWing;
this.backWing = backWing;
this.fuelLoad = fuelLoad;
this.listener = listener;
}
}
It turns out that the track is the only Dependency of a RacingCar, the hint is that it's the only field that's final. The driver is a Policy, the listener is a Notification, and the rest are Parts; all of these can be modified by the user before or during the race. Here's a reworked constructor:
public class RacingCar {
private final Track track;
private DrivingStrategy driver = DriverTypes.borderlineAggressiveDriving();
private Tyres tyres = TyreTypes.mediumSlicks();
private Suspension suspension = SuspensionTypes.mediumStiffness();;
private Wing frontWing = WingTypes.mediumDownforce();
private Wing backWing = WingTypes.mediumDownforce();
private double fuelLoad = 0.5;
private CarListener listener = CarListener.NONE;
public RacingCar(Track track) {
this.track = track;
}
public void setSuspension(Suspension suspension) { …
public void setTyres(Tyres tyres) { …
public void setEngine(Engine engine) { …
public void setListener(CarListener listener) { …
…
}
Now I've initialised the peers to most common defaults, the user can then configure them through the user interface. I've initialised the listener to a null implementation, again this can be changed later by the object's environment.
A matter of taste
These stereotypes are only heuristics to help me think about the design, not hard rules, so I don't get obsessed with finding just the right classification of an object's peers. What matters most is the context in which I'm developing the object, my mental model of the domain. For example, an auditing log might be a Dependency, since it's a legal requirement for the business, or a Notification, since the object will otherwise function without it. Policies belong to the environment and can usually be constants, since they're often immutable and so don't need to be instantiated more than once. Parts belong to the object and usually cannot be stateless, so they need a new instance. For example, in my Racing Car, tyres might be a Policy if TyreTypes.mediumSlicks() describes only how the choice of tyre affects the car's handling, but a Part if it stores mutable values such as air pressure and wear.
Labels: design, listening to the tests
02 May 2007
Cross-referencing Code Smells
Labels: design, listening to the tests
Test Smell: Confused object
Another diagnosis for a Bloated Constructor might be that the object itself is too large because it has too many responsibilities. For example,
public class Handset {
public Handset(Network network, Camera camera, Display display,
DataNetwork dataNetwork, AddressBook addressBook,
Storage storage, Tuner tuner, …) {
// set the fields here
}
public void placeCallTo(DirectoryNumber number) {
network.openVoiceCallTo(number);
}
public void takePicture() {
Frame frame = storage.allocateNewFrame();
camera.takePictureInto(frame);
display.showPicture(frame);
}
public void showWebPage(URL url) {
display.renderHtml(dataNetwork.retrievePage(url));
}
public void showAddress(SearchTerm searchTerm) {
display.showAddress(addressBook.findAddress(searchTerm));
}
public void playRadio(Frequency frequency) {
tuner.tuneTo(frequency);
tuner.play();
}
// and so on …
}
Like my mobile phone, this class has several unrelated responsibilities which force it to pull in many dependencies. Like my phone, the class is confusing to use so I keep pressing the wrong button, and I find that unrelated features interfere with each other. I'm prepared to put up with these compromises in my handset because I don't have enough pockets for all the devices it implements, but that doesn't apply to my code. This class should be broken up. The sort of fault lines I look for are fields that have no relationship with each other, particularly if the methods that reference them aren't used together either.
An associated smell for this kind of class is that its test suite will look confused too. The tests for its various features will have no relationship with each other, so I'll be able to make major changes in one area without touching any others. If I can break up the test class into slices that don't share anything, then the best thing might be to do just that and then slice up the object too.
Labels: design, listening to the tests
30 April 2007
Test Smell: Bloated Constructor
Sometimes, during the TDD process, I end up with a constructor that has a long, unwieldy list of arguments. Most likely, I got there by adding the object's peers (other objects it has to communicate with) one at a time, and it got out of hand. This is not dreadful, since the process helped me sort out the design of the object and its neighbours, but now it's time to clean up. The object must still be constructed in a valid state as required by the application. I will still need all the current constructor arguments, so I should see if there's any implicit structure there that I can tease out.
One possibility is that some of the arguments together define a concept that should be packaged up and replaced with a new object that represents it. Here's a small example.
public class MessageProcessor {
public MessageProcessor(MessageUnpacker unpacker,
AuditTrail auditer,
CounterPartyFinder counterpartyFinder,
LocationFinder locationFinder,
DomesticNotifier domesticNotifier,
ImportedNotifier importedNotifier) {
// set the fields here
}
public void onMessage(Message rawMessage) {
UnpackedMessage unpacked = unpacker.unpack(rawMessage, counterpartyFinder);
auditer.recordReceiptOf(unpacked);
// some other activity here
if (locationFinder.isDomestic(unpacked)) {
domesticNotifier.notify(unpacked.asDomesticMessage());
} else {
importedNotifier.notify(unpacked.asImportedMessage())
}
}
}
Just the thought of writing expectations for all these objects makes me wilt, which suggests that things are too complicated. A first step is to notice that the unpacker and counterpartyFinder are always used together, they're fixed at construction and one calls the other. I can remove one argument by pushing the counterpartyFinder into the unpacker.
public class MessageProcessor {
public MessageProcessor(MessageUnpacker unpacker,
AuditTrail auditer,
LocationFinder locationFinder,
DomesticNotifier domesticNotifier,
ImportedNotifier importedNotifier) {
…
public void onMessage(Message rawMessage) {
UnpackedMessage unpacked = unpacker.unpack(rawMessage);
…
Then there's the triple of locationFinder and the notifiers, which seem to go together. It might make sense to package them together into a MessageDispatcher.
public class MessageProcessor {
public MessageProcessor(MessageUnpacker unpacker,
AuditTrail auditer,
MessageDispatcher dispatcher) {
…
public void onMessage(Message rawMessage) {
UnpackedMessage unpacked = unpacker.unpack(rawMessage);
auditer.recordReceiptOf(unpacked);
// some other activity here
dispatcher.dispatch(unpacked);
}
}
Although I've forced this example to get it to fit within a blog post, it shows that being sensitive to complexity in the tests can help me clarify my designs. Now I have a message handling object that clearly performs the usual three stages: receive, process, and forward. I've pulled out the message routing code (the MessageDispatcher), so the MessageProcessor has fewer responsiblities and I know where to put routing decisions when things get more complicated. You might also notice that this code is easier to test.
When I'm extracting implicit components, I look first for two conditions: arguments that are always used together in the class, and that have the same lifetime. That usually finds me the concept, then I have the harder task of finding a good name.
As an aside, one of the signs that the design is developing nicely is that this kind of change is easy to integrate. All I have to do is find where the MessageProcessor is created and change this
messageProceesor =
new MessageProcessor(new XmlMessageUnpacker(), auditer, counterpartyFinder,
locationFinder, domesticNotifier, importedNotifier);
to this
messageProceesor =
new MessageProcessor(
new XmlMessageUnpacker(counterpartyFinder), auditer
new MessageDispatcher(locationFinder, domesticNotifier,
importedNotifier));
Later I can cut down the noise by extracting out the creation of the MessageDispatcher.
Labels: design, listening to the tests, testability
24 April 2007
Test Smell: Mocking concrete classes
One approach to Interaction Testing is to mock concrete classes rather than interfaces. The technique is to inherit from the class you want to mock and override the methods that will be called within the test, either manually or with any of the mocking frameworks. I think it's a technique that should be used only when you really have no other options.
Here's an example of mocking by hand, the test verifies that the music centre starts the CD player at the requested time. Assume that setting the schedule on a CdPlayer object involves triggering some behaviour we don't want in the test, so we override the scheduleToStartAt and verify afterwards that we've called it with the right argument.
public class MusicCentreTest {
@Test public void startsCdPlayerAtTimeRequested() {
final MutableTime scheduledTime = new MutableTime();
CdPlayer player = new CdPlayer() {
@Override
public void scheduleToStartAt(Time startTime) {
scheduledTime.set(startTime);
}
}
MusicCentre centre = new MusicCentre(player);
centre.startMediaAt(LATER);
assertEquals(LATER, scheduledTime.get());
}
}
The problem with this approach is that it leaves the relationship between the objects implicit. I hope we've made clear by now that the intention of Test-Driven Development with Mock Objects is to discover relationships between objects. If I subclass, there's nothing in the domain code to make such a relationship visible, just methods on an object. This makes it harder to see if the service that supports this relationship might be relevant elsewhere and I'll have to do the analysis again next time I work with the class. To make the point, here's a possible implementation of CdPlayer:
public class CdPlayer {
public void scheduleToStartAt(Time startTime) { …
public void stop() { …
public void gotoTrack(int trackNumber) { …
public void spinUpDisk() { …
public void eject() { …
}
It turns out that my MusicCentre object only uses the starting and stopping methods on the CdPlayer, the rest are used by some other part of the system. I'm over-specifying my MediaCentre by requiring it to talk to a CdPlayer, what it actually needs is a ScheduledDevice. Robert Martin made the point (back in 1996) in his Interface Segregation Principle that "Clients should not be forced to depend upon interfaces that they do not use", but that's exactly what we do when we mock a concrete class.
There's a more subtle but powerful reason for not mocking concrete classes. As part of the TDD with Mocks process, I have to think up names for the relationships I discover—in this example the ScheduledDevice. I find that this makes me think harder about the domain and teases out concepts that I might otherwise miss. Once something has a name, I can talk about it.
In case of emergency
There are a few occasions when I have to put up with this smell. The least unacceptable situation is where I'm working with legacy code that I control but can't change all at once. Alternatively, I might be working with third party code that I can't change. As I wrote before, it's usually better to write a veneer over an external library rather than mock it directly, but sometimes it's just too hard. Either way, these are unfortunate but necessary compromises that I would try to work my way out of as soon as possible. The longer I leave them in the code, the more likely it is that some brittleness in the design will cause me grief.
Above all, do not mock by overriding a class's internal features, which just locks down your test to the quirks of the current implementation. Override only visible methods. This rule also prohibits exposing internal methods just so you can override them in a test. If you can't get to the structure you need, then the tests are telling you that it's time to break up the class into smaller, composable features.
Labels: design, listening to the tests, testability
19 April 2007
Test Smell: Logging is also a feature
I have a more contentious example of working with objects that are hard to replace: logging. Take a look at these two lines of code:
log.error("Lost touch with Reality after " + timeout + "seconds);
log.trace("Distance travelled in the wilderness: " + distance);
These are two separate features that happen to share an implementation. Let me explain.
Support logging (errors and info) is part of the user interface of the application. These messages are intended to be tracked by support staff, perhaps system administrators and operators, to diagnose a failure or monitor the progress of the running system.
Diagnostic logging (debug and trace), is infrastructure for programmers. These messages should not be turned on in production because they're intended to help the programmers understand what's going on whilst they're developing the system.
Given this distinction, I should consider using different techniques to develop these two type of logging. Support logging should be Test-Driven from someone's requirements, such as auditing or responsiveness to failure. The tests will make sure that I've thought about what each message is for and that it works. The tests will also protect me from breaking any tools and scripts that other people write to analyse these log messages. Diagnostic logging, on the other hand, is driven by the programmers' need for fine-grained tracking of what's happening in the system. It's scaffolding so it probably doesn't need to be Test-Driven and the messages might not need to be as consistent as those for support logs. These messages are not to be used in production, right?
Support, not logging
To get back to the point of the series, writing unit tests against static global objects, including loggers, is noisy. Either I have to read from the file system or I have to manage an extra testing Appender. I have to remember to clean up afterwards so that tests don't interfere with each other, and set the right level on the right Logger. The noise in the test reminds me that my code is working at two levels: my domain and the logging infrastructure. Here's a common example of code with logging:
Location location = tracker.getCurrentLocation();
for (Filter filter : filters) {
filter.selectFor(location);
if (logger.isInfoEnabled()) {
logger.info("Filter " + filter.getName() + ", " + filter.getDate()
+ " selected for " + location.getName()
+ ", is current: " + tracker.isCurrent(location));
}
}
Notice the shift in vocabulary and style between the functional part of the loop and the logging part (in italics)? At a micro level, the code is doing two things at once, something to do with locations and rendering support information, which breaks the Single Responsibility Principle. Maybe I could do this instead:
Location location = tracker.getCurrentLocation();
for (Filter filter : filters) {
filter.selectFor(location);
support.notifyFiltering(tracker, location, filter);
}
where the support object might be implemented by a logger, a message bus, pop-up windows, or whatever's appropriate; the detail is not relevant to the code at this level. This code is also easier to test. We, rather than the logging framework, own the support object so we can pass in a mock implementation at our convenience, perhaps in the constructor, and keep it local to the test case. The other simplification is that now we're testing for objects rather than the formatted contents of a string. Of course, we will still need to write an implementation of support and some focussed integration tests to go with it; Jeff Langr's Agile Java is one source of advice on how to do that.
But that's crazy talk…
The idea of encapsulating support reporting sounds like over-design, but it's worth thinking about for a moment. It means I'm writing code in terms of my intent (helping the support people) rather than the implementation (logging), so it's more expressive. All the support reporting is handled in a few known places, so it's easier to be consistent about how things are reported and to encourage reuse. It can also help me structure and control my reporting in terms of the application domain, rather than in terms of java packages. Finally, the act of writing a test for each report helps me avoid the "I don't know what to do with this exception, so I'll log it and carry on" syndrome, which leads to log-bloat and production failures because I haven't handled obscure error conditions.
One objection is "I can't pass in a logger for testing because I've got logging all over my domain objects. I'd have to pass one around everywhere". I think this is a test smell that is telling me that I haven't clarified my design enough. Perhaps some of my support logging should really be diagnostic logging, or I'm logging more than I need to because I left some in that I wrote when I hadn't yet understood the behaviour. Most likely, there's still too much duplication in my domain code and I haven't yet found the "choke points" where most of the production logging should go.
So what about diagnostic logging? Is it disposable scaffolding that should be taken down once the job is done, or essential infrastructure that should be tested and maintained? That depends on the system, but once I've made the distinction I have more freedom to think about using different techniques for support and diagnostic logging. I might even decide that in-line code is the wrong technique to implement diagnostic logging because it interferes with the readability of the production code that matters. Perhaps I could weave in some Aspects instead (since that's the canonical example of their use). Perhaps not, but at least I now have a choice.
One final data point. A colleague of mine once worked on a system where so much pointless content was written to the logs that they had to rolled off the disks after a week. This made maintenance very difficult as the relevant logs had usually gone by the time a bug was assigned to be fixed. If they'd not logged at all they could have made the system go faster with no loss of useful information.
Labels: design, domain-specific-language, listening to the tests, testability
14 April 2007
Test Smell: Everything is mocked
This Test Smell is about a weakness in the design of tests, rather than in the code to be tested.
It may sound obvious, but you don't have to mock every class in your system.
Don't mock value objects
There isn't much point in writing mocks for simple value objects (which should be immutable anyway), just create an instance and use it. For example, in this test:
@Test public void sumsTotalRunningTime() {
Show show = new Show();
Video video1 = context.mock(Video.class);
Video video2 = context.mock(Video.class);
context.checking(new Expectations(){{
one(video1).getTime(); will(returnValue(40));
one(video2).getTime(); will(returnValue(23));
}});
show.add(video1);
show.add(video2);
assertEqual(63, show.getRunningTime())
}
where Video holds details of a part of a show. It's not worth creating an interface / implementation pair to control which time values are returned, just create instances with the appropriate values and use them. There are a couple of heuristics for when a class is not worth mocking. First, it has only accessors or simple methods that act on values it holds, it doesn't have any interesting behaviour. Second, you can't think of a meaningful name for the class other than VideoImpl or some such vague term.
Don't mock third-party libraries
There are two problems when mocking a third-party library. First, you can't use the tests to drive the design, the API belongs to someone else. Mock-based tests for external libraries often end up contorted and messy to get through to the functionality that you want to exercise. These tests are giving off smells that highlight designs that are inappropriate for your system but, instead of fixing the problem and simplifying the code, you end up carrying the weight of the test complexity. The second issue is that you have to be sure that the behaviour you implement in a mock (or stub) matches the external library. How difficult this is depends on the API, which has to be specified (and implemented) well enough for you to be certain that the tests are meaningful.
To develop against an external API, first TDD your code to define interfaces for the services that your application needs in terms of your domain. Then write a thin layer using the library to implement these interfaces, with focussed integration tests to confirm your assumptions about how it works. How to make integration tests work effectively will depend on your environment, but you will need to do it anyway so you might as well start early and get the benefit. There will be relatively few integration tests, compared to the number of unit tests, so they should not get in the way of the build even if they're not as fast as the in-memory tests.
There are some exceptions when mocking third-party libraries can be helpful. You might use mocks to simulate behaviour that is hard to trigger with the real library, such as throwing exceptions. Similarly, you might use mocks to test a sequence of calls, for example making sure that a transaction is rolled back if there's a failure. There should not be many of these in a test suite.
Labels: design, listening to the tests, tip
11 April 2007
Test Smell: I need to mock an object I can't replace (without magic)
Date now = new Date();Under the covers, the constructor calls the singleton System and sets the new instance to the current time using
System.currentTimeMillis(). This is a convenient technique, but it comes at a cost. Let's say we want to write a test like this:
@Test public void RejectsRequestsNotWithinTheSameDay() {
receiver.acceptRequest(FIRST_REQUEST);
// the next day
assertFalse("too late now", receiver.acceptRequest(SECOND_REQUEST));
}The implementation looks like this:
public boolean acceptRequest(Request request) {
final Date now = new Date();
if (dateOfFirstRequest == null) {
dateOfFirstRequest = now;
} else if (firstDateIsDifferentFrom(now)) {
return false;
}
// process the request
return true;
}where dateOfFirstRequest is a field and firstDateIsDifferentFrom is a helper method that hides the unpleasantness of working with the Java date library. To test this timeout I must either make the test wait overnight or do something clever, perhaps with Aspects or byte-code manipulation, to intercept the constructor and return suitable Date values for the test. This difficulty in testing is a hint that I should change the code. To make the test easier, I need to control how Date objects are created, so I introduce a Clock and pass it into the Receiver. If I stub the Clock, the test might look like this:
@Test public void RejectsRequestsNotWithinTheSameDay() {
Receiver receiver = new Receiver(stubClock);
stubClock.setNextDate(TODAY);
receiver.acceptRequest(FIRST_REQUEST);
stubClock.setNextDate(TOMORROW);
assertFalse("too late now", receiver.acceptRequest(SECOND_REQUEST));
}and the implementation like this:
public boolean acceptRequest(Request request) {
final Date now = clock.now();
if (dateOfFirstRequest == null) {
dateOfFirstRequest = now;
} else if (firstDateIsDifferentFrom(now)) {
return false;
}
// process the request
return true;
}Now I can test the Receiver without any special tricks. More importantly, however, I've made it obvious that a Receiver is dependent on time, I can't even create one without a Clock. One could argue that this is breaking encapsulation by exposing the internals of a Receiver, I should be able to just create an instance and not worry, but personally I want to know about this dependency — especially when the service is rolled out across timezones and New York and London start complaining about different results.
From procedural code to objects
The introduction of a Clock suggests to me that I've been missing a concept in my code: date checking in terms of my domain. A Receiver doesn't need to know all the details of a Calendar system, such as time zones and locales, it just need to know whether, for this application, the date has changed. There's a clue in the fragment:firstDateIsDifferentFrom(now)which means that I've had to wrap up some date manipulation code in the Receiver. It's the wrong object, that kind of work should be done by the Clock. I'll write the test again (in jMock2 syntax):
@Test public void RejectsRequestsNotWithinTheSameDay() {
Receiver receiver = new Receiver(clock);
context.checking(new Expectations() {{
allowing(clock).now(); will(returnValue(NOW));
one(clock).dayHasChangedFrom(NOW); will(returnValue(false));
}});
receiver.acceptRequest(FIRST_REQUEST);
assertFalse("too late now", receiver.acceptRequest(SECOND_REQUEST));
}The implementation looks like this:
public boolean acceptRequest(Request request) {
if (dateOfFirstRequest == null) {
dateOfFirstRequest = clock.now();
} else if (clock.dayHasChangedFrom(dateOfFirstRequest)) {
return false;
}
// process the request
return true;
}This version of Receiver is more focussed, it doesn't need to know how to distinguish one date from another and it only needs to get a date to set the first value. The Clock interface defines exactly those date services a Receiver needs from its environment.
But I think I can push this further. The Receiver only stores a date so it can detect a change of day, perhaps I should delegate all the date functionality to another object which, for want of a better name, I'll call a SameDayChecker.
@Test public void RejectsRequestsOutsideAllowedPeriod() {
Receiver receiver = new Receiver(sameDayChecker);
context.checking(new Expectations() {{
allowing(sameDayChecker).hasExpired(); will(returnValue(false));
}});
assertFalse("too late now", receiver.acceptRequest(REQUEST));
}With an implementation like this:
public boolean acceptRequest(Request request) {
if (sameDayChecker.hasExpired()) {
return false;
}
// process the request
return true;
}
All the logic about dates has been separated out from the Receiver, which can concentrate on processing the request. With two objects I can make sure that each behaviour (date checking and request processing) is unit tested clearly.
Implicit dependencies are still dependencies
I can hide a dependency from the caller of a component by using a global to bypass encapsulation, but that doesn't make the dependency go away, it just makes it inaccessible. For example, I once had to work with a .Net library that could not be loaded without installing ActiveDirectory — which I couldn't do on my machine and wasn't actually required for the features that I wanted to use. The author was trying to be helpful and make the library "just work", but the result was that I couldn't even try it out.
The point about Objects, as a technique for structuring code, is that the boundaries of an object should be clearly visible. An object should only deal with values and instances that are either local, created and managed within its scope, or passed in explicitly.
In this example, the act of making date checking testable forced me to make the Receiver's requirements more explicit and then to think more clearly about its structure.
Labels: design, listening to the tests, testability
22 March 2007
Synaesthesia: Listening to Test Smells
We just gave our "Synaesthesia:
Listening to the Tests" talk at QCon. The exercise helped us to clarify what we want to say about our ideas on Test-Driven Development.
For a more eye-churning version of the image, try Nat's blog.
In our experience, where we find that our tests are awkward to write, it's usually because the design of our target code can be improved. The trick is to let your tests drive your development—that's a hint as to why it's called Test-Driven Development. You can sensitize yourself to find the rough edges in your tests and use them for rapid feedback about what to do with the code. It's rather like Systems Thinking for code: I don't stop at the immediate problem (an ugly test) but look deeper for why I'm in this situation (weakness in the design) and address that†.
As Avi Naparstek once posted to the Test-Driven Development list, the flow for state- and interaction-based testing is different. With state-based testing, where objects are exercised in small clusters, I tend to clean up within the cluster when there's enough code to see what's happening; the design effort happens in bursts. With interaction-based testing, the design is more continuous. The tests are finer grain, so they're more active in shaping the code.
We'll be writing up some test "smells" and what they might mean on this blog.
† With proper Systems Thinking, I should carry on a few more levels and think about how I ended up with a weak design in the first place, but that's for another day.
Labels: design, listening to the tests, testability
© The authors
