You can see a lot just by observing.
Sometimes we find it difficult to write a test for some functionality we want to add to our code. In our experience, this is usually because our design can be improved, perhaps because the class is too tightly coupled to its environment or does not implement a clear abstraction. We try to use this as a opportunity to improve our code, rather than working around the design by introducing ever more sophisticated tools. We find that the qualities that make an object easy to test are those that make our code flexible and able to accommodate change.
The trick is to let our tests drive our design (that's a hint as to why it's called Test-Driven Development). TDD is about testing code, that is about verifying its externally visible qualities such as reliability, performance, and accuracy. TDD is also about feedback on the code's internal qualities such as the coupling and cohesion of its classes, dependencies that are explicit or hidden, and effective information hiding; these are the qualities that will keep the code maintainable. With practice, we’ve become more sensitive to the rough edges in our tests, so we can use them for rapid feedback about the design. Now when we find a feature that’s difficult to test we don’t just ask ourselves how to test it, but also why is it difficult to test.
In this chapter, we look at some common “Test Smells” that we’ve encountered and discuss what they might imply about design.
One approach to reducing complexity in code is to make commonly useful objects accessible through a global name, usually implemented with a Singleton to fool oneself into thinking that its not really a global variable. The idea is that any code that needs access to a feature can just refer to it by its global name instead of receiving it as a parameter. Here's a common example:
Date now = new Date();
Under the covers, the constructor calls the singleton
System and sets the new instance to the current time
using System.currentTimeMillis(). This is a convenient
technique, but it comes at a cost. Let's say we want to write a test like
this:
@Test public void
rejectsRequestsNotWithinTheSameDay() {
receiver.acceptRequest(FIRST_REQUEST);
// the next day
assertFalse("too late now", receiver.acceptRequest(SECOND_REQUEST));
}The implementation looks like this:
public boolean acceptRequest(Request request) {
final Date now = new Date();
if (dateOfFirstRequest == null) {
dateOfFirstRequest = now;
} else if (firstDateIsDifferentFrom(now)) {
return false;
}
// process the request
return true;
}where dateOfFirstRequest is a field and
firstDateIsDifferentFrom is a helper method that hides
the unpleasantness of working with the Java date library.
To test this timeout we must either make the test wait overnight
or do something clever, perhaps with Aspects or byte-code manipulation, to
intercept the constructor and return suitable Date
values for the test. This difficulty in testing is a hint that we should
change the code. To make the test easier, we need to control how
Date objects are created, so we introduce a
Clock and pass it into the
Receiver. If we stub the Clock, the test might look
like this:
@Test public void
rejectsRequestsNotWithinTheSameDay() {
Receiver receiver = new Receiver(stubClock);
stubClock.setNextDate(TODAY);
receiver.acceptRequest(FIRST_REQUEST);
stubClock.setNextDate(TOMORROW);
assertFalse("too late now", receiver.acceptRequest(SECOND_REQUEST));
}and the implementation like this:
public boolean acceptRequest(Request request) {
final Date now = clock.now();
if (dateOfFirstRequest == null) {
dateOfFirstRequest = now;
} else if (firstDateIsDifferentFrom(now)) {
return false;
}
// process the request
return true;
}Now we can test the Receiver without any special tricks. More importantly, however, we've made it obvious that a Receiver is dependent on time, we can't even create one without a Clock. One could argue that this is breaking encapsulation by exposing the internals of a Receiver, we should be able to just create an instance and not worry, but we want to know about this dependency—especially when the service is rolled out across timezones, and New York and London start complaining about different results.
The introduction of a Clock suggests to me that we've been missing a concept in our code: date checking in terms of our domain. A Receiver doesn't need to know all the details of a Calendar system, such as time zones and locales, it just need to know whether, for this application, the date has changed. There's a clue in the fragment:
firstDateIsDifferentFrom(now)
which means that we've had to wrap up some date manipulation code in the Receiver. It's the wrong object, that kind of work should be done by the Clock. We'll write the test again:
@Test public void
rejectsRequestsNotWithinTheSameDay() {
Receiver receiver = new Receiver(clock);
context.checking(new Expectations() {{
allowing(clock).now(); will(returnValue(NOW));
one(clock).dayHasChangedFrom(NOW); will(returnValue(false));
}});
receiver.acceptRequest(FIRST_REQUEST);
assertFalse("too late now", receiver.acceptRequest(SECOND_REQUEST));
}The implementation looks like this
public boolean acceptRequest(Request request) {
if (dateOfFirstRequest == null) {
dateOfFirstRequest = clock.now();
} else if (clock.dayHasChangedFrom(dateOfFirstRequest)) {
return false;
}
// process the request
return true;
}This version of Receiver is more focussed, it doesn't need to know how to distinguish one date from another and it only needs to get a date to set the first value. The Clock interface defines exactly those date services a Receiver needs from its environment.
But we think we can push this further. The Receiver only stores a date so it can detect a change of day, perhaps I should delegate all the date functionality to another object which, for want of a better name, we'll call a SameDayChecker.
@Test public void
rsejectsRequestsOutsideAllowedPeriod() {
Receiver receiver = new Receiver(sameDayChecker);
context.checking(new Expectations() {{
allowing(sameDayChecker).hasExpired(); will(returnValue(false));
}});
assertFalse("too late now", receiver.acceptRequest(REQUEST));
}With an implementation like this
public boolean acceptRequest(Request request) {
if (sameDayChecker.hasExpired()) {
return false;
}
// process the request
return true;
}All the logic about dates has been separated out from the Receiver, which can concentrate on processing the request. With two objects we can make sure that each behaviour (date checking and request processing) is unit tested clearly.
We can hide a dependency from the caller of a component by using a global to bypass encapsulation, but that doesn't make the dependency go away, it just makes it inaccessible. For example, Steve once had to work with a .Net library that could not be loaded without installing Microsoft’s ActiveDirectory—which I couldn't do on my machine and wasn't actually required for the features that I wanted to use. The author was trying to be helpful and make the library "just work", but the result was that I couldn't even try it out.
The point about Objects, as a technique for structuring code, is that the boundaries of an object should be clearly visible. An object should only deal with values and instances that are either local, created and managed within its scope, or passed in explicitly.
In this example, the act of making date checking testable forced us to make the Receiver's requirements more explicit and then to think more clearly about its structure.
There are several frameworks available that use techniques such as manipulating class loaders or byte codes to allow unit tests to break dependencies without changing the target code. As a rule, these are advanced techniques that most developers would not use when writing production code. Sometimes these are really necessary, but they come with a substantial, and hidden, cost. Unit-testing tools that let the programmer side-step poor dependency management in the design also waste a valuable source of feedback. When the developers eventually do need to address these design weaknesses, to add some urgent feature, they can find it even harder. The poor structure will have influenced other parts of the system that rely on it, and any understanding of the original intent will have evaporated. Like dirty dishes, it's easier to get the grease off before it's been baked in.
We have a more contentious example of working with objects that are hard to replace: logging. Take a look at these two lines of code:
log.error("Lost touch with Reality after " + timeout + "seconds);
log.trace("Distance travelled in the wilderness: " + distance);These are two separate features that happen to share an implementation. Let us explain.
Support logging (errors and info) is part of the user interface of the application. These messages are intended to be tracked by support staff, perhaps system administrators and operators, to diagnose a failure or monitor the progress of the running system
Diagnostic logging (debug and trace), is infrastructure for programmers. These messages should not be turned on in production because they're intended to help the programmers understand what's going on whilst they're developing the system.
Given this distinction, we should consider using different techniques to develop these two type of logging. Support logging should be Test-Driven from someone's requirements, such as auditing or responsiveness to failure. The tests will make sure that we've thought about what each message is for and that it works. The tests will also protect us from breaking any tools and scripts that other people write to analyse these log messages. Diagnostic logging, on the other hand, is driven by the programmers' need for fine-grained tracking of what's happening in the system. It's scaffolding so it probably doesn't need to be Test-Driven and the messages might not need to be as consistent as those for support logs. These messages are not to be used in production, right?
To get back to the point of the chapter, writing unit tests against static global objects, including loggers, is noisy. We have to either read from the file system or manage an extra testing Appender, and we have to remember to clean up afterwards so that tests don't interfere with each other, and set the right level on the right Logger. The noise in the test reminds me that our code is working at two levels: our domain and the logging infrastructure. Here's a common example of code with logging:
Location location = tracker.getCurrentLocation();
for (Filter filter : filters) {
filter.selectFor(location);
if (logger.isInfoEnabled()) {
logger.info("Filter " + filter.getName() + ", " + filter.getDate()
+ " selected for " + location.getName()
+ ", is current: " + tracker.isCurrent(location));
}
}Notice the shift in vocabulary and style between the functional part of the loop and the (emphasised) logging part? At a micro level, the code is doing two things at once, something to do with locations and rendering support information, which breaks the Single Responsibility Principle. Maybe we could do this instead
Location location = tracker.getCurrentLocation();
for (Filter filter : filters) {
filter.selectFor(location);
support.notifyFiltering(tracker, location, filter);}where the support object might be
implemented by a logger, a message bus, pop-up windows, or whatever's
appropriate; the detail is not relevant to the code at this level. This
code is also easier to test. We, rather than the logging framework, own
the support object so we can pass in a mock implementation at our
convenience, perhaps in the constructor, and keep it local to the test
case. The other simplification is that now we're testing for objects
rather than the formatted contents of a string. Of course, we will still
need to write an implementation of support and some
focussed integration tests to go with it.
The idea of encapsulating support reporting sounds like over-design, but it's worth thinking about for a moment. It means we're writing code in terms of our intent (helping the support people) rather than the implementation (logging), so it's more expressive. All the support reporting is handled in a few known places, so it's easier to be consistent about how things are reported and to encourage reuse. It can also help us structure and control our reporting in terms of the application domain, rather than in terms of Java packages. Finally, the act of writing a test for each report helps us avoid the "I don't know what to do with this exception, so I'll log it and carry on" syndrome, which leads to log-bloat and production failures because we haven't handled obscure error conditions.
One objection we've heard is "I can't pass in a logger for testing because I've got logging all over my domain objects. I'd have to pass one around everywhere". We think this is a test smell that is telling us that we haven't clarified our design enough. Perhaps some of our support logging should really be diagnostic logging, or we're logging more than we need to because we left some in that we wrote when we hadn't yet understood the behaviour. Most likely, there's still too much duplication in our domain code and we haven't yet found the "choke points" where most of the production logging should go.
So what about diagnostic logging? Is it disposable scaffolding that should be taken down once the job is done, or essential infrastructure that should be tested and maintained? That depends on the system, but once we've made the distinction we have more freedom to think about using different techniques for support and diagnostic logging. We might even decide that in-line code is the wrong technique to implement diagnostic logging because it interferes with the readability of the production code that matters. Perhaps we could weave in some Aspects instead (since that's the canonical example of their use). Perhaps not, but at least we've now clarified the the choice.
One final data point. One of us once worked on a system where so much pointless content was written to the logs that they had to rolled off the disks after a week. This made maintenance very difficult as the relevant logs had usually gone by the time a bug was assigned to be fixed. If they'd not logged at all they could have made the system go faster with no loss of useful information.
This Test Smell is more usually about weaknesses in the design of the tests, rather than in the code to be tested. It may sound obvious, but not every class in a system has to be mocked.
There isn't much point in writing mocks for simple value objects (which should be immutable anyway), just create an instance and use it. For example, in this test
@Test public void
sumsTotalRunningTime() {
Show show = new Show();
Video video1 = context.mock(Video.class);
Video video2 = context.mock(Video.class);
context.checking(new Expectations(){{
one(video1).getTime(); will(returnValue(40));
one(video2).getTime(); will(returnValue(23));
}});
show.add(video1);
show.add(video2);
assertEqual(63, show.getRunningTime())
}where Video holds details of a part of a
show. It's not worth creating an interface / implementation pair to
control which time values are returned, just create instances with the
appropriate values and use them. There are a couple of heuristics for
when a class is not worth mocking. First, it has only accessors or
simple methods that act on values it holds, it doesn't have any
interesting behaviour. Second, we can't think of a meaningful name for
the class other than VideoImpl or some such vague
term.
There are two problems when mocking a third-party library. First, we can't use the tests to drive the design, the API belongs to someone else. Mock-based tests for external libraries often end up contorted and messy to get through to the functionality that we want to exercise. These tests are giving off smells that highlight designs that are inappropriate for our system but, instead of fixing the problem and simplifying the code, we end up carrying the weight of the test complexity. The second issue is that we have to be sure that the behaviour we implement in a mock (or stub) matches the external library. How difficult this is depends on the API, which has to be specified (and implemented) well enough for us to be certain that the tests are meaningful.
To develop against an external API, we first TDD the code to define interfaces for the services that our application needs in terms of our domain. Then we write a thin layer using the library to implement these interfaces, with focussed integration tests to confirm our assumptions about how it works, as in Figure 19.1, “An adaption layer above a third-party API”. How to make integration tests work effectively will depend on the environment, but we will need to do it anyway so we might as well start early and get some benefit. There will be relatively few integration tests, compared to the number of unit tests, so they should not get in the way of the build even if they're not as fast as the in-memory tests.
There are some exceptions when mocking third-party libraries can be helpful. We might use mocks to simulate behaviour that is hard to trigger with the real library, such as throwing exceptions. Similarly, we might use mocks to test a sequence of calls, for example making sure that a transaction is rolled back if there's a failure. There should not be many of these in a test suite.
One approach to Interaction Testing is to mock concrete classes rather than interfaces. The technique is to inherit from the class you want to mock and override the methods that will be called within the test, either manually or with any of the mocking frameworks. We think it's a technique that should be used only when you really have no other options.
Here's an example of mocking by hand, the test verifies that the
music centre starts the CD player at the requested time. Assume that
setting the schedule on a CdPlayer object involves
triggering some behaviour we don't want in the test, so we override
scheduleToStartAt() and verify afterwards that we've
called it with the right argument.
public class MusicCentreTest {
@Test public void
startsCdPlayerAtTimeRequested() {
final MutableTime scheduledTime = new MutableTime();
CdPlayer player = new CdPlayer() {
@Override public void scheduleToStartAt(Time startTime) {
scheduledTime.set(startTime);
}
}
MusicCentre centre = new MusicCentre(player);
centre.startMediaAt(LATER);
assertEquals(LATER, scheduledTime.get());
}
}The problem with this approach is that it leaves the relationship
between the objects implicit. We hope we've made clear by now that the
intention of our approach to Test-Driven Development
is to use Mock Objects to bring out relationships between objects. If we
subclass, there's nothing in the domain code to make such a relationship
visible, just methods on an object. This makes it harder to see if the
service that supports this relationship might be relevant elsewhere and
we'll have to do the analysis again next time we work with the class. To
make the point, here's a possible implementation of
CdPlayer:
public class CdPlayer {
public void scheduleToStartAt(Time startTime) { …
public void stop() { …
public void gotoTrack(int trackNumber) { …
public void spinUpDisk() { …
public void eject() { …
}It turns out that our MusicCentre object
only uses the starting and stopping methods on the
CdPlayer, the rest are used by some other part of
the system. We would be over-specifying the
MusicCentre by requiring it to talk to a
CdPlayer, what it actually needs is a
ScheduledDevice. Robert Martin made the point (back
in 1996) in his Interface Segregation Principle
that "Clients should not be forced to depend upon interfaces that they do
not use", but that's exactly what we do when we mock a concrete
class.
There's a more subtle but powerful reason for not mocking
concrete classes. As part of the TDD with Mocks process, we have to think
up names for the relationships we discover—in this example the
ScheduledDevice. We find that this makes us think
harder about the domain and teases out concepts that we might otherwise
miss. Once something has a name, we can talk about it.
There are a few occasions when we have to put up with this smell. The least unacceptable situation is where we're working with legacy code that we control but can't change all at once. Alternatively, we might be working with third party code that we can't change at all. As we wrote above, it's usually better to write a veneer over an external library rather than mock it directly, but sometimes it's just too hard. Either way, these are unfortunate but necessary compromises that we would try to work our way out of as soon as possible. The longer we leave them in the code, the more likely it is that some brittleness in the design will cause us grief.
Above all, do not mock by overriding a class's internal features, which just locks down your test to the quirks of the current implementation. Override only visible methods. This rule also prohibits exposing internal methods just so we can override them in a test. If we can't get to the structure you need, then the tests are telling us that it's time to break up the class into smaller, composable features.
Sometimes, during the TDD process, we end up with a constructor that has a long, unwieldy list of arguments. Most likely, we got there by adding the object's peers (other objects it has to communicate with) one at a time, and it got out of hand. This is not dreadful, since the process helped us sort out the design of the object and its neighbours, but now it's time to clean up. The object must still be constructed in a valid state as required by the application. We will still need all the current constructor arguments, so we should see if there's any implicit structure there that we can tease out.
One possibility is that some of the arguments together define a concept that should be packaged up and replaced with a new object that represents it. Here's a small example.
public class MessageProcessor {
public MessageProcessor(MessageUnpacker unpacker,
AuditTrail auditer,
CounterPartyFinder counterpartyFinder,
LocationFinder locationFinder,
DomesticNotifier domesticNotifier,
ImportedNotifier importedNotifier)
{
// set the fields here
}
public void onMessage(Message rawMessage) {
UnpackedMessage unpacked =
unpacker.unpack(rawMessage, counterpartyFinder);
auditer.recordReceiptOf(unpacked);
// some other activity here
if (locationFinder.isDomestic(unpacked)) {
domesticNotifier.notify(unpacked.asDomesticMessage());
} else {
importedNotifier.notify(unpacked.asImportedMessage())
}
}
}Just the thought of writing expectations for all these objects
makes us wilt, which suggests that things are too complicated. A first
step is to notice that the unpacker and
counterpartyFinder are always used together, they're
fixed at construction and one calls the other. We can remove one argument
by pushing the counterpartyFinder into the
unpacker.
public class MessageProcessor {
public MessageProcessor(MessageUnpacker unpacker,
AuditTrail auditer,
LocationFinder locationFinder,
DomesticNotifier domesticNotifier,
ImportedNotifier importedNotifier) {
// etc.
public void onMessage(Message rawMessage) {
UnpackedMessage unpacked = unpacker.unpack(rawMessage);
// etc.Then there's the triple of locationFinder and
the two notifiers, which seem to go together. It might make sense to
package them together into a
MessageDispatcher.
public class MessageProcessor {
public MessageProcessor(MessageUnpacker unpacker,
AuditTrail auditer,
MessageDispatcher dispatcher) {
// etc.
public void onMessage(Message rawMessage) {
UnpackedMessage unpacked = unpacker.unpack(rawMessage);
auditer.recordReceiptOf(unpacked);
// some other activity here
dispatcher.dispatch(unpacked);
}
}Although we've forced this example to get it to fit within a section, it shows that being sensitive to complexity in the tests can help us clarify our designs. Now we have a message handling object that clearly performs the usual three stages: receive, process, and forward. We've pulled out the message routing code (the MessageDispatcher), so the MessageProcessor has fewer responsiblities and we know where to put routing decisions when things get more complicated. You might also notice that this code is easier to test.
When we're extracting implicit components, we look first for two conditions: arguments that are always used together in the class, and that have the same lifetime. That usually finds us the concept, then we have the harder task of finding a good name.
As an aside, one of the signs that the design is developing
nicely is that this kind of change is easy to integrate. All we have to do
is find where the MessageProcessor is created and
change this
messageProceesor =
new MessageProcessor(new XmlMessageUnpacker(),
auditer, counterpartyFinder,
locationFinder, domesticNotifier,
importedNotifier);to this
messageProceesor = new MessageProcessor(new XmlMessageUnpacker(counterpartyFinder), auditer, new MessageDispatcher( locationFinder, domesticNotifier, importedNotifier));
Later we can reduce the syntax noise by extracting out the
creation of the MessageDispatcher.
Another diagnosis for a Bloated Constructor might be that the object itself is too large because it has too many responsibilities. For example,
public class Handset {
public Handset(Network network, Camera camera, Display display,
DataNetwork dataNetwork, AddressBook addressBook,
Storage storage, Tuner tuner, …) {
// set the fields here
}
public void placeCallTo(DirectoryNumber number) {
network.openVoiceCallTo(number);
}
public void takePicture() {
Frame frame = storage.allocateNewFrame();
camera.takePictureInto(frame);
display.showPicture(frame);
}
public void showWebPage(URL url) {
display.renderHtml(dataNetwork.retrievePage(url));
}
public void showAddress(SearchTerm searchTerm) {
display.showAddress(addressBook.findAddress(searchTerm));
}
public void playRadio(Frequency frequency) {
tuner.tuneTo(frequency);
tuner.play();
}
// and so on
}Like our mobile phones, this class has several unrelated responsibilities which force it to pull in many dependencies. Like our phones, the class is confusing to use so we keep pressing the wrong button, and find that unrelated features interfere with each other. We're prepared to put up with these compromises in a handset because we don't have enough pockets for all the devices it implements, but that doesn't apply to code. This class should be broken up. The sort of fault lines we look for are fields that have no relationship with each other, particularly if the methods that reference them aren't used together either.
An associated smell for this kind of class is that its test suite will look confused too. The tests for its various features will have no relationship with each other, so we'll be able to make major changes in one area without touching any others. If we can break up the test class into slices that don't share anything, then the best thing might be to do just that and then slice up the object too.
Bring this up to date with only three stereotypes
A third diagnosis for a Bloated Constructor might be that not all of the arguments are dependencies, services that must be passed to the object before it can do its job. Other arguments might be more to do with structuring the implementation of the object or varying its behaviour to suit its neighbours. To help us think about this distinction, we've found it useful to categorise the peers of an object into stereotypes:
| Dependencies are services that the object needs from its environment so that it can fulfil its responsibilities. |
| Notifications are other parts of the system that need to know when the object changes state or performs an action. |
| Policies are objects that tweak or adapt the object's behaviour to the needs of the system. |
| Parts are components in the implementation that are not controlled from outside the object after being set. |
Dependencies must be passed in through the constructor. An object needs the system to provide implementations of its dependencies to function at all, so there's no point in creating an instance until they're available. Partially creating an object and then finishing it off by setting properties is risky and brittle because the programmer has to know whether all the dependencies have been set. As Yoda said, "New, or new not. There is no try."
Other types of peer can be passed to the constructor as a convenience but, if the list is becoming too long, they can also be initialised to safe defaults and overridden later (one way to distinguish the two is that there are no safe defaults for a Dependency). Policies and Parts can be initialised to commonly used cases, and collections of Parts can be initialised as empty. We can then add setters and add/remove methods to the class to allow the object's clients to configure it. Similarly, Notifications can be implemented as events. Unicast events can be initialised with a null implementation of the listener interface, and multicast events can be initialised with an empty collection of listeners. We always create a valid object, and we also have the hooks we need for testing.
Here's an example, it's probably not quite bad enough to need
fixing, but it'll do to make the point. The application is a Formula 1
racing game, players can try out different configurations of car and
driving style to see which one wins. A RacingCar
represents a competitor within a race.
public class RacingCar {
private final Track track;
private Tyres tyres;
private Suspension suspension;
private Wing frontWing;
private Wing backWing;
private double fuelLoad;
private CarListener listener;
private DrivingStrategy driver;
public RacingCar(Track track, DrivingStrategy driver,
Tyres tyres, Suspension suspension,
Wing frontWing, Wing backWing, double fuelLoad,
CarListener listener)
{
this.track = track;
this.driver = driver;
this.tyres = tyres;
this.suspension = suspension;
this.frontWing = frontWing;
this.backWing = backWing;
this.fuelLoad = fuelLoad;
this.listener = listener;
}
}It turns out that the track is the only
Dependency of a RacingCar, the hint is that it's
the only field that's final. The driver is a Policy,
the listener is a Notification, and the rest are
Parts; all of these can be modified by the user before or during the
race. Here's a reworked constructor:
public class RacingCar {
private final Track track;
private DrivingStrategy driver =
DriverTypes.borderlineAggressiveDriving();
private Tyres tyres = TyreTypes.mediumSlicks();
private Suspension suspension = SuspensionTypes.mediumStiffness();
private Wing frontWing = WingTypes.mediumDownforce();
private Wing backWing = WingTypes.mediumDownforce();
private double fuelLoad = 0.5;
private CarListener listener = CarListener.NONE;
public RacingCar(Track track) {
this.track = track;
}
public void setSuspension(Suspension suspension) { […]
public void setTyres(Tyres tyres) { […]
public void setEngine(Engine engine) { […]
public void setListener(CarListener listener) { […]
}Now we've initialised the peers to most common defaults, the
user can then configure them through the user interface. We've
initialised the listener to a null implementation,
again this can be changed later by the object's environment.
These stereotypes are only heuristics to help me think about
the design, not hard rules, so we don't get obsessed with finding just
the right classification of an object's peers. What matters most is the
context in which we're developing the object, our mental model of the
domain. For example, an auditing log might be a Dependency, since it's a
legal requirement for the business, or a Notification, since the object
will otherwise function without it. Policies belong to the environment
and can usually be constants, since they're often immutable and so don't
need to be instantiated more than once. Parts belong to the object and
usually cannot be stateless, so they need a new instance. For example,
in RacingCar, tyres might be a Policy if
TyreTypes.mediumSlicks() describes only how the
choice of tyre affects the car's handling, but a Part if it stores
mutable values such as air pressure and wear.
When a test has too many expectations it's hard to see what's important, what's really under test. For example, here's a test
@Test public void
decidesCasesWhenFirstPartyIsReady() {
context.checking(new Expectations(){{
one(firstPart).isReady(); will(returnValue(true));
one(organiser).getAdjudicator(); will(returnValue(adjudicator));
one(adjudicator).findCase(firstParty, issue); will(returnValue(case));
one(thirdParty).proceedWith(case);
}});
claimsProcessor.adjudicateIfReady(thirdParty, issue);
}that might be implemented like this
public void adjudicateIfReady(ThirdParty thirdParty, Issue issue) {
if (firstParty.isReady()) {
Adjudicator adjudicator = organisation.getAdjudicator();
Case case = adjudicator.findCase(firstParty, issue);
thirdParty.proceedWith(case);
} else{
thirdParty.adjourn();
}
}What makes the test hard to read is that everything is an expectation, so everything seems to be equally important. We can't tell what's significant and what's just there to get through the test.
In fact, if we look at all the methods we call, there are only
two that have any side effects outside this class:
thirdParty.proceedWith() and
thirdParty.adjourn(). It would be an error to call
these more than once. All the other methods are queries, we can call
organisation.getAdjudicator() repeatedly without
breaking any behaviour. adjudicator.findCase() might
go either way, but it happens to be a lookup so it has no side
effects.
We can make our intentions clearer by distinguishing between Stubs, simulations of real behaviour that help me get our test to pass, and Expectations, assertions we want to make about how an object interacts with its neighbours. Reworking the test, we get
@Test public void
decidesCasesWhenFirstPartyIsReady() {
context.checking(new Expectations(){{
allowing(firstPart).isReady(); will(returnValue(true));
allowing(organiser).getAdjudicator(); will(returnValue(adjudicator));
allowing(adjudicator).findCase(firstParty, issue); will(returnValue(case));
one(thirdParty).proceedWith(case);
}});
claimsProcessor.adjudicateIfReady(thirdParty, issue);
}which is more explicit about how we expect the object to change the world around it.
For the more formally minded, you might want to think (very, very loosely) of Stubs as pre-conditions and Expectations as post-conditions. The test now says that given the state described in the Stubs, the events described by the Expectations should have happened. Once we've sorted out which calls are Stubs, we can usually move some of them into common setup code, making the tests clearer still
@Test public void
decidesCasesWhenFirstPartyIsReady() {
context.checking(new Expectations(){{
allowing(firstParty).isReady(); will(returnValue(true));
one(thirdParty).proceedWith(case);
}});
claimsProcessor.adjudicateIfReady(thirdParty, issue);
}It is, of course, possible that all the methods calls are actions. In which case, either we just have a complicated domain and there's not much we can do to simplify it or, more likely, there are some intermediate concepts in there that need to be extracted into objects.
Clearly, when we're thinking about the whole system, we can't ignore implementation issues such as the cost of database queries, but that's not relevant at this point. What matters now is that we have tests and code that clearly express our intentions.
We always have problems coming up with good examples. There's actually a better improvement to this code, which is to notice that that we've pulled out a chain of objects to get to the case object, exposing dependencies that aren't relevant here. What we should have done is told the nearest object to do the work for we, like this:
public void adjudicateIfReady(ThirdParty thirdParty, Issue issue) {
if (firstParty.isReady()) {
organisation.adjudicateBetween(
firstParty, thirdParty, issue);
} else {
thirdParty.adjourn();
}
}or, possibly,
public void adjudicateIfReady(ThirdParty thirdParty, Issue issue) {
if (firstParty.isReady()) {
thirdParty.startAdjudication(
organisation, firstParty, issue);
} else{
thirdParty.adjourn();
}
}which looks more balanced. If you spotted this then we award you a Moment of Smugness™ to be exercised at your convenience.
We had one posting to the jMock user list that included this comment
I was involved in a project recently where JMock was used quite heavily. Looking back, here's what I found:
The unit tests where at times unreadable (no idea what they were doing).
Some tests classes would reach 500 lines in addition to inheriting an abstract class which also would have up to 500 lines.
Refactoring would lead to massive changes in test code.
We've seen this failure mode on other projects we've been helping with, so we think there might be a common pattern.
Unit test code shouldn't get that large, except for unusual circumstances. Unit tests are supposed to focus on at most a few classes and shouldn't need a large amount of set up. What we've seen on projects was enormous amounts of preparation and positioning to get the objects into a state where the expected feature could be exercised. Of course it's hard to understand the point of a test when there's just so much code. If you see this pattern, then we suggest that the code (or at least the test code) needs breaking. Integration and acceptance tests that happen to be written using a unit test framework, on the other hand, might have to be large.
It's not yet clear whether the naive use of interaction-testing is particularly susceptible to this failing, or whether it happens all the time and we're the only ones who get complained to. We are, however, convinced that an emphasis on using Mock Objects mainly to substitute external systems (some of which we perpetrated in the early days) is a deeply bad idea which leads teams towards this sort of problem.
Link smells to possible solutions: Test Smell -> Code Smell -> Refactoring. Maybe as an appendix?
We've found these benefits from learning to Listen to Test Smells
This is one reason why we don't like mocking concrete classes, we like to have names for the relationships between objects as well the objects themselves. As the legends say, if we have something's true name then we can control it. If we can see it, we have more chance of finding other uses and so reducing duplication.
Keeping the code clean as we go along minimizes the risk of nasty surprises later when a new requirement triggers significant changes to the design and the developers end up being pressured into botching the code to get the job done. Just a few of these episodes can make further changes increasingly difficult and dissipate the good intentions of the team.
Copyright © 2008 Steve Freeman and Nat Pryce