Chapter 19. Listening to the Tests

You can see a lot just by observing.

--Yogi Berra

Introduction

Sometimes we find it difficult to write a test for some functionality we want to add to our code. In our experience, this is usually because our design can be improved, perhaps because the class is too tightly coupled to its environment or does not implement a clear abstraction. We try to use this as a opportunity to improve our code, rather than working around the design by introducing ever more sophisticated tools. We find that the qualities that make an object easy to test are those that make our code flexible and able to accommodate change.

The trick is to let our tests drive our design (that's a hint as to why it's called Test-Driven Development). TDD is about testing code, that is about verifying its externally visible qualities such as reliability, performance, and accuracy. TDD is also about feedback on the code's internal qualities such as the coupling and cohesion of its classes, dependencies that are explicit or hidden, and effective information hiding; these are the qualities that will keep the code maintainable. With practice, we’ve become more sensitive to the rough edges in our tests, so we can use them for rapid feedback about the design. Now when we find a feature that’s difficult to test we don’t just ask ourselves how to test it, but also why is it difficult to test.

In this chapter, we look at some common “Test Smells” that we’ve encountered and discuss what they might imply about design.

Test Smell: I need to mock an object I can't replace (without magic)

One approach to reducing complexity in code is to make commonly useful objects accessible through a global name, usually implemented with a Singleton to fool oneself into thinking that its not really a global variable. The idea is that any code that needs access to a feature can just refer to it by its global name instead of receiving it as a parameter. Here's a common example:

Date now = new Date();

Under the covers, the constructor calls the singleton System and sets the new instance to the current time using System.currentTimeMillis(). This is a convenient technique, but it comes at a cost. Let's say we want to write a test like this:

@Test public void 
rejectsRequestsNotWithinTheSameDay() {
  receiver.acceptRequest(FIRST_REQUEST);
  // the next day
  assertFalse("too late now", receiver.acceptRequest(SECOND_REQUEST));
}

The implementation looks like this:

public boolean acceptRequest(Request request) {
  final Date now = new Date();
  if (dateOfFirstRequest == null) {
    dateOfFirstRequest = now;
   } else if (firstDateIsDifferentFrom(now)) {
    return false;
  }
  // process the request
  return true;
}

where dateOfFirstRequest is a field and firstDateIsDifferentFrom is a helper method that hides the unpleasantness of working with the Java date library.

To test this timeout we must either make the test wait overnight or do something clever, perhaps with Aspects or byte-code manipulation, to intercept the constructor and return suitable Date values for the test. This difficulty in testing is a hint that we should change the code. To make the test easier, we need to control how Date objects are created, so we introduce a Clock and pass it into the Receiver. If we stub the Clock, the test might look like this:

@Test public void 
rejectsRequestsNotWithinTheSameDay() {
  Receiver receiver = new Receiver(stubClock);
  stubClock.setNextDate(TODAY);
  receiver.acceptRequest(FIRST_REQUEST);

  stubClock.setNextDate(TOMORROW);
  assertFalse("too late now", receiver.acceptRequest(SECOND_REQUEST));
}

and the implementation like this:

public boolean acceptRequest(Request request) {
  final Date now = clock.now();
  if (dateOfFirstRequest == null) {
   dateOfFirstRequest = now;
  } else if (firstDateIsDifferentFrom(now)) {
   return false;
  }
  // process the request
  return true;
}

Now we can test the Receiver without any special tricks. More importantly, however, we've made it obvious that a Receiver is dependent on time, we can't even create one without a Clock. One could argue that this is breaking encapsulation by exposing the internals of a Receiver, we should be able to just create an instance and not worry, but we want to know about this dependency—especially when the service is rolled out across timezones, and New York and London start complaining about different results.

From procedural code to objects

The introduction of a Clock suggests to me that we've been missing a concept in our code: date checking in terms of our domain. A Receiver doesn't need to know all the details of a Calendar system, such as time zones and locales, it just need to know whether, for this application, the date has changed. There's a clue in the fragment:

firstDateIsDifferentFrom(now)

which means that we've had to wrap up some date manipulation code in the Receiver. It's the wrong object, that kind of work should be done by the Clock. We'll write the test again:

@Test public void 
rejectsRequestsNotWithinTheSameDay() {
  Receiver receiver = new Receiver(clock);
  context.checking(new Expectations() {{
   allowing(clock).now(); will(returnValue(NOW));
   one(clock).dayHasChangedFrom(NOW); will(returnValue(false));
  }});

  receiver.acceptRequest(FIRST_REQUEST);
  assertFalse("too late now", receiver.acceptRequest(SECOND_REQUEST));
}

The implementation looks like this

public boolean acceptRequest(Request request) {
  if (dateOfFirstRequest == null) {
   dateOfFirstRequest = clock.now();
  } else if (clock.dayHasChangedFrom(dateOfFirstRequest)) {
   return false;
  }
  // process the request
  return true;
}

This version of Receiver is more focussed, it doesn't need to know how to distinguish one date from another and it only needs to get a date to set the first value. The Clock interface defines exactly those date services a Receiver needs from its environment.

But we think we can push this further. The Receiver only stores a date so it can detect a change of day, perhaps I should delegate all the date functionality to another object which, for want of a better name, we'll call a SameDayChecker.

@Test public void 
rsejectsRequestsOutsideAllowedPeriod() {
  Receiver receiver = new Receiver(sameDayChecker);
  context.checking(new Expectations() {{
    allowing(sameDayChecker).hasExpired(); will(returnValue(false));
  }});

  assertFalse("too late now", receiver.acceptRequest(REQUEST));
}

With an implementation like this

public boolean acceptRequest(Request request) {
  if (sameDayChecker.hasExpired()) {
   return false;
  }
  // process the request
  return true;
}

All the logic about dates has been separated out from the Receiver, which can concentrate on processing the request. With two objects we can make sure that each behaviour (date checking and request processing) is unit tested clearly.

Implicit dependencies are still dependencies

We can hide a dependency from the caller of a component by using a global to bypass encapsulation, but that doesn't make the dependency go away, it just makes it inaccessible. For example, Steve once had to work with a .Net library that could not be loaded without installing Microsoft’s ActiveDirectory—which I couldn't do on my machine and wasn't actually required for the features that I wanted to use. The author was trying to be helpful and make the library "just work", but the result was that I couldn't even try it out.

The point about Objects, as a technique for structuring code, is that the boundaries of an object should be clearly visible. An object should only deal with values and instances that are either local, created and managed within its scope, or passed in explicitly.

In this example, the act of making date checking testable forced us to make the Receiver's requirements more explicit and then to think more clearly about its structure.

Break dependencies in unit tests only with techniques that you would use in production code.

There are several frameworks available that use techniques such as manipulating class loaders or byte codes to allow unit tests to break dependencies without changing the target code. As a rule, these are advanced techniques that most developers would not use when writing production code. Sometimes these are really necessary, but they come with a substantial, and hidden, cost. Unit-testing tools that let the programmer side-step poor dependency management in the design also waste a valuable source of feedback. When the developers eventually do need to address these design weaknesses, to add some urgent feature, they can find it even harder. The poor structure will have influenced other parts of the system that rely on it, and any understanding of the original intent will have evaporated. Like dirty dishes, it's easier to get the grease off before it's been baked in.

Logging is also a feature

We have a more contentious example of working with objects that are hard to replace: logging. Take a look at these two lines of code:

log.error("Lost touch with Reality after " + timeout + "seconds);
log.trace("Distance travelled in the wilderness: " + distance);

These are two separate features that happen to share an implementation. Let us explain.

  • Support logging (errors and info) is part of the user interface of the application. These messages are intended to be tracked by support staff, perhaps system administrators and operators, to diagnose a failure or monitor the progress of the running system

  • Diagnostic logging (debug and trace), is infrastructure for programmers. These messages should not be turned on in production because they're intended to help the programmers understand what's going on whilst they're developing the system.

Given this distinction, we should consider using different techniques to develop these two type of logging. Support logging should be Test-Driven from someone's requirements, such as auditing or responsiveness to failure. The tests will make sure that we've thought about what each message is for and that it works. The tests will also protect us from breaking any tools and scripts that other people write to analyse these log messages. Diagnostic logging, on the other hand, is driven by the programmers' need for fine-grained tracking of what's happening in the system. It's scaffolding so it probably doesn't need to be Test-Driven and the messages might not need to be as consistent as those for support logs. These messages are not to be used in production, right?

Support, not logging

To get back to the point of the chapter, writing unit tests against static global objects, including loggers, is noisy. We have to either read from the file system or manage an extra testing Appender, and we have to remember to clean up afterwards so that tests don't interfere with each other, and set the right level on the right Logger. The noise in the test reminds me that our code is working at two levels: our domain and the logging infrastructure. Here's a common example of code with logging:

Location location = tracker.getCurrentLocation();
for (Filter filter : filters) {
  filter.selectFor(location);
  if (logger.isInfoEnabled()) {
    logger.info("Filter " + filter.getName() + ", " + filter.getDate()
                 + " selected for " + location.getName() 
                 + ", is current: " + tracker.isCurrent(location));
  }
}

Notice the shift in vocabulary and style between the functional part of the loop and the (emphasised) logging part? At a micro level, the code is doing two things at once, something to do with locations and rendering support information, which breaks the Single Responsibility Principle. Maybe we could do this instead

Location location = tracker.getCurrentLocation();
for (Filter filter : filters) {
  filter.selectFor(location);
  support.notifyFiltering(tracker, location, filter);}

where the support object might be implemented by a logger, a message bus, pop-up windows, or whatever's appropriate; the detail is not relevant to the code at this level. This code is also easier to test. We, rather than the logging framework, own the support object so we can pass in a mock implementation at our convenience, perhaps in the constructor, and keep it local to the test case. The other simplification is that now we're testing for objects rather than the formatted contents of a string. Of course, we will still need to write an implementation of support and some focussed integration tests to go with it.

But that's crazy talk…

The idea of encapsulating support reporting sounds like over-design, but it's worth thinking about for a moment. It means we're writing code in terms of our intent (helping the support people) rather than the implementation (logging), so it's more expressive. All the support reporting is handled in a few known places, so it's easier to be consistent about how things are reported and to encourage reuse. It can also help us structure and control our reporting in terms of the application domain, rather than in terms of Java packages. Finally, the act of writing a test for each report helps us avoid the "I don't know what to do with this exception, so I'll log it and carry on" syndrome, which leads to log-bloat and production failures because we haven't handled obscure error conditions.

One objection we've heard is "I can't pass in a logger for testing because I've got logging all over my domain objects. I'd have to pass one around everywhere". We think this is a test smell that is telling us that we haven't clarified our design enough. Perhaps some of our support logging should really be diagnostic logging, or we're logging more than we need to because we left some in that we wrote when we hadn't yet understood the behaviour. Most likely, there's still too much duplication in our domain code and we haven't yet found the "choke points" where most of the production logging should go.

So what about diagnostic logging? Is it disposable scaffolding that should be taken down once the job is done, or essential infrastructure that should be tested and maintained? That depends on the system, but once we've made the distinction we have more freedom to think about using different techniques for support and diagnostic logging. We might even decide that in-line code is the wrong technique to implement diagnostic logging because it interferes with the readability of the production code that matters. Perhaps we could weave in some Aspects instead (since that's the canonical example of their use). Perhaps not, but at least we've now clarified the the choice.

One final data point. One of us once worked on a system where so much pointless content was written to the logs that they had to rolled off the disks after a week. This made maintenance very difficult as the relevant logs had usually gone by the time a bug was assigned to be fixed. If they'd not logged at all they could have made the system go faster with no loss of useful information.

Test Smell: Everything is mocked

This Test Smell is more usually about weaknesses in the design of the tests, rather than in the code to be tested. It may sound obvious, but not every class in a system has to be mocked.

Don't mock value objects

There isn't much point in writing mocks for simple value objects (which should be immutable anyway), just create an instance and use it. For example, in this test

@Test public void 
sumsTotalRunningTime() {
  Show show = new Show();
  Video video1 = context.mock(Video.class);
  Video video2 = context.mock(Video.class);
  
  context.checking(new Expectations(){{
    one(video1).getTime(); will(returnValue(40));
    one(video2).getTime(); will(returnValue(23));
  }});
  
  show.add(video1);
  show.add(video2);
  assertEqual(63, show.getRunningTime())
}

where Video holds details of a part of a show. It's not worth creating an interface / implementation pair to control which time values are returned, just create instances with the appropriate values and use them. There are a couple of heuristics for when a class is not worth mocking. First, it has only accessors or simple methods that act on values it holds, it doesn't have any interesting behaviour. Second, we can't think of a meaningful name for the class other than VideoImpl or some such vague term.

Don't mock third-party libraries

There are two problems when mocking a third-party library. First, we can't use the tests to drive the design, the API belongs to someone else. Mock-based tests for external libraries often end up contorted and messy to get through to the functionality that we want to exercise. These tests are giving off smells that highlight designs that are inappropriate for our system but, instead of fixing the problem and simplifying the code, we end up carrying the weight of the test complexity. The second issue is that we have to be sure that the behaviour we implement in a mock (or stub) matches the external library. How difficult this is depends on the API, which has to be specified (and implemented) well enough for us to be certain that the tests are meaningful.

To develop against an external API, we first TDD the code to define interfaces for the services that our application needs in terms of our domain. Then we write a thin layer using the library to implement these interfaces, with focussed integration tests to confirm our assumptions about how it works, as in Figure 19.1, “An adaption layer above a third-party API”. How to make integration tests work effectively will depend on the environment, but we will need to do it anyway so we might as well start early and get some benefit. There will be relatively few integration tests, compared to the number of unit tests, so they should not get in the way of the build even if they're not as fast as the in-memory tests.

An adaption layer above a third-party API
mockable-adaptor.svg

Figure 19.1. An adaption layer above a third-party API


There are some exceptions when mocking third-party libraries can be helpful. We might use mocks to simulate behaviour that is hard to trigger with the real library, such as throwing exceptions. Similarly, we might use mocks to test a sequence of calls, for example making sure that a transaction is rolled back if there's a failure. There should not be many of these in a test suite.

Test Smell: Mocking concrete classes

One approach to Interaction Testing is to mock concrete classes rather than interfaces. The technique is to inherit from the class you want to mock and override the methods that will be called within the test, either manually or with any of the mocking frameworks. We think it's a technique that should be used only when you really have no other options.

Here's an example of mocking by hand, the test verifies that the music centre starts the CD player at the requested time. Assume that setting the schedule on a CdPlayer object involves triggering some behaviour we don't want in the test, so we override scheduleToStartAt() and verify afterwards that we've called it with the right argument.

public class MusicCentreTest {
  @Test public void 
  startsCdPlayerAtTimeRequested() {
    final MutableTime scheduledTime = new MutableTime();
    CdPlayer player = new CdPlayer() {
      @Override public void scheduleToStartAt(Time startTime) {
        scheduledTime.set(startTime);
      }
    }

    MusicCentre centre = new MusicCentre(player);
    centre.startMediaAt(LATER);

    assertEquals(LATER, scheduledTime.get());
  }
}

The problem with this approach is that it leaves the relationship between the objects implicit. We hope we've made clear by now that the intention of our approach to Test-Driven Development is to use Mock Objects to bring out relationships between objects. If we subclass, there's nothing in the domain code to make such a relationship visible, just methods on an object. This makes it harder to see if the service that supports this relationship might be relevant elsewhere and we'll have to do the analysis again next time we work with the class. To make the point, here's a possible implementation of CdPlayer:

public class CdPlayer {
  public void scheduleToStartAt(Time startTime) { …
  public void stop() { …
  public void gotoTrack(int trackNumber) { …
  public void spinUpDisk() { …
  public void eject() { …
}

It turns out that our MusicCentre object only uses the starting and stopping methods on the CdPlayer, the rest are used by some other part of the system. We would be over-specifying the MusicCentre by requiring it to talk to a CdPlayer, what it actually needs is a ScheduledDevice. Robert Martin made the point (back in 1996) in his Interface Segregation Principle that "Clients should not be forced to depend upon interfaces that they do not use", but that's exactly what we do when we mock a concrete class.

There's a more subtle but powerful reason for not mocking concrete classes. As part of the TDD with Mocks process, we have to think up names for the relationships we discover—in this example the ScheduledDevice. We find that this makes us think harder about the domain and teases out concepts that we might otherwise miss. Once something has a name, we can talk about it.

Break glass in case of emergency

There are a few occasions when we have to put up with this smell. The least unacceptable situation is where we're working with legacy code that we control but can't change all at once. Alternatively, we might be working with third party code that we can't change at all. As we wrote above, it's usually better to write a veneer over an external library rather than mock it directly, but sometimes it's just too hard. Either way, these are unfortunate but necessary compromises that we would try to work our way out of as soon as possible. The longer we leave them in the code, the more likely it is that some brittleness in the design will cause us grief.

Above all, do not mock by overriding a class's internal features, which just locks down your test to the quirks of the current implementation. Override only visible methods. This rule also prohibits exposing internal methods just so we can override them in a test. If we can't get to the structure you need, then the tests are telling us that it's time to break up the class into smaller, composable features.

Test Smell: Bloated Constructor

Sometimes, during the TDD process, we end up with a constructor that has a long, unwieldy list of arguments. Most likely, we got there by adding the object's peers (other objects it has to communicate with) one at a time, and it got out of hand. This is not dreadful, since the process helped us sort out the design of the object and its neighbours, but now it's time to clean up. The object must still be constructed in a valid state as required by the application. We will still need all the current constructor arguments, so we should see if there's any implicit structure there that we can tease out.

One possibility is that some of the arguments together define a concept that should be packaged up and replaced with a new object that represents it. Here's a small example.

public class MessageProcessor {
  public MessageProcessor(MessageUnpacker unpacker, 
                          AuditTrail auditer, 
                          CounterPartyFinder counterpartyFinder,
                          LocationFinder locationFinder,
                          DomesticNotifier domesticNotifier,
                          ImportedNotifier importedNotifier) 
  {
    // set the fields here
  }
  public void onMessage(Message rawMessage) {
    UnpackedMessage unpacked = 
       unpacker.unpack(rawMessage, counterpartyFinder);
    auditer.recordReceiptOf(unpacked);
    // some other activity here
    if (locationFinder.isDomestic(unpacked)) {
      domesticNotifier.notify(unpacked.asDomesticMessage());
    } else {
      importedNotifier.notify(unpacked.asImportedMessage())
    }
  }
}

Just the thought of writing expectations for all these objects makes us wilt, which suggests that things are too complicated. A first step is to notice that the unpacker and counterpartyFinder are always used together, they're fixed at construction and one calls the other. We can remove one argument by pushing the counterpartyFinder into the unpacker.

public class MessageProcessor {
  public MessageProcessor(MessageUnpacker unpacker, 
                          AuditTrail auditer, 
                          LocationFinder locationFinder,
                          DomesticNotifier domesticNotifier,
                          ImportedNotifier importedNotifier) {
  // etc.

  public void onMessage(Message rawMessage) {
    UnpackedMessage unpacked = unpacker.unpack(rawMessage);
    // etc.

Then there's the triple of locationFinder and the two notifiers, which seem to go together. It might make sense to package them together into a MessageDispatcher.

public class MessageProcessor {
  public MessageProcessor(MessageUnpacker unpacker, 
                          AuditTrail auditer, 
                          MessageDispatcher dispatcher) {
  // etc.

  public void onMessage(Message rawMessage) {
    UnpackedMessage unpacked = unpacker.unpack(rawMessage);
    auditer.recordReceiptOf(unpacked);
    // some other activity here
    dispatcher.dispatch(unpacked);
  }
}

Although we've forced this example to get it to fit within a section, it shows that being sensitive to complexity in the tests can help us clarify our designs. Now we have a message handling object that clearly performs the usual three stages: receive, process, and forward. We've pulled out the message routing code (the MessageDispatcher), so the MessageProcessor has fewer responsiblities and we know where to put routing decisions when things get more complicated. You might also notice that this code is easier to test.

When we're extracting implicit components, we look first for two conditions: arguments that are always used together in the class, and that have the same lifetime. That usually finds us the concept, then we have the harder task of finding a good name.

As an aside, one of the signs that the design is developing nicely is that this kind of change is easy to integrate. All we have to do is find where the MessageProcessor is created and change this

messageProceesor = 
  new MessageProcessor(new XmlMessageUnpacker(), 
                       auditer, counterpartyFinder, 
                       locationFinder, domesticNotifier,
                       importedNotifier);

to this

messageProceesor = 
  new MessageProcessor(new XmlMessageUnpacker(counterpartyFinder),
                       auditer,
                       new MessageDispatcher(
                         locationFinder, 
                         domesticNotifier, importedNotifier));

Later we can reduce the syntax noise by extracting out the creation of the MessageDispatcher.

Test Smell: Confused object

Another diagnosis for a Bloated Constructor might be that the object itself is too large because it has too many responsibilities. For example,

public class Handset {
  public Handset(Network network, Camera camera, Display display, 
                DataNetwork dataNetwork, AddressBook addressBook,
                Storage storage, Tuner tuner, …) {
    // set the fields here
  }
  public void placeCallTo(DirectoryNumber number) { 
    network.openVoiceCallTo(number);
  }
  public void takePicture() { 
    Frame frame = storage.allocateNewFrame();
    camera.takePictureInto(frame);
    display.showPicture(frame);
  }
  public void showWebPage(URL url) {
    display.renderHtml(dataNetwork.retrievePage(url));
  }
  public void showAddress(SearchTerm searchTerm) {
    display.showAddress(addressBook.findAddress(searchTerm));
  } 
  public void playRadio(Frequency frequency) {
    tuner.tuneTo(frequency);
    tuner.play();
  }
  // and so on
}

Like our mobile phones, this class has several unrelated responsibilities which force it to pull in many dependencies. Like our phones, the class is confusing to use so we keep pressing the wrong button, and find that unrelated features interfere with each other. We're prepared to put up with these compromises in a handset because we don't have enough pockets for all the devices it implements, but that doesn't apply to code. This class should be broken up. The sort of fault lines we look for are fields that have no relationship with each other, particularly if the methods that reference them aren't used together either.

An associated smell for this kind of class is that its test suite will look confused too. The tests for its various features will have no relationship with each other, so we'll be able to make major changes in one area without touching any others. If we can break up the test class into slices that don't share anything, then the best thing might be to do just that and then slice up the object too.

Test Smell: Too many dependencies

Bring this up to date with only three stereotypes

A third diagnosis for a Bloated Constructor might be that not all of the arguments are dependencies, services that must be passed to the object before it can do its job. Other arguments might be more to do with structuring the implementation of the object or varying its behaviour to suit its neighbours. To help us think about this distinction, we've found it useful to categorise the peers of an object into stereotypes:

Dependencies are services that the object needs from its environment so that it can fulfil its responsibilities.
Notifications are other parts of the system that need to know when the object changes state or performs an action.
Policies are objects that tweak or adapt the object's behaviour to the needs of the system.
Parts are components in the implementation that are not controlled from outside the object after being set.

Dependencies must be passed in through the constructor. An object needs the system to provide implementations of its dependencies to function at all, so there's no point in creating an instance until they're available. Partially creating an object and then finishing it off by setting properties is risky and brittle because the programmer has to know whether all the dependencies have been set. As Yoda said, "New, or new not. There is no try."

Other types of peer can be passed to the constructor as a convenience but, if the list is becoming too long, they can also be initialised to safe defaults and overridden later (one way to distinguish the two is that there are no safe defaults for a Dependency). Policies and Parts can be initialised to commonly used cases, and collections of Parts can be initialised as empty. We can then add setters and add/remove methods to the class to allow the object's clients to configure it. Similarly, Notifications can be implemented as events. Unicast events can be initialised with a null implementation of the listener interface, and multicast events can be initialised with an empty collection of listeners. We always create a valid object, and we also have the hooks we need for testing.

A high speed example

Here's an example, it's probably not quite bad enough to need fixing, but it'll do to make the point. The application is a Formula 1 racing game, players can try out different configurations of car and driving style to see which one wins. A RacingCar represents a competitor within a race.

public class RacingCar {
  private final Track track;
  private Tyres tyres;
  private Suspension suspension;
  private Wing frontWing;
  private Wing backWing;
  private double fuelLoad;
  private CarListener listener;
  private DrivingStrategy driver;

  public RacingCar(Track track, DrivingStrategy driver, 
                  Tyres tyres, Suspension suspension, 
                  Wing frontWing, Wing backWing, double fuelLoad,
                  CarListener listener)
  {
    this.track = track;
    this.driver = driver;
    this.tyres = tyres;
    this.suspension = suspension;
    this.frontWing = frontWing;
    this.backWing = backWing;
    this.fuelLoad = fuelLoad;
    this.listener = listener;
  }
}

It turns out that the track is the only Dependency of a RacingCar, the hint is that it's the only field that's final. The driver is a Policy, the listener is a Notification, and the rest are Parts; all of these can be modified by the user before or during the race. Here's a reworked constructor:

public class RacingCar {
  private final Track track;

  private DrivingStrategy driver = 
                           DriverTypes.borderlineAggressiveDriving();
  private Tyres tyres = TyreTypes.mediumSlicks();
  private Suspension suspension = SuspensionTypes.mediumStiffness();
  private Wing frontWing = WingTypes.mediumDownforce();
  private Wing backWing = WingTypes.mediumDownforce();
  private double fuelLoad = 0.5;

  private CarListener listener = CarListener.NONE;

  public RacingCar(Track track) {
    this.track = track;
  }
    
  public void setSuspension(Suspension suspension) { […]
  public void setTyres(Tyres tyres) { […]
  public void setEngine(Engine engine) { […]
   
  public void setListener(CarListener listener) { […]
}

Now we've initialised the peers to most common defaults, the user can then configure them through the user interface. We've initialised the listener to a null implementation, again this can be changed later by the object's environment.

A matter of taste

These stereotypes are only heuristics to help me think about the design, not hard rules, so we don't get obsessed with finding just the right classification of an object's peers. What matters most is the context in which we're developing the object, our mental model of the domain. For example, an auditing log might be a Dependency, since it's a legal requirement for the business, or a Notification, since the object will otherwise function without it. Policies belong to the environment and can usually be constants, since they're often immutable and so don't need to be instantiated more than once. Parts belong to the object and usually cannot be stateless, so they need a new instance. For example, in RacingCar, tyres might be a Policy if TyreTypes.mediumSlicks() describes only how the choice of tyre affects the car's handling, but a Part if it stores mutable values such as air pressure and wear.

Test Smell: Too many expectations

When a test has too many expectations it's hard to see what's important, what's really under test. For example, here's a test

@Test public void 
decidesCasesWhenFirstPartyIsReady() {
  context.checking(new Expectations(){{
    one(firstPart).isReady(); will(returnValue(true));
    one(organiser).getAdjudicator(); will(returnValue(adjudicator));
    one(adjudicator).findCase(firstParty, issue); will(returnValue(case));
    one(thirdParty).proceedWith(case);
  }});
  
  claimsProcessor.adjudicateIfReady(thirdParty, issue);
}

that might be implemented like this

public void adjudicateIfReady(ThirdParty thirdParty, Issue issue) {
  if (firstParty.isReady()) {
    Adjudicator adjudicator = organisation.getAdjudicator();
    Case case = adjudicator.findCase(firstParty, issue);
    thirdParty.proceedWith(case);
  } else{
    thirdParty.adjourn();
  }
}

What makes the test hard to read is that everything is an expectation, so everything seems to be equally important. We can't tell what's significant and what's just there to get through the test.

In fact, if we look at all the methods we call, there are only two that have any side effects outside this class: thirdParty.proceedWith() and thirdParty.adjourn(). It would be an error to call these more than once. All the other methods are queries, we can call organisation.getAdjudicator() repeatedly without breaking any behaviour. adjudicator.findCase() might go either way, but it happens to be a lookup so it has no side effects.

We can make our intentions clearer by distinguishing between Stubs, simulations of real behaviour that help me get our test to pass, and Expectations, assertions we want to make about how an object interacts with its neighbours. Reworking the test, we get

@Test public void 
decidesCasesWhenFirstPartyIsReady() {
  context.checking(new Expectations(){{
    allowing(firstPart).isReady(); will(returnValue(true));
    allowing(organiser).getAdjudicator(); will(returnValue(adjudicator));
    allowing(adjudicator).findCase(firstParty, issue); will(returnValue(case));
    
    one(thirdParty).proceedWith(case);
  }});
  
  claimsProcessor.adjudicateIfReady(thirdParty, issue);
}

which is more explicit about how we expect the object to change the world around it.

Stub queries, mock actions

For the more formally minded, you might want to think (very, very loosely) of Stubs as pre-conditions and Expectations as post-conditions. The test now says that given the state described in the Stubs, the events described by the Expectations should have happened. Once we've sorted out which calls are Stubs, we can usually move some of them into common setup code, making the tests clearer still

@Test public void 
decidesCasesWhenFirstPartyIsReady() {
  context.checking(new Expectations(){{
    allowing(firstParty).isReady(); will(returnValue(true));
    
    one(thirdParty).proceedWith(case);
  }});
  
  claimsProcessor.adjudicateIfReady(thirdParty, issue);
}

It is, of course, possible that all the methods calls are actions. In which case, either we just have a complicated domain and there's not much we can do to simplify it or, more likely, there are some intermediate concepts in there that need to be extracted into objects.

Clearly, when we're thinking about the whole system, we can't ignore implementation issues such as the cost of database queries, but that's not relevant at this point. What matters now is that we have tests and code that clearly express our intentions.

Special bonus prize

We always have problems coming up with good examples. There's actually a better improvement to this code, which is to notice that that we've pulled out a chain of objects to get to the case object, exposing dependencies that aren't relevant here. What we should have done is told the nearest object to do the work for we, like this:

public void adjudicateIfReady(ThirdParty thirdParty, Issue issue) {
  if (firstParty.isReady()) {
    organisation.adjudicateBetween(
        firstParty, thirdParty, issue);
  } else {
    thirdParty.adjourn();
  }
}

or, possibly,

public void adjudicateIfReady(ThirdParty thirdParty, Issue issue) {
  if (firstParty.isReady()) {
    thirdParty.startAdjudication(
        organisation, firstParty, issue);
  } else{
    thirdParty.adjourn();
  }
}

which looks more balanced. If you spotted this then we award you a Moment of Smugness™ to be exercised at your convenience.

Test Smell: Mega unit tests

We had one posting to the jMock user list that included this comment

I was involved in a project recently where JMock was used quite heavily. Looking back, here's what I found:

  1. The unit tests where at times unreadable (no idea what they were doing).

  2. Some tests classes would reach 500 lines in addition to inheriting an abstract class which also would have up to 500 lines.

  3. Refactoring would lead to massive changes in test code.

We've seen this failure mode on other projects we've been helping with, so we think there might be a common pattern.

Unit test code shouldn't get that large, except for unusual circumstances. Unit tests are supposed to focus on at most a few classes and shouldn't need a large amount of set up. What we've seen on projects was enormous amounts of preparation and positioning to get the objects into a state where the expected feature could be exercised. Of course it's hard to understand the point of a test when there's just so much code. If you see this pattern, then we suggest that the code (or at least the test code) needs breaking. Integration and acceptance tests that happen to be written using a unit test framework, on the other hand, might have to be large.

It's not yet clear whether the naive use of interaction-testing is particularly susceptible to this failing, or whether it happens all the time and we're the only ones who get complained to. We are, however, convinced that an emphasis on using Mock Objects mainly to substitute external systems (some of which we perpetrated in the early days) is a deeply bad idea which leads teams towards this sort of problem.

Cross-referencing Code Smells

Link smells to possible solutions: Test Smell -> Code Smell -> Refactoring. Maybe as an appendix?

What the tests will tell you (if you're listening)

We've found these benefits from learning to Listen to Test Smells

Keeping knowledge local
Some of the test smells we've identified, such as needing magic to create mocks, are to do with knowledge leaking between components. If we can keep knowledge local to an object, either internal or passed in, then its implementation is independent of its context. We can safely move it wherever we like. Do this consistently and we can build an application as a web of communicating objects.
If it's explicit we can name it

This is one reason why we don't like mocking concrete classes, we like to have names for the relationships between objects as well the objects themselves. As the legends say, if we have something's true name then we can control it. If we can see it, we have more chance of finding other uses and so reducing duplication.

More names mean more domain information
We find that when we emphasise how objects communicate, rather than what they are, we end up with types and roles defined more in terms of the domain than of the implementation. This might be because we have more, smaller abstractions which gets us a further away from the underlying language. Somehow we seem to get more domain vocabulary into the code.
Pass behaviour rather than data
We find that TDD with Mock Objects encourages us to write objects that tell each other what to do, rather then requesting and manipulating values. Applied consistently, we end up with a coding style where we pass behaviour (in listeners and callbacks) from the higher levels of the code to the data, rather than pulling data up through the stack. It's an unusual style in the Java/C# world, but we find that we get better encapsulation and clearer abstractions because we have to clarify what the pieces will do, not just what values they hold.

Keeping the code clean as we go along minimizes the risk of nasty surprises later when a new requirement triggers significant changes to the design and the developers end up being pressured into botching the code to get the job done. Just a few of these episodes can make further changes increasingly difficult and dissipate the good intentions of the team.