Every day you may make progress. Every step may be fruitful. Yet there will stretch out before you an ever-lengthening, ever-ascending, ever-improving path. You know you will never get to the end of the journey. But this, so far from discouraging, only adds to the joy and glory of the climb.
Once you have kickstarted the TDD process into motion, you then need to keep that process running smoothly. In this chapter we'll introduce how a TDD process runs once started. The rest of the book explores in some detail how we ensure it runs smoothly: how we write tests as we build the system, how we use tests to get early feedback on internal and external quality issues and how we ensure that the tests continue to support change and do not become an obstacle to further development.
As we described in Chapter 1, What's the point of Test Driven Development?, we start work on a new feature by writing failing acceptance tests that demonstrate that the system does not yet have the feature we're about to write and track our progress towards completion of the feature.
We write the acceptance test using only language from the application's domain, not the underlying technologies. This helps us understand what the system should do, without tying us to any initial assumptions we make about the implementation or complicating the test with technological details. This also shields our acceptance test suite from changes to the system's technical infrastructure. For example, if a third party organisation changes the protocol used by their services [md] from FTP and binary files to web services and XML, for example [md] our tests of the system's application logic should not be affected.
We find that writing this kind of test before coding makes us clarify what we need to achieve. The precision of expressing requirements in a form that can be automatically checked helps to uncover implicit assumptions. The failing tests keep us focused on just what we need now, improving our chances of delivering it. More subtly, by taking the client point of view, they help us concentrate on features that are actually needed, rather than what we think we should provide (but shouldn't). I have no idea what that last sentence means -- Nat
Unit tests, on the other hand, exercise objects, or small clusters of objects, in isolation. They're important to help us design classes, or small cluster of classes, and give us confidence that they work, but they don't say anything about whether they work together with the rest of the system. Acceptance tests both test the integration of unit-tested objects and push the project forwards.
When we write acceptance tests to describe a new feature, we expect them to fail until that feature has been implemented: new acceptance tests describe work yet to be done. The activity of turning acceptance tests from red to green gives the team a sense of the progress it's making, and is an essential part of the rhythm of iterative Test-Driven Development, as shown in Figure 1.1, “Nested Feedback Loops”.
Once passing, the acceptance tests now represent completed features and should not fail again. A failure means that there's been a regression: that we've broken our existing code.
So, we organise our test suites to reflect the different roles that the tests fulfil: unit and integration tests support the development team, should run quickly and should always pass. Acceptance tests for completed features catch regressions and should always pass, although they might take longer. New acceptance tests represent work in progress and will not pass until a feature is ready.
If requirements change, we must then move affected acceptance tests back out of the regression suite and into the in-progress suite, edit them to reflect the new requirements and change the system to make them pass again.
Where do we start when we have to write a new class or feature? It's often tempting to start with degenerate or error cases because they're so easy. It seems that we're following the XP maxim to do "Simplest Thing That Could Possibly Work" [Beck2002]. But simple is not the same as easy. Those degenerate cases do not really add much to the value of the system and, because they're not doing anything very useful, do not give us much feedback about our ideas. We also find it's bad for morale since it's easy to get so bogged down in degenerate cases and error handling that it feels like we're going nowhere.
So, we prefer to start by testing the simplest success case. Once that's working, we find we have a better idea of the real structure of the solution and can decide whether to work on the error cases we've noticed whilst implementing the success case, or add more success cases.
Of course, we have to write tests for the error cases before we finish the task. This isn't an excuse not to bother with error handling. It's useful to keep a notepad or index cards by the keyboard so that as we work we can jot down the error cases, refactoring opportunities and other technical tasks that need to be addressed. The work is not finished until all the tasks we have jotted down have been crossed off.
For example, we started the auction sniper application by testing that a single sniper could make a bid is accepted and wins the auction. We then went on to deal with when its bid is not accepted, when the sniper is subsequently outbid, and so on. One those were complete we dealt with handling error cases such as when the sniper loses its connection to the auction.
We want each each test to be as clear as possible an expression of the behaviour to be performed by the system or object. While writing the test, we ignore the fact that the test won't run, or even compile, and just concentrate on its text. While writing it, we assume the existence of any supporting code exists to let us run the test.
When the test reads well, we then build up the infrastructure to support the test. We know we've implemented enough of the supporting code when the test fails in the way we'd expect with a clear error message describing what needs to be done.
Only then do we start writing the code to make the test pass.
In the example, we wrote the first acceptance test to express our sniper's bidding behaviour as clearly as possible, assuming the existence of a FakeAuction, the GUI drivers and other parts of the test infrastructure. We then built those up underneath the test until it compiled, ran and produced useful failure diagnostics.
We look at the readability of tests in more detail in Chapter 16.
We always watch the test fail before writing the code to make it pass, and check the diagnostic message. If the test fails in a way we didn't expect then we know we've misunderstood something or that the code is incomplete, so we fix that. When we get the “right” failure, we check that the diagnostics are helpful. If their description isn't clear then someone (probably us) will have to struggle when the code breaks in a few weeks time. We adjust the test code and rerun the tests until the error messages guide us to the problem with the code.
As we write the production code, we keep running the test to see progress and to check the error diagnostics as the system is built up behind the test. Where necessary, we extend or modify the support code to ensure the error messages are always clear and relevant.
There's more than one reason for insisting on checking the error messages. First, it checks our assumptions about the code we're working on. Sometimes we're wrong as we saw in Chapter 8, Getting ready to bid. Second, more subtly, we find that our emphasis on (or, perhaps, mania for) expressing our intentions is a fundamental practice for developing reliable, maintainable systems—and for us that includes tests and failure messages. Taking the trouble to generate a useful diagnostic helps us clarify what the the test, and therefore the code, is supposed to do.
We look at error diagnostics and how to improve them in more detail in Chapter 17.
We start developing a feature by with the objects that receive input from the end-to-end tests. What events cause the system do do something? Which object, or objects, listen to those events? What must they do when the events occur? As we write the objects that handle events, we discover the services they need to perform their responsibilities and write those objects, discovering the services that they need in turn. In this way we work through the system, from the objects that receive events from outside the process, through intermediate layers, to the central domain model, and then back out to the objects that send events back out.
figure to go here
For example, when building the auction sniper we started by unit-testing and implementing objects that sent and received XMPP messages to the auction server and developed the application from there towards the GUI. As we added more application logic, we factored that logic out into separate objects that modeled the domain: Auction, Bid, AuctionSniper, Portfolio, and so on.
It's tempting to start by unit-testing new domain model objects and then try to hook them into the rest of the application. It seems easier at the start [md] we feel we're making rapid progress working on the domain model when we don't have to hook it into anything [md] but we're more likely to get bitten by integration problems later, and then find we've wasted a lot of time building unnecessary or incorrect functionality.
We've learned that hard way that just writing lots of tests, even
when it produces high tests coverage, does not guarantee a codebase that's
easy to work with. Many developers who adopt TDD find their early tests
hard to understand when they revisit them later, and one common mistake is
to think of testing methods; this habit is particularly easy to fall into
for those of use who've learned to think in terms of class invariants. A
test called testBidAccepted() tells us something about what
it does, but not what it's for.
We do better when we focus on the features that the object under test should provide, each of which may require collaboration with its neighbours and may involve calling more than one of its methods. We need to know how to use the class to achieve a goal, not how to exercise all the paths through its code.
It helps to choose test names that describe how the object behaves in the scenario being tested.We look at this in more detail in the section called “Expressive Test Names”.
When writing unit and integration tests, we stay alert for areas of the code that are difficult to test. When we find a feature that’s difficult to test we don’t just ask ourselves how to test it, but also why is it difficult to test.
In our experience, when code is difficult to test our design needs be improved. The same flaws that make it difficult to test the code now will make it difficult to change the code sometime in the future. But in the future we will have lost a lot of our understanding of the system that we're keeping in our short-term memory as we write the code, and so maintenance will be that much harder. And that's assuming that we will still be around to change what we've written. If we've done our job well, our code will continue being maintained for years, long after we've moved on to pastures new.
So writing unit tests gives us valuable early warning of maintenance problems in the future at the time when it is easiest to fix them. This is an example of how our maxim to expect unexpected changes guides development: we don't know which area of the code we will have to change, so we work to make the entire system as well factored as possible.
We examine how to listen to the tests in more detail in Chapter 19.
Copyright © 2008 Steve Freeman and Nat Pryce