So now we've got an idea of what to build, can we get on with it and write our first unit test?
Not yet.
Our first task is to create a Walking Skeleton—an implementation of the thinnest possible slice of real functionality that we can automatically build, deploy, and test end-to-end [Cockburn2004]. Writing a Walking Skeleton forces us to understand the requirements well enough to propose a broad-brush system architecture—the major components of the system and how they will fit together. Of course we start by writing a test for our Walking Skeleton which, in turn, will force us to work out how to build, package, and deploy the application into a production-like environment and what "production-like" means.
The development of a Walking Skeleton is the moment when we start to make choices about the high-level structure of our application, which means we need to have a high-level understanding of the client’s requirements, both functional and non-functional. There’s a lot of preparatory work that’s outside the scope of this book. We can always change our minds later, when we learn more, but it’s important to start with something that maps out the landscape of our solution. It’s very important that we can test the approach we’ve chosen to validate our choices and so that we can make changes with confidence later.
It's also important to realise that the “end” in “end-to-end” refers to the process, as well as the system. We want our test to start from scratch, build a deployable system, deploy it into a production-like environment, and then run the tests through the deployed system. Including the deployment step in the testing process is critical for two reasons. First, this is the sort of error-prone activity that should not be done by hand, so we want our scripts to have been thoroughly exercised by the time we have to deploy for real. Second, in many organisations this is where the development team interfaces with other teams and have to follow their procedures. If it’s going to take six weeks and four signatures to set up a database you want to know now, not two weeks before delivery.
All this effort means that teams are frequently surprised by the time it takes to get a Walking Skeleton working, considering it hardly does anything. That’s because this first step involves establishing a lot of infrastructure and doing a lot of research and analysis. The alternative, which usually involves manual processes and tweaking the system to keep it live, is risky and can waste unpredictable amounts of time when deadlines are close. The Walking Skeleton will flush out issues early in the project when there’s still time, budget, and goodwill to address them.
In practice, of course, end-to-end sometimes takes so long to achieve that we have to start with infrastructure that implements our current understanding of what the real system will do and its environment. We have to keep in mind, however, that this is a stop-gap, a temporary patch until we can finish the job, and that we will have unknown risks until our tests really run end-to-end.
The Walking Skeleton must cover all the components of our Auction Sniper system: the user interface, the sniping component, and the communication with an Auction Server. The thinnest slice we can imagine testing is that the AuctionSniper can join an Auction and then wait for the Auction to close, which was the first item on our To Do list. This slice is so minimal that we're not even concerned with sending a Bid, we just want to know that the two sides can communicate and that we can test the system from outside: through the client's GUI and by injecting events as if from the third-party auction server. Once that's working, we have a solid base on which to build the features that the clients want.
We like to start by writing a test as if its implementation already exists, and then filling in whatever's needed to make it work—what Abelson and Sussman call Programming by Wishful Thinking [SICP1996]. Working backwards from the test helps us focus on what we want the system to do, rather than getting caught up in the complexity of how we will make it do that. So, first we will code up a test to describe our intentions as clearly as we can, given the expressive limits of a programming language. Then we will build the infrastructure to support the way we want to test the system, rather than writing the tests to fit in with an existing infrastructure. This usually takes a large part of our initial effort because there is so much to get ready. Once we have this infrastructure in place, we can implement the feature and make the test pass.
An outline of the test we want would be:
When an Auction is selling an Item,
and an AuctionSniper has started to bid in that Auction,
then the Auction will receive a Join request from the AuctionSniper.
When an Auction announces that it is Closed,
then the AuctionSniper will show that it lost the Auction.
which we need to translate into something executable. We'll use JUnit as our test framework since it's familiar and widely supported. We also need mechanisms to control the application and the Auction that the client is talking to.
Southabee On-Line's test services are not freely available. We have to book ahead and pay for each test session, which is not practical if we want to run tests all the time. We’ll need a fake Auction service that we can control from our tests to behave like the real thing—or at least how we think the real thing will behave until we get a chance to test against it for real. This fake Auction, or “stub”, will be as simple as we can make it. It will connect to an XMPP message broker, receive Commands from the Sniper to be checked by the test, and allow the test to send back Events. We’re not trying to reimplement Southabee's ourselves, just reproduce test scenarios.
Controlling the Sniper application is more complicated. We want our
Skeleton test to exercise our application as close to end-to-end as
possible, to show that the main() method initializes
the application correctly and that the components really work together.
This means that we should start by working through the publicly visible
features of the application, in this case its user interface, rather than
directly invoking its domain objects. We also want our test to be clear
about what is being checked, written in terms of the relationship between
a Sniper and its Auction, so we'll hide all the messy code for
manipulating Swing in an ApplicationRunner class.
We'll start by writing the test as if all the code it needs exists and
then fill in the implementations.
public classAuctionSniperEndToEndTest{ private final FakeAuctionServer auction = new FakeAuctionServer("item-54321"); private final ApplicationRunner application = new ApplicationRunner(); @Test public voidsniperJoinsAuctionUntilAuctionCloses()throws Exception { auction.startSellingItem(); // Step 1 application.startBiddingIn(auction); // Step 2 auction.receivesJoinRequestFromSniper(); // Step 3 auction.announceClosed(); // Step 4 application.showsSniperHasLostAuction(); // Step 5 } // Additional cleanup @After public voidstopAuction(){ auction.stop(); } @After public voidstopApplication(){ application.stop(); } }
To make our intentions clear in the test, we've adopted naming
conventions for the methods of the helper objects. If a method triggers an
event to drive the test, its name will be a command, such as
startBiddingIn(). If a method
asserts that something should have happened, its name will be descriptive,
such as
showsSniperHasLostAuction()[1]. JUnit will call the two
stop() methods after the test has run to clean up the
runtime environment.
In writing the test, one of the assumptions we've made is that a
FakeAuctionServer is tied to a given item. This
matches the structure of our intended architecture, where Southabee's
On-Line hosts multiple auctions, each of which sells a single
item.
The language of this test is concerned with Auctions and Snipers, there's nothing about messaging layers or components in the User Interface, that's all incidental detail here. Keeping the language consistent helps us understand what's significant in this test, with a nice side effect of protecting us when the implementation inevitably changes.
Goal/Task/Action hierarchy? Needs a reference.
Now we have to make the test pass, which will require a lot of preparation. We need to find or write four components: an XMPP message broker, a stub Auction that can communicate over XMPP, a GUI testing framework, and a test harness that can cope with our multi-threaded, asynchronous architecture. We also have to get the project under version control with an automated build/deploy/test process. Compared to unit testing a single class, there is a lot to do, but it’s essential. Even at this high level, the exercise of writing tests drives the development of the system. Working through our first end-to-end test will force some of the structural decisions we need to make, such as packaging and deployment.
First the package selection, we will need an XMPP message broker to let the application talk to our stub auction house. After some investigation, we decide on an open-source implementation called Openfire and its associated client library Smack. We also need a high-level test framework that can work with Swing and Smack, both of which are multi-threaded and event-driven. Luckily for us, there are several frameworks for testing Swing applications and the way that they deal with Swing's multithreaded, event-driven architecture copes equally well with Smack. We pick WindowLicker which is open source, and supports the asynchronous approach that we need in our tests. Together, the infrastructure will look like Figure 6.1, “The end-to-end test rig”
Change Stub Auction to Fake Auction, Jemmy to
WindowLicker
You might have noticed that we skipped over one point, that this first test is not really end-to-end. It doesn’t include the real auction service because that is not easily available. An important part of the skill of Test-Driven Development is judging where to set the boundaries of what to test and how to cover everything eventually. In this case, we have to start with a fake Auction service based on the documentation from Southabee On-Line. The documentation might or might not be correct, so we will record that as a known risk in the project plan and schedule time to test against the real server as soon as we have enough functionality to complete a meaningful transaction—even if we end up buying a hideous (but cheap) pair of candlesticks in a real Auction. The sooner we find a discrepancy, the less code we have based on that misunderstanding and the more time we have left to fix it.
We'd better get on with it.
[1] For the grammatically pedantic, the names of methods that trigger events are clauses in the imperative mood whereas the names of assertions are clauses in the indicative mood.
Copyright © 2008 Steve Freeman and Nat Pryce