Friday, December 24, 1999

Extreme Programming, Cathedral and the Bazaar

This article appears in slightly different form in the November/December 1999 issue of IEEE Micro © 1999 IEEE.

Changes

In my final column of the 1900s I look at two books that challenge twentieth century ideas about developing software and making money from it.

Extreme Programming Explained by Kent Beck (Addison-Wesley, 2000, 212pp, ISBN 0-201-61641-6, www.awl.com, $29.95)

Kent Beck knows how to design and implement software. He advocated using design patterns years before most people had heard of them, and he pioneered CRC (classes, responsibility, collaboration) cards, a low-overhead design methodology.

Programming lies between art and engineering. Too much influence from either can make it an economically unprofitable activity. Isolated artists, working from an inner vision, can produce brilliant work, often quickly. But it's a rare artist whose inner vision coincides with what users want.

Engineering, on the other hand, is thorough and methodical. Engineers insist on complete functional specifications. They devise comprehensive testing regimens, prepare and verify detailed design plans, then build the product. They constantly review each other's work and monitor their adherence to the plan.

This style of engineering requires long product cycles, especially when measured by Internet time. Even a thorough analysis phase usually fails to produce a functional specification that perfectly foresees and communicates what end users need and how they will use the product. This leads to changes, which often produce delays and weaken the overall design.

Extreme programming (EP) adopts the proven techniques that make engineering robust but replaces the long product cycles. An EP project compresses the cycle of specification, design, implementation, and testing into a time period of a week or so. Within that period it uses smaller cycles -- some as short as a few minutes. In this way it replaces the ballistic approach (that is, plan the whole trajectory, then blast off) with frequent course corrections based on feedback.

Beck calls his style of programming extreme because he pushes accepted techniques and principles as far as he can. For example, he requires two programmers to participate in any programming session, leading to non-stop code reviews. Programmers write tests before they write code, and they perform unit testing every time they change anything.

Beck integrates and tests the entire system many times per day. Customers (he requires one to be part of the team) continually test the system's functionality.

Beck clearly separates the roles of customers, management, and developers in defining the functionality and dates of releases. He requires releases to be as small as possible, consisting of just the most important new features. Customers receive new capabilities every few weeks. As they use them, they discover important possibilities or gaps, and these help define the content of subsequent releases.

Because the project meanders in this way, EP requires each piece of software to be the simplest that supports the current requirement. It avoids designing on speculation. As a result EP calls for frequent redesign or refactoring. In other design methodologies, redesigning something that already works is anathema. EP's frequent testing makes this a low-risk activity. EP programmers who see ways to simplify the design or eliminate duplication are obliged to do so. Anyone can modify any code at any time -- no matter who wrote it.

Because communication is such an important part of an EP project, Beck requires the project to have a single overarching metaphor. He bases all project jargon on the metaphor, and he uses it for everything from talking with customers to deciding how the code should work.

Beck's book is easy to read. He explains the principles well, and he doesn't bring in a lot of unnecessary detail. You can read it in a few hours. Then you'll know what EP is and how it works, but you won't really know how to do it. For that, Beck advises, "you will have to go online, talk to some of the coaches mentioned here, wait for the how-to books to follow, or just make up your own version."

If your programming projects aren't proceeding as effectively as you'd like, read this book. You'll probably find something you can use.


The Cathedral and the Bazaar by Eric S. Raymond (O'Reilly, Sebastopol CA, 1999, 280pp, ISBN 1-56592-724-9, www.oreilly.com, $19.95)

In the summer of 1968 the company I worked for hired two high school students to modify Digital's PDP-8 assembler. They moved the symbol table from core memory to disk, maintained it in alphabetical order, and replaced the linear search with a binary search. In the early 1970s I modified programs from DEC, Hewlett-Packard, Data General, and Varian. All of those companies provided source code. Nowadays, few companies do so.

Eric Raymond has been producing open source code for about fifteen years. Much of it is still popular today. Nonetheless, Linus Torvald's Linux project went against much of what Raymond thought he knew. His efforts to understand the Linux model and his experiences leading the Fetchmail project led to the essays in this book. They contain interesting technical and sociological observations of the open source process, and they contain aphorisms embodying the lessons Raymond learned from the Fetchmail project. My favorites are:
  • Release early. Release often. And listen to your users.
  • Smart data structures and dumb code works a lot better than the other way around.
The first of these leads to a style similar to Beck's extreme programming. It responds to the impatience and the parallelism of the Internet culture.

I've heard the second in many forms over the years. Raymond cites Brooks's Mythical Man Month (1969) for one version. I remember the stress Butler Lampson placed on this point in his 1968 lectures on operating systems at UC Berkeley. It's so important, and so often ignored.

If Raymond had stopped at the observations and the aphorisms, this would have been an excellent book. He goes on, however, to frame broader general principles. They seem plausible, but he presents many of them as facts and mixes them with the other material. I think this detracts from the clarity and value of the book.

Despite the quibble, you should read this book. It provides useful insights into the open source movement and its place in the future of software development.