Friday, December 24, 1999

Extreme Programming, Cathedral and the Bazaar

This article appears in slightly different form in the November/December 1999 issue of IEEE Micro © 1999 IEEE.

Changes

In my final column of the 1900s I look at two books that challenge twentieth century ideas about developing software and making money from it.

Extreme Programming Explained by Kent Beck (Addison-Wesley, 2000, 212pp, ISBN 0-201-61641-6, www.awl.com, $29.95)

Kent Beck knows how to design and implement software. He advocated using design patterns years before most people had heard of them, and he pioneered CRC (classes, responsibility, collaboration) cards, a low-overhead design methodology.

Programming lies between art and engineering. Too much influence from either can make it an economically unprofitable activity. Isolated artists, working from an inner vision, can produce brilliant work, often quickly. But it's a rare artist whose inner vision coincides with what users want.

Engineering, on the other hand, is thorough and methodical. Engineers insist on complete functional specifications. They devise comprehensive testing regimens, prepare and verify detailed design plans, then build the product. They constantly review each other's work and monitor their adherence to the plan.

This style of engineering requires long product cycles, especially when measured by Internet time. Even a thorough analysis phase usually fails to produce a functional specification that perfectly foresees and communicates what end users need and how they will use the product. This leads to changes, which often produce delays and weaken the overall design.

Extreme programming (EP) adopts the proven techniques that make engineering robust but replaces the long product cycles. An EP project compresses the cycle of specification, design, implementation, and testing into a time period of a week or so. Within that period it uses smaller cycles -- some as short as a few minutes. In this way it replaces the ballistic approach (that is, plan the whole trajectory, then blast off) with frequent course corrections based on feedback.

Beck calls his style of programming extreme because he pushes accepted techniques and principles as far as he can. For example, he requires two programmers to participate in any programming session, leading to non-stop code reviews. Programmers write tests before they write code, and they perform unit testing every time they change anything.

Beck integrates and tests the entire system many times per day. Customers (he requires one to be part of the team) continually test the system's functionality.

Beck clearly separates the roles of customers, management, and developers in defining the functionality and dates of releases. He requires releases to be as small as possible, consisting of just the most important new features. Customers receive new capabilities every few weeks. As they use them, they discover important possibilities or gaps, and these help define the content of subsequent releases.

Because the project meanders in this way, EP requires each piece of software to be the simplest that supports the current requirement. It avoids designing on speculation. As a result EP calls for frequent redesign or refactoring. In other design methodologies, redesigning something that already works is anathema. EP's frequent testing makes this a low-risk activity. EP programmers who see ways to simplify the design or eliminate duplication are obliged to do so. Anyone can modify any code at any time -- no matter who wrote it.

Because communication is such an important part of an EP project, Beck requires the project to have a single overarching metaphor. He bases all project jargon on the metaphor, and he uses it for everything from talking with customers to deciding how the code should work.

Beck's book is easy to read. He explains the principles well, and he doesn't bring in a lot of unnecessary detail. You can read it in a few hours. Then you'll know what EP is and how it works, but you won't really know how to do it. For that, Beck advises, "you will have to go online, talk to some of the coaches mentioned here, wait for the how-to books to follow, or just make up your own version."

If your programming projects aren't proceeding as effectively as you'd like, read this book. You'll probably find something you can use.


The Cathedral and the Bazaar by Eric S. Raymond (O'Reilly, Sebastopol CA, 1999, 280pp, ISBN 1-56592-724-9, www.oreilly.com, $19.95)

In the summer of 1968 the company I worked for hired two high school students to modify Digital's PDP-8 assembler. They moved the symbol table from core memory to disk, maintained it in alphabetical order, and replaced the linear search with a binary search. In the early 1970s I modified programs from DEC, Hewlett-Packard, Data General, and Varian. All of those companies provided source code. Nowadays, few companies do so.

Eric Raymond has been producing open source code for about fifteen years. Much of it is still popular today. Nonetheless, Linus Torvald's Linux project went against much of what Raymond thought he knew. His efforts to understand the Linux model and his experiences leading the Fetchmail project led to the essays in this book. They contain interesting technical and sociological observations of the open source process, and they contain aphorisms embodying the lessons Raymond learned from the Fetchmail project. My favorites are:
  • Release early. Release often. And listen to your users.
  • Smart data structures and dumb code works a lot better than the other way around.
The first of these leads to a style similar to Beck's extreme programming. It responds to the impatience and the parallelism of the Internet culture.

I've heard the second in many forms over the years. Raymond cites Brooks's Mythical Man Month (1969) for one version. I remember the stress Butler Lampson placed on this point in his 1968 lectures on operating systems at UC Berkeley. It's so important, and so often ignored.

If Raymond had stopped at the observations and the aphorisms, this would have been an excellent book. He goes on, however, to frame broader general principles. They seem plausible, but he presents many of them as facts and mixes them with the other material. I think this detracts from the clarity and value of the book.

Despite the quibble, you should read this book. It provides useful insights into the open source movement and its place in the future of software development.

Wednesday, October 27, 1999

Pot Pourri

This article appears in slightly different form in the September/October 1999 issue of IEEE Micro © 1999 IEEE.

This time I look at a loosely related collection of interesting ideas. Jini is a new technology from Sun. It aims at enabling you to network anything, anytime, anywhere. Java is a programming language and a platform. It has matured considerably since I first wrote about it (Micro Review, June, 1996). WinWriters is a support organization for online help developers. They recently put on a conference about JavaHelp. Adobe Acrobat is the principal tool for working with Adobe's portable document format (PDF).


Jini

Many people have recognized the potential of large numbers of mildly intelligent communicating  devices. Devices as disparate as switches, lights, cameras, sensors, printers, vending machines, and large relational databases might all benefit from exchanging information. In the 1980s, Micro's editor-in-chief, Ken Sakamura, headed the TRON Project, a large cooperative effort based on this idea.

Java's roots are in embedded systems and consumer electronics. From the first, it was also intimately associated with networking. Jini brings these threads together to provide a model for networking arbitrary devices.

Many systems for distributed computing run afoul of the seven fallacious assumptions identified by computer scientist Peter Deutsch:
  • The network is reliable.
  • Latency is zero.
  • Bandwidth is infinite.
  • The network is secure.
  • Topology doesn't change.
  • There is one administrator.
  • Transport cost is zero.
Jini provides explicit support for the difficulties and failure modes that these assumptions try to hide. The result is a powerful mechanism built on a few simple ideas and protocols. Because it builds on Java, Jini leads developers to this mechanism along a short, familiar path.

Here is a slightly oversimplified explanation of what Jini is and how it works. It begins when someone starts a lookup service, which is essentially the only piece of system administration required. Devices that have services to offer can register with a nearby lookup service when they connect to a network. They don't need to know its location in advance.

Registering means leasing space on the lookup service for a small proxy program to be downloaded to devices seeking services of the registered provider's type. The proxy knows how to communicate with the provider and implements a standard interface for services of that type.

Jini has a concept of remote events. It is a simple model that separates the event mechanism from event policy, leaving the latter to individual applications. It facilitates the use of third-party event listening services, and it supports Jini transactions. Transactions are the key to designing robust, reliable distributed applications.

This brief overview of Jini doesn't capture all of the details--even all of the major details. I hope it gives you the general idea.


Core Jini by W. Keith Edwards (Prentice Hall, 1999, 812pp, www.phptr.com, ISBN 0-13-014469-X, $49.99)

Keith Edwards is a researcher at Xerox PARC, where he works on distributed object technology and user interfaces for information management. He was an early Jini user -- long before its first public release. He understands the underlying issues and philosophy, and he explains them clearly in this book.

Edwards' book is especially valuable, because he combines an intelligent discussion of the underlying concepts with a clear explanation of how to perform the basic tasks. He understands what developers need to know, and he explains the material patiently, using complete, working examples to illustrate the points.

Edwards sees Jini as resting on five key concepts: discovery, lookup, leasing, remote events, and transactions. He explains these concepts lucidly and shows how they fit together to make a complete workable system. If you read this explanation carefully, you will understand Jini at a significant and useful level.

On one hand, I'd like to say that everyone should read this book. This is surely true of the first 120 pages, which cover the conceptual material. On the other hand, this is a highly technical book. If you're not a working Java programmer, most of the book will make difficult reading. With that caveat, I recommend it highly.


Java

Java Power Reference by David Flanagan (O'Reilly, 1999, CD-ROM, www.oreilly.com, ISBN 1-56592-589-0, $19.95)

The Java Power Reference is a CD-based top-level quick reference for the Java 2 platform. It provides useful functionality not available elsewhere, but it falls short of what Flanagan could have done with the material he has already produced for his books.

Flanagan's Java in a Nutshell (see Micro Review, June 1996) has grown into a series of four books, making quick access and effective cross referencing difficult. The material cries out for an online approach. Rather than taking this approach, Flanagan has decided to leave the detailed material to the books. He says:
So, while Java in a Nutshell contains a paragraph or two about each class and interface, the Java Power Reference contains a paragraph or two about each package.
At 107 megabytes, the Java Power Reference falls far short of filling the CD. I think Flanagan would do his users a service by copying those extra paragraphs from the books onto the CD. In fact, I'll be surprised if he doesn't do something like that after he brings all the nutshell books up to date.

The best thing about the Java Power Reference is that it gives you an overview of the Java 2 platform and lets you explore it. You can find a lot of tantalizing packages, but you won't find many words about what they do and how the designers intended you to use them.

The other important thing the Java Power Reference gives you is a search capability. You can search for any package, class, method, or field name, without having to hunt through the hierarchy for it. You can even perform wildcard searches for names you're not sure of.

The Java Power Reference is a useful supplement to the online documentation that comes with the JDK distribution. It's not everything it could be, but it's well worth the price.



WinWriters

The WinWriters organization (www.winwriters.com), based in Seattle WA, specializes in training and supporting developers of online help systems. Their principal, Joe Welinske, is an expert in the field and a good organizer. In May, 1999, I attended their JumpStart Conference for JavaHelp Technology.

JavaHelp is one of many HTML-based formats for online help. Because the JavaHelp API is based on the Java foundation classes (JFC), the JavaHelp format is well suited to providing online help for Java programs. It provides many excellent capabilities not available in other formats.

All the major vendors of online help authoring tools have announced support for JavaHelp, but Sun is not supporting the JavaHelp API aggressively. You can find the JavaHelp page on the java.sun.com website if you already know the URL, but the main pages have no obvious links to it. The field of HTML-based online help methodologies is extremely competitive, so JavaHelp is unlikely to succeed if Sun's support remains lukewarm.

Despite JavaHelp's uncertain future, I'm glad I attended the JumpStart Conference. WinWriters put together a focused series of presentations that covered the material. They integrated vendor presentations with talks by relatively impartial (and highly regarded) online help developers. They accommodated questions, but kept the program on schedule and deflected the occasional "I just want to hear myself talk" contributions.

WinWriters puts on many conferences of this sort. I haven't attended others, but I have heard only good reports of them. The WinWriters website contains summaries and supplemental information from recent ones. Be sure to visit it if you are interested in online help systems.


Adobe Acrobat

Adobe's PostScript language created the desktop publishing industry. It decoupled publishing applications from printer drivers by providing a universal output format. It made WYSIWYG possible by carrying the same output format to display screens.

While PostScript works well for communicating between computers and printing devices, PostScript files are not very good for document storage and interchange, because they are large and difficult to manipulate. Adobe's portable document format (PDF) addresses this situation. Adobe Acrobat, and its companion program, the distiller, convert PostScript files into much smaller PDF files. The freely available PDF viewer allows anyone to view or print PDF files.

These programs have been evolving for several years, and Adobe has just brought out version 4 in a convenient package.

Adobe Acrobat 4 (Adobe Systems, www.adobe.com, $99 upgrade)

Because the PDF reader is freely available and many websites provide some documents in PDF, I assume you are generally aware of the features and capabilities of version 3. Version 4 provides substantial improvements in many areas.

Acrobat 4 facilitates collaborative review of PDF documents through its support of annotations and digital signatures. PDF places annotations in separate layers on top of the original, so you can always see and recover the original. It lets you view and manipulate annotations in a number of ways. Acrobat 4 includes many annotation tools (pencil, clip art, highlighter, and so forth) that were not available in earlier versions.

Acrobat 4 also provides better and clearer options for font bundling and file compression. These were sources of problems with prior versions. It is increasingly likely that high-end print shops will accept PDF 4 documents as input. Adobe provides a mechanism for print shops to specify packages of options to allow you to prepare PDF files to their specifications.

You can convert documents to PDF much more easily than with earlier versions. In many cases you can simply drag documents to the distiller icon. Microsoft Office 2000 programs can all write PDF directly.

One of my favorite features is Acrobat's ability to capture individual HTML pages or entire websites. The resulting document contains all of the links in the original but resides in a single file. You can view or print it in formatted pages, without the usual awkward breaks of printed HTML pages.

Acrobat 4 is a significant improvement over Acrobat 3. It will make many people's lives a lot easier. If you haven't got it yet, rush out and buy it. It's a real winner.

Friday, August 27, 1999

Dynamics in Document Design. WebWorks Publisher

This article appears in slightly different form in the July/August 1999 issue of IEEE Micro © 1999 IEEE.

This time I review a book and a software package. The book sets out to help you create documents that are "less ugly and less confusing." The software package helps you produce printed and online hypertext documents from a single source. Its documentation could have benefited from the precepts in the book.   


Documents for Readers

Dynamics in Document Design by Karen A. Schriver (Wiley, New York NY, 1997, 592pp, ISBN 0-471-30636-3, $44.99)

Karen Schriver holds a PhD degree in rhetoric and document design from Carnegie Mellon University. She is a former professor and a former co-director of CMU's late lamented Communications Design Center. She now runs her own research and consulting firm. The essential messages of her book are the following: 
  • The way readers interact with documents is more important and more complex than most people realize.
  • Words and images interact in surprising ways to produce their combined effects on readers.
  • Designers can obtain useful information from classical rhetoric, experimental psychology, and usability testing. 
"Know your audience" is one of the main rules of technical writing. Creators of instruction manuals and online help systems routinely perform an audience analysis before they begin. Before Schriver's book, however, document designers didn't really know what to do with the information. Schriver shows that designers hit this dead end because the next step is neither simple nor mechanical.

Engineers, scientists, physicians, and many other professionals routinely carry out tasks that are neither simple nor mechanical. Their education and training give them the tools they need. Document designers, on the other hand, usually lack comparable education and training in document design. Their main resources are collections of valuable but superficial rules, and software packages that help them enforce those rules.

Designers who get beyond the cookbook level usually do so haphazardly -- slowly accumulating principles and techniques that pertain to their own contexts, but rarely generalizing them for the benefit of the entire profession. Schriver tries to "capture the texture of the choices" experienced document designers make and "represent the subtlety of the knowledge they rely on in carrying out their work."

Schriver's book is important, because it makes a real contribution in this direction. Through examples from her research and consulting, she helps us form a realistic picture of the people who use documentation and the problems they have doing so. Document designers who read this book carefully will come away with increased sensitivity to their audience. They will also have a clearer picture of the gaps they need to fill in their own knowledge and skills. Educators will come away with good ideas for improving curricula in technical and professional communication.

The book is scholarly, well organized, and a gold mine of ideas, but it would benefit from sharper editing. One example is the way Schriver describes her research work. Her audience -- "those who would like to further improve their work with verbal and visual language" -- have to wade through details of research procedure and methodology that have little to do with improving document designs.

Sharp editing would also reduce the 150 pages Schriver devotes to "situating document design." This is extremely interesting background material, and some of it is central to the main themes, but Schriver gives us little help in separating the wheat from the chaff.

VCRs are the symbol for inscrutable technical products with frustrating documentation. Two of Schriver's best examples deal with VCRs. I especially enjoyed reading about her attempts to use a pair of VCRs to edit videotapes she had made.

Schriver and a colleague recorded their efforts to connect the two VCRs, a cable outlet, a converter box, and a TV so that they could edit tapes and at other times watch or record cable TV programs. This is a predictable use of the equipment. The manufacturers could have foreseen and provided instructions for this task. Instead, before they found ways to simplify the problem, Schriver and her colleague confronted a problem space of over 5000 plausible combinations of wiring and settings. It took them ten hours to find the right combination.

Schriver concludes that changes to the product design and the documentation could have simplified the task. The main product changes she suggests are
  • Standard arrangement and naming of inputs, outputs, and functionality.
  • A display to help users monitor the effects of different wiring and settings.
The documentation changes are
  • Anticipation of and instructions for tasks users might wish to perform.
  • Illustrations to give users a mental picture of how signals pass through the system.
  • A clear distinction between settings that are essential for connecting and operating the equipment and those that merely support "creeping featurism."
  • Clear explanations of the effects and interrelationships of the possible settings and connections.

Her ten hour ordeal also gave Schriver a chance to experience personally another of her research findings: people tend to blame themselves for not understanding faulty products and documentation, but in fact, most people read instructions, try hard to make things work, and don't give up easily.

A key part of Schriver's book deals with typography and space. Schriver weaves together theories about rhetoric, facts about type legibility, experimental findings of the gestalt psychologists, and practical techniques for arranging material in grids.  The strength of this section is the way it draws together many factors -- making it difficult to summarize. Document designers will come away with few prescriptive rules but with enhanced sensitivity to helping readers see the structure and interrelationships of the text.

The material about typography and space is complex but clear. Schriver's discussion of the interplay of words and pictures is a little fuzzier. Even though she is able to distill a page of guidelines from the material, Schriver acknowledges that this is an area of ongoing research. She analyzes an attractive but ineptly designed website to help show how important it is to pay attention to this area -- and how easy it is to make words and pictures work at cross purposes.

Schriver finishes this substantial book by offering evidence, in the form of research studies, that paying attention to reader feedback improves document quality. At the same time she acknowledges that we don't fully understand how to obtain, interpret, and use such feedback.

In summary, this is a valuable book for anyone who wishes to inform or persuade others through words and images. But don't expect answers on a silver platter. If you don't intend to work hard to put the lessons of this book into practice in your own work, you won't get much benefit from reading it.



Single-Sourcing

WebWorks Publisher 2000 for Windows (Quadralay, Austin TX, www.quadralay.com, $895)

In Silicon Valley and much of the rest of the world, Adobe FrameMaker is the tool of choice for designing printed manuals. Many FrameMaker users wish to produce hypertext versions of the same material for use on websites or as online help. The principal tools for producing hypertext, however, can do little with FrameMaker output.

WebWorks Publisher is the exception. It transforms FrameMaker output into a variety of hypertext formats. It achieves great flexibility through the following features:
  • Exploiting the structure that FrameMaker's paragraph, character, and table styles impose on the content.
  • Generating hyperlinks from FrameMaker markers (such as those used for cross-references and generated tables).
  • Enabling users to design templates for different kinds of hypertext pages (for example, contents, index, body).
  • Specifying all of its mappings as programs in a powerful macro language.
In short, WebWorks Publisher extracts every bit of structure that it can from the FrameMaker output, then allows you to control the mapping of each element to achieve whatever effect you wish.

So far, so good, but all this flexibility presents you with a substantial programming problem. Quadralay provides project templates that let you map your FrameMaker output in acceptable but unimaginative ways. To achieve the kind of flexibility most designers are likely to want, however, you need to write some macros. At that point you must confront Quadralay's documentation.

The product has no printed documentation, but Quadralay's online help is thorough, informative, and authoritative. It appears to have been lovingly written by the product's designers. The only problem is that it is not very helpful. For example, many users will start at the first topic under Using WebWorks Publisher, namely, Managing WebWorks 2000 Projects. This largely vacuous topic leads to the detailed Setting Project Options topic. This topic states clearly what each setting means, but it gives no information about why you might choose one setting over another. It doesn't help you think about how to set up a project that fits your work style.

I found myself slowly increasing my understanding as I dived into topic after topic. The information seems to be there. The authors provide lots of facts but little indication of their significance.

WebWorks Publisher is a powerful tool -- perhaps the only sensible choice for a FrameMaker user -- but learning how to use it is a research project. If you decide to use it, allocate plenty of time for learning, or hire a consultant to bring you up to speed.

Sunday, June 27, 1999

Bringing up the Rear

This article appears in slightly different form in the May/June 1999 issue of IEEE Micro © 1999 IEEE.

The last few years have seen substantial changes in the enterprise applications market. The traditional client/server architecture is giving way, except on Windows platforms, to multi-tier distributed architectures with generic browser clients.

The installed base is large enough to justify new client/server applications in the Windows environment. As Roger Sessions explained in his book COM and DCOM -- Microsoft's Vision for Distributed Objects (Micro Review, Mar/Apr 1998), Microsoft has a coherent strategy for distributed applications. This strategy leads to architectures somewhere between Windows-specific clients on a Windows-specific network and generic browsers on a generic intranet.

While strategists lay out grand plans on the marketing front, application developers and customizers scramble to identify and learn to use appropriate tools. This column looks at some old tools that assume new significance in this situation: Visual Basic, Perl, and online help authoring tools.


Visual Basic

Microsoft's Visual Basic 6 comes with excellent documentation. Within the framework of Visual Studio and the MSDN library, Microsoft has put together an exemplary package of procedural help and tutorial examples. Nonetheless, large numbers of third-party books try to complement the Microsoft documentation.

The three authors whose books on Visual Basic appear here look at the elephant from different viewpoints. Steven Holzner begins with Visual Basic fundamentals and small local problems. After the first 600 pages or so, he introduces more global issues, and by the time he reaches the index, he has covered everything thoroughly.

Dan Appleman doesn't bother with the material in Holzner's first 600 pages. He jumps into global issues from the beginning and focuses on giving you a thorough understanding of how to approach the design of reusable components. By the time he reaches the index, he has given back most of that 600-page head start. If Holzner wants to answer all your questions, Appleman wants to give you a deep understanding of the fundamental principles.

Ted Pattison uses only 300 pages to get from the introduction to the index. He covers much of the same material as Appleman, but a lot more concisely. He also gives examples of how to interface with Microsoft transaction processing capabilities. Pattison's book is more about Visual Basic's environment than it is about Visual Basic itself.

 
Visual Basic 6 Black Book by Steven Holzner (Coriolis, Scottsdale AZ, 1998, 1132pp plus CD, ISBN 1-57610-283-1, www.coriolis.com, $49.99)

Holzner and the team at Coriolis have put together a logically arranged, attractive, well designed, and extremely thorough book. It's a small thing, but I appreciate the fact that, except for the first and last fifty pages or so, the book lies flat and open to whatever page you turn to.

Holzner makes it easy to find out how to accomplish specific tasks. Most chapters begin with a table with columns labeled "If you need an immediate solution to" and "See page." After a brief "in depth" section, the chapter provides the promised solutions in short, well illustrated procedural essays.

If you plan to work with Visual Basic and expect to have questions like "How do I add buttons to a toolbar at runtime" or "How do I use a code component without creating an object," this is the book for you.


Developing COM/ActiveX Components with Visual Basic 6: A Guide to the Perplexed by Dan Appleman (Sams, Indianapolis IN, 1998, 888pp plus CD, ISBN 1-56276-576-0, www.samspublishing.com, $49.99)

Appleman's subtitle imitates the title of a set of 800-year-old talmudic commentaries by Maimonides. Appleman's point is that he hopes to augment the Microsoft documentation, not replace it. You may wish to draw other parallels between Appleman's book and talmudic commentary.

ActiveX (the technology formerly known as OLE) is a collection of object-oriented capabilities based on COM. Visual Basic is the front end that successfully hides ActiveX's complexities -- making it possible for average programmers to develop powerful reusable software components in minimal time.

Appleman's complaint is that Visual Basic hides ActiveX's complexities so successfully that programmers are left in the dark. The philosophy behind the Microsoft documentation is something like "You don't need to know how a car works to drive it." Appleman's answer might be "Yes, but you ought to know enough to understand why it's a bad idea to drive on a rough road or into a river or off a cliff."

Appleman's definition of an expert is someone who understands the fundamentals of a subject. He hopes to give you such a good grounding in using Visual Basic for ActiveX and COM development that you'll think all the techniques he describes are obvious.

Appleman's day job is creating reusable components for sale. That forces him to focus continually on all the right issues. He wrote this book from that perspective. If you want to build reusable components of high quality, this book is a very good place to start. 


Programming Distributed Applications with COM and Microsoft Visual Basic 6.0 by Ted Pattison (Microsoft Press, Redmond WA, 1998, 344pp plus CD, ISBN 1-57231-961-5, mspress.microsoft.com, $44.99)

Pattison thinks COM is the most important thing a Windows programmer can learn. Unfortunately, the early COM documentation was hard to understand without thorough grounding in C++. Pattison wrote this book to help Visual Basic programmers understand COM and to help C++ programmers understand how Visual Basic handles COM.

Pattison's book is short on examples, but the accompanying CD contains complete, functioning examples that you can run and examine. This is in keeping with Pattison's general approach. The material is all there. It's concise. You have to figure out the subtleties for yourself.

If you're a fairly sophisticated programmer and you want a quick tour of the COM basics without someone holding your hand, this book is an excellent choice.


Perl

Larry Wall's Perl language (Micro Review, October 97) is a wonderful example of collaborative software development. A key to making this collaborative effort successful has been the commitment and support of Tim O'Reilly and his company, O'Reilly associates. Originally known for a few definitive Unix books, O'Reilly Associates has grown into the leading publisher of books about Perl, Linux, Apache, Python, Tcl, and other collaborative open source projects.

The common gateway interface (CGI) is the standard protocol for using server-side computer programs to add dynamic content to web pages. Perl has become the leading language for CGI programming.
  

Perl Resource Kit for Windows (O'Reilly, Sebastopol CA, 1998, 4 volumes plus CD, ISBN 1-56592-409-6, www.oreilly.com, $149.95)

The software in this kit is free. You pay for the books and the CD.

Perl is a moving target. As soon as a book or CD appears it begins to become outdated. Nonetheless, this kit is a definitive Perl distribution for the Windows environment. It includes hundreds of Perl modules from the comprehensive Perl archive network (CPAN). You have to start somewhere, and this snapshot of Perl is as good a place as any.

The Perl distribution contains both client side and server side components. The Perl Utilities Guide by Brian Jepson, one of the four books in the kit, leads you through the complexities of installing and configuring those components. It also gives you an overview of how to write, debug, and run Perl programs within the framework of the Windows component object model (COM).

Programming with Perl Modules by Erik Olson, the second book in the kit, gives examples of how to use the most popular CPAN modules to develop applications. He devotes a chapter to Lincoln Stein's CGI.pm, a module to support CGI applications.

The Perl community values and rewards good documentation. Perl programmers produce documentation for their modules using a markup language called pod (for plain old documentation). The final two books in the kit contain David Futato's compilation of written documentation for many of the CPAN modules.


Learning Perl/Tk by Nancy Walsh (O'Reilly, Sebastopol CA, 1999, 376pp, ISBN 1-56592-314-6, www.oreilly.com, $32.95)

Visual Basic provides designers an easy, visual way to specify, arrange, customize, and program graphical elements. This has made it the most popular tool for producing graphic user interfaces (GUIs) for distributed applications.

The Tk toolkit, originally developed for the Tcl language, gives Perl many of the same GUI development tools that Visual Basic has. This book explains how Perl and Tk work together, and it provides step-by-step instructions for using all of Tk's graphical elements.

Unlike Visual Basic, which hides such details, Tk makes its geometry management explicit. Walsh covers this material carefully in a lengthy chapter. Reading it is a good way to learn to use Tk effectively.

The book does not teach Perl programming, but it is a worthwhile introduction to Perl/Tk.


Perl and CGI for the World Wide Web by Elizabeth Castro (Peachpit, Berkeley CA, 1999, 272pp, ISBN 0-201-35358-X, www.peachpit.com, $18.99)

The Peachpit Visual Quickstart Guide series, and Elizabeth Castro's books in particular, are examples of user help at its best.

Do you want to know how to reverse the contents of an array? Look in the index. Turn to page 99. There you find the simple procedure, an example of the code in context, two helpful (but tiny) screen shots, and three tips, including one that refers to the material about sorting arrays on page 98.

If you need to do Perl and CGI programming from time to time, this is a good reference to keep nearby.
 

Online Help Authoring Tools

Online help was once almost exclusively a Windows phenomenon. Help authors prepared RTF files and passed them through the Windows help compiler. Microsoft's WinHelp engine was the only way to display the resulting HLP files.

Nowadays there are other possibilities. The WinHelp engine, as it exists in current Windows systems, is more powerful than its predecessors. Nonetheless, Microsoft is phasing it out in favor of HTML Help. HTML Help is still Windows based and compiled, but an ActiveX-enabled browser can view much of an HTML Help file on any platform. More important from Microsoft's point of view is the fact that HTML Help blends seamlessly with web content.

At the same time, Sun and Netscape, trying to support their own views of distributed computing, have introduced formats for HTML-based online help. These have evolved into WebHelp, which uses both a Java applet and an ActiveX control to display help files on any platform.

Steve Wexler's book describes HTML Help and Microsoft's tools for producing help systems in that format. But the best tool for all formats, including HTML Help, is RoboHELP from Blue Sky Software.


The Official Microsoft HTML Help Authoring Kit by Steve Wexler (Microsoft Press, Redmond WA, 1998, 312pp plus CD, ISBN 1-57231-603-9, mspress.microsoft.com, $39.99)

Steve Wexler is a well respected expert in the field of online help for Windows systems. He is the ideal person to describe Microsoft's approach.

The WinHelp engine is a Windows executable program. It stands alone and only runs on Windows systems. The HTML Help engine, on the other hand, is an ActiveX control, so it integrates tightly with a container application -- including an ActiveX enabled browser on any platform.

If you want a clear description of how this works and how to use Microsoft's tools to produce HTML Help systems, this is the definitive book.


RoboHELP Office 7.0 (Blue Sky Software, La Jolla CA, www.blue-sky.com, $799)

I have talked about RoboHELP many times in these columns (see Micro Review, May/June 1998). Blue Sky has long been the premier producer of tools for creating online help, and they are working aggressively to stay that way. Building on the basic RoboHELP package, which originally supported only WinHelp development, Blue Sky has moved to support development of all online help formats.

Blue Sky uses a two-pronged approach to this problem. They offer a single source capability based on the original RoboHELP approach. They also provide a separate authoring environment, with essentially the same user interface, for producing HTML Help. From this environment, you can import a WinHelp-oriented project, then fine tune it for HTML Help.

Version 7 also introduces a number of productivity improvements to earlier versions. These make it easier to produce indexes, browse sequences, and modular help systems, and they simplify macro programming.

If you need to produce online help in any format for any platform, this is the product to use, even if you hate Windows.

Tuesday, April 27, 1999

Words of Wisdom

This article appears in slightly different form in the March/April 1999 issue of IEEE Micro © 1999 IEEE.

This time I look at two thin books in which highly respected engineers offer words of wisdom to undergraduates and experienced engineers alike.

Logical Effort: Designing Fast CMOS Circuits by Ivan E. Sutherland, Robert F. Sproull, and David Harris (Morgan Kaufmann, San Francisco CA, 1999, 240pp, ISBN 1-55860-557-6, www.mkp.com, $42.95)

Ivan Sutherland is the recipient of both the ACM Turing Award and the IEEE Von Neumann Medal. He and Bob Sproull are Sun fellows (and vice-presidents). Both are highly acclaimed experts in the design of graphics hardware and software. They published the theoretical kernel of this book in a paper in 1991.

David Harris teaches engineering at Harvey Mudd College. After developing many of the same techniques independently, he persuaded Sutherland and Sproull to let him rework their unpublished notes into this book.

Logical effort is a theoretical construct that measures the relative cost of computation inherent in the circuit topology that implements a logic gate's function. A logic gate may contain many transistors, and MOS transistors in series conduct electricity relatively poorly. Thus, some logic gates require substantially more logical effort than others.

Electrical effort is the ratio of the logic gate's output to input capacitances. The product of logical effort and electrical effort, plus a term that represents the parasitic capacitance of the logic gate, gives a relative measure of the time it takes a signal to pass through the logic gate.

Because input capacitance depends largely on the sizes of the transistors that make up the logic gate, this analysis allows the authors to decompose signal delays into separate logical, transistor size, output load, and parasitic terms. The authors recognize that this is an approximation, but it is a useful one. It allows them, through relatively elementary mathematics, to arrive quickly at circuit designs that they can prove are close to optimal speed.

The authors know that many good circuit designers have independently discovered some of their results, and they acknowledge that there are many heuristics that produce good results. Nonetheless, many designers -- especially beginners -- rely on the "simulate and tweak" method, and for those designers, the authors' method is a big improvement. As Sutherland says about the original work he did on this method in 1985, "I resorted to calculus for lack of circuit simulation tools. Instead of computing I had to think about the problem, a formula for success that I recommend highly."

This leads to one of the reasons the authors give for reading this book. It gives you a new way to think about circuits -- new rules of thumb to help you build your intuitive understanding. I think there is another good reason to read it. It's a wonderful example of how world-class engineers think about a problem. The problem is easy to understand, and with a little effort, most designers can follow the authors' reasoning and understand their results.

The book is excellent for students, because it works through a many examples in great detail. Exercises encourage you to work through examples on your own -- the best way to incorporate the method into your intuitive thinking about CMOS circuits.

Like many good books nowadays, this one comes with its own website: www.mkp.com/Logical_Effort. There you can find answers to exercises, an extensive design example, and some software.

If designing circuits interests you at all, I think you'll appreciate this book.


The Practice of Programming by Brian Kernighan and Rob Pike (Addison Wesley, Reading MA, 1999, 288pp, ISBN 0-201-61586-X, 800-822-6339, tpop.awl.com, $24.95)

There are few software engineers more distinguished than Brian Kernighan and Rob Pike. Both have been at Bell Labs for as long as I can remember. Pike was an architect of the Plan 9 and Inferno operating systems. He works on software that makes writing software easier.

Kernighan and Pike's last book, The Unix Programming Environment (Prentice-Hall, 1984) is a well known classic. Kernighan is also known as the K in K&R, the name by which many refer to The C Programming Language (Prentice-Hall, 1978), an even more classic work he coauthored with Dennis Ritchie. My favorites, though, are the works Kernighan coauthored with PJ Plauger: The Elements of Programming Style (McGraw Hill, 1974) and Software Tools (Addison-Wesley, 1976).

Steve McConnell's Code Complete (Microsoft Press, 1993) addresses the fact that most programmers know only scattered bits of the large body of practical knowledge about programming. They often go directly from school, where they don't learn practical programming, into isolated jobs without mentoring or apprenticeship. McConnell's answer to that problem is an 877-page systematic survey of professional programming practices. You can go off by yourself and, given enough time, read and understand everything in it.

Kernighan and Pike's book covers many of the same topics, but it is much shorter. The authors aim at being clear and concise, but they don't waste time explaining things you can find easily elsewhere and should probably already know anyway. To get the most out of the book, you should read it with the help of an experienced programmer or teacher who can explain some of the subtleties.

The cover of the book bears the message "simplicity, clarity, generality," and the book exemplifies those principles. Seven of the nine chapters have one-word titles. A section on error handling is called Abort, Retry, Fail? The text does not mention that famous message, but the error-handling principles it teaches help you make the connection.

It's hard to summarize a book that is already clear and concise. An appendix contains three pages of one-line statements of rules that appear in the book. They range from generalities (be consistent, be accurate, stick to the standard) to specifics (use else-ifs for multi-way decisions, parenthesize the macro body and its arguments).

The rules the authors list under debugging and testing are all generalities. This illustrates one of the difficulties of trying to teach programming from a book. Testing and debugging depend heavily on the facts of the case and the details of the environment. It is much easier to learn these skills at the side of a master programmer who is willing to think out loud while testing real programs and pursuing real bugs. Given that limitation, the authors do an excellent job of pointing out ways to make testing and debugging easier and more systematic.

Programming is a form of communication. The programmer communicates with the machine or operating system and with other programmers. Programs often communicate with each other. The book's chapters on style, interfaces, portability, and notation make you aware of these issues and give you tools to ensure that you and your programs communicate clearly and efficiently.

Programming languages are the foundation of programming communication. In one chapter the authors solve the same problem five times, using a different programming language each time. Today's top programmers must know many languages and know how to choose the right language for each job.

Computer books come and go -- usually quickly. Very few become classics. This one won't go quickly. I think it will become a classic.

Saturday, February 27, 1999

Macworld, Microsoft Woes, Handbook of Programming Languages

This article appears in slightly different form in the January/February 1999 issue of IEEE Micro © 1999 IEEE.

January in San Francisco means it's Macworld Expo time again, and I made my annual pilgrimage. January also means trying to recover from my annual revamp of my computer systems. Despite that effort and all the distractions of the holidays, I did find time to look at an amazing set of books.


Macworld Expo

Last year I wrote about how encouraging Macworld expo was. Long ago most of my work moved to the PC, but Steve Jobs made me feel glad that I had bought a new Macintosh. It was the same this year, only more so.

Macintosh users are loyal. Their numbers have declined over the years, but the ones who remain are nearly unshakable. All they need is a plausible excuse to hope for better times. This year's excuses are more than adequate.

In his keynote, Steve Jobs focused on Apple's desktop and server business. Reports of exciting new portables remained just reports.

Since releasing its entry-level machine, the iMac, in August, 1998, Apple has sold more than 800,000 of them, giving the Mac its first solid gain in market share in a long time. Jobs announced a faster processor, a larger hard drive, and a lower price, but he saved the most important announcement for last. The iMac now comes in five yummy colors: strawberry, lime, blueberry, tangerine, and grape. "Collect the whole set!" Jobs urged. They really are beautiful -- and apparently the result of much plastics research. I picked up a free full-color poster of them instead.

The Mac has a long uphill battle to regain anywhere near its early success. Much of the computing world has standardized on the PC and Windows, but Apple is using standards to help it come back. The Power Macintosh G3 line has done well, but Jobs rolled out a reinvented version that had his huge audience cheering wildly. The new G3 Macs have FireWire and USB ports, high-speed PCI slots, and ethernet. Apple has also licensed the OpenGL 3D graphics platform from Silicon Graphics. With an inexpensive software adapter, the new Macs can also play game CDs designed for Sony Playstations.

Apple has worked hard to give the G3 Macs superior graphics, and Jobs showed some impressive Photoshop and game benchmarks against 450 MHz Pentium machines. What impressed me the most, however, was the door. The case has the system board on the inside of one of the sides. A simple latch allows you to open the door and fold out the board. Everything you might want to get to is right there, easily accessible (but you can lock it if you need to). It's breathtakingly simple, and such a good idea, that I can't imagine why nobody ever made one like it before.

I have never seen a Steve Jobs show that wasn't exciting -- even when it was about something as ultimately insignificant as launching the Wingz spreadsheet on the Next machine (I still have my "Wingz World Tour" jacket). Jobs' keynote was long, but there were few letdowns. One announcement showed that Apple is committed to the thin client model -- an obvious winner in the education market, for example. Jobs showed a G3 Mac running Mac OS X server, a Unix system based on the Mach kernel. This system supports booting from the network, and he illustrated this by bringing out a rack of 49 diskless iMacs, booting them all from the server, and running different Quicktime movies on them simultaneously. Jobs didn't spell out the thin client story, but it's clear that Apple intends to be in on the big growth areas of the next decade.


Why Microsoft is Hard to Love

The subject of my January/February 1992 column was how I spent my vacation. Back then I had just spent nearly $500 to upgrade my Mac SE-30 from 2 mbytes to 8. This year I spent not too much more than that to upgrade one PC from 32 mbytes to 128 and another from 64 to 288.

Back then I upgraded my Mac SE-30 to System 7. This year I installed an ethernet network to connect my Power Mac 8600 and two of my PCs, and I upgraded my Mac to OS 8.5. I tried to find some single or dual-boot combination of Windows 98, Windows NT 4 Server, and Windows NT 5 beta Server or Workstation that would work with my PCs and their largely, but not entirely, plug and play peripherals.

I installed OS 8.5 with no trouble, but my Windows attempts have yet to pan out. After a great deal of trouble I managed to get my PCs working (on Windows 98) almost as well as they did before I started. But I learned a lot, and I expect to do better on my next attempt.

As I've said many times, I'm far from being a Microsoft hater. Microsoft software is amazingly powerful, and it keeps getting better. But even Microsoft can't keep up with the furious pace of change and competition in the industry. Like everyone else, they rush bug-filled products into the market in an attempt to stay on the cutting edge. While they invest huge sums in attempts to make their products easy to use, even helpful, they just can't seem to make them simple.

I am not a naive user, but I find the task of installing or upgrading Microsoft software to be a mine field. Partially present programs clutter my disks but defy removal. Mystifying error messages leap out at me. Devices and drivers don't quite match. Printing, scanning, fonts, graphics -- all work well most of the time, then fail mysteriously at inopportune moments.

There is a huge opportunity here. If someone can provide simplicity, reliability, and guaranteed usability, they can make a big dent in Microsoft's monopoly.

There! Now that I have that off my chest, I can go back to trying to install Windows NT on one of my PCs. In the end I'll be happy with it.


Handbook of Programming Languages

For one of my recent projects, I had to understand Tcl/Tk quickly. I didn't need detailed knowledge, but I needed good information quickly. I turned to the Handbook of Programming languages. This review isn't about Tcl, but if you have anything to do with developing CAD software, you ought to find out about it. Check out www.tcltk.com.

Handbook of Programming Languages, Peter Salus, ed (Macmillan, Indianapolis IN, 1998, 2432 pp, 4 vol, www.mcp.com, ISBN 1-57870-008-6, 1-57870-009-4, 1-57870-010-8, 1-57870-011-6, $49.99/volume)

Peter Salus is a linguist and an author of computer books. He has been executive director of USENIX and the Sun User Group, and vice president of the Free Software Foundation. For ten years he was managing editor of MIT Press's Computing Systems. The 38 authors of articles in this handbook are a Who's Who of programming language design: Adele Goldberg on Smalltalk, James Gosling on Java, Bjarne Stroustrup on C++, Jon Bentley on little languages, Brian Kernighan on eqn, Bertrand Meyer on Eiffel, and many, many more.

The aim of the Handbook of Programming Languages is to provide a single comprehensive source of information for computing professionals about a broad range of programming languages. It achieves that aim. The handbook combines reprints, derivative articles, and original work, yet it keeps a uniform tone -- simple, clear writing about relatively arcane topics.

I haven't read every word of this handbook, of course, but every time I've turned to it, I've found what I wanted. It's comprehensive, well organized, and well written. If you work with programming languages, it belongs on your shelf.