Tuesday, October 27, 1998

Working on the Web

This article appears in slightly different form in the September/October 1998 issue of IEEE Micro © 1998 IEEE.

Everything in this column relates in some way to working on the web. I have been working with website design issues recently, and it has been a frustrating experience. I attribute this to several factors: 
  • Industry innovation is far out in front of standardization.
  • New tools and new ideas arise quickly, then disappear just as they become familiar.
  • Many tools are primitive, bug ridden, and ill documented -- the results of brutal competition and impossible schedules.
  • Efficient operation requires integration of operating systems, browsers, communication hardware and software, transmission protocols, and Internet service -- all from a varied array of suppliers.
  • I am my own system integrator.

Too Many Pieces

If the above seems overly critical and pessimistic, I hope the following story puts my remarks in perspective.
 
Two years ago my Internet service provider (ISP) was C2Net, which at the time had an office three blocks from mine. I dialed into their system for my Internet connection and maintained a Unix shell account there. I used SoftQuad's HoTMetal Pro to build a website on my local PC, then transferred the files to the C2Net machine via FTP. I used the Eudora email program on my local PC, interacting over the Internet connection with C2Net's POP and SMTP servers. I was pretty much a one-stop shopper.
 
As a backup I acquired another Internet connection via a local call to the Microsoft Network (MSN). If, as sometimes happened, I had trouble dialing into the C2Net modem bank, I connected to the Internet via MSN. This gave me direct access to the C2Net POP and SMTP servers and access via Telnet to my shell account.

This worked well until C2Net decided to get out of the ISP business about a year ago. They arranged a smooth transition of my website, shell account, and electronic mail to my current ISP, Infonex. Unfortunately, Infonex is 400 miles from my office, and they have no local access number, so my MSN backup became my only path to the Internet. Fortunately, it is a highly reliable path. I should have been suspicious when everything went so smoothly.

Electronic junk mail (spam) is a growing problem. It has motivated a large number of security measures, and one of them affected me. Last year, Infonex informed me that I may no longer use their SMTP server for sending mail, because I connect to the Internet via MSN.

Fortunately, MSN had recently installed an SMTP server, and I started to use it, giving me an unusual, but logical configuration. I was sending Infonex mail via the MSN SMTP server and receiving it via the Infonex POP server, using Qualcomm's Eudora email client for both. This worked well for about a year. Then, recently, I once again became a casualty of the anti-spam wars.

The MSN SMTP server had become a favorite tool of spammers, so MSN instituted a secure procedure to prevent its unauthorized use. Unfortunately, the procedure also prevented me, an authorized user, from sending email, because Eudora does not have that protocol in its repertoire. In fact, it appears to be a Microsoft proprietary protocol.
 
I am not usually as critical of Microsoft as many of my colleagues are, but I am disappointed by the way Microsoft handled this situation. They sent no notice of the change, and they sent no error indications. The server simply accepted my mail and appears to have thrown it away. It took me two days to figure out why nobody was replying to my messages. Then after listening to baffled mumblings from Microsoft's technical support staff for another two days, I finally reached someone who gave me a workaround. "We've had this for a week," he said.
 
The workaround was a temporary alternate SMTP server. It worked for another week, then started doing the same thing. This time it took me only six hours to notice that my mail wasn't arriving at its destination. MSN technical support assured me that their SMTP server was working properly, and it did in fact work from my MSN account. I couldn't get Eudora (or Microsoft's Outlook Express) to send Cyberpass mail, and the MSN folks informed me that they didn't feel obliged to help. Two days later, without notice, it started to work again. The delayed mail arrived in a bunch. Perhaps the mail that disappeared a few weeks ago will show up some day too.
 
One advantage of the piecemeal approach is that not everything fails at once. My MSN Internet connection has trouble passing email through its own SMTP server, but it handles my web publishing FTP transfers without difficulty. I use HoTMetaL Pro 4 (Micro Review, Oct 97) to manage those transfers. I still like HoTMetaL Pro 4, though I have run into a few problems. I won't go into all I've found out about its strengths and weaknesses, because a new version is imminent. I expect the new version to support XML, and I look forward to reviewing it when it appears.
 

Books

The above story illustrates how many and varied are the things that can go wrong. Your best protection is an informed guide. Most of us can't rely on a system administrator for every problem. The next best thing is usually a good reference work. Nowadays, you can look for information on many useful websites, but a good book is often a better choice. It may not be as current, but it is usually better edited, better organized, and more carefully checked.

 Web Site Engineering by Thomas A. Powell et al. (Prentice Hall, Upper Saddle River NJ, 1998, 334pp, ISBN 0-13-650920-7, (800) 382-3419, www.phptr.com, $39.95)

Thomas Powell is a computer science instructor at UC San Diego, the developer of a web publishing certificate program, and an independent Internet consultant. His book is the first I've seen that restates well known principles of engineering and software development management in the context of website development.

Website designers -- if they take a systematic approach at all -- frequently apply principles of document design to building their sites. Those principles are helpful, but they don't address the most important problems that designers of large, active sites face.

Powell looks at a complex website as a piece of software designed to run in a variety of configurations -- not all of which the designer can anticipate. Software design principles have not made software design easy, but they have made it manageable. Powell expects his design principles to have the same effect.

Powell recognizes that websites, unlike most software, depend for their success on their look and feel and on their content. Traditional software development methodologies don't address those aspects effectively. Websites must also function in an environment that is more complex and unpredictable than traditional client/server configurations. Powell outlines methods for assessing the characteristics and capabilities of each user's environment and deciding dynamically what files, formatting, and client-side scripts to send into that environment.

Powell's approach starts with traditional project management steps: defining the problem, exploring the concept and analyzing its feasibility, then formalizing the requirements analysis and specification. These become the head of a cascade of project lifecycle steps: prototyping, implementation and module testing, system integration and testing, deployment, and maintenance.

Engineering entails measurement. A designer following Powell's methods must establish measurable criteria for success. Correlating site activity with objective measures of business success is difficult. Powell offers suggestions, but if he had a good way to do this, he wouldn't need to write books for a living.

While focusing on process, Powell also provides practical advice on how to address specific aspects of design and implementation. Nonetheless, this is not really a reference book, as its skimpy index emphasizes. Plan to read the book once to get a good overview of how to approach website design, then turn to other books for detailed help.


 Dynamic HTML: The Definitive Reference by Danny Goodman (O'Reilly, Sebastopol CA, 1998, 1096pp, ISBN 1-56592-494-0, (800) 998-9938, www.oreilly.com, $39.95)

Danny Goodman is a well known and well respected author of computer books. His Complete Hypercard Handbook (Bantam, 1987) was justifiably famous and sold enormously well. His JavaScript Bible has recently come out in its third edition (IDG, 1998). His books have won prestigious awards.

Goodman wrote this enormous tome because he couldn't keep straight all of the changing standards, undocumented features, and browser incompatibilities and idiosyncracies. And if he, totally immersed in this material, can't keep it straight, what chance do the rest of us have?

The first thing Goodman explains is that there is no such thing as dynamic HTML (DHTML) -- or rather that there is a large, ill-defined collection of concepts, standards, and browser features that fall under that heading. In many cases the various elements are battlegrounds for commercial competition -- mainly involving Microsoft and Netscape. Another of Goodman's objectives for the book is to present the elements of DHTML in a non-partisan way.

Goodman devotes the first part of his book -- less than 200 pages -- to an explanation of how the Netscape and Microsoft approaches differ. He explains how to develop applications that run on both browsers.

The remainder of the book -- over 800 pages -- is reference material, including nearly 100 pages of indexes. It provides complete documentation of every feature of HTML, the document object model (DOM), style sheets, and JavaScript. It identifies undocumented features, features that behave differently on Netscape and Microsoft browsers, features that behave differently from the way they are supposed to, and features that don't behave logically or consistently at all.

If you plan to produce professional websites with dynamic content, you need a reference like this one.


 HTML 4 for Dummies Quick Reference by Deborah S. Ray & Eric J. Ray (IDG, Foster City CA, 1998, 240pp, ISBN 0-7645-0332-4, www.idgbooks.com, $14.99)

I have to say at the outset that I know Eric and Deb Ray and have the highest personal regard for them, so you have to take all the wonderful things I say about their books with a grain of salt. Still, they are all true.

The Rays have produced several books about HTML 4. Their Mastering HTML 4.0 (Sybex, 1997) is a piece of engineering in its own right -- almost too heavy and thick to lift with one hand, yet opening easily to any page and lying nearly flat. It looks wonderful, and the prose is easy to read. Someday I'd like to read it. My only reservation is that it might have come out too early to be authoritative.

They aimed their Dummies 101: HTML 4 (IDG, 1998) at beginning to intermediate website designers. If you work your way through it, you can wind up with a professional looking personal site.

The quick reference aims to serve all HTML 4 users. Its explanations are accessible to beginners, but they are sufficiently thorough to satisfy advanced users.

Elizabeth Castro's Visual Quickstart Guide: HTML for the World Wide Web (Peachpit, 1996) is an outstanding example of purely task-oriented documentation (see Micro Review, Aug 96). The Rays aim at task orientation in their quickstart guide, but they blend it with explanatory material. Not only can you find a step-by-step procedure for the task you're trying to accomplish, but you can also read the relevant background material. This kind of just-in-time exposition helps you learn and remember facts and general principles that you might not absorb as easily in another context.

The book is small and light, with a spiral binding that makes it lie flat at any page. Unfortunately, the small pages have forced the Rays to reduce their screen shots to an almost unreadable size. The text, however, is clear and easy to read. Navigation and orientation aids make the book easy to use. A little more attention to the index, however, would have made the book an even better tool.

I think the best thing about this quickstart guide is the blend of task orientation and exposition. If you keep it by your side while you work on your website, it will answer your questions and help you deepen your knowledge. I recommend it.


 XML: Principles, Tools, and Techniques - World Wide Web Journal, v.2, no. 4, Winter 1997, Dan Connolly, ed. (O'Reilly, Sebastopol CA, 1997, 258pp, ISBN 1-56592-349-9, (800) 998-9938, www.oreilly.com, $29.95)

Pundits have described the Extensible Markup Language (XML) as SGML lite or heavy-duty HTML. In fact, it is a subset of the Standard Generalized Markup Language (SGML), surrounded by a collection of web-oriented tools and standards. Unlike SGML, which made a lot of noise but few inroads into mainline applications, XML is creeping, relatively silently, into the underpinnings of basic PC and web-based tools.

It's almost too late to talk about this issue of the World Wide Web Journal. When it appeared about a year ago, it gathered in one place the relevant XML specifications and technical papers. Many of these are now a little dated. The expository papers, on the other hand, provide a great deal of insight into the underlying ideas.

Perhaps you don't feel a need to learn anything about XML right now. Buy this issue of the World Wide Web Journal and keep it on your shelf. When you suddenly find yourself needing a roadmap to the new technology, you'll know where to look.