Tuesday, September 1, 2015

DITA


The Darwin Information Typing Architecture (DITA) emerged from a long line of internal IBM projects based on an earlier markup language called SGML. Approximately 10 years ago, IBM bequeathed DITA to the world as an open source project. You can read all about it at dita.xml.org. DITA is based on the idea of semantic markup, that is, embedding metadata in a document to describe the structural roles of its elements without prescribing formatting for those elements. This has many benefits but can impose a large overhead cost on writing projects. Managers of small projects find it hard to justify that overhead, but tools keep getting better and simpler.

DITA is highly flexible, but in its most common use it marries semantic markup with another long-developing trend in technical writing: topic-based writing, that is, writing small, independent topics that can be assembled into documents and help systems by means of external structural descriptions called maps. This enables reuse and single-sourcing. When the resulting material needs to be translated into additional languages, this approach can save large sums of money. If you say something in only one place, then you don’t have to translate many similar versions of the same information.

DITA’s version of topic-based writing rests on the idea that each topic can consist purely of one type of information: concepts, reference material, or procedural instructions. Unlike its underlying markup language, XML, DITA uses a system of specialization and constraints rather than arbitrary extensions, so different DITA projects can make sense of each other’s customizations. This makes it easy for DITA-based projects with different customizations to share topics.

DITA also provides mechanisms for decoupling cross-references from content, making sharing and reuse easier. Using maps to define documents as combinations of topics is one aspect of that decoupling. The other is an indirection method called keys, which enables dependencies to be confined to maps. A topic can refer to another topic -- or even a bit of text -- using a key, and different maps can associate that key with different topics or bits of text. The contents of the topic do not need to change.

While you are free to define your own means of transforming a map and the topics it refers to into a document, most projects build their DITA production on top of the DITA Open Toolkit (www.dita-ot.org), a set of Java-based open source publishing tools. The combination of DITA and the toolkit presents a steep learning curve for most writers, and the available support – not bad, but about average for open source projects – makes the climb even harder. This situation cries out for third-party books, and there are a few.

In the July/August 2014 Micro Review I recommended DITA Best Practices: A Roadmap for Writing, Editing, and Architecting in DITA by Laura Bellamy et al. It’s an excellent book, but I have nothing new to say about it. In the July/August 2006 Micro Review, I wrote about the first edition of the Comtech book Introduction to DITA -- A User Guide to the Darwin Information Typing Architecture. DITA was just out, and the book showed signs of being rushed into print. This time I look at the second edition. Finally, anyone who wants to understand the thinking behind the DITA standard should read Eliot Kimber’s book, DITA for Practitioners, supposedly the first of two volumes, though it has been out for more than three years and he hasn’t started writing the second volume yet. I talk about that book here as well.


DITA for Practitioners, Volume 1: Architecture and Technology by Eliot Kimber (XML Press, Laguna Hills CA, 2012, 348pp, ISBN 978-1-937434-06-9, xmlpress.net, $29.95)

Eliot Kimber really knows DITA. He is a DITA consultant and a voting member of the Oasis DITA Technical Committee. He has written a book for “people who are or will be in some way involved with the design, implementation, or support of DITA-based systems.” The book is not for authors who just want to use DITA, though everyone who works with DITA can benefit from learning its architecture and main technical features. For example, many authors would benefit from understanding the indirect addressing provided by keys, but books aimed mainly at authors usually tiptoe around that topic. Because I have a technical background that includes system architecture and design, this is my favorite DITA book. But I certainly understand why DITA users without that background might prefer books that more specifically target their needs and concerns. JoAnn Hackos’s book, described elsewhere in this column, is closer to that category.

Kimber makes the point that just as there are many XMLs -- making teaching someone to use XML difficult -- there are also many DITAs. Authors of how-to books must pick a specific way of using DITA (usually, something akin to designing topic-based online help systems) before they can provide clear, simple instructions and examples. Kimber’s approach is to survey the architecture as an introduction to the DITA standard, focusing on the parts that might confuse experienced XML practitioners. With that background you can then read the standard. With this approach it might be months before you can apply DITA to your documentation projects, but when you do, you’ll know what you’re doing, why you’re doing it, and how to investigate and correct problems.

Fortunately, Kimber provides an intermediate path. The longest chapter of his book (102 pp) is a tutorial, though a more conceptual than procedural one. It covers all the main steps in producing a DITA-based publication. Reading it exposes you to the main aspects of using DITA. His procedural steps, however, are not always simple and direct. Here, for example, is a step in a procedure to reuse the topics of an online help system to create a printed version:

4. In the DOCTYPE declaration, change “DITA Map//” to “DITA BookMap//” and “map.dtd” to “bookmap.dtd”

Note the uppercase “M” in BookMap.

You don’t actually have to change “map.dtd” to “bookmap.dtd” because you should always be resolving the public ID, not the system ID, for the DTD. But people will get confused if you don’t change it.

The best thing about this book is the sense it gives you of an ongoing technical conversation within the DITA community. For example, in discussing the DITA 1.2 key reference facility, he talks about a limitation in the way DITA constructs the global key space, then adds, “Without [the limitation], we would not have had time to get any indirection facility into DITA 12.” This tells me that key scoping is not a mysterious fact of life, set in stone, but a technical feature that DITA architects continue to try to make more flexible.

Sometimes the conversation goes against Kimber. For example, he notes that many DITA users use keys for variable text like product names. He points out that this implementation falls short of how programmers expect variables to behave and advocates that DITA provide a separate variable mechanism – a position that the rest of the DITA Technical Committee disagrees with. This sort of information is fascinating, but of little use to readers. It is one of the ways in which this book is like no other DITA book.

If you really want to know how DITA works, if the idea of understanding and even participating in this kind of technical conversation appeals to you, you should read this book.



Introduction to DITA, 2nd ed: A User Guide to the Darwin Information Typing Architecture Including DITA 1.2 by JoAnn Hackos (Comtech, Denver CO, 2011, 430pp, ISBN 978-0-9778634-3-3, www.comtech-serv.com, $50.00)

JoAnn Hackos’s name did not appear on the first edition of this book, but she founded Comtech Services in 1978 and has been its leader ever since. She is the author of several well-known and highly respected books on managing technical communication. She is a Fellow and former President of the Society for Technical Communication (STC). She is known for being thorough and methodical – in her books and in highly regarded seminars, workshops, and conferences. Her workshops are expensive, but people seem to find them worth the price.

This book is much more clearly a tutorial than Kimber’s book, but Hackos does not aim just at authors. She includes tutorials for system architects as well. She covers every aspect of setting up and using DITA to support topic-based authoring, but she says little about the technical decisions that underlie the publishing system she helps you set up. She calls the book a reference manual as well as a learning tool; that is true in the sense that most readers will not go through all of the tutorial topics. They will learn the basics, start writing their own documents, then come back for the more advanced parts when they run into something they don’t know how to do.

Hackos spells everything out, and the result uses the print medium inefficiently. This is typical of workshop handouts, which are often distributed as large three-ring binders, but not so common in published books. If you buy this book, you pay extra for the redundancy, but you’re never in doubt about the context of what Hackos is saying.

While Hackos is careful about the technical accuracy of her examples, the text is, surprisingly, not well copyedited. I bought my copy of the book directly from Comtech just a few weeks before I started writing this column – years after this edition came out – but the book still contains errors that a competent editor could have corrected before publication, or even in a subsequent reprinting. Sadly, the lack of editing of technical books is widespread, but given the cost of this one and the prominence of its author, I’m disappointed that the editing isn’t better.

Many readers will find this book too thorough and methodical for their taste. They will be frustrated by the slow pace of the tutorials. But if you persist, you will know the basics, and the later chapters cover material that most how-to DITA books don’t. If you’re new to DITA and you want to buy just one DITA book, this one is a good choice.


Windows 10


Recently Microsoft started bombarding me with notices about Windows 10. I had been running Windows 7 and had seen – and disliked – instances of Windows 8.1. I am usually cautious about operating system upgrades. I wait until the new version has been out a while. But I had heard good things about Windows 10 and sensed that Microsoft was making a special effort, so when the little window popped up at the bottom of my screen to tell me that my free upgrade to Windows 10 was on my machine and ready to install, I said “Go for it!”

I had seen a number of posts about how to respond to the Windows 10 privacy options. So when the installer asked if I’d like the default settings, I said no and turned off anything that seemed at all problematic. I am sure there are other settings that they don’t let you turn off as easily, or at all, but I felt I had done what I could. If you search online for information about Windows 10 privacy settings, you should find lots of guidance.

The installation and startup were the simplest and smoothest I have ever seen, and I have seen all Windows upgrades since version 3.1. When it was done, everything was in place and running, and it was hard to notice the small differences from Windows 7. I have been running Windows 10 for more than a week and have had no trouble. Chrome and Firefox quickly adapted, and it wasn’t hard to turn off Edge (the new Internet Explorer).

Once everything was running, I upgraded Office to Office 2013. That also went relatively smoothly, though I had some trouble with Outlook PST files. Microsoft had known for more than a year, but didn’t bother to tell me, that I had to upgrade them explicitly to Office 2013 format. When I did so, they worked fine.


That one glitch aside, I am amazed at how smoothly it all went. Watch out for the privacy settings, but if you run Windows, be sure to upgrade to Windows 10.

This article appears in slightly different form in the Sept/Oct 2015 issue of IEEE Micro © 2015 IEEE.

No comments: