The Myth of Single-Source Authoring
Single-source publishing is a zombie idea that revives itself periodically and refuses to stay dead. Its zombie supporters chant its purported benefits as a âwrite once, publish to manyâ promise and ploddingly follow it as their ultimate goal for mechanized authoring and machine translation. As an object-oriented writing methodology, it is as human as present-day robot technologyâgood only for conveyor belt assembly or specialized tasks, and always very expensive to implement. Single-source publishing lacks purpose in todayâs world of information turnover and the dynamic nature of the Web 2.0 moving to Web 3.0 landscape.
But hope survives to finally bury this living-dead entity once and for all. And who will be our emerging heroes to fill the promise of content reuse and localization savings? Knowledge mashups and applications using cloud-based linked data and the emergence of the semantic Web.
Quick Note: This posting starts a periodic thread on authoring content and the collaboration of knowledge providers for the individual as well as the enterprise. In my mind, concepts about integration of content to form ad hoc knowledge and the collaboration of services to form unique applications are interchangeable when talking about mashups, cloud computing, and linked data. All come from the same fountainhead of processes and the same philosophy with the same benefits.
Not Working as PromisedâŚAgain
After its death-knell in the early â90s with SGML markup and the DocBook DTD, single-source authoring rose again with XML and DITA in the late nineties and early 21st century. IBM liked the idea early on but, with the advent of Web publishing and quickening pace of information management, gave it up. Others like Novell and other large corporations adopted it as a multi-platform solution to render content in different formats relying on syntactical processing rather than semantic markup. But the overhead of highly structured writing and evolution of multiple publishing formats made this practice obsolete. Still, some companies hang on to their archaic ideas of single-source authoring and similar to the plot of the movie Weekend at Bernieâs, continue to prop up this dead thing as a real-life entity just to keep the party going (and to support their now entrenched system and management decisions). But like todayâs updated version of zombies, single-source authoring will never be more than a corpulent, well-dressed stiff.
Taking on the Purported Benefits of Single-Sourcing
Single-source publishing makes promises that the same content can be employed in different documents or in various formats to reduce writing costs in publishing and localization. It is claimed by proponents that the expensive, labor-intensive work of setting metadata and reusing topics for use in different documents and formats can be accomplished mechanically by automated tools to save time and money. I want to dispel some of these arguments:
Publishing to Multiple Formats. The reasons for highly-structured content to publish as online help, PDF, printed materials, and Web content died in the nineties. These days, any content can be saved to all of these formats using the authoring tools in the marketplace today.
Reusing Topics. In theory, the ability to re-use content from a library of already written, edited, and translated topics seems to save time and leads to cost savings. Itâs like the difference between procedural, top-down coding and the now ubiquitous use of object-oriented programming. Sounds good as a theory, but writing and coding are far different endeavors.
In practice, single-source authoring rarely works. While code classes are organized and accessed for specific needs and can be extended from the root classes for specific needs, they always rely on same base class functionality. Rules for each programming language are objective and reliant on a compiler to translate using exacting syntax into program operations. Conveying information is subjective from the context of the writer to the context of the reader. One needs to meet finite compiler rules, and the other needs to move information from the synaptic interconnects in one personâs brain to the synaptic interconnects of anotherâs. Writing relies on the de facto connotations of language as it evolves organically in a society, while programming languages are de jure, unbending rules are set by a software vendor or open-source committee.
Trusting Others. Single-sourcing within a company requires one writer to generate a topic to be used by another. Not a problem for object-oriented programmers. Itâs actually an effective process and the status quo for programmers today. After all, each coding language includes a library of classes to be implemented or extended rather than reinventing each procedure. However, in my experience, information developers seldom re-use another writerâs topic unless it is a basic glossary entry link. The needs of each communicator are so vast and different in imparting knowledge in an e-book, guide, or document as to make reuse not worth the time or effort.
And then there is my empirical knowledge of content reuse. As a manager of a technical writing team engaged in single-sourcing methods, my experience shows that a writer seldom grabs a topic wholesale and places it into his or her document. Topics rarely meet all needs of the author and usually throw off the context and purpose of the document. At best, some parts of a document (a paragraph or two) can be referenced and reused as Context References, the ConRef feature in DITA for example. But then, cut and paste proves effective here too.
I can see where a single writer or closely-knit group of writers (two or three at most) can collaborate seamlessly at a workable level of lockstep writing. But for most organizations, the planning and practicing of content reuse is rarely successful beyond the publication of a cookbook or other basic reference materials.
Localizing Documents. The translation of content by creating individual, reusable topics presents a chance for information developers to demonstrate a real cost savings of single-sourcing for an organization. The argument goes that because it costs so much to translate, then reuse has a real cost savings that can be shown on the ledger sheet.
I profoundly doubt this argument. Writing topic-based content requires dumbed-down and standardized information to meet the assembly line process of single-source authoring. Topics need to follow a formula of concept, task, and reference topics strung together and watered-down to meet the lowest common denominator for all translated languages. This means the author needs to omit the richness of language of each dialect. I understand that diluted language reflects the nature of writing for localization, but single-source strategies only add to banal explanations and reference content not really needed by users. Customers need context and real-world knowledge.
Many proponents of topic-based writing point to the savings of using it with automated machine translation (MT) rather than human translation as a cost savings, which is probably a highly contested argument across all localization companies. That debate is not something I want to get into here. I have even heard a high-ranking manager forecast that one day the machines can do all the writing as well. Machine writing to machine translation to human readers. Good luck on that.
For now, the cost of setting up machine translation using topic-based writing requires a large investment that rarely, if ever, realizes cost savings. To be a believer in the merits of single-source authoring, I would need to see the total costs of staffing a localization team to painstakingly set up MT and then get a forensic accountant to study the time and effort spent on adding metadata to each document and compiling its various components as a readable document. And then I would have to see a customer satisfaction study on how the lack of quality affects sales.
Authoring In-house. Practices in writing and reading content are changing rapidly. Just ask Rupert Murdoch and any newspaper publisher. Aggregators, bloggers, and social networks stand as the future for imparting much of the information we will consume. In-house authoring now competes with bloggers as experienced subject matter experts and with group editors inherent to social network postings critiqued by multiple readers. As the sole writer confronting the horror of the blank page, itâs hard to compete with so much experience and intellect. Instead, social writing practices should be embraced and fostered.
In addition, the logistics of single-source authoring requiring that all writers use a common database and authoring tools regardless of their location causes many performance and security problems. Add to the logistical problems the emerging advent of online translations through Google Translate and other services, and the argument for single-sourcing and proprietary machine translation practices seems weak.
Single-Sourcing After Publishing
For information developers working with product, service, or development teams, the goal is to describe the features of the product or service as presented to them from internal experts. Consequently, they produce a feature by feature description of the product from the inside-out. You want to travel on vacation? Well first let me give you an encyclopedia of the features of the combustion engine. I may get around later to a travel guide and maps. In-house authors write from the perspective of features developed by the R&D, marketing, or product support teams rather than the outside-in best practices and innovative uses needed by the customer. See Shotgun Communication for an in-depth view of corporate information problems and examples.
The main issue for me is between authoring static in-house documents using single-sourcing methods before publishing, or capturing information sources dynamically after publishing from online social networks, linked data sources, and knowledge mashups . The myth of single-source authoring is that it actually has a life in the future and remains a viable goal for many information developers. With so many mega-trends against itâsuch as the belief that static authoring from a single vantage point from a single author paid by a single organization is a workable systemâseems ludicrous. Instead, we should be looking to capture, sequence, and give context to the wealth of rich content already published in context from the Web. Collaborating with the many subject experts, authors, videographers, bloggers, tweeters, and writers coming together on the Web with shared interests will be powerful if it can be harnessed.
In a future posting, I will present my ideas for knowledge mashups and linked data objects that utilize the best of in-house authors to prime key discussions while giving stakeholders the knowledge and impetus they need to perform tasks specific to their unique needs.
November 18, 2009
Âˇ Michael Hiatt Âˇ 25 Comments
Tags: knowledge mashups, Linked data, single-sourc authoring, Single-source publications Âˇ Posted in: Contextual Data, Information management, Knowledge management, Linked data, Mashups, Single-source publishing