Bumps on the Road to Document Exchange Nirvana

The OpenDocument Foundation has announced its plans to sever itself from participation in or further advocacy of its namesake office document format in favor of the World Wide Web Consortium’s XHTML (Extensible HTML)-based Compound Document Format.

Although the OpenDocument Foundation is a fairly small organization, the group sports a certain cachet that stems from the ODF-to-MS Office plug-in that the group announced–but did not release publicly–about a year and a half ago.

At the heart of the rift between the Foundation and the rest of the ODF backers–led by Sun and IBM–lies a dispute over the proper strategy for achieving round-trip document fidelity between Microsoft Office and ODF-consuming applications, such as Sun’s OpenOffice.org or IBM’s Lotus Symphony.

When you open an MS Office document with one of these applications, a conversion engine attempts to map every formatting element it finds to a feature of the application doing the importing. If some formatting elements are unknown or otherwise unmappable, those elements are stripped and thrown away.

Stripped formatting elements mean formatting inconsistencies in documents passed between MS Office and other applications, and these inconsistencies have made it extremely tough to sell organizations on MS Office alternatives–even alternatives with zero licensing fees.

The OpenDocument Foundation wanted to see ODF applications pick up the capability to pass along unknown elements in order to maintain formatting fidelity, albeit at the cost, at times, of file format purity.

It turns out, however, that the backers of ODF care a great deal about file format purity–they’re out to create a group of MS Office-killers, and as they see it, perpetuating bits of proprietary MS document formatting runs directly counter to their Office-slaying plans.

The OpenDocument Foundation has announced its plans to sever itself from participation in or further advocacy of its namesake office document format in favor of the World Wide Web Consortium’s XHTML (Extensible HTML)-based Compound Document Format.

Although the OpenDocument Foundation is a fairly small organization, the group sports a certain cachet that stems from the ODF-to-MS Office plug-in that the group announced–but did not release publicly–about a year and a half ago.

At the heart of the rift between the Foundation and the rest of the ODF backers–led by Sun and IBM–lies a dispute over the proper strategy for achieving round-trip document fidelity between Microsoft Office and ODF-consuming applications, such as Sun’s OpenOffice.org or IBM’s Lotus Symphony.

When you open an MS Office document with one of these applications, a conversion engine attempts to map every formatting element it finds to a feature of the application doing the importing. If some formatting elements are unknown or otherwise unmappable, those elements are stripped and thrown away.

Stripped formatting elements mean formatting inconsistencies in documents passed between MS Office and other applications, and these inconsistencies have made it extremely tough to sell organizations on MS Office alternatives–even alternatives with zero licensing fees.

The OpenDocument Foundation wanted to see ODF applications pick up the capability to pass along unknown elements in order to maintain formatting fidelity, albeit at the cost, at times, of file format purity.

It turns out, however, that the backers of ODF care a great deal about file format purity–they’re out to create a group of MS Office-killers, and as they see it, perpetuating bits of proprietary MS document formatting runs directly counter to their Office-slaying plans.

For my part, I don’t care about file format purity, and I don’t care about vendor hopes of building or maintaining supremacy for some particular brand of office application. I care about my data, and I care about having as broad a set of options as possible for accessing and manipulating that data.

I would, however, like to see more diversity in the office applications space, because Microsoft is currently dominant enough to get away with offering a very slim set of options for accessing MS Office formatted data. The only way to access and manipulate MS Office documents is to do so from a fat Windows client machine running a fat Office suite.

In reference to the OpenDocument Foundation’s abandonment of ODF, Microsoft’s director of corporate standards, Jason Matusow, posted a telling comment on his blog: “….when you are speaking about document formats, you are really speaking about an adjunct technology to the applications which are the real ‘solutions’ in this discussion.”

On paper, file formats may be of little intrinsic value, but consider the real world, where file formats serve as containers for our data–it’s ludicrous to argue that our data should take a back seat to the tools we use to access and manipulate it.

Back on the ODF side of the aisle, we have a format that’s undoubtedly better suited to offering a broad range of access and manipulation options. However, if it really is possible to boost fidelity between ODF-consuming applications and MS Office, then the ODF’s backers should be working to make this possible.

As for the OpenDocument Foundation and the CDF, I must admit to keeping my file exchange nirvana expectations low. While the Foundation has some promising ideas, I question its track record for actually making things happen. The group’s Office file converter took an inordinately long amount of time to become publicly available, and unless I’m mistaken, the project’s source code has yet to see the light of day.

This time around, I suggest that the CompoundDocument Foundation keep in mind the mantra on which all successful community projects are based: Release Early, Release Often. Ideas and proposal text are fine, but it’s tough to rally around a white paper. If the ODF backers have it wrong, then show us the code and prove it.