Measuring Office Format Fidelity with Acrobat 9

When considering alternatives to Microsoft’s Office productivity suite, one of the most important issues to evaluate is that of the success with which Office rivals such as can handle Microsoft’s ubiquitous binary file formats.


Over the past few years, eWEEK Labs has approached the MS Office to file format fidelity issue several times. Our conclusions haven’t changed much since 2004, when Anne Chen and I helped one of our corporate partners test the productivity suite pair for themselves:

“Although does a good job of handling Microsoft Office file formats, small formatting inconsistencies will require reworking of complex documents.”

While the phrase “small formatting inconsistencies” still sums up the situation fairly accurately, organizations and individuals out to bring the open source suite into their application mix could use a more rigorous means of measuring’s handling of MS Office formats.

That’s why, when Adobe briefed me on Acrobat 9, I was particularly interested in Acrobat’s new “compare documents” feature, which analyzes two PDF documents and parses out all of the inconsistencies between them.

I grabbed a Word-formatted reviewer’s guide document from Microsoft’s Web site, opened it up in Word 2007, and printed it to a PDF using Acrobat 9.

Next, I opened the document in 2.4 and used Acrobat 9 to print it to a PDF document. I could have used’s built-in PDF export function, or Office 2007’s plugin-based PDF exporter, but I opted to stick with Acrobat in order to minimize inconsistencies that the differing PDF exporters might have introduced.

I fired up Acrobat 9 (I tested with a beta version of the software) and pointed the application’s compare document feature at my Office and PDF documents. The result? Good fidelity overall, but various inconsistencies remained. This time, however, I had Acrobat 9 on hand to point the inconsistencies out to me.

For instance, right on the first page of the document, rendered a 935 by 227 pixel logo at 936 by 234 pixels–a formatting inconsistency that resulted in a slightly misplaced logo, but one that I would have had a tough time putting my finger on without Acrobat 9’s help.

Another odd, slight inconsistency came in the document’s table of contents, in which rendered 146 periods between the section name and page number, where Office had rendered 145 periods.

I also downloaded a test version of the upcoming version 3, and compared that version’s Word document rendering to that of 2.4. Both versions appeared to render my test Word document exactly the same–a result that Acrobat’s compare function confirmed.

Since support for Microsoft’s new Office Open XML formats is one of the new features in 3.0, I fetched another document from Microsoft’s Web site, this time in the DOCX format, and cheffed up some PDFs to gauge the open-source suite’s OOXML chops. This time, the formatting differences were much more pronounced and included misplaced images and jumbled bullet lists.

I expect to see 3.0 improve its handling of OOXML documents as it moves closer to its release. I’ll be testing the suite’s OOXML capabilities as subsequent test releases emerge, and I expect that I’ll be using Acrobat 9 to help with those tests.

For a walkthrough of my Acrobat-fueled Office vs. file format adventures, see our slide show, here.