Word
Word typesetting
Word is a popular tool among authors and supports numerous publishing workflows with features such as Track Changes. For actual page makeup, however, more suitable tools such as InDesign or TeX are used in most cases. In principle, conversion into these formats is relatively straightforward, although it often involves reworking or checking which incurs costs. This is particularly the case if the author does not use style sheets systematically and consistently.
Finalizing a Word manuscript in Word can offer a cost-effective alternative to page makeup in TeX, InDesign, or other typesetting systems, provided certain requirements are met. Problems may be encountered with complex layouts, certain image types, or particularly large publications, for example. However, many manuscripts are suited to a Word-only workflow.
To handle a workflow like this requires specific experience. le-tex has developed methods and tools to enable efficient handling of even very large or technically demanding Word documents. le-tex has proved its skill in this field in over a hundred projects totaling many tens of thousands of pages.
The following services are provided:
-
Typesetting of books, journals, and proceedings;
-
Consulting for authors and publishers;
-
Repair of Word files;
-
Finalizing of Word documents (i.e. page makeup, image import, and index creation);
-
Organization and integration of individual files into complete documents;
-
Editing of large documents (> 1000 pages);
-
Generation of complex indexes (e.g. for proceeding series with 5000-page volumes);
-
Creation and adaptation of templates;
-
Macro programming;
-
Conversion of Word data into PDF documents;
-
Conversion of Word documents with math content and tables into TeX and vice versa; and
-
Conversion of Word documents generally into XML and vice versa.
Word as an authoring tool: Word to XML, XML to Word
Approximately 90% of all authors (depending on the subject area) insist on Word as a tool for editing their manuscripts. le-tex creates style sheets and converters, in order not only to meet the author's needs in this regard but also to fulfil the publisher's requirements for standardized, reusable content with an attractive layout.
Word author templates can be restricted to the required paragraph and character styles so that the content can be converted to XML or other formats without manual intervention.
However, there is also a demand for conversion in the opposite direction, particularly among the growing number of publishers who store their books as full-text XML. le-tex can efficiently convert structured XML content including tables, math content, index entries, or nested lists into Word. Authors can then use their familiar Word environment to edit follow-up editions of their works. This also enables copy editing in Word. Depending on the strictness of the Word style sheet and with the author's cooperation, the text can then be converted back to XML automatically and rendered with the help of other tools such as LaTeX, InDesign, XSL-FO processors, or HTML converters.
During the conversion, le-tex uses the Word 2007 format Office Open XML (.docx), guaranteed by Microsoft to be compatible with older Word versions. In other words, even if the actual conversion always takes place between Word XML and the XML target format, authors can be supplied with conventional Word data (e.g. Word 2003) for the previous edition.
Technologically, le-tex has implemented the conversion between Word and other XML formats predominantly in XSLT/XPath 2.0. The approach is similar to that of roundtrip from the DocBook XSL project, where the le-tex converter stands out for the following reasons:
-
Extensive coverage of the markup features (including tables, lists),
-
Support for XML formats other than DocBook (where DocBook performs a key role as a neutral interim or hub format)
-
And finally by using powerful XSLT/Xpath 2.0 constructs which provided not only more compact, easy-to-maintain code, but also, and most importantly, solutions to some of the roundtrip shortcomings.