Microsoft Office 2007 — Getting Closer to What STM Publishers Need

Over the last year, Microsoft has really engaged with the STM publishing community and has been maintaining a steady dialog on how they can help publishers start to use OOML and the OpenXML (DOCX) format.

I had the honor of moderating a seminar session at the SSP Annual Meeting on 28 May titled The View from the New Office: Opportunities and Issues Incorporating Office 2007 into Scholarly Publishing. Microsoft staff Pablo Fernicola, Murray Sargent, and Alex Wade all did a wonderful job of presenting many of the new features of Office 2007 and the value of OOML and OpenXML/DOCX formats. Cyndy Wessling also of Microsoft also chimed in from the front row on the latest release dates. Paul Topping of Design Science presented the amazing progress he has made on MathType. Tim Ingoldsby and Bruce Rosenblum presented the issues surrounding the new format that are still causing problems for publishers. Presentations can be found at inera.com/word2007directory.shtml. Here are my observations.

Microsoft has Created a New XSLT to Fix MathML Support in OOML

Great news! Unfortunately, they are currently planning on shipping this when Service Pack 2 is ready for Office 2007. That would mean that it is not likely to happen until early 2009. This would be a real shame. Apparently the XSLT transform has been tested and really would make a significant difference. Without this XSLT it is not really advisable for publishers that need MathML in their workflows to start supporting the OOML/DOCX format. If anyone from Microsoft is listening out there, I highly recommend you simply publish this XSLT as a beta out on your developer forum or even better please post it to your scholarly website at microsoft.com/mscorp/tc/scholarly_communication.mspx . The XSLT is not technically part of the Office 2007 offering. It is a transform that publishers and their partners (as well as others) can readily start inserting into our automation routines. You have come so far! Give us the tools so we can start the migration. You want, we want it, let’s get moving together.

Continue reading

Microsoft Office 2007 — Getting Closer to What STM Publishers Need

Over the last year, Microsoft has really engaged with the STM publishing community and has been maintaining a steady dialog on how they can help publishers start to use OOML and the OpenXML (DOCX) format.

I had the honor of moderating a seminar session at the SSP Annual Meeting on 28 May titled The View from the New Office: Opportunities and Issues Incorporating Office 2007 into Scholarly Publishing. Microsoft staff Pablo Fernicola, Murray Sargent, and Alex Wade all did a wonderful job of presenting many of the new features of Office 2007 and the value of OOML and OpenXML/DOCX formats. Cyndy Wessling also of Microsoft also chimed in from the front row on the latest release dates. Paul Topping of Design Science presented the amazing progress he has made on MathType. Tim Ingoldsby and Bruce Rosenblum presented the issues surrounding the new format that are still causing problems for publishers. Presentations can be found at inera.com/word2007directory.shtml. Here are my observations.

Microsoft has Created a New XSLT to Fix MathML Support in OOML

Great news! Unfortunately, they are currently planning on shipping this when Service Pack 2 is ready for Office 2007. That would mean that it is not likely to happen until early 2009. This would be a real shame. Apparently the XSLT transform has been tested and really would make a significant difference. Without this XSLT it is not really advisable for publishers that need MathML in their workflows to start supporting the OOML/DOCX format. If anyone from Microsoft is listening out there, I highly recommend you simply publish this XSLT as a beta out on your developer forum or even better please post it to your scholarly website at microsoft.com/mscorp/tc/scholarly_communication.mspx . The XSLT is not technically part of the Office 2007 offering. It is a transform that publishers and their partners (as well as others) can readily start inserting into our automation routines. You have come so far! Give us the tools so we can start the migration. You want, we want it, let’s get moving together.

Continue reading

STIX Fonts Go Beta

October 31, 2007 will forever be remembered as an important day in publishing history. After more than ten years in research and development, the STIX fonts (Scientific and Technical Information Exchange) have finally launched and are freely available in beta! This new web font set properly renders mathematical symbols on any browser alleviating the need for publishers to assemble symbols from a variety of fonts. It includes over 8,000 glyphs.By making the fonts freely available, the STIX project hopes to encourage the development of widespread applications that make use of these fonts. The TeX version of the fonts should be available soon after the production version is released.

Much thanks goes to the six publishers who collaborated to design, fund, and manage the STIX project include American Chemical Society (ACS), American Institute of Physics (AIP), American Mathematical Society (AMS), American Physical Society (APS), Elsevier, and Institute of Electrical and Electronics Engineers (IEEE) and their technical partner MicroPress, Inc.

Congratulations to Tim Ingoldsby (Project Chairman) and his terrific team for seeing this through.

I recommend all publishers download the fonts from the STIX web site at www.stixfonts.org today.

STIX Fonts Go Beta

October 31, 2007 will forever be remembered as an important day in publishing history. After more than ten years in research and development, the STIX fonts (Scientific and Technical Information Exchange) have finally launched and are freely available in beta! This new web font set properly renders mathematical symbols on any browser alleviating the need for publishers to assemble symbols from a variety of fonts. It includes over 8,000 glyphs.By making the fonts freely available, the STIX project hopes to encourage the development of widespread applications that make use of these fonts. The TeX version of the fonts should be available soon after the production version is released.

Much thanks goes to the six publishers who collaborated to design, fund, and manage the STIX project include American Chemical Society (ACS), American Institute of Physics (AIP), American Mathematical Society (AMS), American Physical Society (APS), Elsevier, and Institute of Electrical and Electronics Engineers (IEEE) and their technical partner MicroPress, Inc.

Congratulations to Tim Ingoldsby (Project Chairman) and his terrific team for seeing this through.

I recommend all publishers download the fonts from the STIX web site at www.stixfonts.org today.

Microsoft and STM Publishers Meet to Discuss DOCX / Word 2007

Since my post to Nascent in June on the best way for publishers to deal with Word 2007 within their publishing ecosystems,

I have been involved in the conversation about the best way for publishers to adopt Word 2007 into their publishing ecosystems on two fronts:

  • Tracking the conversation.

    See my tagged Connotea links

  • Participating by hosting a meeting on 25 July 2007 at the NPG office in New York between staff from Microsoft (Jennifer Michelstein, Lee Dirks, Murray Sargent), AIP (Tim Ingoldsby), AGU (Carter Glass), AAAS/Science (Brooks Hanson), Inera (Bruce Rosenblum), Aries (Ben Peterson and Lyndon Holmes), and Nature (Howard Ratner and Chris Flammang).

The meeting agenda was quite simple. Discuss a typical journal publisher’s workflow from authoring through to publication and archiving. This was not specific to any of the publishers present, but rather a high level overview of the various stages involved and a quick overview of the types of software systems that have been built to aid in these workflows along with a quick overview of the standards that are important to this community. The four publishers in the room did an excellent job of relaying this to the Microsoft staff. This was then followed up by presentations from Inera and Aries detailing the problems Word 2007 was causing for editing automation tools (eXtyles) and manuscript tracking systems (Editorial Manager).

The open and fruitful conversation quickly turned toward how Microsoft, third-party vendors and publishers can work together moving forward to make Word 2007 work within the STM publishing ecosystem. Here were some of the outcomes:

  • Microsoft will establish a page on one of its websites with more advanced details on how to best use Word 2007 in a publishing environment. (For example, an image of an equation created when saving a Word 2007 file to Word 2003 actually carries important semantic information that can be reused when reopening in Word 2007 file. Microsoft refers to this as dehydrating and rehydrating.)
  • Microsoft will consider adding text to its help file with Word 2007 especially about its Math Markup Language Support.
  • Microsoft will make real efforts to educate publishers by presenting more often at publisher events.
  • Microsoft will consider supporting STIX fonts when they are available. I just heard from Tim Ingoldsby of AIP, “Microsoft seems to be trying to cooperate with the STIX Fonts project. They have supplied us with a Font Edit program that will let us get the STIX Fonts ready for Word 2007 (so they can be used in place of Cambria Math).”

If you can’t wait for Microsoft’s information site to go live have a look at Bruce Rosenblum’s excellent summary of the current state of play.

It is also worth reviewing the press release from Design Science, producers of MathType, about Word 2007, equations, and scientific journal submissions. Also be sure to read their toolbar tip which gives instructions on how to add a button within the Word 2007 quick-access toolbar for the legacy equation editor.

More information is sure to be published on this topic in the next few months as vendors ramp up to better accept the new DOCX format. Be sure to check out my Connotea bookmarks on this topic.

Microsoft and STM Publishers Meet to Discuss DOCX / Word 2007

Since my post to Nascent in June on the best way for publishers to deal with Word 2007 within their publishing ecosystems,

I have been involved in the conversation about the best way for publishers to adopt Word 2007 into their publishing ecosystems on two fronts:

  • Tracking the conversation.

    See my tagged Connotea links

  • Participating by hosting a meeting on 25 July 2007 at the NPG office in New York between staff from Microsoft (Jennifer Michelstein, Lee Dirks, Murray Sargent), AIP (Tim Ingoldsby), AGU (Carter Glass), AAAS/Science (Brooks Hanson), Inera (Bruce Rosenblum), Aries (Ben Peterson and Lyndon Holmes), and Nature (Howard Ratner and Chris Flammang).

The meeting agenda was quite simple. Discuss a typical journal publisher’s workflow from authoring through to publication and archiving. This was not specific to any of the publishers present, but rather a high level overview of the various stages involved and a quick overview of the types of software systems that have been built to aid in these workflows along with a quick overview of the standards that are important to this community. The four publishers in the room did an excellent job of relaying this to the Microsoft staff. This was then followed up by presentations from Inera and Aries detailing the problems Word 2007 was causing for editing automation tools (eXtyles) and manuscript tracking systems (Editorial Manager).

The open and fruitful conversation quickly turned toward how Microsoft, third-party vendors and publishers can work together moving forward to make Word 2007 work within the STM publishing ecosystem. Here were some of the outcomes:

  • Microsoft will establish a page on one of its websites with more advanced details on how to best use Word 2007 in a publishing environment. (For example, an image of an equation created when saving a Word 2007 file to Word 2003 actually carries important semantic information that can be reused when reopening in Word 2007 file. Microsoft refers to this as dehydrating and rehydrating.)
  • Microsoft will consider adding text to its help file with Word 2007 especially about its Math Markup Language Support.
  • Microsoft will make real efforts to educate publishers by presenting more often at publisher events.
  • Microsoft will consider supporting STIX fonts when they are available. I just heard from Tim Ingoldsby of AIP, “Microsoft seems to be trying to cooperate with the STIX Fonts project. They have supplied us with a Font Edit program that will let us get the STIX Fonts ready for Word 2007 (so they can be used in place of Cambria Math).”

If you can’t wait for Microsoft’s information site to go live have a look at Bruce Rosenblum’s excellent summary of the current state of play.

It is also worth reviewing the press release from Design Science, producers of MathType, about Word 2007, equations, and scientific journal submissions. Also be sure to read their toolbar tip which gives instructions on how to add a button within the Word 2007 quick-access toolbar for the legacy equation editor.

More information is sure to be published on this topic in the next few months as vendors ramp up to better accept the new DOCX format. Be sure to check out my Connotea bookmarks on this topic.

Word 2007 and the STM Publisher Ecosystem

As the CTO of Nature Publishing Group, I have become involved in a very lively conversation with Microsoft staff about why Word 2007 is not being actively endorsed by STM publishers. It has recently come to Microsoft’s attention (see blogs Murray Sargent and Brian Jones) that Nature ( https://www.nature.com/nature/authors/submissions/template/index.html ), Science ( https://www.sciencemag.org/about/authors/prep/docx.dtl), and many other scholarly publishers do not accept files authored in Word 2007. Both Science and NPG have been in correspondence with Microsoft staff on this important issue. The staff there have been very willing to engage in this conversation. As Inera is one of NPG’s main suppliers of Word macros (eXtyles) and a general expert on Word, I asked Bruce Rosenblum of Inera to enter the discussion. The following was sent to Microsoft on 12 June 2007 by Bruce Rosenblum to explain why this situation exists.

“Over the past 10 years, Microsoft Word has become the standard for almost all content authoring. As a result of Microsoft’s success with Office, and the relative stability of the Office environment and DOC format over that time, third parties have built sophisticated applications to address specific vertical market requirements for integration of Word into highly efficient workflows.

eXtyles is one such application; eXtyles is a suite of editorial and XML tools for Word in wide use by scholarly publishers. But eXtyles is only one organism in the larger ecosystem of domain-specific applications dedicated to scholarly publishing. Other tools include online submission and peer review applications, and other applications used in the post-editorial production workflow.

Like eXtyles, most of the applications in this workflow ecology are not yet compatible with DOCX format. For example, I surveyed the four largest vendors of online submission and peer review systems this week, and none support DOCX files. Nor could any of the four provide me with a date when they expect to have native DOCX compatibility.

If you detect no sense of urgency to upgrade systems in this vertical market, you are not mistaken. For most scholarly publishers, the challenge is to publish high quality and accurate information on a regular schedule. Software upgrades to critical publishing systems, unless they are seamless or provide a significant immediate benefit, are often not a priority.

In the case of Word 2007, upgrading is not seamless. Because files incorporating OMML equations are not semantically backwards compatible with older versions of Word, publishers must update an entire ecology of systems before they can accept DOCX files. Completing such updates requires work with third parties, careful testing, training, and finally deployment — often one system at a time — of updated applications. All of this takes time.

In the mean time, because a DOCX file with OMML equations renders the equations as graphics when used with today’s systems, it’s easier for publishers to ask authors to refrain from submitting DOCX files until every part of the workflow ecology is DOCX-compatible. And not just updated to accept DOCX, but also updated so that OMML can seamlessly be integrated into systems today that provide publishers with full text XML and tagged math according to the NLM DTD or other 12083-derived DTDs.

Had the conversion from DOCX to DOC provided a conversion from OMML to Equation Editor format, it would have provided the necessary backwards compatibility for publishers to upgrade one system at a time. But because this compatibility is not available, it’s created the need for a “big bang” upgrade, or a delay until the ecosystem of inter-dependent systems is deliberately updated over time. In the environment of scholarly publishing, such substantive upgrades often take years, not months.

I hope this post clarifies some of the core issues DOCX format presents scholarly publishers and explains Word 2007 issues that are cause for publisher upgrade reticence. Those of us in the scientific community look forward to a dialog to articulate scholarly publishing requirements to Microsoft so that Microsoft can provide products that serve the needs of the entire scholarly community.”

Word 2007 and the STM Publisher Ecosystem

As the CTO of Nature Publishing Group, I have become involved in a very lively conversation with Microsoft staff about why Word 2007 is not being actively endorsed by STM publishers. It has recently come to Microsoft’s attention (see blogs Murray Sargent and Brian Jones) that Nature ( https://www.nature.com/nature/authors/submissions/template/index.html ), Science ( https://www.sciencemag.org/about/authors/prep/docx.dtl), and many other scholarly publishers do not accept files authored in Word 2007. Both Science and NPG have been in correspondence with Microsoft staff on this important issue. The staff there have been very willing to engage in this conversation. As Inera is one of NPG’s main suppliers of Word macros (eXtyles) and a general expert on Word, I asked Bruce Rosenblum of Inera to enter the discussion. The following was sent to Microsoft on 12 June 2007 by Bruce Rosenblum to explain why this situation exists.

“Over the past 10 years, Microsoft Word has become the standard for almost all content authoring. As a result of Microsoft’s success with Office, and the relative stability of the Office environment and DOC format over that time, third parties have built sophisticated applications to address specific vertical market requirements for integration of Word into highly efficient workflows.

eXtyles is one such application; eXtyles is a suite of editorial and XML tools for Word in wide use by scholarly publishers. But eXtyles is only one organism in the larger ecosystem of domain-specific applications dedicated to scholarly publishing. Other tools include online submission and peer review applications, and other applications used in the post-editorial production workflow.

Like eXtyles, most of the applications in this workflow ecology are not yet compatible with DOCX format. For example, I surveyed the four largest vendors of online submission and peer review systems this week, and none support DOCX files. Nor could any of the four provide me with a date when they expect to have native DOCX compatibility.

If you detect no sense of urgency to upgrade systems in this vertical market, you are not mistaken. For most scholarly publishers, the challenge is to publish high quality and accurate information on a regular schedule. Software upgrades to critical publishing systems, unless they are seamless or provide a significant immediate benefit, are often not a priority.

In the case of Word 2007, upgrading is not seamless. Because files incorporating OMML equations are not semantically backwards compatible with older versions of Word, publishers must update an entire ecology of systems before they can accept DOCX files. Completing such updates requires work with third parties, careful testing, training, and finally deployment — often one system at a time — of updated applications. All of this takes time.

In the mean time, because a DOCX file with OMML equations renders the equations as graphics when used with today’s systems, it’s easier for publishers to ask authors to refrain from submitting DOCX files until every part of the workflow ecology is DOCX-compatible. And not just updated to accept DOCX, but also updated so that OMML can seamlessly be integrated into systems today that provide publishers with full text XML and tagged math according to the NLM DTD or other 12083-derived DTDs.

Had the conversion from DOCX to DOC provided a conversion from OMML to Equation Editor format, it would have provided the necessary backwards compatibility for publishers to upgrade one system at a time. But because this compatibility is not available, it’s created the need for a “big bang” upgrade, or a delay until the ecosystem of inter-dependent systems is deliberately updated over time. In the environment of scholarly publishing, such substantive upgrades often take years, not months.

I hope this post clarifies some of the core issues DOCX format presents scholarly publishers and explains Word 2007 issues that are cause for publisher upgrade reticence. Those of us in the scientific community look forward to a dialog to articulate scholarly publishing requirements to Microsoft so that Microsoft can provide products that serve the needs of the entire scholarly community.”