The World Wide Web and emerging Internet resource discovery standards for scholarly literature - Networked Scholarly Publishing

Library Trends, Spring, 1995 by Stuart L. Weibel

As HTML evolves (its enhancement is mediated by the activities of a Working Group of the Internet Engineering Task Force), it is likely to acquire greater expressive power, but it is not likely to be imbued with the full richness of SGML. If this were to happen, the Web browsers themselves would become far more complex to develop and maintain, and, more importantly, the simplicity of Web publishing as we know it today would be overwhelmed by the complexity of SGML.

A more likely scenario is the promulgation of SGML display applications that act in tandem with Web browsers much as external graphics viewers now support the display of image data without actually being part of the Web browser itself.

The first such SGML display engine has recently been announced by SoftQuad, a Toronto based vendor of SGML software and systems (SoftQuad, 1994). This product, named Panorama, is being made available in a public domain version and a commercial version (Panorama Pro, with somewhat enhanced capabilities).

Formal publishers will thus have a mechanism for distributing typographically complex, SGML-encoded, materials while occasional or less formal publishing will benefit from the simpler idiom that HTML affords.

An Interim Solution: The Translation

of SGML to HTML

When SGML viewers are commonly available and widely supported by information providers, many of the representational problems of HTML text will become moot. During the transitional period leading to that state, the delivery of complex scholarly text requires an interim solution involving the translation of more complex markup (such as SGML) to HTML. The translation facility developed in the OCLC Office of Research is being used to provide Web-based access to Electronic journals Online (Weibel et al., 1994). The first journal to be supported thus is the American Institute of Physics journal, Applied Physics Letters Online.

The translator parses an SGML document and decomposes it into a grammar tree. Each SGML entity in the document is translated into either its HTML-specific counterpart or a bitmap of the appropriate font character. Each formula (i.e., equations or mathematical notation) is extracted and translated from 12083 SGML to TeX, a computer-based typography system, and subsequently rendered to generate a corresponding bitmap. The 12083 SGML standard is a recommended style of SGML for books and journal publishing (12083, 1994).

Figures and tables are handled similarly to equations. However, in these cases, a reduced size or thumbnail image is embedded in the running text. It in turn is linked to a corresponding full-size image that can be downloaded at the user's discretion. The thumbnail images reduce initial image-loading burdens and provide a better-proportioned page display (full-resolution figures in electronic documents are typically of awkward proportion when included inline in running text). The full-sized image is displayed by selecting the thumbnail image, thereby invoking the appropriate external viewer.


 

BNET TalkbackShare your ideas and expertise on this topic

Please add your comment:

  1. You are currently: a Guest |
  2.  

Basic HTML tags that work in comments are: bold (<b></b>), italic (<i></i>), underline (<u></u>), and hyperlink (<a href></a)

advertisement
Click Here
advertisement
  • Click Here
  • Click Here
  • Click Here
advertisement

Content provided in partnership with Thompson Gale