Business Services Industry
A hands-on guide for multilingual Web sites - includes related article on creating legible graphics files from text
Communication World, June-July, 1999 by Gerry Dempsey, Robert Sussman
Globalization is on everyone's mind. Big companies are printing brochures in Korean and Finnish, and Japanese, and publishing their company newsletters in Spanish and French. But check out the web sites of the Fortune 500, and time and again you see only one language: English. This is surprising.
The Internet is the ultimate global information source, but American companies - even multinationals - are largely ignoring its ability to reach literally every human being with access to a computer. Perhaps English-only companies feel that making foreign-language web pages is just too hard to do. The fact is that translating a web site is not that difficult. Here is how it's done.
PLANNING THE PROJECT
To start the process, a company must decide which languages are right for its web site. That choice will, of course, depend on the target readership. The most popular languages in corporate America are French, Spanish, German, Japanese and Chinese; a second tier would include Swedish, Portuguese, Italian, Russian and Korean. In addition, for some languages, a dialect must be specified. For example, Portuguese has two dialects, European and Brazilian, and Spanish has three main variants: European, Mexican and South American. For the most part, the difference among dialects is fairly subtle. The contrast between British and American English is a good illustration, with differences in spelling (colour vs. color) and vocabulary (lorry vs. truck) accounting for most of the variation. Nevertheless, these subtle differences are extremely apparent to any reader, so the proper choice of dialect is an important consideration. Chinese has two distinct writing systems, one used in China and Singapore, the other in Taiwan, Hong Kong and other Asian countries.
The next step is to identify which web pages should have language versions. The home page will be translated, naturally, as will most if not all pages that directly link to the home page. Beyond that, there should be two criteria for translation: marketing and maintenance. A page that emphasizes marketing, sales and corporate identity is a better candidate for translation than, say, a page about employment opportunities. As for maintenance, a page that changes regularly (for example a weekly events update) is a poorer candidate for translation than a page that is expected to remain stable for a long time.
THE LANGUAGE PHASE
Now that the languages are chosen and a set of relatively high-value, low-maintenance web pages is decided upon, translation can begin. The ideal translator is a professional writer with a knowledge of languages and expertise in the subject matter being covered. Translators should write exclusively in their native language, no matter how good their competence in other languages may be. Like all writers, translators need their editors. In fact, a translation is simply not complete until it has been thoroughly vetted by a professional editor. Assembling a professional translator-editor team for each language is crucial to the success of any multi-language web site. With the final translation in hand, the project can move on to the HTML phase.
HTML
HTML (short for HyperText Markup Language) is a set of codes that define the content and structure of web pages. Embedded in these codes is the text that users see on the screen. Within HTML, the basic unit of text is the paragraph. So there will be a sequence of codes, then a paragraph of text, then more codes, then another paragraph of text, then more codes, then another paragraph, and so on. The task here is simply to replace each English-language paragraph with its foreign-language equivalent, using the same cut-and-paste techniques that everyone is familiar with. If the operator is careful enough to avoid mixing up the paragraphs or making inadvertent changes to the code, it's easy. Of course, the new pages must be renamed, relinked, and placed within the hierarchy of the site, but these are standard webmaster functions that have nothing to do with the language aspect of the project. It's essentially a matter of cut-and-paste, and voila: the page is in another language.
Cutting and pasting paragraphs within HTML is the ideal solution, as it's easy to do and the resulting page looks just like the original. Unfortunately, this technique works only with languages that use the Roman alphabet. This includes all the Western European languages, but none of the Eastern European languages. Also excluded are Chinese, Japanese and Korean. The reason for this limitation resides with the nature of fonts.
THE PROBLEM WITH FONTS
A font is a set of characters. A typical font contains the letters from A to Z, the numbers 0 through 9, plus punctuation marks and other special characters. Each character is identified in the computer by one byte of information, such as 00110010. This "1 byte = 1 character" system can accommodate only 256 different characters (there are 2 kinds of bits - 0 and 1 - and there are 8 bits per byte: [2.sup.8] = 256). That's plenty for the Western languages, because our alphabet has just 26 letters, leaving more than enough slots for curiosities like [Pi] and [Beta. But the 256-character limit poses a problem for languages such as Chinese, which has well over 10,000 characters. They simply cannot be represented on the computer using only one byte of information each. So each character in a Chinese font is identified in the computer by two bytes, for example 00011100 10101000. This solution allows for 65,536 characters ([2.sup.16]) but it creates another problem: most Internet browsers in the West are designed for one-byte fonts, so they cannot read two-byte fonts. To get around this obstacle, it is necessary to have special "reader" software running in the background. Without that software, Chinese text looks like this: ???. As you cannot expect visitors to your site to have "reader" software, the only solution is to display these languages in art files - not as text pasted into the HTML code. Languages such as Russian or Polish do use one-byte fonts, but to read those fonts, a user's computer must be specially configured. This configuration is easy to do, and is available on all computers. The problem is that most browsers in the West are not configured for the Eastern European languages, so Russian text will look like Chinese text: gobbledygook. Again, the solution is to display the text as art files.
Most Recent Business Articles
- Multiple criteria evaluation and optimization of transportation systems
- Multi-criteria analysis procedure for sustainable mobility evaluation in urban areas
- A two-leveled multi-objective symbiotic evolutionary algorithm for the hub and spoke location problem
- Multi-criteria analysis for evaluating the impacts of intelligent speed adaptation
- The development of Taiwan arterial traffic-adaptive signal control system and its field test: a Taiwan experience
Most Recent Business Publications
Most Popular Business Articles
- 7 tips for effective listening: productive listening does not occur naturally. It requires hard work and practice - Back To Basics - effective listening is a crucial skill for internal auditors
- FAS 109: a primer for non-accountants - Financial Accounting Standards Board's "Statement 109: Accounting for Income Taxes"
- Design a commission plan that drives sales - Sales Commissions
- Too Young to Rent a Car? - 25-years-old the minimum age for car renting - Brief Article
- LIFO vs. FIFO: a return to the basics



