Nov 19, 2014

ATA 2014: Is Machine Translation Your Friend or Foe? Challenges for English-Japanese Translators

Presented by Dr. Takako Aikawa, Sr. Lecturer in Japanese, MIT Global Studies and Languages
Summarized by Kazumasa Aoyama, Park IP Translations

In her excellent presentation, Dr. Aikawa discussed the use of machine translation (MT) in English-to-Japanese translation. She gave answers to the question with many examples: “Why English-to-Japanese translation is so challenging for MT?”  She also showed us how human translators can help improve the quality of MT.

As an introduction, she told us the history of MT, its advancements, moving from rule-based MT to statistic MT (SMT), and how an SMT system works. SMT has great advantages over rule-based MT: It is scalable and sustainable.

She then showed us challenges that MT faces when translating a language to another, and particularly from English to Japanese.

She listed the following open problems for MT in general:
1.      Lexical Ambiguity: Many words have different meaning depending on a given context.
2.      Syntactic Ambiguity: One sentence can be syntactically interpreted more than one way.
3.      Idioms
4.      Pronoun resolution: MT has to know what a pronoun refers to and its gender and number.

She then gave an answer to the question. The reason why the quality of English-to-Japanese MT/SMT is so bad is
Japanese and English are so different from each other:
1.      Word order (SVO (English) vs. SOV (Japanese))
2.      Case markers. The Japanese language uses case markers such as and , and the word order of a sentence in Japanese is not as critical as in English – free word order.) 太郎そのリンゴ食べた and そのリンゴ太郎食べた are both acceptable Japanese sentences with almost the same meaning.
3.      Postpositions. An English preposition may need to be translated to different Japanese postpositions depending on the context
Taro ate the apple at school/at 3 pm.
太郎がその林檎を学校/午後3時食べた。
Taro will come by train/by noon.
太郎は電車/正午までに来ます。
4.      Pronouns. “Pronoun resolution requires the understanding of a given context! But MT is still at a sentence level.”
5.      Japanese counters (, , , etc.)
6.      Word-breaking. While the English language uses a white space to indicate word boundaries, the Japanese language needs a word-breaker for NLP/MT related tasks.

She then discussed how human translators can improve the quality of MT for English-to-Japanese translation. She listed three ways.
1.      Training data
“The more the training data, the better the quality of an SMT system,” and “the cleaner the training data, the better the quality of an SMT system.”
2.      Controlled English. Using controlled/simplified English helps MT to produce better quality translation.
3.      Post-editing. Post editing “is the process of improving a machine-generated translation with a minimum of manual labor.” (http://en.wikipedia.org/wiki/Postediting)

She emphasized the importance of human post-editing in our work-flow when translating English to Japanese with MT.

Machine Translation Post-editing Guidelines

Skills for Post-editing
      Excellent word-processing and editing skills; ability to work and make corrections directly on screen
      General knowledge of the problems and challenges faced by MT
      Specific knowledge of the weaknesses of the particular MT system
      Knowledge of source and target languages
      Quick in making decisions as to what and how to correct (or ignore errors)
      Ability to balance PE speed and cost with respect to required quality
      Ability to adapt to different specifications required for each job

Here, she answered one of our big questions: Can SMT replace human translators?

Her answer was, “No. Instead, it will create a new field of translation.”

Finally, she answered her first question: Machine translation is our FRIEND, if we use it appropriately and wisely.

No comments:

Post a Comment