Proceedings of the 11th Joint Conference on Information Sciences (JCIS 2008)

A Phrase Combination Approach to Patent SMT

Authors
Junguo Zhu1, Muyun Yang, Tiejun Zhao, Sheng Li, Qi Haoliang
1Harbin Institute of Technology
Corresponding Author
Junguo Zhu
Available Online December 2008.
DOI
https://doi.org/10.2991/jcis.2008.99How to use a DOI?
Keywords
statistical machine translation, patent, phrase combination, word segmentation
Abstract

This paper presents a phrase combination approach to patent SMT (Statistical Ma-chine Translation) for Japanese to English. To minimize the segmentation problems caused by the rich OOV (out-of-vocabulary) words in the patent texts, the character based translation phrases are first introduced to avoid the segmentation errors in translation modeling. Then the word based translation phrases, which are established to utilize the dependent word level information, are combined with character translation table by linearly integrating their probability. Our experiments on NTCIR corpus indicate that the proposed method significantly out-performed the originally word based approach.

Copyright
© 2008, the Authors. Published by Atlantis Press.
Open Access
This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).

Download article (PDF)

Volume Title
Proceedings of the 11th Joint Conference on Information Sciences (JCIS 2008)
Series
Advances in Intelligent Systems Research
Publication Date
December 2008
ISBN
978-90-78677-18-5
ISSN
1951-6851
DOI
https://doi.org/10.2991/jcis.2008.99How to use a DOI?
Copyright
© 2008, the Authors. Published by Atlantis Press.
Open Access
This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).

Cite this article

TY  - CONF
AU  - Junguo Zhu
AU  - Muyun Yang
AU  - Tiejun Zhao
AU  - Sheng Li
AU  - Qi Haoliang
PY  - 2008/12
DA  - 2008/12
TI  - A Phrase Combination Approach to Patent SMT
BT  - Proceedings of the 11th Joint Conference on Information Sciences (JCIS 2008)
PB  - Atlantis Press
SP  - 590
EP  - 594
SN  - 1951-6851
UR  - https://doi.org/10.2991/jcis.2008.99
DO  - https://doi.org/10.2991/jcis.2008.99
ID  - Zhu2008/12
ER  -