Development of a Syntax-Based Model for English - Igbo Statistical Machine Translation

Authors

  • Esan Adebimpe Federal University Oye-Ekiti, Ekiti state, Nigeria
  • John B. Oladosu LAUTECH, Ogbomoso, Oyo State, Nigeria

Keywords:

Syntax, Religious, Language model, Translation model, Domain and Corpora

Abstract

Semantic errors occurred due to syntactical difference between English and Igbo languages in existing statistical machine translators. Therefore, a syntax-based model was developed in this research for English-Igbo statistical machine translation.
Parallel corpus was obtained from the religious domain and word alignments were made on the English and Igbo corpora with
GIZA++. The Hidden Markov Model uses the word alignments produced by GIZA++ to estimate a maximum likelihood translation table. The Language model for the target language was built using IRSTLM toolkit and the model was tuned using Minimum error rate training (MERT). The developed SMT system was evaluated using BLEU and NIST and the results were compared to an existing related work. Results showed that the developed model outperformed the previous system by up to 0.3 BLEU score and 3.0 NIST scores respectively.

Author Biographies

Esan Adebimpe , Federal University Oye-Ekiti, Ekiti state, Nigeria

Dept of Computer Engr.

John B. Oladosu, LAUTECH, Ogbomoso, Oyo State, Nigeria

Dept of Computer Engr.


Published

2021-07-07

How to Cite

Adebimpe , E., & Oladosu, J. B. . (2021). Development of a Syntax-Based Model for English - Igbo Statistical Machine Translation. LAUTECH JOURNAL OF COMPUTING AND INFORMATICS , 2(1), page 69-78. Retrieved from https://laujci.lautech.edu.ng/index.php/laujci/article/view/43

Issue

Section

Articles