Improving an English-Arabic Transformer-Based Machine Translation Model

, Diadeen

doi:<a href=

10.25673/122126">

Proceedings of International Conference on Applied Innovation in IT
2025/08/29, Volume 13, Issue 4, pp.275-281

Improving an English-Arabic Transformer-Based Machine Translation Model

Diadeen Ali Hameed and Belal Al-Khateeb

Abstract: Arabic is widely recognized as one of the most challenging languages for translation due to its rich morphology, complex syntax, and context-sensitive structures. Despite its global importance, Arabic has received significantly less attention in machine translation research compared to European languages, highlighting the pressing need for further investigation into high-quality Arabic machine translation systems. This paper proposes an enhanced model for English–Arabic machine translation tailored to the news domain, based on the Transformer architecture, which currently underpins most state-of-the-art machine translation. The model is augmented by incorporating external lexical alignment data at each decoding step. This integration is designed to improve the system’s handling of polysemy, contextual ambiguity, and missing word detection, thereby enhancing translation accuracy, cohesion, and semantic fidelity. The proposed model was evaluated based on BLEU scores and F measure. Results showed that the proposed model outperformed, achieving an accuracy of 87% based on BLEU score, and an 84% score on the F-measure test, surpassing the Google translate by 6%, on the same tested data. These findings confirm that incorporating lexical knowledge into the Transformer framework significantly improves English–Arabic translation quality. The model not only provides more translations that are accurate but also demonstrates improved cohesion and contextual understanding, particularly in morphologically rich and domain-specific texts. This research underscores the value of domain-aware and linguistically informed machine translation approaches, paving the way for more effective Arabic-language translation systems in practical applications.

Keywords: Machine Translation, Arabic Language, Deep Learning Techniques, Transformer Model, BLEU Score.

DOI: 10.25673/122126

Download: PDF

References:

D. R. Muhealddin, Y. M. M. Salih, and F. J. Taher, “A linguistic evaluation of Google Translate between human and machine translation for English–Sorani Kurdish languages using natural language processing (NLP),” Passer Journal of Basic and Applied Sciences, vol. 6, no. 2, pp. 351–368, 2024, doi: 10.24271/PSR.2024.422619.1415.
I. M. Moosa, R. Zhang, and W. Yin, “MT-Ranker: Reference-free machine translation evaluation by inter-system ranking,” arXiv preprint, 2024. [Online]. Available: arXiv:2401.17099.
R. Torjmen and K. Haddar, “Translation from Tunisian dialect to modern standard Arabic: Exploring finite-state transducers and sequence-to-sequence transformer approaches,” 2024, doi: 10.1145/3681788.
Y. Jia, “Attention mechanism in machine translation,” Journal of Physics: Conference Series, vol. 1314, no. 1, 2019, doi: 10.1088/1742-6596/1314/1/012186.
N. Alsohybe, N. Dahan, and F. Ba-Alwi, “Machine-translation history and evolution: Survey for Arabic-English translations,” Current Journal of Applied Science and Technology, vol. 23, no. 4, pp. 1–19, 2017, doi: 10.9734/cjast/2017/36124.
J. Li, T. Tang, W. X. Zhao, and J. R. Wen, “Pretrained language models for text generation: A survey,” in Proceedings of the IJCAI, vol. 1, no. 1, pp. 4492–4499, 2021, doi: 10.24963/ijcai.2021/612.
Y. Ye and A. Toral, “Fine-grained human evaluation of transformer and recurrent neural machine translation,” 2014.
W. Xie, Y. Feng, S. Gu, and D. Yu, “Importance-based neuron allocation for multilingual neural machine translation,” in Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, pp. 5725–5737, 2021, doi: 10.18653/v1/2021.acl-long.445.
D. A. Hameed, T. A. Faisal, A. M. Alshaykha, G. T. Hasan, and H. A. Ali, “Automatic evaluating of Russian–Arabic machine translation quality using METEOR method,” AIP Conference Proceedings, vol. 2386, 2022, doi: 10.1063/5.0067018.
Ľ. Benko, D. Munkova, M. Munk, L. Benkova, and P. Hajek, “The use of residual analysis to improve the error rate accuracy of machine translation,” Scientific Reports, vol. 14, no. 1, pp. 1–19, 2024, doi: 10.1038/s41598-024-59524-3.
D. A. Hameed and B. Al-Khateeb, “Deep learning techniques for machine translation: A survey,” Procedia Computer Science, vol. 258, pp. 1022–1037, 2025.
Y. Liu, K. Wang, C. Zong, and K. Su, “PT US CR,” Computer Speech & Language, 2018, doi: 10.1016/j.csl.2018.09.006.
J. Moorkens, “What to expect from neural machine translation: A practical in-class translation evaluation exercise,” Interpreter Training, vol. 12, no. 4, pp. 375–387, 2018, doi: 10.1080/1750399X.2018.1501639.
Z. Peng et al., “Handling very long contexts in neural machine translation: A survey,” 2024.
A. Khan, S. Panda, J. Xu, and L. Flokas, “Hunter NMT system for WMT18 biomedical translation task: Transfer learning in neural machine translation,” vol. 2, pp. 655–661, 2019, doi: 10.18653/v1/w18-6447.
X. Han et al., “Pre-trained models: Past, present and future,” AI Open, vol. 2, pp. 225–250, 2021, doi: 10.1016/j.aiopen.2021.08.002.
D. A. Hameed, T. A. Faisal, A. K. Abbas, H. A. Ali, and G. T. Hasan, “DIA-English–Arabic neural machine translation domain: Sulfur industry,” Indonesian Journal of Electrical Engineering and Computer Science, vol. 27, no. 3, pp. 1619–1624, 2022, doi: 10.11591/ijeecs.v27.i3.pp1619-1624.
H. M. Lateef, A. M. Awaad, D. A. Hameed, G. T. Hasan, and T. A. Faisal, “Evaluation of domain sulfur industry for DIA translator using bilingual evaluation understudy method,” Bulletin of Electrical Engineering and Informatics, vol. 13, no. 1, pp. 370–376, 2024, doi: 10.11591/eei.v13i1.4489.
H. Atwany, N. Rabih, I. Mohammed, A. Waheed, and B. Raj, “OSACT 2024 task 2: Arabic dialect to MSA translation,” in Proceedings of the 6th Workshop on Open-Source Arabic Corpora and Processing Tools, pp. 98–103, 2024.
D. A. Hameed and B. Al-Khateeb, “Deep learning-based English–Arabic machine translation for sulfur manufacture texts,” Mesopotamian Journal of Big Data, pp. 241–250, 2024, doi: 10.58496/MJBD/2024/018.
T. Yang, S. Zhao, H. Chen, and B. Chen, “Machine translation system based on deep learning,” Journal of Physics: Conference Series, vol. 2030, no. 1, 2021, doi: 10.1088/1742-6596/2030/1/012098.
B. Sunita and T. J. Peter, “Multilingual document data preprocessing and machine translation employing neural machine translation models,” vol. 45, no. 3, pp. 624–634, 2024.
Y. K. Hussein, D. A. Hameed, L. I. Kalaf, B. Rahmatullah, and A. T. Al-Taani, “Automatic evaluating Russian–Arabic machine translation quality using BLEU method,” Review AUS, no. 25, pp. 155–162, 2019, doi: 10.4206/aus.2019.n25-9.
A. Vaswani et al., “Attention is all you need,” in Proceedings of the Advances in Neural Information Processing Systems (NeurIPS), 2017.
S. Badawi, “A transformer-based neural network machine translation model for the Kurdish Sorani dialect,” UHD Journal of Science and Technology, vol. 7, no. 1, pp. 15–21, 2023, doi: 10.21928/uhdjst.v7n1y2023.pp15-21.
W. He, Y. Wu, and X. Li, “Attention mechanism for neural machine translation: A survey,” in Proceedings of the IEEE ITNEC, vol. 5, pp. 1485–1489, 2021, doi: 10.1109/ITNEC52019.2021.9586824.
P. B. Arnal, “Transformer models for machine translation and streaming automatic speech recognition,” 2023.
H. Wang, H. Wu, Z. He, L. Huang, and K. W. Church, “Progress in machine translation,” Engineering, vol. 18, pp. 143–153, 2022, doi: 10.1016/j.eng.2021.03.023.
B. Nath, C. Kumbhar, and B. T. Khoa, “A study on approaches to neural machine translation,” Journal of Logistics, Informatics and Service Science, vol. 9, no. 3, pp. 271–283, 2022, doi: 10.33168/LISS.2022.0319.
P. M. Reader, “From cloze to comprehension: Retrofitting pre-trained masked language models,” 2023.
M. Kale, “mT5: A massively multilingual pre-trained text-to-text transformer,” 2020.
X. Ho et al., “A survey of pre-trained language models for processing scientific text,” arXiv preprint, 2024.
S. Shen, P. Walsh, K. Keutzer, J. Dodge, M. Peters, and I. Beltagy, “Staged training for transformer language models,” Proceedings of Machine Learning Research, vol. 162, pp. 19893–19908, 2022.
L. H. Baniata, I. K. E. Ampomah, and S. Park, “A transformer-based neural machine translation model for Arabic dialects that utilizes subword units,” Sensors, vol. 21, no. 19, pp. 1–28, 2021, doi: 10.3390/s21196509.
B. Li, “Word alignment in the era of deep learning: A tutorial,” arXiv preprint, 2022.
B. Li, “Word alignment as preference for machine translation,” arXiv preprint, 2024.
A. T. Al-Taani, A. Farhat, and A. Al-Taani, “A rule-based English to Arabic machine translation approach,” 2015.
A. Alqudsi, N. Omar, and K. Shaker, “A hybrid rules and statistical method for Arabic to English machine translation,” in Proceedings of the 2nd International Conference on Computer Applications and Information Security, 2019, doi: 10.1109/CAIS.2019.8769545.
M. Seghir and H. Ameur, “Exploration des techniques de l’apprentissage profond au service de la traduction automatique Anglais–Arabe,” 2021.
X. Li, “Comparison of translation quality between large language models and neural machine translation systems: A case study of Chinese–English language pair,” International Journal of Education and Humanities, vol. 4, no. 2, pp. 121–128, 2024, doi: 10.58557/(ijeh).v4i2.213.

HOME

       - Conference
       - Journal
       - Paper Submission to Conference
       - Paper Submission to Journal
       - Fee Payment
       - For Authors
       - For Reviewers
       - Important Dates
       - Conference Committee
       - Editorial Board
       - Reviewers
       - Last Proceeding

PROCEEDINGS

       - Volume 14, Issue 1 (ICAIIT 2026)
       - Volume 13, Issue 5 (ICAIIT 2025)
       - Volume 13, Issue 4 (ICAIIT 2025)
       - Volume 13, Issue 3 (ICAIIT 2025)
       - Volume 13, Issue 2 (ICAIIT 2025)
       - Volume 13, Issue 1 (ICAIIT 2025)
       - Volume 12, Issue 2 (ICAIIT 2024)
       - Volume 12, Issue 1 (ICAIIT 2024)
       - Volume 11, Issue 2 (ICAIIT 2023)
       - Volume 11, Issue 1 (ICAIIT 2023)
       - Volume 10, Issue 1 (ICAIIT 2022)
       - Volume 9, Issue 1 (ICAIIT 2021)
       - Volume 8, Issue 1 (ICAIIT 2020)
       - Volume 7, Issue 1 (ICAIIT 2019)
       - Volume 7, Issue 2 (ICAIIT 2019)
       - Volume 6, Issue 1 (ICAIIT 2018)
       - Volume 5, Issue 1 (ICAIIT 2017)
       - Volume 4, Issue 1 (ICAIIT 2016)
       - Volume 3, Issue 1 (ICAIIT 2015)
       - Volume 2, Issue 1 (ICAIIT 2014)
       - Volume 1, Issue 1 (ICAIIT 2013)

LAST CONFERENCE

       ICAIIT 2026
         - Photos
         - Reports

    PAST CONFERENCES

ETHICS IN PUBLICATIONS

ACCOMODATION

CONTACT US

Proceedings of the International Conference on Applied Innovations in IT by Anhalt University of Applied Sciences is licensed under CC BY-SA 4.0

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License

           ISSN 2199-8876
           Publisher: Edition Hochschule Anhalt
           Location: Anhalt University of Applied Sciences
           Email: leiterin.hsb@hs-anhalt.de
           Phone: +49 (0) 3496 67 5611
           Address: Building 01 - Red Building, Top floor, Room 425, Bernburger Str. 55, D-06366 Köthen, Germany

Except where otherwise noted, all works and proceedings on this site is licensed under Creative Commons Attribution-ShareAlike 4.0 International License.