A Survey on Deep Learning for Natural Language Processing: Models, Techniques, and Open Research Problems

Nguyễn Nhật Hào; Trần Khánh Vy; Lê Văn Phước

doi:10.63876/ijtm.v4i2.137

Authors

Nguyễn Nhật Hào Can Tho University, Can Tho City, Vietnam
Trần Khánh Vy Can Tho University, Can Tho City, Vietnam
Lê Văn Phước Can Tho University, Can Tho City, Vietnam

DOI:

https://doi.org/10.63876/ijtm.v4i2.137

Keywords:

Natural Language Processing, Deep Learning, Vietnamese Language, Pretrained Language Models, Low-Resource NLP, Transformers

Abstract

In recent years, deep learning has emerged as a powerful paradigm in natural language processing (NLP), enabling significant breakthroughs in tasks such as machine translation, sentiment analysis, and question answering. This survey provides a comprehensive overview of deep learning models and techniques that have shaped the evolution of NLP, with a focused lens on the Vietnamese language as a representative low-resource language. We review foundational models including recurrent neural networks (RNNs), convolutional neural networks (CNNs), and Transformer-based architectures such as BERT and GPT, and analyze their applications in Vietnamese NLP tasks. Special attention is given to the development and adaptation of Vietnamese-specific pretrained language models like PhoBERT and ViT5, as well as the use of multilingual approaches to address data scarcity. In addition, the paper discusses practical implementations in Vietnam, such as sentiment analysis of social media, Vietnamese question answering systems, and machine translation, highlighting the opportunities and challenges in this context. We also identify open research problems including limited training data, dialectal variations, code-switching, and ethical concerns, offering insights and directions for future work. This survey aims to serve as a resource for researchers and practitioners seeking to advance NLP capabilities in low-resource languages using deep learning.

Downloads

Download data is not yet available.

References

T. Nguyen and M. Tran, “Multi-Models from Computer Vision to Natural Language Processing for Cheapfakes Detection,” in 2023 IEEE International Conference on Multimedia and Expo Workshops (ICMEW), IEEE, Jul. 2023, pp. 93–98. doi: https://doi.org/10.1109/ICMEW59549.2023.00023.

D. W. Otter, J. R. Medina, and J. K. Kalita, “A Survey of the Usages of Deep Learning for Natural Language Processing,” IEEE Trans. Neural Networks Learn. Syst., vol. 32, no. 2, pp. 604–624, Feb. 2021, doi: https://doi.org/10.1109/TNNLS.2020.2979670.

R. Xiao et al., “Towards Energy-Preserving Natural Language Understanding With Spiking Neural Networks,” IEEE/ACM Trans. Audio, Speech, Lang. Process., vol. 31, pp. 439–447, 2023, doi: https://doi.org/10.1109/TASLP.2022.3221011.

C. Wang et al., “An Interpretable and Accurate Deep-Learning Diagnosis Framework Modeled With Fully and Semi-Supervised Reciprocal Learning,” IEEE Trans. Med. Imaging, vol. 43, no. 1, pp. 392–404, Jan. 2024, doi: https://doi.org/10.1109/TMI.2023.3306781.

Yi Shen and Jun Wang, “An Improved Algebraic Criterion for Global Exponential Stability of Recurrent Neural Networks With Time-Varying Delays,” IEEE Trans. Neural Networks, vol. 19, no. 3, pp. 528–531, Mar. 2008, doi: https://doi.org/10.1109/TNN.2007.911751.

W. Chen, H. Zou, and Y. Zhou, “A link prediction algorithm based on convolutional neural network,” Phys. A Stat. Mech. its Appl., vol. 678, p. 130922, Nov. 2025, doi: https://doi.org/10.1016/j.physa.2025.130922.

C. Patra, D. Giri, T. Maitra, and B. Kundu, “A Comparative Study on Detecting Phishing URLs Leveraging Pre-trained BERT Variants,” in 2024 6th International Conference on Computational Intelligence and Networks (CINE), IEEE, Dec. 2024, pp. 1–6. doi: https://doi.org/10.1109/CINE63708.2024.10881521.

J. Alghamdi, Y. Lin, and S. Luo, “ABERT: Adapting BERT model for efficient detection of human and AI-generated fake news,” Int. J. Inf. Manag. Data Insights, vol. 5, no. 2, p. 100353, Dec. 2025, doi: https://doi.org/10.1016/j.jjimei.2025.100353.

S. GÜNAY, A. ÖZTÜRK, A. T. KARAHAN, M. BARINDIK, S. KOMUT, and Y. YİĞİT, “Comparing DeepSeek and GPT-4o in ECG interpretation: Is AI improving over time?,” Hear. Lung, Sep. 2025, doi: https://doi.org/10.1016/j.hrtlng.2025.08.007.

B. Guan, X. Zhu, and S. Yuan, “A T5-based interpretable reading comprehension model with more accurate evidence training,” Inf. Process. Manag., vol. 61, no. 2, p. 103584, Mar. 2024, doi: https://doi.org/10.1016/j.ipm.2023.103584.

C. Zucco, “Deep Learning Methods in NLP,” in Encyclopedia of Bioinformatics and Computational Biology, Elsevier, 2025, pp. 190–198. doi: https://doi.org/10.1016/B978-0-323-95502-7.00249-9.

H. N. Trung, V. T. Hoang, and T. H. Huong, “Transformer-based Approach for Gender Prediction using Vietnamese Names,” Procedia Comput. Sci., vol. 235, pp. 2362–2369, 2024, doi: https://doi.org/10.1016/j.procs.2024.04.224.

H.-V. Tran, V.-T. Bui, D.-T. Do, and V.-V. Nguyen, “Combining PhoBERT and SentiWordNet for Vietnamese Sentiment Analysis,” in 2021 13th International Conference on Knowledge and Systems Engineering (KSE), IEEE, Nov. 2021, pp. 1–5. doi: https://doi.org/10.1109/KSE53942.2021.9648599.

T. Nguyen, V. Tran, N. Nguyen, T. Huynh, and D. Dinh, “Chatbot for E-Commerce Service Based on Rasa Framework and BERT Model,” in 2024 16th International Conference on Knowledge and System Engineering (KSE), IEEE, Nov. 2024, pp. 01–05. doi: https://doi.org/10.1109/KSE63888.2024.11063477.

T.-K. Dao, D.-V. Nguyen, M.-V. Pham, and N.-N. Duong, “Sino-Vietnamese Text Transcription using Word Embedding Approach,” in 2024 Tenth International Conference on Communications and Electronics (ICCE), IEEE, Jul. 2024, pp. 392–397. doi: https://doi.org/10.1109/ICCE62051.2024.10634601.

N. H. Nguyen, D. T. D. Vo, K. Van Nguyen, and N. L.-T. Nguyen, “OpenViVQA: Task, dataset, and multimodal fusion models for visual question answering in Vietnamese,” Inf. Fusion, vol. 100, p. 101868, Dec. 2023, doi: https://doi.org/10.1016/j.inffus.2023.101868.

J. Alghamdi, Y. Lin, and S. Luo, “Fake news detection in low-resource languages: A novel hybrid summarization approach,” Knowledge-Based Syst., vol. 296, p. 111884, Jul. 2024, doi: https://doi.org/10.1016/j.knosys.2024.111884.

Song Nguyen Duc Cong, Quoc Hung Ngo, and R. Jiamthapthaksin, “State-of-the-art Vietnamese word segmentation,” in 2016 2nd International Conference on Science in Information Technology (ICSITech), IEEE, Oct. 2016, pp. 119–124. doi: https://doi.org/10.1109/ICSITech.2016.7852619.

N. T. M. Trang and M. Shcherbakov, “Vietnamese Question Answering System f rom Multilingual BERT Models to Monolingual BERT Model,” in 2020 9th International Conference System Modeling and Advancement in Research Trends (SMART), IEEE, Dec. 2020, pp. 201–206. doi: https://doi.org/10.1109/SMART50582.2020.9337155.

W. Etaiwi and B. Alhijawi, “Comparative evaluation of ChatGPT and DeepSeek across key NLP tasks: Strengths, weaknesses, and domain-specific performance,” Array, vol. 27, p. 100478, Sep. 2025, doi: https://doi.org/10.1016/j.array.2025.100478.

L. Torbarina, T. Ferkovic, L. Roguski, V. Mihelcic, B. Sarlija, and Z. Kraljevic, “Challenges and Opportunities of Using Transformer-Based Multi-Task Learning in NLP Through ML Lifecycle: A Position Paper,” Nat. Lang. Process. J., vol. 7, p. 100076, Jun. 2024, doi: https://doi.org/10.1016/j.nlp.2024.100076.

A Survey on Deep Learning for Natural Language Processing: Models, Techniques, and Open Research Problems

Authors

DOI:

Keywords:

Abstract

Downloads

References

Downloads

Published

How to Cite

Issue

Section

navigasi

Journal Information

For Author

Manuscript Template

Visitor Statistic

Recommended Tools