Liqing Wang*, Yiheng Xiao
Intelligent Automation & Soft Computing, Vol.39, No.4, pp. 619-631, 2024, DOI:10.32604/iasc.2024.052971
- 06 September 2024
Abstract Neural Machine Translation is one of the key research directions in Natural Language Processing. However, limited by the scale and quality of parallel corpus, the translation quality of low-resource Neural Machine Translation has always been unsatisfactory. When Reinforcement Learning from Human Feedback (RLHF) is applied to low-resource machine translation, commonly encountered issues of substandard preference data quality and the higher cost associated with manual feedback data. Therefore, a more cost-effective method for obtaining feedback data is proposed. At first, optimizing the quality of preference data through the prompt engineering of the Large Language Model (LLM), More >