Optimizing Fine-Tuning in Quantized Language Models: An In-Depth Analysis of Key Variables

Ao Shen; Zhiquan Lai; Dongsheng Li; Xiaoyu Hu

doi:10.32604/cmc.2024.057491

Open Access icon Open Access

ARTICLE

Optimizing Fine-Tuning in Quantized Language Models: An In-Depth Analysis of Key Variables

Ao Shen¹, Zhiquan Lai^1,*, Dongsheng Li^1,*, Xiaoyu Hu²

1 National Key Laboratory of Parallel and Distributed Computing, National University of Defense Technology, Changsha, 410073, China
2 Strategic Assessments and Consultation Institute, Academy of Military Science, Beijing, 100091, China

* Corresponding Authors: Zhiquan Lai. Email: email ; Dongsheng Li. Email: email

Computers, Materials & Continua 2025, 82(1), 307-325. https://doi.org/10.32604/cmc.2024.057491

Received 19 August 2024; Accepted 10 October 2024; Issue published 03 January 2025

Abstract

Large-scale Language Models (LLMs) have achieved significant breakthroughs in Natural Language Processing (NLP), driven by the pre-training and fine-tuning paradigm. While this approach allows models to specialize in specific tasks with reduced training costs, the substantial memory requirements during fine-tuning present a barrier to broader deployment. Parameter-Efficient Fine-Tuning (PEFT) techniques, such as Low-Rank Adaptation (LoRA), and parameter quantization methods have emerged as solutions to address these challenges by optimizing memory usage and computational efficiency. Among these, QLoRA, which combines PEFT and quantization, has demonstrated notable success in reducing memory footprints during fine-tuning, prompting the development of various QLoRA variants. Despite these advancements, the quantitative impact of key variables on the fine-tuning performance of quantized LLMs remains underexplored. This study presents a comprehensive analysis of these key variables, focusing on their influence across different layer types and depths within LLM architectures. Our investigation uncovers several critical findings: (1) Larger layers, such as MLP layers, can maintain performance despite reductions in adapter rank, while smaller layers, like self-attention layers, are more sensitive to such changes; (2) The effectiveness of balancing factors depends more on specific values rather than layer type or depth; (3) In quantization-aware fine-tuning, larger layers can effectively utilize smaller adapters, whereas smaller layers struggle to do so. These insights suggest that layer type is a more significant determinant of fine-tuning success than layer depth when optimizing quantized LLMs. Moreover, for the same discount of trainable parameters, reducing the trainable parameters in a larger layer is more effective in preserving fine-tuning accuracy than in a smaller one. This study provides valuable guidance for more efficient fine-tuning strategies and opens avenues for further research into optimizing LLM fine-tuning in resource-constrained environments.

Keywords

Large-scale Language Model; Parameter-Efficient Fine-Tuning; parameter quantization; key variable; trainable parameters; experimental analysis

Cite This Article

APA Style

Shen, A., Lai, Z., Li, D., Hu, X. (2025). Optimizing Fine-Tuning in Quantized Language Models: An In-Depth Analysis of Key Variables. Computers, Materials & Continua, 82(1), 307–325. https://doi.org/10.32604/cmc.2024.057491

Vancouver Style

Shen A, Lai Z, Li D, Hu X. Optimizing Fine-Tuning in Quantized Language Models: An In-Depth Analysis of Key Variables. Comput Mater Contin. 2025;82(1):307–325. https://doi.org/10.32604/cmc.2024.057491

IEEE Style

A. Shen, Z. Lai, D. Li, and X. Hu, “Optimizing Fine-Tuning in Quantized Language Models: An In-Depth Analysis of Key Variables,” Comput. Mater. Contin., vol. 82, no. 1, pp. 307–325, 2025. https://doi.org/10.32604/cmc.2024.057491

BibTex EndNote RIS

Copyright © 2025 The Author(s). Published by Tech Science Press.
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Table of Content

Optimizing Fine-Tuning in Quantized Language Models: An In-Depth Analysis of Key Variables

Abstract

Keywords

Cite This Article

2560

1853

0

Related articles

Further Information

Guidelines

Follow Us

Join Us

Contact Us

WhatsApp:

Share Link