Home / Journals / CMC / Online First / doi:10.32604/cmc.2025.060783
Special Issues
Table of Content

Open Access

ARTICLE

VPM-Net: Person Re-ID Network Based on Visual Prompt Technology and Multi-Instance Negative Pooling

Haitao Xie, Yuliang Chen, Yunjie Zeng, Lingyu Yan, Zhizhi Wang, Zhiwei Ye*
School of Computer Science, Hubei University of Technology, Wuhan, 430068, China
* Corresponding Author: Zhiwei Ye. Email: email
(This article belongs to the Special Issue: Machine Vision Detection and Intelligent Recognition, 2nd Edition)

Computers, Materials & Continua https://doi.org/10.32604/cmc.2025.060783

Received 09 November 2024; Accepted 17 February 2025; Published online 27 March 2025

Abstract

With the rapid development of intelligent video surveillance technology, pedestrian re-identification has become increasingly important in multi-camera surveillance systems. This technology plays a critical role in enhancing public safety. However, traditional methods typically process images and text separately, applying upstream models directly to downstream tasks. This approach significantly increases the complexity of model training and computational costs. Furthermore, the common class imbalance in existing training datasets limits model performance improvement. To address these challenges, we propose an innovative framework named Person Re-ID Network Based on Visual Prompt Technology and Multi-Instance Negative Pooling (VPM-Net). First, we incorporate the Contrastive Language-Image Pre-training (CLIP) pre-trained model to accurately map visual and textual features into a unified embedding space, effectively mitigating inconsistencies in data distribution and the training process. To enhance model adaptability and generalization, we introduce an efficient and task-specific Visual Prompt Tuning (VPT) technique, which improves the model’s relevance to specific tasks. Additionally, we design two key modules: the Knowledge-Aware Network (KAN) and the Multi-Instance Negative Pooling (MINP) module. The KAN module significantly enhances the model’s understanding of complex scenarios through deep contextual semantic modeling. MINP module handles samples, effectively improving the model’s ability to distinguish fine-grained features. The experimental outcomes across diverse datasets underscore the remarkable performance of VPM-Net. These results vividly demonstrate the unique advantages and robust reliability of VPM-Net in fine-grained retrieval tasks.

Keywords

Person re-identification; multi-instance negative pooling; visual prompt tuning
  • 38

    View

  • 11

    Download

  • 0

    Like

Share Link