Event-Driven Attention Network: A Cross-Modal Framework for Efficient Image-Text Retrieval in Mass Gathering Events

Kamil Yasen^1,#, Heyan Jin^2,#, Sijie Yang², Li Zhan², Xuyang Zhang², Ke Qin^1,3, Ye Li^2,3,*
1 School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, 611731, China
2 School of Information and Software Engineering, University of Electronic Science and Technology of China, Chengdu, 611731, China
3 Kashi Institute of Electronics and Information Industry, Kashi, 844508, China
* Corresponding Author: Ye Li. Email: email
# Both Kamil Yasen and Heyan Jin contributed equally to this work

Computers, Materials & Continua https://doi.org/10.32604/cmc.2025.061037

Received 15 November 2024; Accepted 09 January 2025; Published online 18 March 2025

Download PDF

Abstract

Research on mass gathering events is critical for ensuring public security and maintaining social order. However, most of the existing works focus on crowd behavior analysis areas such as anomaly detection and crowd counting, and there is a relative lack of research on mass gathering behaviors. We believe real-time detection and monitoring of mass gathering behaviors are essential for migrating potential security risks and emergencies. Therefore, it is imperative to develop a method capable of accurately identifying and localizing mass gatherings before disasters occur, enabling prompt and effective responses. To address this problem, we propose an innovative Event-Driven Attention Network (EDAN), which achieves image-text matching in the scenario of mass gathering events with good results for the first time. Traditional image-text retrieval methods based on global alignment are difficult to capture the local details within complex scenes, limiting retrieval accuracy. While local alignment-based methods are more effective at extracting detailed features, they frequently process raw textual features directly, which often contain ambiguities and redundant information that can diminish retrieval efficiency and degrade model performance. To overcome these challenges, EDAN introduces an Event-Driven Attention Module that adaptively focuses attention on image regions or textual words relevant to the event type. By calculating the semantic distance between event labels and textual content, this module effectively significantly reduces computational complexity and enhances retrieval efficiency. To validate the effectiveness of EDAN, we construct a dedicated multimodal dataset tailored for the analysis of mass gathering events, providing a reliable foundation for subsequent studies. We conduct comparative experiments with other methods on our dataset, the experimental results demonstrate the effectiveness of EDAN. In the image-to-text retrieval task, EDAN achieved the best performance on the R@5 metric, while in the text-to-image retrieval task, it showed superior results on both R@10 and R@5 metrics. Additionally, EDAN excelled in the overall Rsum metric, achieving the best performance. Finally, ablation studies further verified the effectiveness of event-driven attention module.

Keywords

Mass gathering events; image-text retrieval; attention mechanism

Downloads
- Full-Text PDF
Citation Tools
- BibTex
- EndNote
- RIS

112

View
38

Download
0

Like

Dependency-Based Local Attention Approach to Neural Machine Translation
Jing Qiu, Yan Liu, Yuhan Chai,...
Attention-Aware Network with Latent Semantic Analysis for Clothing Invariant Gait Recognition
Hefei Ling, Jia Wu, Ping Li, Jialie...
An Improved End-to-End Memory Network for QA Tasks
Aziguli Wulamu, Zhenqi Sun, Yonghong...
Generating Questions Based on Semi-Automated and End-to-End Neural Network
Tianci Xia, Yuan Sun, Xiaobing...
Deep Feature Fusion Model for Sentence Semantic Matching
Xu Zhang, Wenpeng Lu, Fangfang...

All issues

Online First

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

Event-Driven Attention Network: A Cross-Modal Framework for Efficient Image-Text Retrieval in Mass Gathering Events

Abstract

Keywords

112

38

0

Further Information

Guidelines

Follow Us

Join Us

Contact Us

WhatsApp:

Share Link