Home / Journals / CMC / Online First / doi:10.32604/cmc.2025.060318
Special Issues
Table of Content

Open Access

ARTICLE

TIPS: Tailored Information Extraction in Public Security Using Domain-Enhanced Large Language Model

Yue Liu1, Qinglang Guo2, Chunyao Yang1, Yong Liao1,*
1 School of Cyber Science and Technology, University of Science and Technology of China, Hefei, 230026, China
2 National Engineering Research Center for Public Safety Risk Perception and Control by Big Data, China Academy of Electronics and Information Technology, Beijing, 100041, China
* Corresponding Author: Yong Liao. Email: email

Computers, Materials & Continua https://doi.org/10.32604/cmc.2025.060318

Received 29 October 2024; Accepted 17 February 2025; Published online 26 March 2025

Abstract

Processing police incident data in public security involves complex natural language processing (NLP) tasks, including information extraction. This data contains extensive entity information—such as people, locations, and events—while also involving reasoning tasks like personnel classification, relationship judgment, and implicit inference. Moreover, utilizing models for extracting information from police incident data poses a significant challenge—data scarcity, which limits the effectiveness of traditional rule-based and machine-learning methods. To address these, we propose TIPS. In collaboration with public security experts, we used de-identified police incident data to create templates that enable large language models (LLMs) to populate data slots and generate simulated data, enhancing data density and diversity. We then designed schemas to efficiently manage complex extraction and reasoning tasks, constructing a high-quality dataset and fine-tuning multiple open-source LLMs. Experiments showed that the fine-tuned ChatGLM-4-9B model achieved an F1 score of 87.14%, nearly 30% higher than the base model, significantly reducing error rates. Manual corrections further improved performance by 9.39%. This study demonstrates that combining large-scale pre-trained models with limited high-quality domain-specific data can greatly enhance information extraction in low-resource environments, offering a new approach for intelligent public security applications.

Keywords

Public security; information extraction; large language model; prompt engineering
  • 50

    View

  • 7

    Download

  • 0

    Like

Share Link