Transparent and Accountable Training Data Sharing in Decentralized Machine Learning Systems

Siwan Noh; Kyung-Hyune Rhee

doi:10.32604/cmc.2024.050949

Open Access icon Open Access

ARTICLE

Transparent and Accountable Training Data Sharing in Decentralized Machine Learning Systems

Siwan Noh¹, Kyung-Hyune Rhee^2,*

1 Industrial Science Technology Research Center, Pukyong National University, Busan, 48513, South Korea
2 Division of Computer Engineering, Pukyong National University, Busan, 48513, South Korea

* Corresponding Author: Kyung-Hyune Rhee. Email: email

(This article belongs to the Special Issue: Innovative Security for the Next Generation Mobile Communication and Internet Systems)

Computers, Materials & Continua 2024, 79(3), 3805-3826. https://doi.org/10.32604/cmc.2024.050949

Received 23 February 2024; Accepted 10 May 2024; Issue published 20 June 2024

Abstract

In Decentralized Machine Learning (DML) systems, system participants contribute their resources to assist others in developing machine learning solutions. Identifying malicious contributions in DML systems is challenging, which has led to the exploration of blockchain technology. Blockchain leverages its transparency and immutability to record the provenance and reliability of training data. However, storing massive datasets or implementing model evaluation processes on smart contracts incurs high computational costs. Additionally, current research on preventing malicious contributions in DML systems primarily focuses on protecting models from being exploited by workers who contribute incorrect or misleading data. However, less attention has been paid to the scenario where malicious requesters intentionally manipulate test data during evaluation to gain an unfair advantage. This paper proposes a transparent and accountable training data sharing method that securely shares data among potentially malicious system participants. First, we introduce a blockchain-based DML system architecture that supports secure training data sharing through the IPFS network. Second, we design a blockchain smart contract to transparently split training datasets into training and test datasets, respectively, without involving system participants. Under the system, transparent and accountable training data sharing can be achieved with attribute-based proxy re-encryption. We demonstrate the security analysis for the system, and conduct experiments on the Ethereum and IPFS platforms to show the feasibility and practicality of the system.

Keywords

Decentralized machine learning; data accountability; dataset sharing

Cite This Article

APA Style

Noh, S., Rhee, K. (2024). Transparent and Accountable Training Data Sharing in Decentralized Machine Learning Systems. Computers, Materials & Continua, 79(3), 3805–3826. https://doi.org/10.32604/cmc.2024.050949

Vancouver Style

Noh S, Rhee K. Transparent and Accountable Training Data Sharing in Decentralized Machine Learning Systems. Comput Mater Contin. 2024;79(3):3805–3826. https://doi.org/10.32604/cmc.2024.050949

IEEE Style

S. Noh and K. Rhee, “Transparent and Accountable Training Data Sharing in Decentralized Machine Learning Systems,” Comput. Mater. Contin., vol. 79, no. 3, pp. 3805–3826, 2024. https://doi.org/10.32604/cmc.2024.050949

BibTex EndNote RIS

Copyright © 2024 The Author(s). Published by Tech Science Press.
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Table of Content

Transparent and Accountable Training Data Sharing in Decentralized Machine Learning Systems

Abstract

Keywords

Cite This Article

1345

747

0

Further Information

Guidelines

Follow Us

Join Us

Contact Us

WhatsApp:

Share Link