Evaluating the Risk of Disclosure and Utility in a Synthetic Dataset

Kang-Cheng Chen; Chia-Mu Yu; Tooska Dargahi

doi:10.32604/cmc.2021.014984

Open Access icon Open Access

ARTICLE

Evaluating the Risk of Disclosure and Utility in a Synthetic Dataset

Kang-Cheng Chen¹, Chia-Mu Yu^2,*, Tooska Dargahi³

1 Industrial Technology Research Institute, Hsinchu, 310, Taiwan
2 National Chiao Tung University, Hsinchu, 320, Taiwan
3 University of Salford, Manchester, M5 4WT, United Kingdom

* Corresponding Author: Chia-Mu Yu. Email: email

Computers, Materials & Continua 2021, 68(1), 761-787. https://doi.org/10.32604/cmc.2021.014984

Received 31 October 2020; Accepted 13 January 2021; Issue published 22 March 2021

Abstract

The advancement of information technology has improved the delivery of financial services by the introduction of Financial Technology (FinTech). To enhance their customer satisfaction, Fintech companies leverage artificial intelligence (AI) to collect fine-grained data about individuals, which enables them to provide more intelligent and customized services. However, although visions thereof promise to make customers’ lives easier, they also raise major security and privacy concerns for their users. Differential privacy (DP) is a common privacy-preserving data publishing technique that is proved to ensure a high level of privacy preservation. However, an important concern arises from the trade-off between the data utility the risk of data disclosure (RoD), which has not been well investigated. In this paper, to address this challenge, we propose data-dependent approaches for evaluating whether the sufficient privacy is guaranteed in differentially private data release. At the same time, by taking into account the utility of the differentially private synthetic dataset, we present a data-dependent algorithm that, through a curve fitting technique, measures the error of the statistical result imposed to the original dataset due to the injection of random noise. Moreover, we also propose a method that ensures a proper privacy budget, i.e., will be chosen so as to maintain the trade-off between the privacy and utility. Our comprehensive experimental analysis proves both the efficiency and estimation accuracy of the proposed algorithms.

Keywords

Differential privacy; risk of disclosure; privacy; utility

Cite This Article

APA Style

Chen, K., Yu, C., Dargahi, T. (2021). Evaluating the Risk of Disclosure and Utility in a Synthetic Dataset. Computers, Materials & Continua, 68(1), 761–787. https://doi.org/10.32604/cmc.2021.014984

Vancouver Style

Chen K, Yu C, Dargahi T. Evaluating the Risk of Disclosure and Utility in a Synthetic Dataset. Comput Mater Contin. 2021;68(1):761–787. https://doi.org/10.32604/cmc.2021.014984

IEEE Style

K. Chen, C. Yu, and T. Dargahi, “Evaluating the Risk of Disclosure and Utility in a Synthetic Dataset,” Comput. Mater. Contin., vol. 68, no. 1, pp. 761–787, 2021. https://doi.org/10.32604/cmc.2021.014984

BibTex EndNote RIS

Copyright © 2021 The Author(s). Published by Tech Science Press.
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Table of Content

Evaluating the Risk of Disclosure and Utility in a Synthetic Dataset

Abstract

Keywords

Cite This Article

3071

1727

0

Related articles

Further Information

Guidelines

Follow Us

Join Us

Contact Us

WhatsApp:

Share Link