Computer Modeling in Engineering & Sciences |
DOI: 10.32604/cmes.2021.015210
ARTICLE
Attribute-Based Keyword Search over the Encrypted Blockchain
1Faculty of Information Technology, Beijing University of Technology, Beijing, 100124, China
2School of Computing and Information Systems, The University of Melbourne, Victoria, 3010, Australia
*Corresponding Author: Haiyang Yu. Email: yuhaiyang@bjut.edu.cn
Received: 02 December 2020; Accepted: 15 March 2021
Abstract: To address privacy concerns, data in the blockchain should be encrypted in advance to avoid data access from all users in the blockchain. However, encrypted data cannot be directly retrieved, which hinders data sharing in the blockchain. Several works have been proposed to deal with this problem. However, the data retrieval in these schemes requires the participation of data owners and lacks finer-grained access control. In this paper, we propose an attribute-based keyword search scheme over the encrypted blockchain, which allows users to search encrypted files over the blockchain based on their attributes. In addition, we build a file chain structure to improve the efficiency of searching files with the same keyword. Security analysis proves the security of the proposed scheme. Theoretical analysis and experimental results in performance evaluation show that our scheme is feasible and efficient.
Keywords: Blockchain; searchable encryption; attribute-based encryption; smart contract
As an emerging decentralized technology, blockchain has gained attention worldwide over the past decade, which can be applied to many scenarios such as IoT, 5G networks [1], etc. Among the many appealing features of blockchain, highly redundant storage enables each user to possess a full copy of the blockchain, which means users can share their data with anyone in the blockchain network directly for many data driven applications [2]. This feature makes the blockchain a perfect technique for constructing data sharing schemes. However, since data is public to all users, it can be accessed by both data users and external (undesirable) users in the blockchain network, thus giving rise to privacy issues and malicious attacks [3]. To achieve confidentiality, data owners usually opt to encrypt their data before submitting it to the blockchain. Although encryption protects the data from being deciphered and understood by anyone in the network, the encrypted files hinder data users from conducting search operations on the blockchain. A search mechanism over the encrypted blockchain is thus desirable to achieve both data confidentiality and usability.
Several works [4–8] have focused on keyword search over the encrypted blockchain. However, existing schemes mostly adopted symmetric encryption and thereby have several major drawbacks. Firstly, since data users have to interact with each data owner before searching the data, these schemes cannot support efficient data retrieval over the whole blockchain. Secondly, interactions require data owners and users to share either the secret key or keyword, which can give rise to privacy erosion. Thirdly, since the authorization is entirely dependent on data owners, access control [9] in searching the blockchain is not supported. As an example of these issues, consider the following situation. There are a lot of users in the blockchain network belonging to the same university or company who have similar features or attributes and want to search shared data within the group (and only within the group). Contacting data owners individually to get the search permission makes the data sharing process highly inflexible and inefficient.
Traditional symmetric encryption schemes cannot address these issues well. Therefore, we propose an attribute-based keyword search scheme on blockchain. Data users can get the secret key from a trusted third party and use it to search on the blockchain without any interactions from the data owner, which avoids the possible privacy leakage of users. By improving the structure of the encrypted index, we achieved fine-grained access control and efficient search, which allows users to search encrypted files over the blockchain based on their attributes efficiently and accurately.
In summary, our contributions include:
1. We proposed an attribute-based keyword search scheme enabling the searching of encrypted data over blockchains. By introducing Attribute-Based Encryption (ABE), the proposed scheme can achieve data retrieval over the whole blockchain with support for fine-grained access control.
2. Motivated by the blockchain data structure, we design a data structure that is comprised of chained files with the same keyword to address the efficiency of encrypted data retrieval over blockchains.
3. We analyze the security of our design and further implement a prototype of the proposed scheme on the Hyperledger Fabric platform. Theoretical analysis and experimental evaluation show the efficiency and feasibility of the proposed scheme.
The concept of searchable encryption was first proposed by Song et al. [10] in 2000. This scheme could search encrypted data stored in remote servers by leveraging symmetric encryption methods. In 2004, Boneh et al. [11] first proposed a public-key encryption scheme with keyword search. However, the keyword privacy was not protected by this scheme. Subsequently, extensive research has been carried out on both symmetric [12–15] and asymmetric searchable encryption [16–19].
Searchable encryption has been well studied in cloud environments, however, limited research has been conducted on searching keywords on blockchains. Li et al. [20] presented a searchable encryption scheme using blockchain, which utilized blockchain techniques to ensure transaction security during the process of outsourced data retrieval. Cai et al. [21] designed an encrypted and distributed search protocol. This scheme considered submitted the retrieval results to the blockchain to achieve verifiability. Zhang et al. [22] proposed a trustworthy keyword search scheme over encrypted data based on blockchain. They adopted blockchains to ensure the security and fairness of payment for data retrieval. However, all these schemes did not search encrypted data over the blockchain, rather they only enhanced the reliability of transaction and payment occurrences during data retrieval over encrypted blockchains. Jiang et al. [7] proposed a stealth authorization scheme on blockchains to achieve private authorization delivery between data users and data owners, but the data user still had to inform the data owner of the search keywords. Tian et al. [23] designed a storage framework based on blockchain to protect digital data, which ensures the integrity and validity of evidence and better balances privacy and traceability. Yang et al. [8] proposed a blockchain based multi-keyword ranked search system, which realized the top-k ranked ciphertext retrieval on blockchain and used smart contracts to only return the most relevant encrypted files to data users. However, this scheme limited the structure of the index and therefore limited its scalability; it also increased the size of the index. Tahir et al. [24] proposed a privacy preserving searchable encryption framework to achieve data retrieval over encrypted blockchains. By introducing a probabilistic trapdoor, this scheme was able to protect the privacy of keywords. Unfortunately, this scheme only supported data retrieval by the data owner. Hu et al. [4] considered the use of smart contracts to support data retrieval. However, this scheme required the data user to send keywords to the data owner and the data owner would then search the dataset on behalf of the data user due to the adoption of symmetric searchable encryption. When considering searching over blockchains, the data user even needed to send keywords to all data owners, which is neither efficient nor privacy-preserving. Therefore, none of the above schemes meet the needs of data owners who want to share their data and achieve fine-grained access control while users who want to search independently on the blockchain without relying on the data owner.
Let G1 and GT be two multiplicative cyclic groups of large prime order q. Denote g and gT as the generators of G1 and GT, respectively. A bilinear map is a map e: G1 × G1 → GT which satisfies the following three properties:
1) Bilinearity:
for
2) Non-degeneracy:
3) Computability:
there exists an efficient algorithm to compute the map e.
The system model of the proposed scheme consists of three entities, as illustrated in Fig. 1: the attribute authority (AA), the data owner and the data user. An AA is a trusted third party responsible for distributing secret keys to users in the blockchain based on their attributes. Data owners possess files that can be shared within a group of data users with similar attributes. A data owner generates encrypted indexes and submits them to the blockchain. The data user encrypts the searching keyword and sends an encrypted token to the blockchain to retrieve the files it wants. The blockchain system is maintained and shared by all data owners and data users. A smart contract in the blockchain retrieves files and returns the search results to the data user.
The design goals of the proposed scheme are as follows:
1) Flexibility:
The data user can search files of all data owners in the blockchain without interacting individually with each data owner.
2) Attributed-Based Retrieval:
The encrypted files can only be retrieved and decrypted by data users with specific security attributes.
3) Reliability:
The search should return all matched results to the data user and the results should be the same each time the same query is executed.
4) Efficiency:
The retrieval performance should be higher than traditional traversal search in the blockchain.
3.4 Definition of the Proposed Scheme
Definition 1. The proposed scheme is comprised of six polynomial time algorithms including Setup, KeyGen, IndexGen, TokenGen, Search and Decrypt.
In this paper, the Selective-Set Security Game [25,26] is introduced to prove the security of the proposed scheme. The detailed game between a challenger
Init. An adversary
Setup. The challenger
Phase 1. By submitting any access trees with attribute set
Challenge. The adversary
Phase 2. Same as Phase 1.
Guess. Based on the forementioned steps, the adversary
We say that the proposed scheme is secure if for any probabilistic polynomial time adversary
4 Attribute-Based Keyword Search over Encrypted Blockchain
In a lot of cases, there may be several files with the same keyword when searching files in the blockchain. For instance, a stock analyst in a security company may submit several reports related to the company Apple to his boss. Thus, all these files may have the same keyword ‘Apple.’ A data owner can build a structure in advance to address this issue and improve the overall retrieval performance. Motivated by the data structure of chained blocks in blockchain, we designed a File Chain Structure (FCS) as shown in Fig. 2. An FCS is a chain structure links all files with the same keyword in the blockchain. Each node in the FCS contains the address of previous file and an encrypted file. All nodes linked end to end comprise the FCS.
The FCS can simplify the retrieval process. When users search on any file with certain keywords in the FCS, the smart contract only needs to search an index table and then traverse the FCS to retrieve all files with the same keyword instead of searching for other files by different tokens.
As noted, the proposed scheme consists of six algorithms: Setup, KeyGen, IndexGen, TokenGen, Search, Decrypt.
Setup. k denotes the security parameter. Let G1 and GT be two multiplicative cyclic groups of prime order q. Let g be the generator of group G1.
KeyGen. AA generates a random value
IndexGen. The data owner first generates a keyword
The data owner then chooses random values
TokenGen. The data user randomly chooses
Search. The data user invokes the search smart contract Sc with the search token tk. The smart contract Sc will search over each ciphertext keyword
If the equation holds, it splits the file
Decrypt. The data user invokes the smart contract Sc to read the
The algorithms in our scheme are described in Fig. 3.
In this section, we illustrate the proof of correctness of Eq. (1) as follows:
thus, we get
In this paper, the AA and the blockchain infrastructure are assumed to be reliable. Thus, we focus on the secure search related privacy requirements of the proposed scheme, which mainly involves the keyword semantic security and token unlinkability.
Definition 2. (The DBDH Assumption). Let
Theorem 1. Keyword semantic security. If there exists a probabilistic polynomial time adversary that can win the Selective-Set Security Game with non-negligible probability ɛ, then we can construct a simulator S to solve the DBDH problem with non-negligible probability
Proof. The DBDH challenger
Init. The adversary
Setup. The simulator
Phase 1. The adversary
Challenge. The adversary
Phase 2. The same as Phase 1.
Guess. The adversary
If k = 1,
If k = 0,
Finally, the overall advantage of the simulator
Theorem 2. Token Unlinkability. Data users are unable to directly distinguish two or more search tokens even those containing the same keyword.
Proof. The search information submitted to the blockchain should be encrypted before submitting since it can be accessed by all users. The data user encrypts the keyword kw into tk1 with a random value s. In this way, other users cannot directly distinguish different search tokens because they are obfuscated by the random value s. Therefore, the proposed scheme can achieve token unlinkability.
We implement a prototype of the proposed scheme using almost 3760 lines of code. We develop the blockchain system based on the Hyperledger Fabric blockchain platform. Both the data owner and the service peers were deployed on a machine with 8 GB of RAM, 2 Intel cores i5-8300H, running Ubuntu 20.04 LTS. The smart contract was implemented using chaincode in the Hyperledger Fabric platform. We implemented the pseudo-random function using the secure hash algorithm (SHA1).
The database of this experiment was randomly generated, each record consisted of a keyword and a file corresponding to the keyword. Furthermore, we generated two similar-sized databases with different numbers of keywords and of the same size to study the performance of our scheme on different datasets. The database DB1 consists of 3 K keyword-file pairs, 600 keywords where each keyword matched 5 files on average and the database DB2 also consisted of 3 K keyword-file pairs but had 300 keywords where each keyword matched 10 files on average. We ran each execution 20 times and averaged the results. The performance of our scheme for the different phases was recorded.
During this phase, the data owner encrypts keywords and corresponding files into encrypted indexes and associated encrypted files. Fig. 4 shows the index generation time varying with the number of keyword- file pairs. It can be observed that the time cost increases with the number of keyword-file pairs linearly. The time for generating indexes of 600 keyword-file pairs averages to 21.2 s.
To search the blockchain, the data owner first needs to generate a search token. The time cost of token generation is shown in Fig. 5. It is obvious that the time cost increases with the number of keywords linearly because the data owner generates a search token for each keyword. The time for generating tokens of 600 keywords was 10.3 s on average.
On the service side, the smart contract accepts the search request from the data owner/data user and executes the query operation. We execute the query operation with DB1 and DB2 separately. We do not implement FCS here, which helps us focus on the performance of our scheme without additional optimization considerations. The experimental results are shown in Fig. 6. We can see that the time costs for both DB1 and DB2 increase linearly with the number of matching files. In addition, the search time for DB1 takes about twice as long as that of DB2, since the smart contract needs to traverse all the ciphertext keywords (cph) in this phase. The time for each traversal of DB1 is the same as DB2 because DB1 and DB2 are the same size. Each query on DB2 returns 10 results while DB1 returns 5 results, resulting in the time to be half of that for DB2.
To demonstrate the efficiency of FCS, we present the search time cost with and without FCS respectively. Experimental results on DB1 are shown in Fig. 7a and those on DB2 are shown in Fig. 7b. It is obvious that the efficiency of the scheme with FCS is much higher than that without FCS for both DB1 and DB2. This shows the superiority of our FCS scheme. Besides, itis easy to see that both the time costs with and without FCS linearly increase with the number of matching files. This is because each query is independent of the other.
In this paper, we consider a scenario of encrypted data retrieval where data users could search over encrypted blockchains based on their attributes. Using attribute-based encryption, we designed an attribute-based encrypted keyword search scheme for blockchains, which allows a data user to search encrypted data across the blockchain. A novel File Chain Structure was proposed to improve the retrieval efficiency when searching files with the same keywords. The security analysis proves that our scheme can provide keyword semantic security and token unlinkability. The performance evaluation shows the feasibility and efficiency of the proposed scheme.
Funding Statement: This work was supported by the National Natural Science Foundation of China (61671030), Industrial Internet Innovation Development Project, China Postdoctoral Science Foundation (2019M660377), and National Key Research and Development Program of China (2020YFB2009501). It was also supported by Engineering Research Center of Intelligent Perception and Autonomous Control, Ministry of Education.
Conflicts of Interest: The authors declare that they have no conflicts of interest to report regarding the present study.
1. Su, S., Tian, Z., Liang, S., Li, S., Du, S. et al. (2020). A reputation management scheme for efficient malicious vehicle identification over 5G networks. IEEE Wireless Communications, 27(3), 46–52. DOI 10.1109/MWC.001.1900456. [Google Scholar] [CrossRef]
2. Tian, Z., Su, S., Shi, W., Du, X., Guizani, M. et al. (2019). A data-driven method for future Internet route decision modeling. Future Generation Computer Systems, 95(4), 212–220. DOI 10.1016/j.future.2018.12.054. [Google Scholar] [CrossRef]
3. Li, M., Sun, Y., Lu, H., Maharjan, S., Tian, Z. (2019). Deep reinforcement learning for partially observable data poisoning attack in crowdsensing systems. IEEE Internet of Things Journal, 7(7), 6266–6278. DOI 10.1109/JIOT.2019.2962914. [Google Scholar] [CrossRef]
4. Hu, S., Cai, C., Wang, Q., Wang, C., Luo, X. et al. (2018). Searching an encrypted cloud meets blockchain: A decentralized, reliable and fair realization. IEEE Conference on Computer Communications, pp. 792–800. Honolulu, IEEE. [Google Scholar]
5. Chen, L., Lee, W. K., Chang, C. C., Choo, K. K. R., Zhang, N. (2019). Blockchain based searchable encryption for electronic health record sharing. Future Generation Computer Systems, 95(3), 420–429. DOI 10.1016/j.future.2019.01.018. [Google Scholar] [CrossRef]
6. Li, H., Zhang, F., He, J., Tian, H. (2017). A searchable symmetric encryption scheme using blockchain. arXiv preprint. arXiv: 1711.01030. [Google Scholar]
7. Jiang, S., Liu, J., Wang, L., Yoo, S. M. (2019). Verifiable search meets blockchain: A privacy-preserving framework for outsourced encrypted data. 2019 IEEE International Conference on Communications, pp. 1–6. Shanghai, IEEE. [Google Scholar]
8. Yang, Y., Lin, H., Liu, X., Guo, W., Zheng, X. et al. (2019). Blockchain-based verifiable multi-keyword ranked search on encrypted cloud with fair payment. IEEE Access, 7, 140818–140832. DOI 10.1109/ACCESS.2019.2943356. [Google Scholar] [CrossRef]
9. Qiu, J., Tian, Z., Du, C., Zuo, Q., Su, S. et al. (2020). A survey on access control in the age of internet of things. IEEE Internet of Things Journal, 7(6), 4682–4696. DOI 10.1109/JIOT.2020.2969326. [Google Scholar] [CrossRef]
10. Song, D. X., Wagner, D., Perrig, A. (2000). Practical techniques for searches on encrypted data. Proceeding 2000 IEEE Symposium on Security and Privacy, pp. 44–55. Oakland, IEEE. [Google Scholar]
11. Boneh, D., Di Crescenzo, G., Ostrovsky, R., Persiano, G. (2004). Public key encryption with keyword search. International Conference on the Theory and Applications of Cryptographic Techniques, pp. 506–522. Berlin, Heidelberg: Springer. [Google Scholar]
12. Curtmola, R., Garay, J., Kamara, S., Ostrovsky, R. (2011). Searchable symmetric encryption: Improved definitions and efficient constructions. Journal of Computer Security, 19(5), 895–934. DOI 10.3233/JCS-2011-0426. [Google Scholar] [CrossRef]
13. Cash, D., Jaeger, J., Jarecki, S., Jutla, C. S., Krawczyk, H. et al. (2014). Dynamic searchable encryption in very-large databases: Data structures and implementation. NDSS, vol. 14, pp. 23–26. San Diego, The Internet Society. [Google Scholar]
14. Xia, Z., Wang, X., Sun, X., Wang, Q. (2015). A secure and dynamic multi-keyword ranked search scheme over encrypted cloud data. IEEE Transactions on Parallel and Distributed Systems, 27(2), 340–352. DOI 10.1109/TPDS.2015.2401003. [Google Scholar] [CrossRef]
15. Fu, Z., Sun, X., Linge, N., Zhou, L. (2014). Achieving effective cloud search services: Multi-keyword ranked search over encrypted cloud data supporting synonym query. IEEE Transactions on Consumer Electronics, 60(1), 164–172. DOI 10.1109/TCE.2014.6780939. [Google Scholar] [CrossRef]
16. Waters, B. R., Balfanz, D., Durfee, G., Smetters, D. K. (2004). Building an encrypted and searchable audit log. NDSS, vol. 4, pp. 5–6. San Diego, The Internet Society. [Google Scholar]
17. Baek, J., Safavi-Naini, R., Susilo, W. (2008). Public key encryption with keyword search revisited. International Conference on Computational Science and Its Applications, pp. 1249–1259. Berlin, Heidelberg: Springer. [Google Scholar]
18. Rhee, H. S., Susilo, W., Kim, H. J. (2009). Secure searchable public key encryption scheme against keyword guessing attacks. IEICE Electronics Express, 6(5), 237–243. DOI 10.1587/elex.6.237. [Google Scholar] [CrossRef]
19. Chen, R., Mu, Y., Yang, G., Guo, F., Wang, X. (2015). Dual-server public-key encryption with keyword search for secure cloud storage. IEEE Transactions on Information Forensics and Security, 11(4), 789–798. DOI 10.1109/TIFS.2015.2510822. [Google Scholar] [CrossRef]
20. Li, H., Tian, H., Zhang, F., He, J. (2019). Blockchain-based searchable symmetric encryption scheme. Computers & Electrical Engineering, 73(3), 32–45. DOI 10.1016/j.compeleceng.2018.10.015. [Google Scholar] [CrossRef]
21. Cai, C., Yuan, X., Wang, C. (2017). Towards trustworthy and private keyword search in encrypted decentralized storage. 2017 IEEE International Conference on Communications, pp. 1–7. Paris, IEEE. [Google Scholar]
22. Zhang, Y., Deng, R. H., Shu, J., Yang, K., Zheng, D. (2018). TKSE: Trustworthy keyword search over encrypted data with two-side verifiability via blockchain. IEEE Access, 6, 31077–31087. DOI 10.1109/ACCESS.2018.2844400. [Google Scholar] [CrossRef]
23. Tian, Z., Li, M., Qiu, M., Sun, Y., Su, S. (2019). Block-DEF: A secure digital evidence framework using blockchain. Information Sciences, 491(3), 151–165. DOI 10.1016/j.ins.2019.04.011. [Google Scholar] [CrossRef]
24. Tahir, S., Rajarajan, M. (2018). Privacy-preserving searchable encryption framework for permissioned blockchain networks. 2018 IEEE International Conference on Internet of Thing and IEEE Green Computing and Communications and IEEE Cyber, Physical and Social Computing and IEEE Smart Data, pp. 1628–1633. Halifax, IEEE. [Google Scholar]
25. Sun, W., Yu, S., Lou, W., Hou, Y. T., Li, H. (2014). Protecting your right: Attribute-based keyword search with fine-grained owner-enforced search authorization in the cloud. IEEE Conference on Computer Communications, pp. 226–234. Toronto, IEEE. [Google Scholar]
26. Wang, N., Fu, J., Bhargava, B. K., Zeng, J. (2018). Efficient retrieval over documents encrypted by attributes in cloud computing. IEEE Transactions on Information Forensics and Security, 13(10), 2653–2667. DOI 10.1109/TIFS.2018.2825952. [Google Scholar] [CrossRef]
This work is licensed under a Creative Commons Attribution 4.0 International License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |