Abstract:The encrypted query integrity authentication mechanism can provide assurance for the reliability of the query results while protecting the data privacy of artificial intelligence applications. However, the existing encrypted range query integrity authentication methods suffer from high overhead in authentication data structure construction and poor data scalability. To address these issues, the causes of performance bottlenecks in secure verifiable and efficient framework(ServeDB)are analyzed. Based on the analysis conclusions, a cube-cell-based authentication method(CubeTree) is proposed for the encrypted range query integrity authentication problem. A quantile-normalization-based data redistribution optimization is adopted to balance the data distribution in the domain. The encoding overheads of data records are reduced by the data redistribution optimization. Furthermore, a flat balanced K-ary tree structure and a cube-cell-based index authentication data structure are proposed. The redundancy of the authentication data structure is significantly reduced by merging data records with same codes and adopting cube cells as the basic units. Consequently, the computational and storage costs of the CubeTree construction are decreased. Experiments on real-world and synthetic datasets show that CubeTree can significantly reduce the construction costs of the authentication data structure and the generation/verification costs of query integrity proofs, while efficiently handling large-scale datasets with hundreds of millions of data records.
王肇康, 潘佳辉, 周璐. 支持亿级数据的高效密文范围查询完整性验证[J]. 模式识别与人工智能, 2024, 37(1): 27-46.
WANG Zhaokang, PAN Jiahui, ZHOU Lu. Efficient Encrypted Range Query Integrity Authentication for Hundreds of Millions of Records. Pattern Recognition and Artificial Intelligence, 2024, 37(1): 27-46.
[1] WANG J F, CHEN X F, HUANG X Y, et al. Verifiable Auditing for Outsourced Database in Cloud Computing. IEEE Transactions on Computers, 2015, 64(11): 3293-3303. [2] ZHANG B, DONG B X, WANG W H.Integrity Authentication for SQL Query Evaluation on Outsourced Databases: A Survey. IEEE Transactions on Knowledge and Data Engineering, 2021, 33(4): 1601-1618. [3] XIE M, WANG H X, YIN J, et al. Integrity Auditing of Outsourced Data // Proc of the 33rd International Conference on Very Large Data Bases. New York, USA: ACM, 2007: 782-793. [4] DEVANBU P, GERTZ M, MARTEL C, et al. Authentic Data Publication over the Internet. Journal of Computer Security, 2003, 11(3): 291-314. [5] PANG H H, TAN K L.Authenticating Query Results in Edge Computing // Proc of the 20th International Conference on Data Engineering. Washington, USA: IEEE, 2004. DOI: 10.1109/ICDE.2004.1320027. [6] YANG Y, PAPADIAS D, PAPADOPOULOS S, et al. Authenticated Join Processing in Outsourced Databases // Proc of the ACM SIGMOD International Conference on Management of Data. New York, USA: ACM, 2009: 5-18. [7] NUCKOLLS G.Verified Query Results from Hybrid Authentication Trees // Proc of the IFIP Annual Conference on Data and Applications Security and Privacy. Berlin, Germany: Springer, 2005: 84-98. [8] CHENG W W, PANG H, TAN K L.Authenticating Multi-dimensional Query Results in Data Publishing // Proc of the IFIP Annual Conference on Data and Applications Security and Privacy. Berlin, Germany: Springer, 2006: 60-73. [9] NARASIMHA M, TSUDIK G.Authentication of Outsourced Databases Using Signature Aggregation and Chaining // Proc of the 11th International Conference on Database Systems for Advanced Applications. Berlin, Germany: Springer, 2006: 420-436. [10] PANG H, ZHANG J L, MOURATIDIS K.Scalable Verification for Outsourced Dynamic Databases. Proceedings of the VLDB Endowment, 2009, 2(1): 802-813. [11] PAPADOPOULOS D, PAPADOPOULOS S, TRIANDOPOULOS N.Taking Authenticated Range Queries to Arbitrary Dimensions // Proc of the ACM SIGSAC Conference on Computer and Communications Security. New York, USA: ACM, 2014: 819-830. [12] ZHANG Y P, KATZ J, PAPAMANTHOU C.IntegriDB: Verifi-able SQL for Outsourced Databases // Proc of the 22nd ACM SIGSAC Conference on Computer and Communications Security. New York, USA: ACM, 2015: 1480-1491. [13] REN K, GUO Y, LI J Q, et al. HybrIDX: New Hybrid Index for Volume-Hiding Range Queries in Data Outsourcing Services // Proc of the IEEE 40th International Conference on Distributed Computing Systems. Washington, USA: IEEE, 2020: 23-33. [14] ZHAN Y, SHEN D F, DUAN P, et al. MDOPE: Efficient Multi-dimensional Data Order Preserving Encryption Scheme. Information Sciences, 2022, 595: 334-343. [15] ZHANG S N, RAY S, LU R X.SOREL: Efficient and Secure ORE-Based Range Query over Outsourced Data. IEEE Transactions on Big Data, 2022, 8(6): 1702-1715. [16] TIAN P X, GUO C, CHOO K K R, et al. EAFS: An Efficient, Accurate, and Forward Secure Searchable Encryption Scheme Su-pporting Range Search. IEEE Systems Journal, 2022, 16(2): 3450-3460. [17] GUO C, SU S H, CHOO K K R, et al. A Provably Secure and Effi-cient Range Query Scheme for Outsourced Encrypted Uncertain Data from Cloud-Based Internet of Things Systems. IEEE Internet of Things Journal, 2022, 9(3): 1848-1860. [18] TIAN P X, GUO C, CHOO K K R, et al. A Privacy-Preserving Hybrid Range Search Scheme over Encrypted Electronic Medical Data in IoT Systems. IEEE Internet of Things Journal, 2023, 10(17): 15314-15324. [19] WU S R, LI Q, LI G L, et al. ServeDB: Secure, Verifiable, and Efficient Range Queries on Outsourced Database // Proc of the IEEE 35th International Conference on Data Engineering. Washing-ton, USA: IEEE, 2019: 626-637. [20] MENG Q, WENG J, MIAO Y B, et al. Verifiable Spatial Range Query Over Encrypted Cloud Data in VANET. IEEE Transactions on Vehicular Technology, 2021, 70(12): 12342-12357. [21] CUI N N, YANG X C, CHEN Y L, et al. Secure Boolean Spatial Keyword Query with Lightweight Access Control in Cloud Environments. IEEE Internet of Things Journal, 2022, 9(12): 9503-9514. [22] MA W J, PENG Y G, LIU X M, et al. VeriRange: A Verifiable Range Query Model on Encrypted Geographic Data for loT Environment. IEEE Internet of Things Journal, 2023. DOI: 10.1109/JIOT.2023.3294589. [23] ZHANG M H, XIE Z L, YUE C, et al. Spitz: A Verifiable Database System. Proceedings of the VLDB Endowment, 2020, 13(12): 3449-3460. [24] WANG Q, ZHOU F C, ZHOU B Y, et al. Privacy-Preserving Publicly Verifiable Databases. IEEE Transactions on Dependable and Secure Computing, 2022, 19(3): 1639-1654. [25] YANG D Q, ZHANG D Q, ZHENG V W, et al. Modeling User Activity Preference by Leveraging User Spatial Temporal Characte-ristics in LBSNs. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 2015, 45(1): 129-142. [26] CHO E, MYERS S A, LESKOVEC J.Friendship and Mobility: User Movement in Location-Based Social Networks // Proc of the 17th ACM SIGKDD International Conference on Knowledge Disco-very and Data Mining. New York, USA: ACM, 2011: 1082-1090. [27] BERNARDINELI D L. eBird Brazilian Birdwatching Observations: An Comprehensive Geo-Spatial Dataset for Bird Observation[EB/OL].[2023-09-20]. https://www.kaggle.com/datasets/danlessa/ebird-brazilian-observations. [28] YANG D Q, ZHANG D Q, QU B Q.Participatory Cultural Ma-pping Based on Collective Behavior Data in Location-Based Social Networks. ACM Transactions on Intelligent Systems and Technology, 2016, 7(3): 1-23. [29] SOHRABI B 2.3 Million US Wildfires(1992-2020) 6th Edition: Spatial Wildfire Occurrence Data for the United States, 1992-2020[EB/OL]. [2023-09-20]. https://www.kaggle.com/datasets/behroozsohrabi/us-wildfire-records-6th-edition.