Intelligent Semantic Location Privacy Protection Method for Location Based Services in Three-Dimensional Spaces
摘要: 针对大型医院、商场及其他3维(3D)空间位置服务中敏感语义位置(如药店、书店等)隐私泄露问题,该文研究了基于3D空间地理不可区分性(3D-GI)的智能语义位置隐私保护方法。为摆脱对特定环境和攻击模型的依赖,该文利用强化学习(RL)技术实现对用户语义位置隐私保护策略的动态优化,提出基于策略爬山算法(PHC)的3D语义位置扰动机制。该机制通过诱导攻击者推断较低敏感度的语义位置来减少高敏感语义位置的暴露。为解决复杂3D空间环境下的维度灾难问题,进一步提出基于近端策略优化算法(PPO)的3D语义位置扰动机制,利用神经网络捕获环境特征并采用离线策略梯度方法优化神经网络参数更新,提高语义位置扰动策略选择效率。仿真实验结果表明,所提方法可提升用户的语义位置隐私保护性能和服务体验。Abstract: An intelligent semantic location privacy protection method based on 3D Geo-Indistinguishability (3D-GI) is studied for the privacy leakage problem of sensitive semantic locations (such as medicine stores and bookstores) in 3D space location-based services, such as hospitals and shopping centers. Reinforcement Learning (RL) techniques are used in this paper to optimize user’s semantic location privacy protection policies dynamically. Specifically, a 3D semantic location perturbation mechanism is proposed based on the Policy Hill Climbing (PHC) algorithm, independent of specific environments and attack models. This mechanism induces attackers to infer less sensitive locations to reduce the exposure of sensitive semantic locations. To address the dimensional disaster problem of complex 3D space, a 3D semantic location perturbation mechanism based on the Proximal Policy Optimization (PPO) algorithm is further proposed. This mechanism captures the environment features using a neural network and optimizes the neural network parameter updates through the offline policy gradient method to improve the efficiency of semantic location perturbation policy selection. Experimental results show that the proposed mechanism improves both semantic location privacy protection and user service experience.
表 1 符号含义
符号 含义 $ {\varepsilon ^{(k)}} $ 隐私预算 $ {{\boldsymbol{d}}^{(k)}}/{c^{(k)}}/{l^{(k)}} $ 实际地理位置/语义位置/语义位置敏感度 $ {\tilde {\boldsymbol{d}}^{(k)}}/{\tilde c^{(k)}}/{\tilde l^{(k)}} $ 扰动地理位置/语义位置/语义位置敏感度 $ {\hat {\boldsymbol{d}}^{(k)}}/{\hat c^{(k)}}/{\hat l^{(k)}} $ 推断地理位置/语义位置/语义位置敏感度 $ {p^{(k)}} $ 隐私水平 $ {q^{(k)}} $ 服务质量(QoS)损失 $ {u^{(k)}} $ 用户的效益 1 基于PHC 的3D语义位置扰动机制
初始化$ Q $表,$ V $表及$ \pi $表,系统参数$\alpha $, $\gamma $, $\delta $, $ {d^{\left( 1 \right)}} $, $ {c^{\left( 1 \right)}} $, $ {l^{\left( 1 \right)}} $,
$ {\varpi ^{\left( 0 \right)}} $;设置学习迭代次数。(1) For $k = 1,2, \cdots $, do (2) 观察当前系统状态$ {s^{\left( k \right)}} = \left[ {{{\boldsymbol{d}}^{\left( k \right)}},{c^{\left( k \right)}},{l^{\left( k \right)}},{\varpi ^{\left( {k - 1} \right)}}} \right] $ (3) 根据$\pi $表选择位置扰动策略$ {{\boldsymbol{a}}^{(k)}} $ (4) 根据伽马分布$ \varGamma \left( {3,{1 \mathord{\left/ {\vphantom {1 \varepsilon }} \right. } \varepsilon }} \right) $,产生预算对应的$ r $ (5) 通过式(6)获得扰动位置$ {\tilde {\boldsymbol{d}}^{\left( k \right)}} $,并根据地图信息获取$ {\tilde c^{(k)}} $ (6) 根据扰动位置$ ({\tilde {\boldsymbol{d}}^{(k)}},{\tilde c^{(k)}}) $请求LBS (7) 通过式(7)获取效益$ {u^{\left( k \right)}} $ (8) 通过式(8)更新$ Q({{\boldsymbol{s}}^{(k)}},{{\boldsymbol{a}}^{(k)}}) $ (9) 通过式(9)更新$ V({{\boldsymbol{s}}^{(k)}}) $ (10) 通过式(10)更新$ \pi ({{\boldsymbol{s}}^{(k)}},{\boldsymbol{a}}) $ (11) End 2 基于PPO的3D语义位置扰动机制
初始化系统参数和网络参数$\gamma $, $\delta $, ${d^{(1)}}$, ${c^{(1)}}$, ${l^{(1)}}$, $ {\varpi ^{(0)}} $, $ {\theta ^{(0)}} $,
$ {\phi ^{(0)}} $(1) For $k = 1,2, \cdots, $ do (2) 观察当前系统状态$ {s^{(k)}} = \left[ {{d^{\left( k \right)}},{c^{\left( k \right)}},{l^{\left( k \right)}},{\varpi ^{\left( {k - 1} \right)}}} \right] $ (3) 将状态$ {{\boldsymbol{s}}^{(k)}} $输入到Actor网络得到$ {{\boldsymbol{\mu}} ^{(k)}} $和$ {{\boldsymbol{\xi}} ^{(k)}} $ (4) 通过式(10)得到$ {\pi _\theta }({\boldsymbol{a}}|{{\boldsymbol{s}}^{(k)}}) $ (5) 根据$ {\pi _\theta }({\boldsymbol{a}}|{{\boldsymbol{s}}^{(k)}}) $选择扰动策略$ {{\boldsymbol{a}}^{(k)}} $ (6) 扰动位置的获取参考算法1中的步骤(4)–步骤(5) (7) 根据扰动位置$ ({\tilde d^{(k)}},{\tilde c^{(k)}}) $请求LBS (8) 通过式(7)进行效益评估 (9) 将经验序列$ {{\boldsymbol{\varPsi}} ^{(k)}} = ({{\boldsymbol{s}}^{(k)}},{{\boldsymbol{a}}^{(k)}},{u^{(k)}},{{\boldsymbol{s}}^{(k + 1)}}) $存入经验存
储池中(10) If then (11) 从经验池中抽取小批量经验值输入到Actor和Critic网络中 (12) 通过式(12)计算优势函数$ \hat A({{\boldsymbol{s}}^{(k)}},{{\boldsymbol{a}}^{(k)}}) $ (13) 通过式(13)更新Actor网络参数$ \theta $ (14) 通过式(14)更新Critic网络参数$ \phi $ (15) End (16) End 表 2 仿真过程的超参数设置
参数 PHCLP机制 PPOLP机制 学习率$ \alpha $ 0.5 0.001/0.003
(Actor/Critic)折扣因子$ \gamma $ 0.9 0.9 截断系数$ \sigma $ - 0.1 batch-size - 32 激活函数 - Adam 神经网络隐藏层数(Actor/Critic) - 2层/3层 隐藏的单元数(Actor/Critic) - 8,8/8,8,8 -
