Electrical and Electronic Engineering - Theses

Permanent URI for this collection

Search Results

Now showing 1 - 1 of 1
  • Item
    Thumbnail Image
    Privacy-preserving machine learning and data aggregation for Internet of Things
    Lyu, Lingjuan ( 2018)
    The proliferation of Internet of Things (IoT) devices has contributed to the emergence of participatory sensing (PS) and collaborative learning (CL), where multiple participants collect and report their data to a cloud service to analyse the union of the collected data in the server-based framework. While in the decentralized framework, multiple participants collaboratively train a more accurate global model or multiple local models. However, the possibility of the cloud service or any participant being semi-honest or malicious pose a serious challenge of preserving the participants' privacy. Privacy-preserving machine learning and data aggregation aim to discover or derive useful statistics without compromising privacy. This thesis systematically investigates state-of-the-art techniques for privacy-preserving machine learning and data aggregation in a range of IoT applications. Extensive theoretical and experimental results are provided to support the following primary contributions. First, we explore three privacy-preserving machine learning applications. Examples include collaborative anomaly detection, human activity recognition and decentralized collaboration in a biomedical domain. We tackle security challenges in collaborative anomaly detection with a two-stage scheme called RG+RT: in the first stage, participants individually perturb their data by passing through a nonlinear function called repeated Gompertz (RG); in the second stage, the perturbed data are projected to a lower dimension using a participant-specific uniform random transformation (RT) matrix. The nonlinear RG function is designed to mitigate maximum a posteriori (MAP) estimation attacks, while random transformation resists independent component analysis (ICA) attacks. For human activity recognition, a similar two-stage scheme called RG+RP is proposed, the difference lies in the second stage, where participants project their perturbed data to a lower dimension in an (almost) distance-preserving manner, using a random projection (RP) matrix. The random projection can both resist ICA attacks and maintain model accuracy. These proposed two-stage randomisation schemes are assessed in terms of their recovery resistance to MAP estimation attacks. Preliminary theoretical analysis as well as experimental results on synthetic and real-world datasets indicate that both RG+RT and RG+RP exhibit better recovery resistance to MAP estimation attacks than most state-of-the-art techniques, meanwhile high utility is guaranteed. To mitigate the inherent limitations in the centralized framework, and investigate the applicability of the decentralized framework, we study the decentralized collaboration in a biomedical domain. In particular, we develop an efficient Decentralized Privacy-Preserving Centroid Classifier (DPPCC) considering three practical scenarios, where distributed differential privacy (DDP) is combined with distributed exponential ElGamal cryptosystem to preserve privacy and maintain utility. We realize DDP using discrete Gaussian mechanism without any restriction on ε as in the traditional Gaussian mechanism, and only the encrypted noisy model parameters or test results are shared among all parties. It ensures each party learns nothing but the noisy sum of local statistics. Second, we examine privacy-preserving data aggregation in smart grid application. To this end, we propose a multi-level aggregation framework based on fog architecture, which