With the rapid development of deep learning techniques, the popularity of voice services implemented on various Internet of Things (IoT) devices is ever increasing. In this paper, we examine user-level membership inference in the problem space of voice services, by designing an audio auditor to verify whether a specific user had unwillingly contributed audio used to train an automatic speech recognition (ASR) model under strict black-box access. With user representation of the input audio data and their corresponding translated text, our trained auditor is effective in user-level audit. We also observe that the auditor trained on specific data can be generalized well regardless of the ASR model architecture. We validate the auditor on ASR models trained with LSTM, RNNs, and GRU algorithms on two state-of-the-art pipelines, the hybrid ASR system and the end-to-end ASR system. Finally, we conduct a real-world trial of our auditor on iPhone Siri, achieving an overall accuracy exceeding 80%. We hope the methodology developed in this paper and findings can inform privacy advocates to overhaul IoT privacy.
In a Functional Encryption scheme (FE), a trusted authority enables designated parties to compute specific functions over encrypted data. As such, FE promises to break the tension between industrial interest in the potential of data mining and user concerns around the use of private data. FE allows the authority to decide who can compute and what can be computed, but it does not allow the authority to control which ciphertexts can be mined. This issue was recently addressed by Naveed et al., that introduced so-called Controlled Functional encryption (or C-FE), a cryptographic framework that extends FE and allows the authority to exert fine-grained control on the ciphertexts being mined. In this work we extend C-FE in several directions. First, we distribute the role of (and the trust in) the authority across several parties by defining multi-authority C-FE (or mCFE). Next, we provide an efficient instantiation that enables computation of quadratic functions on inputs provided by multiple data-owners, whereas previous work only provides an instantiation for linear functions over data supplied by a single data-owner and resorts to garbled circuits for more complex functions. Our scheme leverages CCA2 encryption and linearly-homomorphic encryption. We also implement a prototype and use it to showcase the potential of our instantiation.
The calibration of noise for a privacy-preserving mechanism depends on the sensitivity of the query and the prescribed privacy level. A data steward must make the non-trivial choice of a privacy level that balances the requirements of users and the monetary constraints of the business entity.
Firstly, we analyse roles of the sources of randomness, namely the explicit randomness induced by the noise distribution and the implicit randomness induced by the data-generation distribution, that are involved in the design of a privacy-preserving mechanism. The finer analysis enables us to provide stronger privacy guarantees with quantifiable risks. Thus, we propose privacy at risk that is a probabilistic calibration of privacy-preserving mechanisms. We provide a composition theorem that leverages privacy at risk. We instantiate the probabilistic calibration for the Laplace mechanism by providing analytical results.
Secondly, we propose a cost model that bridges the gap between the privacy level and the compensation budget estimated by a GDPR compliant business entity. The convexity of the proposed cost model leads to a unique fine-tuning of privacy level that minimises the compensation budget. We show its effectiveness by illustrating a realistic scenario that avoids overestimation of the compensation budget by using privacy at risk for the Laplace mechanism. We quantitatively show that composition using the cost optimal privacy at risk provides stronger privacy guarantee than the classical advanced composition. Although the illustration is specific to the chosen cost model, it naturally extends to any convex cost model. We also provide realistic illustrations of how a data steward uses privacy at risk to balance the trade-off between utility and privacy.
We propose Falcon, an end-to-end 3-party protocol for efficient private training and inference of large machine learning models. Falcon presents four main advantages – (i) It is highly expressive with support for high capacity networks such as VGG16 (ii) it supports batch normalization which is important for training complex networks such as AlexNet (iii) Falcon guarantees security with abort against malicious adversaries, assuming an honest majority (iv) Lastly, Falcon presents new theoretical insights for protocol design that make it highly efficient and allow it to outperform existing secure deep learning solutions. Compared to prior art for private inference, we are about 8× faster than SecureNN (PETS’19) on average and comparable to ABY3 (CCS’18). We are about 16 − 200× more communication efficient than either of these. For private training, we are about 6× faster than SecureNN, 4.4× faster than ABY3 and about 2−60× more communication efficient. Our experiments in the WAN setting show that over large networks and datasets, compute operations dominate the overall latency of MPC, as opposed to the communication.
Image hosting platforms are a popular way to store and share images with family members and friends. However, such platforms typically have full access to images raising privacy concerns. These concerns are further exacerbated with the advent of Convolutional Neural Networks (CNNs) that can be trained on available images to automatically detect and recognize faces with high accuracy.
Recently, adversarial perturbations have been proposed as a potential defense against automated recognition and classification of images by CNNs. In this paper, we explore the practicality of adversarial perturbation-based approaches as a privacy defense against automated face recognition. Specifically, we first identify practical requirements for such approaches and then propose two practical adversarial perturbation approaches – (i) learned universal ensemble perturbations (UEP), and (ii) k-randomized transparent image overlays (k-RTIO) that are semantic adversarial perturbations. We demonstrate how users can generate effective transferable perturbations under realistic assumptions with less effort.
We evaluate the proposed methods against state-of-theart online and offline face recognition models, Clarifai.com and DeepFace, respectively. Our findings show that UEP and k-RTIO respectively achieve more than 85% and 90% success against face recognition models. Additionally, we explore potential countermeasures that classifiers can use to thwart the proposed defenses. Particularly, we demonstrate one effective countermeasure against UEP.
Abstract: Users trust IoT apps to control and automate their smart devices. These apps necessarily have access to sensitive data to implement their functionality. However, users lack visibility into how their sensitive data is used, and often blindly trust the app developers. In this paper, we present IoTWATcH, a dynamic analysis tool that uncovers the privacy risks of IoT apps in real-time. We have designed and built IoTWATcH through a comprehensive IoT privacy survey addressing the privacy needs of users. IoTWATCH operates in four phases: (a) it provides users with an interface to specify their privacy preferences at app install time, (b) it adds extra logic to an app’s source code to collect both IoT data and their recipients at runtime, (c) it uses Natural Language Processing (NLP) techniques to construct a model that classifies IoT app data into intuitive privacy labels, and (d) it informs the users when their preferences do not match the privacy labels, exposing sensitive data leaks to users. We implemented and evaluated IoTWATcH on real IoT applications. Specifically, we analyzed 540 IoT apps to train the NLP model and evaluate its effectiveness. IoTWATcH yields an average 94.25% accuracy in classifying IoT app data into privacy labels with only 105 ms additional latency to an app’s execution.