Software & Datasets
I maintain a list of DP and LDP experiments at ldp-protocols-mobility-cdrs for some of the works I carried out during my Ph.D. thesis.
Open-Source Libraries
- multi-freq-ldpy [GitHub] [PyPI] [ESORICS 2022] [arXiv]
Multi-Freq-LDPy is a Python library for performing multiple frequency estimation tasks (multidimensional, longitudinal, and both) under local differential privacy (LDP) guarantees. Multi-Freq-LDPy features various state-of-the-art LDP algorithms such as k-RR, SS, OUE, OLH, RS+FD, Google’s RAPPOR, and Microsoft’s dBitFlipPM. The package is distributed under MIT license.
Datasets
MS-FIMU (Mobility Scenario FIMU) [GitHub] [IWCMC 2020]
An open, longitudinal (7 days), multidimensional (7 attributes), and synthetic dataset of faked virtual humans generated by an optimization approach applied to a real-life call-detail-records-based anonymized database.
This dataset can be used for ML classification tasks and for evaluating (locally) differentially private mechanisms on multidimensional and/or longitudinal data.Multivariate-Mobility-Paris [GitHub] [NCAA Journal] [arXiv]
The original dataset was provided by Orange telecom in France, which contains anonymized and aggregated human mobility data. The Multivariate-Mobility-Paris dataset comprises information from 2020-08-24 to 2020-11-04 (72 days during the COVID-19 pandemic), with time granularity of 30 minutes and spatial granularity of 6 coarse regions in Paris, France. In other words, it represents a multivariate time series dataset. This dataset can be used for several time-series tasks such as univariate/multivariate forecasting/classification with classic, machine learning, and privacy-preserving machine learning techniques.
Selected Open-Source Code from Papers
LOLOHA [GitHub] [arXiv]
Python implementation of longitudinal LDP protocols (RAPPOR, dBitFlipPM) and our local hashing-based protocols for frequency estimation longitudinally (i.e., throughout time).Risks LDP [GitHub] [arXiv]
Python implementation of re-identification and attribute inference attacks to LDP protocols for multidimensional data.RS+FD [GitHub] [CIKM 2021] [arXiv]
Python implementation of our Random Sampling Plus Fake Data (RS+FD) algorithms for frequency estimation of multiple attributes under LDP.Longidutinal LDP [GitHub] [DCAN Journal] [arXiv]
Python implementation of our longitudinal LDP protocols (L-GRR and L-UE) for frequency estimation a single time (i.e., with ϵ1-LDP).Geo-Indistinguishability [GitHub] [MCA Journal]
Python implementation of the planar Laplace mechanism that satisfy Geo-Indistinguishability.