Google today announced that it is open sourcing its so-called differential privacy library, an internal tool the company uses to securely draw insights from data sets that contain the private and sensitive personal information of its users.
Differential privacy is a cryptographic approach to data science, particularly with regard to analysis, that allows someone relying on software-aided analysis to draw insights from massive data sets while protecting user privacy. It does so by mixing novel user data with artificial “white noise,” as explained by Wired’s Andy Greenberg. That way, the results of any analysis cannot be used to unmask individuals or allow a malicious third party to trace any one data point back to an identifiable source.
The technique is the bedrock of Apple’s approach to privacy-minded machine learning, for instance. It lets Apple extract data from iPhone users, statistically anonymize that data, and still draw useful insights that can help it improve, say, its Siri algorithms over time.
“Google wants differential privacy to be accessible to data scientists in any field”