Privacy-Preserving Data Mining in the Fully Distributed Model
Professor Rebecca Wright
Department of Computer Science
Stevens Institute of Technology
Monday, October 31, 2005
3:00pm
Kidde 104
Abstract:
Privacy-preserving data mining seeks to balance the ability to perform
useful computations on data held by many parties with the desire to
protect sensitive information. In the fully distributed model, each of
many users or devices has information which we think of as one record
in a virtual database. In this talk, I will describe several
privacy-preserving methods for computing on this virtual database.
Specifically, I will present our results in two areas: (1)
privacy-preserving frequency mining and (2) privacy-preserving
k-anonymization. In privacy-preserving frequency mining, we present a
way for a data miner to learn the frequencies of combinations of data
values without learning the individual data values, and we discuss how
these frequencies can enable various classification tasks. In
privacy-preserving k-anonymization, we consider the previously
proposed method of protecting identities in data through
k-anonymization, which modifies data so that each individual is
"hidden" among at least k others. Previous algorithms for
k-anonymization of data have assumed centralized access to the entire
data set. In our work, we show how a data miner or data publisher can
learn a k-anonymized version of a fully distributed database without
learning the entire data set.
This research is a joint work with Zhiqiang Yang and Sheng Zhong.
Refreshments will be available.
|