Abstract
Given the threat of re-identification in our growing digital society, guaranteeing privacy while providing worthwhile data for knowledge discovery has become a difficult problem. k-anonymity is a major technique used to ensure privacy by generalizing and suppress-ing attributes and has been the focus of intense re-search in the last few years. However, data mod-ification techniques like generalization may produce anonymous data unusable for medical studies because some attributes become too coarse-grained. In this paper, we propose a priority driven k-anonymisation that allows to specify the degree of acceptable dis-tortion for each attribute separately. We also define some appropriate metrics to measure the distance and information loss, which are suitable for both numeri-cal and categorical attributes. Further, we formulate the priority driven k-anonymisation as the k-nearest neighbor (KNN) clustering problem by adding a con-straint that each cluster contains at least k tuples. We develop an efficient algorithm for priority driven k-anonymisation. Experimental results show that the proposed technique causes significantly less distor-tions. © 2008, Australian Computer Society, Inc.
| Original language | English |
|---|---|
| Pages | 73-78 |
| Number of pages | 6 |
| Publication status | Published - 1 Dec 2008 |
| Externally published | Yes |
| Event | Proceedings of the 7th Australasian Data Mining Conference - Glenelg, Australia Duration: 27 Nov 2008 → 28 Nov 2008 |
Conference
| Conference | Proceedings of the 7th Australasian Data Mining Conference |
|---|---|
| Abbreviated title | AusDM 2008 |
| Country/Territory | Australia |
| City | Glenelg |
| Period | 27/11/08 → 28/11/08 |
Keywords
- K-anonymity
- Privacy protection
Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver