Hier finden Sie wissenschaftliche Publikationen aus den Fraunhofer-Instituten.

Towards a toolkit for utility and privacy-preserving transformation of semi-structured data using data pseudonymization

: Kasem-Madani, Saffija; Meier, Michael; Wehner, Martin


Garcia-Alfaro, Joaquin:
Data Privacy Management, Cryptocurrencies and Blockchain Technology : ESORICS 2017 International Workshops, DPM 2017 and CBT 2017, Oslo, Norway, September 14-15, 2017, Proceedings
Cham: Springer International Publishing, 2017 (Lecture Notes in Computer Science 10436)
ISBN: 978-3-319-67815-3 (Print)
ISBN: 978-3-319-67816-0
ISBN: 3-319-67815-9
European Symposium on Research in Computer Security (ESORICS) <22, 2017, Oslo>
International Workshop on Data Privacy Management (DPM) <12, 2017, Oslo>
International Workshop on Cryptocurrencies and Blockchain Technology (CBT) <1, 2017, Oslo>
Conference Paper
Fraunhofer FKIE ()

We present a flexibly configurable toolkit for the automatic pseudonymization of datasets that keeps certain utility. The toolkit could be used to pseudonymize data in order to preserve the privacy of data owners while data processing and to meet the requirements of the new European general data protection regulation. We define some possible utility requirements and corresponding utility options a pseudonym can meet. Based on that, we define a policy language that can be used to produce machine-readable utility policies. The utility policies are used to configure the toolkit to produce a pseudonymized dataset that offers the utility options. Here, we follow a confidentiality-by-default principle. I.e., only the data mentioned in the policy is transformed and included in the pseudonymized dataset. All remaining data is kept confidential. This stays in contrast to common pseudonymization techniques that replace only personal or sensitive data of a dataset with pseudonyms, while keeping any other information in plaintext. If applied appropriately, our approach allows for providing pseudonymized datasets that includes less information that can be misused to infer personal information about the individuals the data belong to.