Einrichtungen >> Fakultät Wirtschaftsinformatik / Angewandte Informatik >> Bereich Angewandte Informatik >> Professur für Angewandte Informatik, insbes. Kognitive Systeme >> |
Master Seminar in Security and Privacy: Anonymization Techniques (PSI-MSem)
- Prof. Dr. Dominik Herrmann, Henning Pridöhl, M. Sc.
Zeit und Ort: Di 16:00 - 18:00, WE5/05.005; Einzeltermin am 30.1.2018 18:00 - 19:00, WE5/05.005
- Voraussetzungen / Organisatorisches
- OBJECTIVE: This seminar will provide in-depth insight into an advanced topic in an important field of information security and privacy. On the methodological level you will learn how to find and discuss scientific literature. You will also considerably advance your abilities to write and present scientifically, and you will exercise how to appreciate and review others' work. You will also advance your analytical skills, in particular when it comes to structuring your thoughts and organizing a scientific text in a concise way. These goals are achieved in this seminar by asking you to perform manageable chunks of work throughout the semester, for which you will receive timely feedback both by your peers and by your instructors.
ORGANIZATION: Please sign up for the VC course (https://vc.uni-bamberg.de/moodle/course/view.php?id=26570) if you want to participate in this seminar, so that I know how many participants to expect. In principle, Bachelor students who already have some knowledge in information security can participate as well. However, if the capacity is not sufficient, Master students will be given precedence. The seminar will be taught in English unless all participants are fluent in German. In the first week (Oct 17) we will discuss the topics and logistics (participation is mandatory). It is recommended to read up on the available topics in advance.
The seminar consists of three phases:
- PHASE 1 (first four weeks): In each of the first three weeks every student (or groups of 2 3 students) writes two literature surveys for two topics from the list of available topics (see below). A literature survey consists of one paragraph of text that concisely summarizes at least three relevant publications for a topic. Thus, after three weeks everyone in the seminar will have a broad overview over six topics. In addition, each survey paragraph will be peer-reviewed anonymously by another student/group to provide feedback (i.e., each group writes reviews for two survey paragraphs per week).
- PHASE 2: On Nov 14 we meet again and assign the topics to students/groups. Each student/group writes a seminar paper. A draft of the paper is anonymously peer-reviewed by two other students/groups (i.e. each student/group writes two reviews). The draft is due before the Christmas break. Reviews of the draft are due after the Christmas break.
- PHASE 3 (last 4 weeks): Each student/group receives feedback from the peer reviews and can improve their paper based on that. Each student/group presents their topic and hands in the final paper until the end of the semester.
TIMELINE OF PHASE 1:
- 17.10. start survey s1 s2
- 24.10. start survey s3 s4, start review r1 r2 / submit survey s1 s2
- 31.10. start survey s5 s6, start review r3 r4 / submit survey s3 s4, submit review r1 r2
- 07.11. start review r5 r6 / submit survey s5 s6, submit review r3 r4
- 14.11. submit review r5 r6, assignment of topics (participation mandatory)
GRADED DELIVERABLES: literature surveys for 6 topics, peer reviews for the literature reviews of other groups, a written seminar paper (10 15 pages in LaTeX/LNCS format in English), and a presentation (20 minutes). The paper is due at the end of the semester January/February 2018. Presentations will be scheduled in the last weeks of the semester.
RECOMMENDED SKILLS FOR THIS SEMINAR: sound command of English language (all literature is in English), basic knowledge of information security fundamentals (or willingness to study those on your own as part of the seminar).
- INTRODUCTION: One of the most daunting tasks in information security is protecting sensitive data in enterprise applications, which are often complex and distributed. What methods are available to protect sensitive data? Some of the methods available are cryptography, anonymization, and tokenization. In this seminar we will focus on anonymization. In order to give you an overview, the three approaches will be briefly discussed in the following.
Cryptographic techniques are probably one of the oldest known techniques for data protection. When done right, they are probably one of the safest techniques to protect data in motion and at rest. Encrypted data have high protection, but are not readable, so how can we use such data? Another issue associated with cryptography is key management. Any compromise of key means complete loss of privacy. As a result, cryptographic techniques are not used widely for use cases in which an enterprise wants or has to work with data. Nevertheless, there is a lot of ongoing research on techniques like secure multiparty computation (MPC) and zero-knowledge proofs (ZKP). However, such techniques are out of scope for this seminar.
Tokenization is a technique that replaces the original sensitive data with non-sensitive placeholders referred to as tokens. The fundamental difference between tokenization and the other techniques is that in tokenization, the original data are completely replaced by a surrogate that has no connection to the original data. Tokens have the same format as the original data. As tokens are not derived from the original data, they exhibit very powerful data protection features. Another interesting point of tokens is, although the token is usable within its native application environment, it is completely useless elsewhere. Therefore, tokenization is ideal to protect sensitive identifying information. However, tokenization makes it impossible to process and work with the protected pieces of information.
Anonymization is a set of techniques used to modify the original data in such a manner that it does not resemble the original value but maintains the semantics and syntax. Regulatory compliance and ethical issues drive the need for anonymization. The intent is that anonymized data can be shared freely with other parties, who can perform their own analysis on the data. Anonymization is an optimization problem, in that when the original data are modified they lose some of its utility. But modification of the data is required to protect it. An anonymization design is a balancing act between data privacy and utility. Privacy goals are set by the data owners, and utility goals are set by data users. Now, is it really possible to optimally achieve this balance between privacy and utility? We will explore this and other questions throughout the seminar.
Note: The previous four paragraphs are the slightly edited introduction taken from Nataraj Venkataramanan & Ashwin Shriram: Data Privacy: Principles and Practice, Chapman & Hall/CRC, 2016.
AVAILABLE SEMINAR TOPICS
- Barriers to the implementation of k-anonymity and related microdata anonymization techniques in a real-world application
- On the tradeoff between utility and anonymity in privacy-preserving data mining in the medical sphere
- Known attacks on k-anonymity, l-diversity, and t-closeness and countermeasures
- Developing practical microdata anonymization case studies with ARX, Aircloak, iDASH, and Sharemind
- Strategies of heuristic optimization algorithms for k-anonymity
- Limiting the impact of intersection attacks on pseudonymized datasets
- Challenges for k-anonymity regarding incremental database updates
- Zusätzliche Informationen
- www: https://www.uni-bamberg.de/informatik/psi/
- Institution: Lehrstuhl für Privatsphäre und Sicherheit in Informationssystemen
Hinweis für Web-Redakteure:
Wenn Sie auf Ihren Webseiten einen Link zu dieser Lehrveranstaltung setzen möchten, verwenden Sie bitte einen der folgenden Links:
Link zur eigenständigen Verwendung
Link zur Verwendung in Typo3
||UnivIS ist ein Produkt der Config eG, Buckenhof