BB26.G Privacy labels (A-F)

From its-wiki.no

Revision as of 10:29, 29 November 2019 by Cristian (Talk | contribs)

Jump to: navigation, search
Title Privacy labels (A-F)
Page Title BB26.G Privacy labels (A-F)
Technology Line Reference Architecture/Implementation
Lead partner UiO
Leader Christian Johansen
Contributors UiO, Smart Innovation Norway
Related to Use Cases SCOTT:WP7, SCOTT:WP11, SCOTT:WP21, SCOTT:WP14"<s>SCOTT:WP14</s>" cannot be used as a page name in this wiki., SCOTT:WP12"<s>SCOTT:WP12</s>" cannot be used as a page name in this wiki., SCOTT:WP8"<s>SCOTT:WP8</s>" cannot be used as a page name in this wiki., SCOTT:WP09
Description In order to allow authorities or standardization and authorization bodies to evaluate a product with respect to privacy aspects, before attaching a Privacy Label, we need to provide both a methodology and suggest tools to be used in the assessment. Moreover, we need to study closely how the Privacy labels should “look and feel” to the end customers. For this we wil apply interaction design techniques, including surveys and other user analysis techniques.

The work has to focus on several aspects:

  • Privacy Label content which will study what kind of information should the Privacy Label contain and how this information is perceived by the end customer. This work is essential for the end- user adoption.
  • Privacy evaluation tools are essential in order for a standardization company/authority to be able to evaluate the respective IoT product. Tools that will be investigated in this BB include, complex verification tools, but also simple testing tools and techniques.
  • Methodology will be designed and prepared for demonstration and application inside the various Use Cases of SCOTT where this BB is applicable. The methodology is something that certification bodies are used with. This includes thorough documentation (and specific to us we will investigate what kind of documentation is needed for Privacy aspects) as well as testing results (for Privacy such testings are not immediately available, and they will have to be investigated part of this BB).
Main output Methodology for privacy evaluation and standardisation to be used in assessing products.

The methodology will be tested and developed together with the Use Cases of SCOTT. Scales and recommendations for Privacy Labeling ranges for different sectors that SCOTT has use Cases in. Recommendations for how to achieve the standard required by each specific label for a specific domain. These recommendations would be tested together with the Industry partners in the respective use Cases to assess their feasibility.

BB category Methodology (for SW/HW development), Profile, Standard, Means for establishing cross-domain interoperability, Process, Other
Baseline We would like to introduce privacy labels for applications and components, similar to the energy labels (A++, A+, A, B,...F), see IoTSec:Privacy_Label. Customers in Europe have an understanding of these labels for white goods, and thus we should use a similar technology to introduce "privacy" labeling. E.g. You would like to buy yourself a sports device (Fitbit, Google watch,...) or application (Endomondo, Strava,...). A potential difference between the tools might be expressed through the privacy label, e.g. a Polar device having an A-privacy, while a Garmin device having a B-privacy. - Our analysis can then show the relation between application goals and system capabilities (configuration of components) to achieve the required privacy level.
Current TRL TRL 1-2 for the ideas of Privacy Labels
Target TRL TRL 6

Overview

  • WPs of interest
    • WP7 can be a core WP for Privacy Labels BB
    • WP21 is also good for applying Privacy Labels
    • WP9 will be future (This was identified by WP9 participants as interesting and listed in their reports. We had planned to apply Privacy Labels to this UC in the beginning. Christian is interacting with WP9.)
    • WP11 mentions Privacy Labels.
It is also interesting for applying Privacy Labels because it works with complex systems that manipulate data of various kinds. Fine-grained access control is also applicable, like our 5th step in S-ABAC "Query-based AC", which can also be good to achieve better privacy.
    • We will not be involved in WP8 nor WP12 nor WP14
WPs in focus
Core Extended Future Cancelled
WP7 WP21 WP9 WP8 and WP11 and WP12 and WP14

Activities

  • Related activities include those started in BB26.F Measurable on Multi-Metrics and Measurable aspects of Privacy
  • Privacy evaluation of the TellU Diabetics app demonstrator from WP21 is planned and under way, with deadline in Spring 2019.
Activities chart
Title Status Responsible Deadlines
Privacy evaluation of the TellU Diabetics app demonstrator from WP21. Planned and Initial MSc discussions Simen Dagfinrud and Christian Johansen tentative final in Spring 2019
User presentation of Privacy Label. Planned Christian Johansen start in Summer 2018
User Studies. Planned Christian Johansen and others (Heidi, ++) start in Autumn 2018
Automation through machine learning. Planned Christian Johansen and Anders Jakob start in January 2019

Division of Work and Research Directions

Planned Outcomes

The work in the Privacy Labelling aims to provide the following tangible results.

Privacy Labelling for Decision Makers

The purpose is to have a lightweight version useful for persons that make decisions where privacy aspects are involved. Examples of use:

  1. Example of the CEO or Project manager, like from Statsbygg, that need to decide on whether to include many sensors in a Smart office building for monitoring the indoor air quality. What are the privacy implications of this? How should she take a decisions, based on what information?
    • This Result should give here the needed tools to help with the decision.
  2. Before doing any expensive and time-consuming PL certification actions, a CEO or investor needs to have initial indications of Privacy Aspects regarding the new App or IoT device or technology that is planned.
    • This Result should enable economical and feasibility calculations of the privacy implications on the business development.
    • This work will be done in collaboration with Smart Innovation Norway.

These examples clarify the Requirement RQ561.

Privacy Labelling for Technical privacy engineers

This is similar to any other certification methods, like for the SIL levels, for the ANSSI security classifications, etc. The goal is two-fold: (i) to have a methodology, i.e., a decision-tree to guide various technological decisions; as well as (ii) to have guidelines for what technologies can achieve what privacy guarantees. This includes a good overview, semantically meaningful, over the various privacy relevant concepts, like anonymization, personal identifiable data, machine learning information extraction, aggregation techniques, deletion, etc. This also includes methods and tools for measuring the various privacy relevant techniques and how much they achieve their purposes.

NOTE that all the above aspects are not related to security, i.e., the confidentiality is assumed provided by other means. For otherwise, if confidentiality of data in transit or stored in broken, then no privacy guarantees can be met any more, thus making any Privacy Labelling irrelevant.

Security evaluation should be done independently of the privacy evaluation.

Privacy Labelling for the Users

This is not present in classical standardisation methods, because those, like the goal above, are meant only for technical people, and have a very simple outcome like a number or a yes/no answer.

The main purpose of Privacy Labelling is to present the outcome of the privacy certification to Users. However, privacy is highly difficult to present, compared to classical aspects like the Energy Consumption labels where the range is the number of consumed KW/hour. Moreover, privacy is also highly personal, i.e., what for some person is highly private for another may not be, and thus does not impact her decisions. The goal is to study how to best present the various concepts involved in technically deciding a privacy label. This means that a User can understand first the privacy implications for some device/service/system that she wants to acquire, as well as how to compare with other similar products based on the privacy aspects. Examples of use:

  1. An older lady wishes to buy some Smart Home Energy Management system, but she is concerned that all these new gadgets are too intrusive into her rather classical style of living. Even more so since several of her family members, or friends at the Bridge Club tell her about <<how the TV listens to what people talk in the house>>, or how <<medical persons watch over people taking a bath>>. She wants some simple indication given by some authority that she can trust (and she mostly trusts when government associations are involved) about how much this new system would expose her, so she can decide what to buy, restricted by her limited budget.
  2. A young adult who likes to have various new electronic devices, is interested in installing a new app on his phone for tracking his weekend hiking trips. He wants to know about how his location data is being used by the app provider so that he can make an informed decision regarding choosing the various different functionalities that the app promises against the privacy exposure that he would have to accept.

The first example clarifies the Requirement RQ558.

Privacy Labelling for certifying experts and certification bodies

These include minimal requirements, alignment with existing regulations like GDPR, adoption and relation to existing standards of relevance. Examples of use:

  1. An expert needs to carry out a certification job for a specific new IoT device. She needs both methodological as well as tool support.
  2. A regulatory body, like a GDPR national authority, needs to make a compliance evaluation.

Sub-components

The work in the Privacy Labelling is divided into several sub-components, each trying to achieve one of the above goals.

PL4Decisions

This sub-component works towards the following measurable outcomes.

  1. A simplified decision process from PL-CERT, with different domain-specific versions
  2. Tools easy to use by decision making people, like questionnaires

This sub-component works to attain the Requirement RQ561.

PL-Methods

This sub-component works towards the following measurable outcomes. This sub-component works to attain the Requirement RQ560.

  1. A decision process, with associated tools like UIs and databases of resources
    • This identifies concepts relevant for privacy, like
      • what are private identifiable data,
      • what/if data is collected,
      • what inference algorithms (e.g., machine learning) are being applied and what for.
    • These concepts need to be (co-)related to each other (e.g., which and how one influences another), maybe prioritized or weighted.
    • Identify which of the concepts can be measured in any way.
    • Making this process in full should be done iteratively, first a minimal process which is also applied to a use case. Then in further iterations increase the amount of concepts and decision nodes/questions, applied to subsequent use cases.
  2. Surveys of existing techniques that can be applied to answer the questions at any of the decision nodes. Include:
    1. Anonymization techniques
    2. De-anonymization methods, like machine learning algorithms
    3. Results about how anonymization of data influences desired learning results, e.g., before anonymization some learning and profiling can be done, whereas after anonymization such profiling cannot be done any more and thus the <<personalised pricing scheme>> cannot be applied any more. (see survey chapter A General Survey of Privacy-Preserving Data Mining Models and Algorithms and the book that it comes from [1] along with the standard book Data Mining: Cncepts and Techniques 2011)
    4. Differential privacy (see survey [2] and more resources from a simple search)
    5. Survey of Privacy-by-Design concepts and current works. Examples include applications
      1. in Programming,
      2. in System architecture,
      3. in Databases,
      4. in Machine Learning
    6. Privacy for Location data (see old book Privacy, security and trust within the context of pervasive computing or survey A survey of computational location privacy)
    7. Privacy Patterns (starting from A Literature Study on Privacy Patterns Research) and question on finding which such patterns involve measurable aspects
    8. Privacy in Biometrics (see book Biometrics: theory, methods, and applications)
    9. Clustering techniques for Privacy (see [3] along with 2017 Privacy and utility preserving data clustering for data anonymization and distribution on Hadoop or An overview of the use of clustering for data privacy or The Effect of Clustering on Data Privacy)
  3. Techniques for measuring various relevant privacy aspects, including tools to automatically do the measuring
    1. Include the multi-metrics framework
  4. Tools for aggregation of the privacy measurements
  5. Examples of use

PL-UX

This sub-component works towards the following measurable outcomes.

  1. Privacy Labelling color range and visual cues, including icons
  2. Privacy Label user friendly explanations, i.e., details
  3. Usability studies to evaluate the User response

This sub-component works to attain the Requirement RQ559.

PL-CERT

This sub-component works towards the following measurable outcomes.

  1. A certification process with all decision points identified and methods for making and evaluation and decision at the respective point.
  2. Clearly described relationships with existing standards
  3. Method of aligning Privacy Labelling to existing regulations including GDPR and Norwegian National regulations from Datatilsynet

This sub-component works to attain the Requirement RQ560.

Assessment of the Technology Objectives

Baseline

The baseline in starting developing this technology building block is formed of the existing technologies related to privacy. These are studied in the #PL-Methods sub-component, and are relevant to the outcome #Privacy_Labelling_for_Technical_privacy_engineers as well as to the outcome #Privacy_Labelling_for_certifying_experts_and_certification_bodies. This BB will not attempt to enhance these methods in any way, but instead will survey and evaluate their performances, with the goal to provide an informed decision support for each privacy aspect considered in the process developed in the #PL-CERT sub-component.

Also part of the baseline for the work in this BB are various certification methods and processes, especially those related to security. These will make the starting point for the #PL-CERT sub-component and the respective outcome #Privacy_Labelling_for_certifying_experts_and_certification_bodies.

There are not certification processes that can be used for certifying a service or IoT device wrt. privacy. Such a processes will be created in this BB; in fact several variants of the process, to accommodate the different objectives that we have.

Various privacy guidelines exist, like privacy-by-design or EU's general data protection regulations (GDPR). We will use these and the concepts that they identify and discuss in order to build this BB.

Enhancements during SCOTT

During SCOTT we enhance existing privacy relevant technologies and guidelines in several aspects, described by the objectives of this BB. From the point of view of the work, these are divided into the four sub-components of this BB.

One enhancement goes in the direction of technologies and the evaluation of their suitability to provide a specific privacy property/functionality. The degree to which some technology can attain some privacy property will be measured on a scale specifically designed for that privacy property. In the certification process each property with its scale will be part of a decision process. Using the multi-metric approach of the SCOTT BB on Measuring SPD we will combine all the different scales and measures for each chosen privacy technology into a single value, which would correspond on the user side to a Privacy Label.

Another enhancement goes in the direction of certification for privacy. This would be inspired by certifications for security. However, we want two certifications:

  1. One heavy-weight, (corresponding to sub-component #PL-CERT) which would be used on a service or IoT device or product. This would evaluate the privacy of the service and assign a privacy label.
  2. Another light-weight, (corresponding to sub-component #PL4Decisions) which is novel in certification domain, and is intended to be used at a decision time, to guide negotiations, or purchase specifications, and decision at higher levels. This would not have too many technical aspects. However, the heavy technical certification process would be a detailed version of the light-weight one.

A final enhancement is on the user side. We need to present the certification to a user and explain the privacy label in such a way that it is useful for the end user. Usefulness is both in terms of helping the user make an informed decision when buy a service, as well as helping the user understand the privacy consequences of using the service. This is related to the sub-cmoponent #PL-UX and is both about graphcal/visual presentation, but also about how to explain privacy concepts to non-security experts.

Future vision 2025

The vision is that by the end of SCOTT we will have demonstrated how privacy labeling can be done. We will discuss with authorities so to have the PL taken up in their evaluation processes.

We expect by 2020 to have experimental methods for all planned sub-components.

We expect by 2025 to have such methods taken up in a standardization committee on the technology side, and to be adopted as a certification process by major certification companies like DNV-GL in Norway. We also expect regulatory bodies to have national regulations asking for such labeling, even if most products would only get a default lowest label (as it is often done when a new labeling scheme is taken up by a government, e.g., see the energy efficiency labeling of houses in Norway).

Hinders and requirements

Scientific, technological, standard or political perspectives:

Scientific hinders come from the fact that there are very many technical solutions to both provide some privacy property but also to break the same property. For example, there are anonymization techniques but there are also various de-anonymization techniques, e.g. based on machine learning or aggregation of data. Gathering and evaluating all these is a serious challenge for this BB.

Standardization requirements include those for certification processes, where both government and industry need to be on an understanding level.

Political hiders are coming from the fact that privacy is a human right and is asked for by citizens to their governments. However, private data is one of the most valuable assets nowadays, and thus companies are more and more interested in breaking individual privacy. As consequence there is a push-and-pull on the political arena between the companies and the citizens.

RoadMap

Deliverables and Documents

Practical Aspects

Implementations and User Testing

Demonstrations and Use Cases

  • In WP7 in an initial preliminary phase at M14
  • In WP21 in a more concrete phase at M24

Air Quality monitoring Use Case of WP7

The work in WP7 involves complex systems and algorithms for processing data coming from multiple sensors and other information sources (like weather report) in order to achieve a high precision monitoring of indoor air quality (IAQ), both in real-time and in quality of the measurements. Thus WP7 provides good use cases for all components of the Privacy Labelling work. Refer to the respective Deliverable D7.1 for detailed scenarios presentations. Consider the following aspects of the IAQ and related privacy questions/concerns.

An office environment, or an industrial facility (like for storage of goods or for processing of food) is equipped with various sensors like temperature, humidity, pressure, various gasses, particles, sound, luminosity, 3D camera, motion, window and doors latch/open sensors (see the slide 18 f the VTT presentation in the 14 Feb 2018 meeting).

  • One Privacy question is how many of these sensors are needed for each specific air quality functionality that is desired? The more sensors the more accurate information can be inferred about the occupancy and processes/behaviours that are happening at some point in time in the respective indoor location.

This sensors information is gathered in cloud-based systems (see the presentation of Centria from 14 Feb 2018 meeting and their FOAM platform) that use powerful tools for big data processing like Apache KAFKA and SPARK, Cassandra (see presentation of ITI from 14 Feb 2018 meeting).

  • One Privacy question is, how is this data being processed in these far-away cloud systems? Can transparency be achieved?
  • Another Privacy question is, what kind of information can these powerful systems extract from the sensors data? How much profiling can be done? These questions are related to research on privacy in static databases like differential privacy.

However, the data from the sensors first goes through a gateway/router inside the facility before reaching the cloud platform (see presentation from Centria slide 5 or from F-Secure from 14 Feb 2018 meeting). Therefore, various forms of processing can be done on the gateway, some relevant for privacy.

  • Can the gateway do some anonymization pre-processing of the sensors data? This work should be done in accordance with the cloud system so that the amount of annonymization does not interfere with the cloud functionality (e.g. the same profiling should be possible, like for various price agreements)

The work in WP7 uses both poor-quality sensors (see presentation of IMEC from 14 Feb 2018 meeting) as well as high-grade sensors (see presentation of QPLOX or RDVELHO from 14 Feb 2018 meeting).

  • One Privacy question is how much information needs to be collected from one sensor to achieve the expected functionality? For example, with a high-grade sensor too often measurements can provide enough information to extract various privacy sensitive information, like what activity s carried in the office, i.e., typing at computer, walking, sitting, reading (see recent research on this aspect from ESORICS 2017).
  • With poor-quality one usually claims that more sensors are needed, however, how many and where to be placed is important so that not to infer more information than needed.

The above example and questions are relevant for the PL-CERT and PL-Methods components. These components would be applied both at the sensor level and immediate gateway (processing unit), as well as on the overall system, i.e., including the cloud systems.

  • Work on clustering (see presentation of Nokia from 14 Feb 2018 meeting) is relevant for anonymization. Can this be performed by the gateway, or only by the cloud?

The scenarios described in Deliverable D7.1 include several aspects regarding the presentation of the data to users, including presenting privacy aspects to the concerned users. Here is where the work of the PL-UX component would be applicable.

The work of SmartIO works with users that are at decision levels, i.e., the responsible for building a new smart office facility, or retrofitting one with smart-* capabilities. For this kind of users the PL4Decisions would be applicable. Our work would focus on the air qualify functionality and privacy questions.


Smart Grid Use Case not related to any SCOTT

Advanced Metering Infrastructures (AMI) and Smart Meters are deployed in Norway to automatically and continuously measure energy consumption. There are many Privacy Concerns around these:

  1. ? How much Private information can be extracted from this data ?
  2. ? How well is this data anonymized ?
  3. ? How well can we measure the privacy implications of such Smart Systems ?

Papers to start from (also see who cites these on scholar.google.com):

  1. Smart grid privacy via anonymization of smart metering data by Costas Efthymiou and Georgios Kalogridis, in IEEE International Conference on Smart Grid Communications (SmartGridComm), 2010.
  2. "Influence of data granularity on smart meter privacy." by Günther Eibl and Dominik Engel in IEEE Transactions on Smart Grid 6.2 (2015): 930-939.
  3. "Do not snoop my habits: preserving privacy in the smart grid." by Félix Gómez Mármol; Christoph Sorge; Osman Ugus; Gregorio Martínez Pérez in IEEE Communications Magazine 50.5 (2012).
  4. "Achieving anonymity via clustering." by Aggarwal, et al. in Proceedings of the twenty-fifth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems. ACM, 2006.
  5. "An overview of the use of clustering for data privacy." by Torra, Vicenç, Guillermo Navarro-Arribas, and Klara Stokes in Unsupervised Learning Algorithms. Springer, Cham, 2016. 237-251.

There are well known ways of using the Smart Meter data to extract behaviors and make profiles:

  1. Leveraging smart meter data to recognize home appliances
  2. Private memoirs of a smart meter
  3. Security and privacy challenges in the smart grid
  4. Privacy-Friendly Aggregation for the Smart-Grid

Requirements

Requirements chart
ID BB Short Name Description Rationale Source Status Progress Comments Contact
67 BB26.G Privacy labels Understandable Privacy Label Privacy labels must be understandable. Since the target of privacy labels will be used by the end users, these labels must be understandable, trustable and user-friendly. WP07 Air Quality; WP08 Smart Infrastr.; WP14 Car access; WP21 Assisted living Confirmed 0% ​​We have not started on working towards this requirement yet, because this needs work in other sub-components, mainly technical work in the PL-Methods and PL-CERT. Christian Johansen
435 BB26.G Privacy labels Security_labels Component should allow for definition of dedicated security label for each of system components WP09 Facilities Mgmt., WP15 Vehicle within Infra. Confirmed NOT RELEVANT ​​Please cancel this requirement because it is no longer in line with any of the use case of privacy labelling. Christian Johansen
558 BB26.G Privacy labels Trustable Privacy Labels Privacy labels must be made so that the end-user can trust them. Since the target of privacy labels is to be accepted by the end users, the labels must be trustable. This means endorsed by some trusted third party, or certified after some accepted method which the user trusts, etc. WP07 Air Quality, WP08 Smart Infrastr., WP11 Secure Cloud, WP12 Automot. Testing, WP14 Car access, WP21 Assisted living Confirmed 10% ​We are working with the Norwegian regulations body called Datatilsynet and other organizations in this direction. We are also trying to start collaborations outside Norway. We started developing a certification methodology. Christian Johansen
559 BB26.G Privacy labels User-friendly Privacy Labels Privacy labels must be user-friendly so that it correctly communicates the message to the end-user. Their presentation must be developed together with the users so that these appeal to the end-user and communicates correctly what is expected by the user. WP07 Air Quality, WP08 Smart Infrastr., WP14 Car access, WP15 Vehicle within Infra., WP21 Assisted living Confirmed 10% ​Initial work, including an internal workshop, has been done to identify the user aspect and expectations. More work is planned in the end of this year, after more of the technical aspects are in place. Christian Johansen
560 BB26.G Privacy labels Feasible evaluation. Privacy labels must be understandable. The method or procedure to certify that a system can be labeled with some specific Privacy Label should be feasible for normal IoT companies. This means that costs should be manageble, and time for certification should be realistic. This puts requirements on the methodology and process for doing Privacy Labelling Certification. WP07 Air Quality, WP08 Smart Infrastr., WP11 Secure Cloud, WP12 Automot. Testing, WP14 Car access WP21 Assisted living Confirmed 10% ​Feasibility methods and reviews of relevant technology and tools are planned, specially targeting the technology developers to easy their efforts for providing privacy preserving technologies that could comply with some desired privacy label. Christian Johansen
561 BB26.G Privacy labels Flexibility Privacy Labelling should be flexible so that the large diversity of IoT systems can be Labelled. The methodology and process of certifying an IoT system for Privacy Label should be flexible enough so that it encompasses the many varied aspects of IoT. It also needs room for unmesurable aspects that might need to be considered. WP07 Air Quality, WP08 Smart Infrastr., WP11 Secure Cloud, WP12 Automot. Testing, WP14 Car access, WP21 Assisted living Confirmed 10% ​We are panning different alternatives of privacy labels and certification requirements for different areas and services. Undergoing work is on a health app and on a large smart building system. Christian Johansen


SCOTT status

From Ramiro: An overview of the instructions for updating the building blocks and the collection of the requirements can be found in this presentation (slide 19-24). https://projects.avl.com/16/0094/WP26/Documents/02_Meetings%20and%20WebEx/20170703_SCOTT_Presentation_WP26.pptx?Web=1


The official and complete instructions can be found in the following presentation from SP1 requirements management. https://projects.avl.com/16/0094/WP01/Documents/03_Deliverables/SCOTT%20REQM%20Approach_Guidance_June2017.pptx?Web=1