Overview   |   Coloring Privacy    |

logo

Introduction

Color is one of the important features in visualization. For example, colors are used to represent the population of a region on a map. The darker the color uses, the more people a region could have. The coded information with colors or other visual attributes seems safer towards privacy breach compared to the raw data, but there could still be potential privacy attacks based on the visual information. This project is motivated by the possible privacy breach issue in the visualization process via colors. When applying common privacy-preserving techniques including Laplacian Mechanism and Exponential Mechanism for ε-differential privacy, we have addressed their corresponding challenges and possible solutions. Moreover, information visualization techniques have their own set of criteria for evaluation, such as traditional Nielsen's set of heuristic evaluation. Hence, we would like to explore the special utilities of visualization, such as the contrasting effect of colors, to minimize any loss in the effectiveness and efficiency of visualization techniques, or in other words, any loss in the utilities of visualization due to privacy-preserving mechanism.

Visualization Query Model

Visualization aims at enhancing data analysts' understanding on data given. We generalize this process in the following model, as shown in Figure 1. Intuitively, the model is an interaction between a data user and a database. The user will have a graphic interface to interact with the database. This interface will generate queries to the database, where raw information or semi-processed information is stored. A further query computation will be processed on the data and generate a query result. In the end, data will be mapped onto a view, which will be presented to the user.

 

 

VisualizationQueryModel

 

Figure 1: Visualization Query Model

ColorConversion

ColorConversionFormula

Figure 2: Coloring conversion for the query result

Frequency Coloring

Example: Given a fact table D of n tuples with column attributes: individual ID and their corresponding location. The user request to create a view on the number of individuals at each geographic grid on the map with corresponding colors. For instance, in Figure 2, given the fact table D, we could derive a frequency matrix FreqM. If we have nine colors available, denoted by the color set C = {0,1,..., 8}, we map color 1 to frequency 0, color 3 to frequency 1, color 5 to frequency 2 and color 7 to frequency 3. Based on this coloring assignment, we could get a coloring matrix ColorM.

Possible Privacy Breach: Simply using those colors to represent the frequency on the map could breach privacy. If a taxi changes its location from one grid cell to another, the frequency for these two grid cells will change and hence the coloring. For example, in Figure 4, the first individual's location changes from (0, 1) to (2, 1), it could result in di erent coloring matrix as shown in Figure 3(a). Then the adversary could see the di erence highlight in the red box of the two coloring matrix ColorM and ColorM', due to the change of one tuple in the fact table D. Common mechanism such as Laplacian Mechanism or Exponential Mechanism could be used to preserve ε-differential privacy, but there are challenges to each of them.

PrivacyAttacks

Figure 3: Potential privacy breach due to one tuple change in the fact table

Coloring Utility:

Approaches towards Frequency Coloring

Future Work

Managed By

Presentation Slide

Details and experiment results could refer to the projec presentation slide
Some testing maps can be found here