The purpose of table below is to keep the table you are currently anonymizing visible. You do not have to click anything. The column for which you are currently defining rules is highlighted with yellow.
In Privacy-Preserving Data Publishing the protection of privacy depends on generalization. The more general value is used the the better protection for privacy. However generalizing data too much will cause too high distortion and therefore render it useless for dataminers. For example generalizing person's age to interval with step of 5 would be normally offer sufficient protection while generalizing age to interval with size of 20 can cause too much distortion. In addition to numerical values categorical values are also generalized.
For example: if in some case Job is part of QID we can generalize tester, programmer and developer to information technology and surgeon, family doctor and nurese to medicine. If for some reason this generalization does not offer sufficient protection to privacy it can be further generalized to 'any job'.
Your task is to decide how much generalization is needed. Depending on whether you have numerical values or categorical values you need to either specify interval size or add generalization rules.
Select interval size.min
which tells you the smallest value for this column and max which tells you the
largest value in this column. For example if you are generalzing column age and min is 18 and
max is 65 it means that the youngest person in this table is 18 years old and the oldestst is 65
years old.
next button.Set generalization for
this
rule.
Please specify general value for these values enter the more general
value. For
example, if you have previously clicked on programmer, tester and developer you might want
to enter
information technology there.
ok to close input box.next button.