Replies: 2 comments
-
Regarding #4, "Do the number/size of geographies affect the k-values? E.g. if you have a large populated area, should you the k-values be lower?" However I do NOT think you can adjust both K and bin size, as making k too small presents a re-identification potential regardless of the geographic or temporal size of the bins. More like, for a specified k, think about what your bin size should be to be meet the use case. |
Beta Was this translation helpful? Give feedback.
-
This is such a difficult (and, to me, interesting) set of questions. If anyone else on the Privacy Committee would like to join an informal reading group on the topic, there are a bunch of whitepapers on the shortfalls of k-anonymity and different strategies for overcoming them. The TLDR is that k-anonymity is not a silver bullet and that even with k=10 there is risk of reidentification and/or attribution disclosure depending on what the underlying data is and what other data on the same individuals exist. For this reason, I think it is difficult and maybe harmful for OMF to attempt to give generalized guidance for dynamic, queryable APIs like Metrics which can be used/attacked in any number of ways. Instead, the safest recommendation today is to treat this data as no less sensitive than the raw MDS data it describes. I don't know if I am in the minority on this view… For static, precomputed things like Reports, I think we can get to recommendations that make this data relatively safe to be published and/or widely shared. Starting with a specific use case in mind, we can first determine the maximum k-value that meets the need (ie how much information loss is acceptable before the data set becomes useless?). Then, based on that use case and k-value, we can look at what possible risk there might be, determine if there are ways to mitigate those risks, and/or make additional recommendations around how the data is shared. I'd love to see the privacy committee attempt this process for Reports. |
Beta Was this translation helpful? Give feedback.
-
The OMF is working to develop guidance around k-anonymity values and privacy, and we welcome feedback for implementers in this discussion area.
Currently we reference k-values in two areas of the MDS 1.1.0 release that is in the OMF approval phase now. You can review the guidance in the Data Redaction areas of Metrics and Provider Reports. The guidance as written will serve while these features are in beta and testing and feedback in real world scenarios and use cases begins. For the next release, we would like to refine this guidance based on community, agency, provider, and expert feedback.
The current recommendation of "10" is conservative, and does not return data for 0-10 aggregated count values. Feedback suggests there may be a more nuanced approach to this.
Here are some discussion questions to get you going:
-1
. Should 0 be returned as it's own discrete value?Thank you and we welcome your discussions to help inform the Working Group and OMF Committee guidance.
Beta Was this translation helpful? Give feedback.
All reactions