Study | Data details | % of suspected matches verified as actual matches |
---|---|---|
Matched 15,000 Safe Harbor de-identified admission records from a regional hospital to a marketing dataset of 30,000 records | 10% (2/20) | |
Elliot et al. [29] | Sampled records from the UK Labour Force Survey (LFS) and the Living Costs and Food Survey (LCF) to re-identify. Matches were performed with and without the Output Area Classifier (OAC), which provides more precise geography | • LFS: 12% (6/50) using web-based info to match with;28% (14/50) using commercial data • LCF: 10% (2/20) for dataset without OAC;43% (18/42) for dataset with OAC |
Data examined were tabular in nature, consisting of 89 tables that were determined to be potentially high risk | • 36% claims of identifying a neighbor were correct • 61% correct for identifying self/family • All claims, except one, involved people the intruder knew | |
Sweeney [45] | News reports of hospitalizations (n = 81) were used to identify individuals in a Washington state hospital inpatient dataset of 648,384 records | 23% (8/35) |