Some publicly available data contain personal identifiers which may pose a risk to the reputation, employability, insurability, or privacy of individuals or to classes of individuals.
If the links are removed before the researcher sees the data, then the data are completely "anonymous." Complete anonymity provides reasonable assurance that individuals are protected. It does not protect against risk to classes of individuals.
If the links to individuals are removed by the researcher, and if privacy and confidentiality can be maintained, then in the absence of data relevant to reputation, employability, or insurability the risk can normally be considered negligible.
For example:
A publicly available dataset contains self-reported alcohol purchasing data among university faculty. There are no identifiers to the individuals surveyed or described in the data. A study is published and reports a rate of alcohol purchasing that is unexpectedly high by comparison to previously published reports of similar purchasing habits conducted among other professional groups.
In the course of the ensuing year by several processes of elimination, the campus is identified by certain otherwise innocuous characteristics such as: number of faculty, ethnic composition, gender composition, and zip codes, all of which are available from other reports and documents and "fit" the profile of "Santa Monica State University." Although the campus is never mentioned, the report is briefly mentioned in a 20 minute segment on "60 Minutes" later that year.
A member of the SMSU faculty is surprised when, applying for health insurance, the carrier declines on the basis of increased risk because the faculty is a member of a class "known" to have high rate of alcoholism.
Was there "risk" in the original data, or did the damage occur through highly unlikely "sleuthing" and inferences by the press and faulty deductions by the insurance carrier?
The data contained the risk because alcohol is not a risk-neutral subject. The researcher provided the information necessary for the identification of the subject campus and could have avoided doing so. The press and the insurance company acted irresponsibly. Purchasing alcoholic beverages does not equate to alcoholism, yet the public quickly ignored this logical inconsistency in its willingness to believe the bad news. Researchers should be aware of the context in which they work and take necessary precautions.