The topic of de-identification of personal information has come up in discussions with clients several times in the past year. In each scenario, our client or potential client is collecting and maintaining a store of personal information which must be protected from breach—customer records, payment card industry cardholder data, electronic protected health information (ePHI), etc. There are several reasons that an organization storing the personal information of others may wish to de-identify the data.
For just about any organization, de-identification of historical personal information not otherwise required to be retained mitigates privacy risks to individuals while also reducing the organization’s exposure to breach risk (e.g., reputational damage and remediation costs). This discussion often surfaces when we do a SOC 2 Privacy audit and discuss the requirement to keep personal information only for as long as necessary to fulfill the stated purposes or as required by law or regulations.
For healthcare industry organizations, de-identification of patient data allows covered entities under HIPAA to share their patient data with other organizations—such as for medical research and comparative effectiveness studies. De-identification of ePHI involves the removal of certain identifiers from patient data. Doing so detaches the identity of the patient from the patient data and effectively renders the health information no longer subject to HIPAA’s requirements. This can also reduce the cost of compliance by reducing the scope of HIPAA compliance assessments and audits.
If your organization is considering de-identification of personal information, we recommend taking a look at the HIPAA Privacy Rule’s standard for de-identification of protected health information. This is found in Section 164.514(a) of the rule. Under this standard, health information is not deemed individually identifiable if it does not identify an individual and if the covered entity has no reasonable basis to believe it can be used to identify an individual.
It is in this standard that you will see that eighteen (18) specific identifiers of individuals or of relatives, employers, or household members of the individuals, must be removed from patient records to meet the HIPAA Privacy Rule’s requirements. Although this list originates from HIPAA, the information is useful to any organization seeking to reduce the business risks attendant to maintaining stores of any type of personal information, including data used for identity theft. The 18 identifiers include:
(2) All geographic subdivisions smaller than a state, including street address, city, county, precinct, ZIP code, and their equivalent geocodes, except for the initial three digits of the ZIP code if, according to the current publicly available data from the Bureau of the Census:
(a) The geographic unit formed by combining all ZIP codes with the same three initial digits contains more than 20,000 people; and
(b) The initial three digits of a ZIP code for all such geographic units containing 20,000 or fewer people is changed to 000
(3) All elements of dates (except year) for dates that are directly related to an individual, including birth date, admission date, discharge date, death date, and all ages over 89 and all elements of dates (including year) indicative of such age, except that such ages and elements may be aggregated into a single category of age 90 or older
(4) Telephone numbers
(5) Vehicle identifiers and serial numbers, including license plate numbers
(6) Fax numbers
(7) Device identifiers and serial numbers
(8) Email addresses
(9) Web Universal Resource Locators (URLs)
(10) Social security numbers
(11) Internet Protocol (IP) addresses
(12) Medical record numbers
(13) Biometric identifiers, including finger and voice prints
(14) Health plan beneficiary numbers
(15) Full-face photographs and any comparable images
(16) Account numbers
(17) Any other unique identifying number, characteristic, or code, except as permitted under the Privacy Rule’s implementation specification allowing the assignment of a unique code to the set of de-identified health information to permit re-identification by the covered entity (re-identification implementation specification); and
(18) Certificate/license numbers