Goal: A solution is needed to de-identify Elevance / Carefirst data so that this information can be safely used to:
Improve NPS: One of the NASCO initiatives is to reduce duplicate claims. Having this deidentified data will enable us to back test our solutions and verify the accuracy of the solution we design.
Design better Benefits: NASCO will be able to design better Benefit solutions by analyzing a large pool of deidentified claim/member data.
Predict service needs: Analyzing the deidentified data will allow us to build better servicing solutions, by matching features with the types of needs we see from population health data.
In scope:
Claim and Member data is of the highest priority for this project. Potentially other ones (TBD).
Hash member ID to be able to link claim/member records?
What to remove based on CMS:
It is possible to mask/remove fewer fields if working with a statistician. Jake to pursue.
Names
Geographic subdivisions smaller than a state (except first 3 digits of ZIP code if area has >20,000 people)
All dates (except year) directly related to an individual
Phone numbers
Fax numbers
Email addresses
Social Security numbers
Medical record numbers
Health plan beneficiary numbers
Account numbers
Certificate/license numbers
Vehicle identifiers and serial numbers
Device identifiers and serial numbers
URLs
IP addresses
Biometric identifiers (including fingerprints and voice prints)
Full-face photos and comparable images
Any other unique identifying numbers, characteristics, or codes
Notes:
12/04/24
Informatica has masking and deidentification tools.
Re-use dormant Claims ODS storage