Identifying Records

This article provides in-depth advice over how to use record identifiers with the Identity API.

Contents

Using Identifiers
Business vs. Internal Identifiers
Selecting a Source Identifier
Other Considerations
- Encrypting Identifiers
- Synthetic Identifiers

Using Identifiers

When adding records using the Identity API, you provide a source name and source-specific record identifier. For example, when adding John Smith’s record from Community Clinic, you might use source name CommunityClinic and Community’s internal database id 12345:

POST /mpi/v1/record/CommunityClinic/12345

{
  "firstName":"John",
  "lastName":"Smith",
  "dob":"2003/01/02",
  ...
}

These record identifiers are also returned, in plain text, when you query for patients. For example, if you try to find records matching John’s demographics, you would receive:

{
    "matchedPerson": {
        "id": "a46ee8bd-9dd1-41e6-bced-f2578ce37ae0",
        "records": [
            {
                "source": "CommunityClinic",
                "identifier": "12345"
            }
        ],
        ...
}

Business vs. Internal Identifiers

Identifiers come in two basic types:

Business Identifiers have business or real-world meaning outside the source system. These could be MRNs, SSNs, insurance account numbers, etc. Most business identifiers are considered PII/PHI.
Internal Identifiers have no meaning outside of the source system. These are typically primary keys from the source system’s internal database.

Note
We advise always using internal identifiers when sending data to the Identity API.

Internal identifiers are preferable for several reasons:

They preserve the blinded nature of the Identity service.
They are unique to the record and immutable, whereas business identifiers may be added to or removed from records.
Unlike all other demographic elements, the source system and identifier are not irreversibly hashed, and can be retrieved in subsequent operations.

Selecting a Source Identifier

Selecting the appropriate source identifier requires knowledge of the source system. You want one that is:

Will appear reliably for every record.
Ideally is an internal identifier (described above).

For example, a HL7 system might send multiple identifiers in PID.3:

PID|1||111111^^^AA1^MR^ABC~123456789^^^SSA^SS^ABC~12345^^^AA2^PI^ABC||Smith^John|||||||||||||7890

The identifiers include:

A MRN (MR) of 111111 from assigning authority “AA1”.
A SSN (SS) of 123-45-6789 from assigning authority “SSA”.
An internal patient identifier (PI) of 12345 from assigning authority “AA2”.

If you know that this system reliably sends the PI identifier type for every record, that would probably be your best candidate.

For a FHIR-based source system, you would similarly need to select from among the various identifiers in the Patient.identifier field:

  "identifier": [
    {
      "type": {
        "coding": [
          {
            "system": "https://terminology.hl7.org/CodeSystem/v2-0203",
            "code": "MR"
          }
        ]
      },
      "value": "11111",
      ...
    },
    {
      "type": {
        "coding": [
          {
            "system": "https://terminology.hl7.org/CodeSystem/v2-0203",
            "code": "PI"
          }
        ]
      },
      "value": "12345",
      ...
    },
    
  ]

As with the HL7 message, the identifier of type PI (for internal patient identifier) is likely the one you should use.

Note
Each source system may send identifiers differently, so you will need to determine the one that best fits your specific use case.

Other Considerations

This section describes a few special cases that you should consider when sending record identifiers to the Identity API.

Encrypting Identifiers

If you do not wish the Identity service to ever receive plain-text record identifiers, you can encrypt all source identifiers before sending them, and decrypt them in any query response.

Synthetic Identifiers

In the unlikely event that your source system has no suitable internal identifier, you may consider creating your own synthetic one. You would need to manage a mapping table that translates your synthetic identifiers to their original source ones.

For example, assume a HL7 source system (Community Clinic) only provides an account number (business identifier). You might receive a few record updates:

PID|1||111111^^^AA1^AN^ABC~12345^^^AA2^MR^ABC|Smith^John|||||||||||||7890
PID|1||111111^^^AA1^AN^ABC|Smith^John|||||||||||||7890
PID|1||222222^^^AA1^AN^ABC~12345^^^AA2^MR^DEF|Jones^Mary|||||||||||||7890

Your mapping table for synthetic identifiers would then be:

Synthetic Internal ID	Source System	Source Identifier
1	CommunityClinic	AA1-111111
2	CommunityClinic	AA1-222222

Any time your system received a message from Community Clinic for John (account number AA1-111111), you would send CommunityClinic and 1 as the source system and identifier. Any time you received a message from Community Clinic for Mary (account number AA1-222222), you would send CommunityClinic and 2.

When matching patients, the Identity service will return these synthetic IDs:

{
    "matchedPerson": {
        "id": "a46ee8bd-9dd1-41e6-bced-f2578ce37ae0",
        "records": [
            {
                "source": "CommunityClinic",
                "identifier": "1"
            }
        ],
        ...
}

You would then use your mapping table to determine that CommunityClinic identifier 1 was associated with account number AA1-111111.