Qual-IT - May 2005
Working Around Personal Identifiers: A Strategic and Tactical Challenge
In previous issues of Qual-IT we've discussed the need for standards governing the structure and content of health information to facilitate data linkage and exchange across disparate health care settings. The Health Insurance Portability and Accountability Act of 1996 (HIPAA) established a legal framework for the development of such standards, including those related to creation of unique personal identifiers for electronic health care transactions. But while many HIPAA standards have been proposed and some have been finalized, it is unlikely that a uniform approach to personal identifiers will be promulgated, further complicating the design and adoption of interoperable health information systems.
Nevertheless, strategies and methods to link health information are being developed and tested, at the national and regional levels, to address both technical and policy issues. This issue of Qual-IT describes two current initiatives, both based on a model that has been adopted in many forms across the country – a decentralized, federated information systems architecture that links information without relying on a common personal identifier.
In this issue
The "Record Locator Service" Model
Developing practical strategies for linking health information in a secure and efficient manner, and improving access to that information by clinicians and consumers, was the goal of the Working Group on Accurately Linking Information for Health Care Quality and Safety, convened under the auspices of the Connecting for Health (CFH) initiative sponsored by the
Rather than creating large central databases, the working group proposed that health information exchange would best be accomplished through accessing and linking information residing in a variety of local sources. These sources – health care institutions and providers, in consultation with consumers – would decide whether to participate, what information would be made available, to whom, and under what circumstances. In addition to maintaining the security of information within these individual repositories, a key challenge in designing this type of system is authenticating the identity of authorized users and providing access only to that level of information each user is allowed.
Following these assumptions, a working model for health information exchange would need to address three distinct but interrelated issues: identification and location of relevant health information, the architecture for information exchange, and structures to facilitate access to that information based on patient and provider authorization. The working group's model included these key elements:
- Identifiers would be established for institutions/providers and for patients, to be used in combination as pointers to specific records. Identifiers would be unique to each individual, but would not disclose any elements of the person's identity. Probability-weighted matching algorithms would be used to link records across disparate locations and over time, a technique used in a number of large-scale systems (such as the New York City immunization registry model described below).
- The architecture for information exchange and linkage would be decentralized and built on top of existing information systems and Internet capabilities. Standards and agreements among participating parties would be developed through a voluntary, federated model.
- The integration of these two features would occur through a new type of infrastructure, the Record Locator Service (RLS). The RLS would manage the key functions of the system: maintaining information about where health information resides (using the identifier scheme described above); managing access to the information by authorized users; and applying algorithmic matching processes to resolve issues of incomplete or conflicting data among various local sources. Regional and local health information networks would establish or subscribe to the RLS, and these organizations would need to collaborate on the rules for accessing information across various networks, consistent with the federated model construct.
This is a broad overview of the key assumptions and elements of the working group's model; the full CFH report goes into greater detail on these issues, including privacy, security, and data access authorization procedures. This report and related documents on interoperability can be downloaded from the CFH web site, http://www.connectingforhealth.org/.
Combining Existing Registries
State and local health departments typically collect a combination of identifiable and non-identifiable data, but most traditional public health functions do not require linkage of health information across disparate providers and over time. Historically, any sharing of identifiable information is explicitly proscribed, except when specific public health threats (e.g., tuberculosis case-finding and treatment compliance) are involved.
With the implementation of various child health information systems, including immunization registries, lead screening registries, and, more recently, systems that support newborn screening, assessment, and follow-up services, the situation has changed dramatically. Many of these systems were developed independently with funding from the Centers for Disease Control and Prevention and the Health Resources and Services Administration, but they have a lot in common: they need to capture information from a variety of sites where children and families receive services, make that information available at the point of care so clinicians and parents can ensure that all appropriate services are delivered (e.g., immunizations are complete and up to date), and maintain a longitudinal record so person-specific information for a particular age group, risk profile, or clinical guideline can be maintained and monitored by health officials, clinicians, and consumers.
Across the country the development of integrated child health information systems has been gaining momentum; this experience has been extensively documented, and a collaborative network for information sharing established, under the auspices of the Public Health Informatics Institute (PHII). PHII has developed several reports that include principles for collaborative system design, as well as detailed descriptions of models integrating person-level information using strategies and tools similar to those articulated by the CFH working group.
The experience of integrating two large databases – a child immunization registry database (CIR) and a lead screening database (LQ) – is described in a recent article by New York City Department of Health and Mental Hygiene (DOHMH) staff (Papadouka 2004). DOHMH initiated planning to integrate these databases in 1999, and completed the process early in 2004. Both of the databases were large (2.4 million children in CIR, 2.2 million in LQ); there was significant overlap in the target populations (newborns through school-age children); and both programs were committing substantial resources (including twelve full-time staff devoted to data review and clean-up activities) to matching and merging records within their discrete systems environments.
The immunization registry, populated directly from birth certificate data, has a very accurate record of names and dates of birth of children born in
Both programs wanted to maintain their existing systems, and continue their operations with minimal disruption during the integration process. They decided to use a modular approach, creating a master child index that would combine data from the two systems and assign each child a unique identifier for purposes of matching and merging records from each system. The application would need to incorporate large volumes of incoming and usually incomplete data, to reduce the likelihood that duplicate records are maintained for the same person. Previously, each system used its own custom-designed software that applied criteria, or rules, for matching; this was supplemented with human review and, for the immunization registry, artificial intelligence (AI) software (the probability-matching approach advocated by CFH). The programs decided to work on a common approach, and apply AI software upstream, as data and records are submitted, substantially reducing the likelihood that duplicate records would be created.
Technical considerations involved in establishing the matching algorithms and applying that method to the existing CIR and LQ databases were substantial, as the article details. The combined databases contained 4.6 million records; after the matching process was applied and manually validated, 1.6 million of those records were successfully matched and merged between the two systems. The accuracy of matching incoming data to the immunization registry improved from 60 percent to 80 percent, reducing the creation of new duplicate records in the system.
Heeding the Policy Context
The absence of standards for accurately retrieving and linking patient data through individual identifiers is a major impediment to interoperability. State and regional health information exchange projects are confronting this problem and developing practical solutions to address major operational and technical issues, so that health information retrieved from discrete sources can be correctly, securely, and efficiently matched and combined. Technology itself offers some of the answers: the Internet provides a fast and efficient means of communication; open-source software can be used to conduct probabilistic matching and reduce duplication of records; and policies and procedures for user authentication and data access can also be programmed, reducing the need for more costly – and actually less reliable – manual methods.
Although some argue that a single national patient identifier would vastly simplify health information exchange, that solution would address some but not all broadly held concerns. Technical solutions to link and merge person-level health data need to be implemented in a larger policy context, based on agreements among the parties to health information data exchange that explicitly incorporate both clinical requirements and patient preferences.
Resources
Connecting for Health: www.connectingforhealth.org
Papadouka V, P Schaeffer, A Metroka, A Borthwick, P Tehranifar, J Leighton, A Aponte, R Liao, A Ternier, S Friedman,
Public Health Informatics Institute: www.phii.org
Coming Next Month
Update: Federal and State Legislative Developments
