Introduction
Biomedical informatics (BMI) is the interdisciplinary field that studies and pursues the effective uses of biomedical data, information, and knowledge for scientific inquiry, problem solving, and decision making, motivated by efforts to improve human health.
– American Medical Informatics Association (AMIA)Kulikowski et al. JAMIA. 2012 Nov 1;19(6):931-8. doi:10.1136/amiajnl-2012-001053
Data, information, knowledge
Biomedical informatics is fundamentally a field concerned with data, information, and knowledge.
Data are the building blocks of information. Raw lab test values, clinical code occurrences, and patient gene sequences are examples of biomedical data. Our purpose for using medical data is ultimately to extract information and gain knowledge.
Information is data with context and meaning.
In the examples given, information could be the contextualization of a number to a lab test value, a code in a diagnosis field to a disease concept, or a set of sequencing reads to a set of variant calls.
The difference between data and information is the difference between (PID=41235934,6.7)
and, “Jane Doe HbA1c = 6.7%”.
Knowledge is the highest level in this hierarchyKnowledge is the highest level with which biomedical informatics is concerned, though wisdom is often added as the fourth and highest concept in the data-information-knowledge (or data-information-knowledge-wisdom in that case) hierarchy. , and it is a difficult concept to define. Unlike data or information, knowledge is a subjective, human phenomenonWeinberger takes issue with knowledge even being represented in this hierarchy as derived from information. “[K]nowledge is not determined by information, for it is the knowing process that first decides which information is relevant, and how it is to be used.” (source) . An example of knowledge is that Dr. Roe knows that Jane’s HbA1C test is indicative of diabetes.
Knowledge enables clinicians to treat patients, and clinical knowledge can be discovered, provided, and supported by biomedical informatics. In the study of symbolic methods, knowledge includes the relationships between concepts, the structure and types of relationships, and the relationships between concepts and their attributesThe discussion so far is meant to motivate our study, not to present the final word on these concepts. Many of the definitions given here are debatable. There is also inherent ambiguity, as one person’s knowledge may be another’s data. .
Symbolic methods
Symbolic methods use explicit representations and previously-known relationships to gain information from dataHelpful article on symbolic vs nonsymbolic methods in NLP .
In other words, data + information/knowledge = more information/knowledge.
Incorporating existing information/knowledge in the pursuit of further information/knowledge presents two key challenges:
- Standardization that ensures interoperability between data sources and prevents incorrect redundancies.
- Knowledge representation that captures relationships between concepts.
Interoperability
Interoperability is the ability of different information systems, devices and applications (‘systems’) to access, exchange, integrate and cooperatively use data in a coordinated manner, within and across organizational, regional and national boundaries, to provide timely and seamless portability of information and optimize the health of individuals and populations globally.
– Healthcare Information and Management Systems Society (HIMSS)HIMSS Dictionary of Health IT terms, acronyms, and organizations. 2017. (link)
Interoperability refers to the ability of systems to exchange data, information, and knowledge among various types of actors.
For our purposes, interoperability means that knowledge is shared among all parties either involved in the care of a patient or who are allowed to access the recordsNote: interoperability does not refer specifically to electronic health records. .
Systems can be interoperable at various levels, depending on what they can communicate. HIMSS defines four levels of interoperability: foundational, structural, semantic, and organizational HIMSS: Interoperability in Healthcare. (link) .
Foundational interoperability deals with the exchange of data at the lowest level, without requiring that the receiving system be able to interpret the data. An example of foundational interoperability is a medical device that can securely send data to a medical record system, without the data being in a format that the system understands.
Structural interoperability is achieved once foundational interoperability is paired with a standardized format for information exchange. For example, two systems are structurally interoperable if they can exchange JSON documents containing patient records. However, while both systems may use the same JSON schema, they may use different vocabularies to encode concepts.
Semantic interoperability describes when systems can communicate shared understanding. In other words, two systems can share knowledge. For clinical data, this means that two systems share the format (eg. data encoding, file type) and semantic encoding (eg. same terminology for drugs) of the data they exchange.
Organizational interoperability has to do with requirements for the real world. It includes things like governance, security, speed, implementation, and use of semantically interoperable systems. For our purposes, this has to do with when, why, and with whom medical data may be shared, while the lower levels deal more with how.
The curly braces problem
An early approach to the interoperability of medical knowledge was the Arden syntax, a language which allows clinical decision support tools to be shared across multiple health record systemsHripcsak G. Computers in Biology and Medicine. 1994 Sep 1;24(5):331-63. doi:10.1016/0010-4825(94)90002-7 .
Because different hospitals store data differently, the syntax was designed with institution-specific mappings inside curly braces. In order to implement a medical logic module (MLM) at a new institution, one needs only to provide references to fields as defined in the institution’s database.
For example, to find the minimum serum potassium for a patient, the Arden syntax would read
a : = read min ( {‘serum potassium’} )
To implement the code, an institution must replace {serum potassium}
with the field in their database for serum potassium.
While this sounds simple, such customizations are nontrivial and can often require expert medical knowledge to implement.
In addition, we aren’t guaranteed one-to-one mappings.
Barriers to interoperability based on interface challenges and different data standards are referred to as the “curly braces problem”.