המכון הלאומי לחקר שירותי הבריאות ומדיניות הבריאות (ע”ר)

The Israel National Institute For Health Policy Research

Development of an Automated System Based on Knowledge-Based, Time-Oriented Similarity Measures for Search, Comparison, and Retrieval of Similar Longitudinal Medical Records

Researchers: Yuval Shahar1
  1. Ben-Gurion University of the Negev
Background: The use of electronic medical records is increasing. Clinical tasks, including diagnosis, prognosis, treatment, and research, might benefit from finding patients similar to a given patient. Clinical tasks focused on classification and prediction, can exploit modern machine-learning and temporal data-mining technologies.
Objectives: 1. Introduce a novel approach for patient record matching that exploits the full dynamic course of the disease.
2. Evaluate the benefit of the new similarity measure[s] to the performance of several clinical classification and prediction tasks.
Method: We implemented an automated search engine, based on innovative similarity measures, for the retrieval of similar patients.
We acquired domain knowledge in Oncology, Diabetes, and Hepatitis.
Using the knowledge, we abstracted longitudinal, multivariate raw data in these domains into clinically meaningful intervals aligned to a significant clinical event, transformed them into a new canonical representation suitable for time-oriented matching, and applied a new algorithm we have developed, iDTW, for interval-based and abstraction-based dynamic time warping.
We measured the benefit of using the similarity measure to perform several clinical classification and prediction tasks using a KNN approach.
Findings: The classification and prediction performance when matching patient records using knowledge-based abstractions was superior to the performance when using only raw data.
This superiority was reflected in the average classification performance, the representation that led to the best performance in each experiment, and in several cases in which, using abstractions, fewer training samples were needed to reach an optimal KNN-based classification performance.
Conclusions: We introduced an interval-based and knowledge-based methodology for determining the similarity between two longitudinal medical records.
The comprehensive analysis confirmed the superiority of matching using knowledge-based abstractions versus using only raw data.
Recommendations: We recommend the use of automated systems for patient matching, to improve the accuracy of diagnosis and prognosis, for clinical research and health policy determination.
Research number: A/111/2015
Research end date: 12/2017
Skip to content