Automatic identification of recent high impact clinical articles in PubMed to support clinical decision making using time-agnostic features

Jiantao Bian, Samir Abdelrahman, Jianlin Shi, Guilherme Del Fiol

Research output: Contribution to journalArticle

Abstract

Objectives: Finding recent clinical studies that warrant changes in clinical practice (“high impact” clinical studies) in a timely manner is very challenging. We investigated a machine learning approach to find recent studies with high clinical impact to support clinical decision making and literature surveillance. Methods: To identify recent studies, we developed our classification model using time-agnostic features that are available as soon as an article is indexed in PubMed®, such as journal impact factor, author count, and study sample size. Using a gold standard of 541 high impact treatment studies referenced in 11 disease management guidelines, we tested the following null hypotheses: (1) the high impact classifier with time-agnostic features (HI-TA) performs equivalently to PubMed's Best Match sort and a MeSH-based Naïve Bayes classifier; and (2) HI-TA performs equivalently to the high impact classifier with both time-agnostic and time-sensitive features (HI-TS) enabled in a previous study. The primary outcome for both hypotheses was mean top 20 precision. Results: The differences in mean top 20 precision between HI-TA and three baselines (PubMed's Best Match, a MeSH-based Naïve Bayes classifier, and HI-TS) were not statistically significant (12% vs. 3%, p = 0.101; 12% vs. 11%, p = 0.720; 12% vs. 25%, p = 0.094, respectively). Recall of HI-TA was low (7%). Conclusion: HI-TA had equivalent performance to state-of-the-art approaches that depend on time-sensitive features. With the advantage of relying only on time-agnostic features, the proposed approach can be used as an adjunct to help clinicians identify recent high impact clinical studies to support clinical decision-making. However, low recall limits the use of HI-TA for literature surveillance.

LanguageEnglish (US)
Pages1-10
Number of pages10
JournalJournal of Biomedical Informatics
Volume89
DOIs
StatePublished - Jan 1 2019

Fingerprint

Clinical Decision Support Systems
Mesh generation
Evidence-Based Medicine
Artificial Intelligence
Automation
Decision support systems
PubMed
Artificial intelligence
Learning systems
Decision Making
Patient Care
Classifiers
Decision making
Journal Impact Factor
Disease Management
Sample Size
Machine Learning
Clinical Decision-Making
Guidelines

Keywords

  • Clinical decision support
  • Concept drift
  • Evidence-based medicine
  • Literature database
  • Machine learning
  • Patient care

ASJC Scopus subject areas

  • Computer Science Applications
  • Health Informatics

Cite this

Automatic identification of recent high impact clinical articles in PubMed to support clinical decision making using time-agnostic features. / Bian, Jiantao; Abdelrahman, Samir; Shi, Jianlin; Del Fiol, Guilherme.

In: Journal of Biomedical Informatics, Vol. 89, 01.01.2019, p. 1-10.

Research output: Contribution to journalArticle

@article{67096aea0db1456f81efacdfef9c6f3d,
title = "Automatic identification of recent high impact clinical articles in PubMed to support clinical decision making using time-agnostic features",
abstract = "Objectives: Finding recent clinical studies that warrant changes in clinical practice (“high impact” clinical studies) in a timely manner is very challenging. We investigated a machine learning approach to find recent studies with high clinical impact to support clinical decision making and literature surveillance. Methods: To identify recent studies, we developed our classification model using time-agnostic features that are available as soon as an article is indexed in PubMed{\circledR}, such as journal impact factor, author count, and study sample size. Using a gold standard of 541 high impact treatment studies referenced in 11 disease management guidelines, we tested the following null hypotheses: (1) the high impact classifier with time-agnostic features (HI-TA) performs equivalently to PubMed's Best Match sort and a MeSH-based Na{\"i}ve Bayes classifier; and (2) HI-TA performs equivalently to the high impact classifier with both time-agnostic and time-sensitive features (HI-TS) enabled in a previous study. The primary outcome for both hypotheses was mean top 20 precision. Results: The differences in mean top 20 precision between HI-TA and three baselines (PubMed's Best Match, a MeSH-based Na{\"i}ve Bayes classifier, and HI-TS) were not statistically significant (12{\%} vs. 3{\%}, p = 0.101; 12{\%} vs. 11{\%}, p = 0.720; 12{\%} vs. 25{\%}, p = 0.094, respectively). Recall of HI-TA was low (7{\%}). Conclusion: HI-TA had equivalent performance to state-of-the-art approaches that depend on time-sensitive features. With the advantage of relying only on time-agnostic features, the proposed approach can be used as an adjunct to help clinicians identify recent high impact clinical studies to support clinical decision-making. However, low recall limits the use of HI-TA for literature surveillance.",
keywords = "Clinical decision support, Concept drift, Evidence-based medicine, Literature database, Machine learning, Patient care",
author = "Jiantao Bian and Samir Abdelrahman and Jianlin Shi and {Del Fiol}, Guilherme",
year = "2019",
month = "1",
day = "1",
doi = "10.1016/j.jbi.2018.11.010",
language = "English (US)",
volume = "89",
pages = "1--10",
journal = "Journal of Biomedical Informatics",
issn = "1532-0464",
publisher = "Academic Press Inc.",

}

TY - JOUR

T1 - Automatic identification of recent high impact clinical articles in PubMed to support clinical decision making using time-agnostic features

AU - Bian, Jiantao

AU - Abdelrahman, Samir

AU - Shi, Jianlin

AU - Del Fiol, Guilherme

PY - 2019/1/1

Y1 - 2019/1/1

N2 - Objectives: Finding recent clinical studies that warrant changes in clinical practice (“high impact” clinical studies) in a timely manner is very challenging. We investigated a machine learning approach to find recent studies with high clinical impact to support clinical decision making and literature surveillance. Methods: To identify recent studies, we developed our classification model using time-agnostic features that are available as soon as an article is indexed in PubMed®, such as journal impact factor, author count, and study sample size. Using a gold standard of 541 high impact treatment studies referenced in 11 disease management guidelines, we tested the following null hypotheses: (1) the high impact classifier with time-agnostic features (HI-TA) performs equivalently to PubMed's Best Match sort and a MeSH-based Naïve Bayes classifier; and (2) HI-TA performs equivalently to the high impact classifier with both time-agnostic and time-sensitive features (HI-TS) enabled in a previous study. The primary outcome for both hypotheses was mean top 20 precision. Results: The differences in mean top 20 precision between HI-TA and three baselines (PubMed's Best Match, a MeSH-based Naïve Bayes classifier, and HI-TS) were not statistically significant (12% vs. 3%, p = 0.101; 12% vs. 11%, p = 0.720; 12% vs. 25%, p = 0.094, respectively). Recall of HI-TA was low (7%). Conclusion: HI-TA had equivalent performance to state-of-the-art approaches that depend on time-sensitive features. With the advantage of relying only on time-agnostic features, the proposed approach can be used as an adjunct to help clinicians identify recent high impact clinical studies to support clinical decision-making. However, low recall limits the use of HI-TA for literature surveillance.

AB - Objectives: Finding recent clinical studies that warrant changes in clinical practice (“high impact” clinical studies) in a timely manner is very challenging. We investigated a machine learning approach to find recent studies with high clinical impact to support clinical decision making and literature surveillance. Methods: To identify recent studies, we developed our classification model using time-agnostic features that are available as soon as an article is indexed in PubMed®, such as journal impact factor, author count, and study sample size. Using a gold standard of 541 high impact treatment studies referenced in 11 disease management guidelines, we tested the following null hypotheses: (1) the high impact classifier with time-agnostic features (HI-TA) performs equivalently to PubMed's Best Match sort and a MeSH-based Naïve Bayes classifier; and (2) HI-TA performs equivalently to the high impact classifier with both time-agnostic and time-sensitive features (HI-TS) enabled in a previous study. The primary outcome for both hypotheses was mean top 20 precision. Results: The differences in mean top 20 precision between HI-TA and three baselines (PubMed's Best Match, a MeSH-based Naïve Bayes classifier, and HI-TS) were not statistically significant (12% vs. 3%, p = 0.101; 12% vs. 11%, p = 0.720; 12% vs. 25%, p = 0.094, respectively). Recall of HI-TA was low (7%). Conclusion: HI-TA had equivalent performance to state-of-the-art approaches that depend on time-sensitive features. With the advantage of relying only on time-agnostic features, the proposed approach can be used as an adjunct to help clinicians identify recent high impact clinical studies to support clinical decision-making. However, low recall limits the use of HI-TA for literature surveillance.

KW - Clinical decision support

KW - Concept drift

KW - Evidence-based medicine

KW - Literature database

KW - Machine learning

KW - Patient care

UR - http://www.scopus.com/inward/record.url?scp=85057270275&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85057270275&partnerID=8YFLogxK

U2 - 10.1016/j.jbi.2018.11.010

DO - 10.1016/j.jbi.2018.11.010

M3 - Article

VL - 89

SP - 1

EP - 10

JO - Journal of Biomedical Informatics

T2 - Journal of Biomedical Informatics

JF - Journal of Biomedical Informatics

SN - 1532-0464

ER -