Expanding vocabularies for complementary and alternative medicine therapies

Lou Ann Scarton, Liqin Wang, Halil Kilicoglu, Margaret Jahries, Guilherme Del Fiol

Research output: Contribution to journalArticle

Abstract

Objective: There is a significant consumer demand for complementary and alternative medicine (CAM) therapies as possible alternatives to drugs in the treatment and prevention of chronic diseases. Expanding controlled vocabularies to include CAM treatment relations could help meet those needs by facilitating information retrieval from the published literature. The purpose of this study is to design and evaluate two methods to semi-automatically extract CAM treatment-related semantic predications (subject-predicate-object triplets) from the biomedical literature using the Semantic Medline database (SemMedDB). Methods: Predications were retrieved from SemMedDB, a database of semantic predications extracted from article abstracts available in Medline. Predications were retrieved for 20 biologically-based and 3 mind-body CAM therapies. The first method (allMedline) retrieved predications from any Medline citation, while the second method (soundStudies) only retrieved predications from scientifically sound clinical studies. Filtering criteria were applied to identify the predications focusing on the treatment and prevention of medical disorders using various CAM modalities. The disorders were extracted for each CAM therapy and ranked by occurrence. A reference vocabulary, composed of 20 biologically-based and 3 mind-body CAM therapies, was developed to evaluate the performance of each method according to precision and recall of the top 100 ranked concepts as well as average precision and recall. Results: The difference between allMedline and soundStudies in terms of median precision for the top 100 concepts ranked by occurrence was significant (21.0% versus 27.0%, p <.001). The soundStudies method had significantly higher precision (7.0% vs 11.5%, p <.001) and the allMedline had significantly higher recall (37.1% vs 25.6%, p <.001). Conclusion: The soundStudies method may be useful for extracting treatment-related predications from the biomedical literature for the highest ranked concepts. Additional work is needed to improve the algorithm as well as identify and report shortcomings for future enhancements of the tools used to populate SemMedDB.

LanguageEnglish (US)
Pages64-74
Number of pages11
JournalInternational Journal of Medical Informatics
Volume121
DOIs
StatePublished - Jan 1 2019

Fingerprint

Psychophysiology
Disease control
Data Mining
Vocabulary
Information Storage and Retrieval
Cams
Complementary Therapies
Information retrieval
Semantics
MEDLINE
Medicine
Data mining
Ontology
Thesauri
Mind-Body Therapies
Databases
Acoustic waves

Keywords

  • Complementary and alternative medicine
  • Data mining
  • Information extraction
  • MEDLINE
  • Ontology
  • SemMedDB

ASJC Scopus subject areas

  • Health Informatics

Cite this

Expanding vocabularies for complementary and alternative medicine therapies. / Scarton, Lou Ann; Wang, Liqin; Kilicoglu, Halil; Jahries, Margaret; Del Fiol, Guilherme.

In: International Journal of Medical Informatics, Vol. 121, 01.01.2019, p. 64-74.

Research output: Contribution to journalArticle

Scarton, Lou Ann ; Wang, Liqin ; Kilicoglu, Halil ; Jahries, Margaret ; Del Fiol, Guilherme. / Expanding vocabularies for complementary and alternative medicine therapies. In: International Journal of Medical Informatics. 2019 ; Vol. 121. pp. 64-74.
@article{ce34ae4852e1405e9cda40570b8d9615,
title = "Expanding vocabularies for complementary and alternative medicine therapies",
abstract = "Objective: There is a significant consumer demand for complementary and alternative medicine (CAM) therapies as possible alternatives to drugs in the treatment and prevention of chronic diseases. Expanding controlled vocabularies to include CAM treatment relations could help meet those needs by facilitating information retrieval from the published literature. The purpose of this study is to design and evaluate two methods to semi-automatically extract CAM treatment-related semantic predications (subject-predicate-object triplets) from the biomedical literature using the Semantic Medline database (SemMedDB). Methods: Predications were retrieved from SemMedDB, a database of semantic predications extracted from article abstracts available in Medline. Predications were retrieved for 20 biologically-based and 3 mind-body CAM therapies. The first method (allMedline) retrieved predications from any Medline citation, while the second method (soundStudies) only retrieved predications from scientifically sound clinical studies. Filtering criteria were applied to identify the predications focusing on the treatment and prevention of medical disorders using various CAM modalities. The disorders were extracted for each CAM therapy and ranked by occurrence. A reference vocabulary, composed of 20 biologically-based and 3 mind-body CAM therapies, was developed to evaluate the performance of each method according to precision and recall of the top 100 ranked concepts as well as average precision and recall. Results: The difference between allMedline and soundStudies in terms of median precision for the top 100 concepts ranked by occurrence was significant (21.0{\%} versus 27.0{\%}, p <.001). The soundStudies method had significantly higher precision (7.0{\%} vs 11.5{\%}, p <.001) and the allMedline had significantly higher recall (37.1{\%} vs 25.6{\%}, p <.001). Conclusion: The soundStudies method may be useful for extracting treatment-related predications from the biomedical literature for the highest ranked concepts. Additional work is needed to improve the algorithm as well as identify and report shortcomings for future enhancements of the tools used to populate SemMedDB.",
keywords = "Complementary and alternative medicine, Data mining, Information extraction, MEDLINE, Ontology, SemMedDB",
author = "Scarton, {Lou Ann} and Liqin Wang and Halil Kilicoglu and Margaret Jahries and {Del Fiol}, Guilherme",
year = "2019",
month = "1",
day = "1",
doi = "10.1016/j.ijmedinf.2018.11.009",
language = "English (US)",
volume = "121",
pages = "64--74",
journal = "International Journal of Medical Informatics",
issn = "1386-5056",
publisher = "Elsevier Ireland Ltd",

}

TY - JOUR

T1 - Expanding vocabularies for complementary and alternative medicine therapies

AU - Scarton, Lou Ann

AU - Wang, Liqin

AU - Kilicoglu, Halil

AU - Jahries, Margaret

AU - Del Fiol, Guilherme

PY - 2019/1/1

Y1 - 2019/1/1

N2 - Objective: There is a significant consumer demand for complementary and alternative medicine (CAM) therapies as possible alternatives to drugs in the treatment and prevention of chronic diseases. Expanding controlled vocabularies to include CAM treatment relations could help meet those needs by facilitating information retrieval from the published literature. The purpose of this study is to design and evaluate two methods to semi-automatically extract CAM treatment-related semantic predications (subject-predicate-object triplets) from the biomedical literature using the Semantic Medline database (SemMedDB). Methods: Predications were retrieved from SemMedDB, a database of semantic predications extracted from article abstracts available in Medline. Predications were retrieved for 20 biologically-based and 3 mind-body CAM therapies. The first method (allMedline) retrieved predications from any Medline citation, while the second method (soundStudies) only retrieved predications from scientifically sound clinical studies. Filtering criteria were applied to identify the predications focusing on the treatment and prevention of medical disorders using various CAM modalities. The disorders were extracted for each CAM therapy and ranked by occurrence. A reference vocabulary, composed of 20 biologically-based and 3 mind-body CAM therapies, was developed to evaluate the performance of each method according to precision and recall of the top 100 ranked concepts as well as average precision and recall. Results: The difference between allMedline and soundStudies in terms of median precision for the top 100 concepts ranked by occurrence was significant (21.0% versus 27.0%, p <.001). The soundStudies method had significantly higher precision (7.0% vs 11.5%, p <.001) and the allMedline had significantly higher recall (37.1% vs 25.6%, p <.001). Conclusion: The soundStudies method may be useful for extracting treatment-related predications from the biomedical literature for the highest ranked concepts. Additional work is needed to improve the algorithm as well as identify and report shortcomings for future enhancements of the tools used to populate SemMedDB.

AB - Objective: There is a significant consumer demand for complementary and alternative medicine (CAM) therapies as possible alternatives to drugs in the treatment and prevention of chronic diseases. Expanding controlled vocabularies to include CAM treatment relations could help meet those needs by facilitating information retrieval from the published literature. The purpose of this study is to design and evaluate two methods to semi-automatically extract CAM treatment-related semantic predications (subject-predicate-object triplets) from the biomedical literature using the Semantic Medline database (SemMedDB). Methods: Predications were retrieved from SemMedDB, a database of semantic predications extracted from article abstracts available in Medline. Predications were retrieved for 20 biologically-based and 3 mind-body CAM therapies. The first method (allMedline) retrieved predications from any Medline citation, while the second method (soundStudies) only retrieved predications from scientifically sound clinical studies. Filtering criteria were applied to identify the predications focusing on the treatment and prevention of medical disorders using various CAM modalities. The disorders were extracted for each CAM therapy and ranked by occurrence. A reference vocabulary, composed of 20 biologically-based and 3 mind-body CAM therapies, was developed to evaluate the performance of each method according to precision and recall of the top 100 ranked concepts as well as average precision and recall. Results: The difference between allMedline and soundStudies in terms of median precision for the top 100 concepts ranked by occurrence was significant (21.0% versus 27.0%, p <.001). The soundStudies method had significantly higher precision (7.0% vs 11.5%, p <.001) and the allMedline had significantly higher recall (37.1% vs 25.6%, p <.001). Conclusion: The soundStudies method may be useful for extracting treatment-related predications from the biomedical literature for the highest ranked concepts. Additional work is needed to improve the algorithm as well as identify and report shortcomings for future enhancements of the tools used to populate SemMedDB.

KW - Complementary and alternative medicine

KW - Data mining

KW - Information extraction

KW - MEDLINE

KW - Ontology

KW - SemMedDB

UR - http://www.scopus.com/inward/record.url?scp=85057143335&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85057143335&partnerID=8YFLogxK

U2 - 10.1016/j.ijmedinf.2018.11.009

DO - 10.1016/j.ijmedinf.2018.11.009

M3 - Article

VL - 121

SP - 64

EP - 74

JO - International Journal of Medical Informatics

T2 - International Journal of Medical Informatics

JF - International Journal of Medical Informatics

SN - 1386-5056

ER -