바로가기메뉴

본문 바로가기 주메뉴 바로가기

logo

  • P-ISSN1013-0799
  • E-ISSN2586-2073
  • KCI

Inferring Undiscovered Public Knowledge by Using Text Mining-driven Graph Model

Journal of the Korean Society for Information Management / Journal of the Korean Society for Information Management, (P)1013-0799; (E)2586-2073
2014, v.31 no.1, pp.231-250
https://doi.org/10.3743/KOSIM.2014.31.1.231


Abstract

Due to the recent development of Information and Communication Technologies (ICT), the amount of research publications has increased exponentially. In response to this rapid growth, the demand of automated text processing methods has risen to deal with massive amount of text data. Biomedical text mining discovering hidden biological meanings and treatments from biomedical literatures becomes a pivotal methodology and it helps medical disciplines reduce the time and cost. Many researchers have conducted literature-based discovery studies to generate new hypotheses. However, existing approaches either require intensive manual process of during the procedures or a semi-automatic procedure to find and select biomedical entities. In addition, they had limitations of showing one dimension that is, the cause-and-effect relationship between two concepts. Thus, this study proposed a novel approach to discover various relationships among source and target concepts and their intermediate concepts by expanding intermediate concepts to multi-levels. This study provided distinct perspectives for literature-based discovery by not only discovering the meaningful relationship among concepts in biomedical literature through graph-based path interference but also being able to generate feasible new hypotheses.

keywords
biotext mining, literature based discovery, undiscovered public knowledge, graph model, 바이오 텍스트 마이닝, 문헌 기반 발견, 미발견 공공 지식, 그래프 모델

Reference

1.

(2013). Automatic Classification for English Verbs. http://www.cl.cam.ac.uk/~ls418/resource_release/.

2.

Cameron, D.. (2013). A graph-based recovery and decomposition of swanson’s hypothesis using semantic predications. Journal of Biomedical Informatics, 46(2), 238-251.

3.

DiGiacomo, R. A.. (1989). Fish oil dietary supplementation in patients with Raynaud’s phenomenon : A doubleblind, controlled, prospective study. American Journal of Medicine, 8, 158-164.

4.

Frijters, R.. (2008). CoPub: a literature-based keyword enrichment tool for microarray data analysis. Nucleic Acids Research, 36(suppl 2), W406-W410.

5.

Frijters, R.. (2010). Literature mining for the discovery of hidden connections between drugs, genes and diseases. PLoS Computational Biology, 6(9), 1-11.

6.

Hristovski, D.. (2006). Exploiting semantic relations for literature-based discovery (349-353). In AMIA Annual Symposium Proceedings. American Medical Informatics Association.

7.

Hristovski, D.. (2005). Using literature-based discovery to identify disease candidate genes. International Journal of Medical Informatics, 74(2), 289-298.

8.

Hristovski, D.. (2013). Using literature-based discovery to identify novel therapeutic approaches. Cardiovascular and Hematological Agents in Medicinal Chemistry, 11(1), 14-24.

9.

Kilicoglu, H.. (2012). SemMedDB : a PubMed-scale repository of biomedical semantic predications. Bioinformatics, 28(23), 3158-3160.

10.

Kim, J. D.. (2003). GENIA corpus-a semantically annotated corpus for bio-textmining. Bioinformatics, 19(1), 180-182.

11.

Lafferty, J.. (2001). Conditional random fields: Probabilistic models for segmenting and labeling sequence data (282-289). In International Conference on Machine Learning.

12.

Liekens, A. M.. (2011). BioGraph: unsupervised biomedical knowledge discovery via automated hypothesis generation. Genome Biology, 12(6), R57-.

13.

(2013). LingPipe: Named entity tutorial. http://alias-i.com/lingpipe/demos/tutorial/ne/read-me.html/.

14.

(2013). LingPipe: Sentence boundary detection. http://alias-i.com/lingpipe/demos/tutorial/sentences/read-me.html/.

15.

MEDLINE. (2013). PubMed XML element descriptions and their attributes. http://www.nlm.nih.gov/bsd/licensee/elements_descriptions.html/.

16.

Narayanasamy, V.. (2004). TransMiner : Mining transitive associations among biological objects from text. Journal of Biomedical Science, 11(6), 864-873.

17.
19.

Smalheiser, N. R.. (1994). Assessing a gap in the biomedical literature : Magnesium deficiency and neurologic disease. Neuroscience Research Communications, 15(1), 1-9.

20.

Smalheiser, N. R.. (1996). Indomethacin and Alzheimer's disease. Neurology, 46(2), 583-583.

21.

Smalheiser, N. R.. (1996). Linking estrogen to Alzheimer's disease : An informatics approach. Neurology, 47(3), 809-810.

22.

Srinivasan, P.. (2004). Text mining : Generating hypotheses from MEDLINE. Journal of the American Society for Information Science and Technology, 55(5), 396-413.

23.

Sun, L.. (2009). Improving verb clustering with automatically acquired selectional preferences (638-647). In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics.

24.

Swanson, D. R.. (1986). Undiscovered public knowledge. The Library Quarterly, 56(2), 103-118.

25.

Swanson, D. R.. (1986). Fish oil, Raynaud's syndrome, and undiscovered public knowledge. Perspectives in Biology and Medicine, 30(1), 7-18.

26.

Swanson, D. R.. (1988). Migraine and magnesium : Eleven neglected connections. Perspectives in Biology and Medicine, 31(4), 526-557.

27.

Swanson, D. R.. (1990). Somatomedin C and arginine : Implicit connections between mutually isolated literatures. Perspectives in Biology and Medicine, 33(2), 157-186.

28.

Swanson, D. R.. (1997). An interactive system for finding complementary literatures : A stimulus to scientific discovery. Artificial Intelligence, 91(2), 183-203.

29.

Swanson, D. R.. (2001). Information discovery from complementary literatures : Categorizing viruses as potential weapons. Journal of the American Society for Information Science and Technology, 52(10), 797-812.

30.

Swanson, D. R.. (2006). Ranking indirect connections in literature-based discovery : The role of medical subject headings. Journal of the American Society for Information Science and Technology, 57(11), 1427-1439.

31.

(2013). UMLS Reference Manual. http://www.ncbi.nlm.nih.gov/books/NBK9676/.

32.

Weeber, M.. (2001). Using concepts in literaturebased discovery : Simulating Swanson's Raynaud-fish oil and migraine-magnesium discoveries. Journal of the American Society for Information Science and Technology, 52(7), 548-557.

33.

Weeber, M.. (2003). Generating hypotheses by discovering implicit associations in the literature : a case report of a search for new potential therapeutic uses for thalidomide. Journal of the American Medical Informatics Association, 10(3), 252-259.

34.

Wilkowski, B.. (2011). Discovery browsing with semantic predications and graph theory (-). In AMIA Annual Symposium Proceedings.

Journal of the Korean Society for Information Management