바로가기메뉴

본문 바로가기 주메뉴 바로가기

Fusion Approach to Targeted Opinion Detection in Blogosphere

Journal of Korean Library and Information Science Society / Journal of Korean Library and Information Science Society, (P)2466-2542;
2015, v.46 no.1, pp.321-344
https://doi.org/10.16981/kliss.46.1.201503.321
Yang, Kiduk

Abstract

This paper presents a fusion approach to sentiment detection that combines multiple sources of evidence to retrieve blogs that contain opinions on a specific topic. Our approach to finding opinionated blogs on topic consists of first applying traditional information retrieval methods to retrieve blogs on a given topic and then boosting the ranks of opinionated blogs based on the opinion scores computed by multiple sentiment detection methods. Our sentiment detection strategy, whose central idea is to rely on a variety of complementary evidences rather than trying to optimize the utilization of a single source of evidence, includes High Frequency module, which identifies opinions based on the frequency of opinion terms (i.e., terms that occur frequently in opinionated documents), Low Frequency module, which makes use of uncommon/rare terms (e.g., “sooo good”) that express strong sentiments, IU Module, which leverages n-grams with IU (I and you) anchor terms (e.g., I believe, You will love), Wilson’s lexicon module, which uses a collection-independent opinion lexicon constructed from Wilson’s subjectivity terms, and Opinion Acronym module, which utilizes a small set of opinion acronyms (e.g., imho). The results of our study show that combining multiple sources of opinion evidence is an effective method for improving opinion detection performance.

keywords
융합, 정보 검색, 의견 감지, Fusion, Information retrieval, Opinion detection

Reference

1.

Lada, Adamic and N. Glance. 2005. “The political blogosphere and the 2004 US election:Divided they blog.” Proceedings of the 3rd International Workshop on Link discovery, 36-43.

2.

Bartell, Brian T., G. W. Cottrell and R. K. Belew. 1994. “Automatic combination of multiple ranked retrieval systems.” Proceedings of the ACM SIGIR Conference on Research and Development in Information Retrieval, 173-181.

3.

Chklovski, Timothy. 2006. “Deriving quantitative overviews of free text assessments on the web.” Proceedings of the 11th International Conference on Intelligent User Interfaces, 155–162.

4.

Ding, Xiaowen., B. Liu and P. S. Yu. 2008. A holistic lexicon-based approach to opinion mining. Proceedings of the 2008 International Conference on Web Search and Data Mining, 231-240.

5.

Efron, Miles. 2004. “The liberal media and right-wing conspiracies: using cocitation information to estimate political orientation in web documents.”Proceedings of the 13th ACM International Conference on Information and Knowledge Management, 390–398.

6.

Fox, Edward A. and J. A. Shaw. 1995. “Combination of multiple searches.”Proceeding of the 3rd Text Retrieval Conference, 105-108.

7.

Hu, Minqing and B. Liu. 2004. “Mining and Summarizing Customer Reviews.” In KDD’04: Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 168–177.

8.

Joshi, Hemant, C. Bayrak and X. Xu. 2007. “UALR at TREC: Blog Track.”Proceedings of the 15th Text Retrieval Conference.

9.

Lee, Joon H. 1997. “Analyses of multiple evidence combination.” Proceedings of the ACM SIGIR Conference on Research and Development in Information Retrieval, 267-276.

10.

Liu, Bing, M. Hu and J. Cheng. 2005. “Opinion observer: analyzing and comparing opinions on the web.” Proceedings of the 14th International Conference on World Wide Web, 342–351.

11.

Liu, Bing. 2012. Sentiment analysis and opinion mining. Synthesis Lectures on Human Language Technologies, 5(1), 1-167.

12.

Macdonald, Craig, R. L. Santos, I. Ounis and I. Soboroff. 2010. Blog track research at TREC. ACM SIGIR Forum, 44(1), 58-75.

13.

Mishne, Gilad. 2007. “Multiple Ranking Strategies for Opinion Retrieval in Blogs.” Proceedings of the 15th Text Retrieval Conference.

14.

Mishne, Gilad and M. de Rijke. 2006. “Deriving wishlists from blogs: Show us your blog, and we’ll tell you what books to buy.” Proceedings of the 15th International World Wide Web Conference. 925-926.

15.

Oard, Doug, T. Elsayed, J. Wang, Y. Wu, P. Zhang, E. Abels and D. Lin. 2007. “TREC 2006 at Maryland: Blog, Enterprise, Legal and QA Tracks.”Proceedings of the 15th Text Retrieval Conference.

16.

Ounis, Iadh, C. Macdonald, J. Lin and I. Soboroff. 2011. Overview of the TREC-2011 microblog track. Proceedings of the 20th Text Retrieval Conference (TREC 2011).

17.

Ounis, Iadh, C. Macdonald, M. de Rijke and G. Mishne. 2007. “Overview of the TREC 2006 Blog Track.” Proceedings of the 15th Text Retrieval Conference.

18.

Taboada, Maite, J. Brooke, M. Tofiloski, K. Voll and M. Stede. 2011. Lexiconbased methods for sentiment analysis. Computational Linguistics, 37(2), 267-307.

19.

Thelwall, Mike, K. Buckley and G. Paltoglou. 2012. Sentiment strength detection for the social web. Journal of the American Society for Information Science and Technology, 63(1), 163-173.

20.

Thelwall, Mike, K. Buckley, G. Paltoglou, D. Cai and A. Kappas. 2010. Sentiment strength detection in short informal text. Journal of the American Society for Information Science and Technology, 61(12), 2544-2558.

21.

Wiebe, Janyce, T. Wilson, R. Bruce, M. Bell and M. Martin. 2004. “Learning subjective language.” Computational Linguistics, 30(3), 277–308.

22.

Wilson, Theresa, D. R. Pierce and J. Wiebe. 2003. “Identifying opinionated sentences.” Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology, 33–34.

23.

Yang, Kiduk. 2002. “Combining Text- and Link-based Retrieval Methods for Web IR.” Proceedings of the 10th Text Retrieval Conference, 609-618.

24.

Yang, Kiduk. 2009. WIDIT in TREC 2008 Blog Track: Leveraging Multiple Sources of Opinion Evidence. Proceedings of the 17th Text Retrieval Conference.

25.

Yang, Kiduk. 2014. Combining multiple sources of evidence to enhance Web search performance. Journal of Korean Library and Information Science Society, 45(3), 5-36.

26.

Yang, Kiduk and N. Yu. 2005. “WIDIT: Fusion-based approach to Web search optimization.” Information Retrieval Technology, 206-220. New York:Springer-Verlag.

27.

Yang, Kiduk, N. Yu, A. Valerio and H. Zhang. 2007a. “WIDIT in TREC2006Blog track.” Proceedings of the 15th Text Retrieval Conference.

28.

Yang, Kiduk, N. Yu, A. Valerio, H. Zhang and W. Ke. 2007b. “Fusion approach to finding opinions in Blogsophere.” Proceedings of the 1st International Conference on Weblog and Social Media.

29.

Yang, Kiduk, N. Yu, A. Wead, G. La Rowe, Y. H. Li, C. French and Y. Lee. 2005. “WIDIT in TREC2004 Genomics, HARD, Robust, and Web tracks.”Proceedings of the 13th Text Retrieval Conference.

30.

Yang, Kiduk, N.Yu and H. Zhang. 2008. “WIDIT in TREC2007 Blog track:Combining lexicon-based methods to detect opinionated Blogs.” Proceedings of the 16th Text Retrieval Conference.

31.

Zhang, Ethan and Y. Zhang. 2007. “UCSC on TREC 2006 Blog Opinion Mining.”Proceedings of the 15th Text Retrieval Conference.

32.

Zhang, Wei and C. Yu. 2007. “UIC in TREC 2006 Blog Track.” Proceedings of the 15th Text Retrieval Conference.

Journal of Korean Library and Information Science Society