바로가기메뉴

본문 바로가기 주메뉴 바로가기

Investigating an Automatic Method in Summarizing a Video Speech Using User-Assigned Tags

Journal of the Korean Society for Library and Information Science / Journal of the Korean Society for Library and Information Science, (P)1225-598X; (E)2982-6292
2012, v.46 no.1, pp.163-181
https://doi.org/10.4275/KSLIS.2012.46.1.163

  • Downloaded
  • Viewed

Abstract

We investigated how useful video tags were in summarizing video speech and how valuable positional information was for speech summarization. Furthermore, we examined the similarity among sentences selected for a speech summary to reduce its redundancy. Based on such analysis results, we then designed and evaluated a method for automatically summarizing speech transcripts using a modified Maximum Marginal Relevance model. This model did not only reduce redundancy but it also enabled the use of social tags, title words, and sentence positional information. Finally, we compared the proposed method to the Extractor system in which key sentences of a video speech were chosen using the frequency and location information of speech content words. Results showed that the precision and recall rates of the proposed method were higher than those of the Extractor system, although there was no significant difference in the recall rates.

keywords
MMR Model, Social Summarization, Redundancy, Acoustic features, Prosodic Features, Extractor, Transcripts, 스피치 요약, 태그, 비디오, 표제, 코싸인 유사계수, 내재적 평가, 적합 문장

Reference

1.

김현희. 2009. 비디오의 오디오 정보 요약 기법에 관한 연구. 『정보관리학회지』, 26(3): 169-188.

2.

김현희. 2011. 비디오 의미 파악을 위한 멀티미디어 요약의 비동시적 오디오와 이미지 정보간의상호 작용 효과 연구. 『한국문헌정보학회지』, 45(2): 97-118.

3.

정영미. 2007. 『정보검색연구』. 서울: 구미무역출판부.

4.

이한성 외. 2010. 멀티모달 방법론과 텍스트 마이닝 기반의 뉴스 비디오 마이닝. 『정보과학회논문지: 데이타베이스』, 37(3): 127-136.

5.

Boydell, O., & Smyth, B. 2010. “Social summarization in collaborative web search.” Information Processing and Management, 46(6): 782-798.

6.

Chen, B., & Lin, S. 2012. “A risk-aware modeling framework for speech summarization.” IEEE Transactions on Audio, Speech, and Language Processing, 20(1): 211-222.

7.

Christensen, H., et al. 2003. “Are extractive text summarisation techniques portable to broadcast news?” In Proceedings of Automatic Speech Recognition and Understanding Workshop, 489-494. St. Thomas, USA.

8.

Chung, M., Wang, T. & Sheu, P. 2011. “Video summarisation based on collaborative temporal tags.” Online Information Review, 35(4): 653-668.

9.

Goldstein, J., et al. 2000. “Multi-document summarization by sentence extraction.” In Proceedings of the 2000 NAACL-ANLP Workshop on Automatic Summarization(NAACL-ANLP-AutoSum '00), Vol.4: 40-48. Stroudsburg, PA, USA: Association for Computational Linguistics.

10.

Hannon, J., et al. 2011. “Personalized and automatic social summarization of events in video.” In Proceedings of the 16th International Conference on Intelligent User Interfaces, 335-338. Palo Alto, California, USA.

11.

Heckner, M., Neubauer, T., & Wolff, C. 2008. “Tree, funny, to read, google: What are tags supposed to achieve?” In Proceedings of the 2008 ACM Workshop on Search in Social Media, 3-10. Napa Valley, California, USA.

12.

Hirohata, M., et al. 2006. “Sentence-extractive automatic speech summarization and evaluation techniques.” Speech Communication, 48(9): 1151-1161.

13.

Kim, H. 2011. “Toward video semantic search based on a structured folksonomy.” Journal of the American Society for Information Science, 62(3): 478-492.

14.

Liu, Y., & Hakkani-Tur, D. 2011. “Speech summarization.” In Spoken Language Understanding: Systems for Extracting Semantic Information from Speech. G. Edited by Hakkani-Tur and R. Mori. Hoboken, NJ: Wiley, 357-392.

15.

Maskey, S., & Hirschberg, J. 2006. “Summarizing speech without text using Hidden Markov Models.” In Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Short Papers(NAACL-Short'06), 89-92. Stroudsburg, PA, USA: Association for Computational Linguistics.

16.

Marchionini, G., et al. 2009. “Multimedia surrogates for video gisting: Toward combining spoken words and imagery.” Information Processing and Management, 45(6): 615-630.

17.

Murray, G., Renals, S., & Carletta, J. 2005. “Extractive summarization of meeting recordings.” Proceedings of the 9th European Conference on Speech Communication and Technology (INTERSPEECH), 593-596. Lisbon, Portugal.

18.

Song, Y., & Marchionini, G. 2007. “Effects of audio and visual surrogates for making sense of digital video.” In Proceedings of CHI 2007, 867-876. San Jose, CA, USA.

19.

Turney, P. 2000. “Learning algorithms for keyphrase extraction.” Information Retrieval, 2(4): 303-336.

20.

Xie, S., & Liu, Y. 2008. “Using corpus and knowledge-based similarity measure in maximum marginal relevance for meeting summarization.” IEEE International Conference on Acoustics, Speech and Signal Processing, 4985-4988.

21.

Xie, S., et al. 2009. “Integrating prosodic features in extractive meeting summarization.” Proceedings of Automatic Speech Recognition & Understanding, 2009.

22.

Zechner, K. 2002. “Automatic summarization of open-domain multiparty dialogues in diverse genres.” Computational Linguistics, 28(4): 447-485.

23.

Zhang, J., et al. 2007. “A comparative study on speech summarization of broadcast news and lecture speech.” In INTERSPEECH-2007, 2781-2784.

24.

Zhu, J., et al. 2009. “Tag-oriented document summarization." Proceedings of the 18th International Conference on World Wide Web, 1195-1196.

25.

Zhu, X., Penn, G., & Rudzicz, F. 2009. “Summarizing multiple spoken documents: Finding evidence from untranscribed audio.” Proceedings of ACL/AFNLP, 549-557.

Journal of the Korean Society for Library and Information Science