바로가기메뉴

본문 바로가기 주메뉴 바로가기

logo

Toward a Structural and Semantic Metadata Framework for Efficient Browsing and Searching of Web Videos

Toward a Structural and Semantic Metadata Framework for Efficient Browsing and Searching of Web Videos

한국문헌정보학회지 / Journal of the Korean Society for Library and Information Science, (P)1225-598X; (E)2982-6292
2017, v.51 no.1, pp.227-243
https://doi.org/10.4275/KSLIS.2017.51.1.227
김현희 (명지대학교)
  • 다운로드 수
  • 조회수

Abstract

This study proposed a structural and semantic framework for the characterization of events and segments in Web videos that permits content-based searches and dynamic video summarization. Although MPEG-7 supports multimedia structural and semantic descriptions, it is not currently suitable for describing multimedia content on the Web. Thus, the proposed metadata framework that was designed considering Web environments provides a thorough yet simple way to describe Web video contents. Precisely, the metadata framework was constructed on the basis of Chatman’s narrative theory, three multimedia metadata formats (PBCore, MPEG-7, and TV-Anytime), and social metadata. It consists of event information, eventGroup information, segment information, and video (program) information. This study also discusses how to automatically extract metadata elements including structural and semantic metadata elements from Web videos.

keywords
Multimedia, Structural Metadata, Semantic Metadata, MPEG-7, PBCore, TV-Anytime, Chatman’s Narrative Theory, Social Metadata, Event, Segment, Content-based Search

참고문헌

1.

Kim, Yong-Ho. 2009. “A Structural Model of Mediated Visual Communication in Narrative Movies: Focusing on Chatman and Bordwell’s Controversy.” Korean Journal of Journalism & Communication Studies, 53(1), 209-232.

2.

Cho, Young-Joon. 2014. “The Study on improvement of Broadcast Metadata about Clip Video at Broadcast Content Managements.” In Proceedings of 2014 Korean Society of Broadcast Engineers Summer Conference, June 30-July 2, 2014, Jeju: Jeju National University:59-63.

3.

Agnew, G., Kniesner, D. and Weber, M. B. 2007. “Integrating MPEG-7 into the Moving Image Collections Portal.” Journal of the American Society for Information Science and Technology, 58(9), 1357-1363.

4.

Algur, S. P., Bhat, P. and Jain, S. 2014. “Metadata Construction Model for Web Videos:A Domain Specific Approach.” International Journal of Engineering and Computer Science, 3(12), 9742-9748.

5.

Arndt, R. et al. 2007. “COMM: Designing a Well-Founded Multimedia Ontology for the Web.” In Proceedings of the 6th International Semantic Web Conference and the 2nd Asian Semantic Web Conference, November 11-15, 2007, Busan: BEXCO.

6.

Behroozi, M., Daliri, M. R. and Shekarchi, B. 2016. “EEG Phase Patterns Reflect the Representation of Semantic Categories of Objects.” Medical & Biological Engineering &Computing, 54(1), 205-221.

7.

Benitez, A. B., Zhong, D. and Chang, S. F. 2007. “Enabling MPEG-7 Structural and Semantic Descriptions in Retrieval Applications.” Journal of the Association for Information Science and Technology, 58(9), 1377-1380.

8.

Bouadjenek, M. R., Hacid, H. and Bouzeghoub, M. 2016. “Social Networks and Information Retrieval, How Are They Converging? A Survey, a Taxonomy and an Analysis of Social Information Retrieval Approaches and Platforms.” Information Systems, 56(2016), 1-18.

9.

Chatman, S. 1975. “Towards a Theory of Narrative.” New Literary History, 6(2), 295-318.

10.

Chen, F., Delannay, D. and De Vleeschouwer, C. 2011. “An Autonomous Framework to Produce and Distribute Personalized Team-Sport Video Summaries: A Basketball Case Study.” IEEE Transactions on Multimedia, 13(6), 1381-1394.

11.

Christel, M. G. 2009. Automated Metadata in Multimedia Information Systems: Creation, Refinement, Use in Surrogates, and Evaluation. Synthesis Lecture on Information Concepts, Retrieval, and Services, 2. San Rafael, CA: Morgan & Claypool Publishers.

12.

Cunningham, S. J. and Nichols, D. M. 2008. “How People Find Videos.” In Proceedings of the 8th ACM/IEEE-CS Joint Conference on Digital Libraries, June 16-20, 2008, Pittsburgh, PA: Omni William Penn Hotel: 201-210.

13.

Evain, J. P. and Martínez, J. M. 2007. “TV-Anytime Phase 1 and MPEG-7.” Journal of the American Society for Information Science and Technology, 58(9), 1367-1373.

14.

International Organization for Standardization/International Electrotechnical Commission (ISO/IEC). 2002-2004. ISO/IEC 15938: Part 1-8: Information Technology: Multimedia Content Description Interface (MPEG-7). Geneva: International Organization for Standardization.

15.

Huurnink, B. et al. 2010. “Search Behavior of Media Professionals at an Audiovisual Archive: A Transaction Log Analysis.” Journal of the Association for Information Science and Technology, 61(6), 1180-1197.

16.

Klix, F. 2001. “The Evolution of Cognition.” Journal of Structural Learning and Intelligence Systems, 14, 415-431.

17.

Lee, H. K. et al. 2005. “Personalized TV Services and T-Learning Based on TV-Anytime Metadata.” In Proceedings of the 6th Pacific-Rim Conference on Multimedia, November 13-16, 2005, Jeju: Ramada Plaza Jeju Hotel: 212-223.

18.

List, T. and Fisher, R. B. 2004. “CVML-An XML-based Computer Vision Markup Language.”In Proceedings of the 17th International Conference on Pattern Recognition, August 23-26, 2004, Cambridge: 789-792.

19.

Lunn, B. K. 2009. Towards the Design of User based Metadata for Television Broadcasts. Saarbrücken: VDM Verlag.

20.

Makkonen, J. et al. 2010. “Detecting Events by Clustering Videos from Large Media Databases.” In Proceedings of the 2nd ACM International Workshop on Events in Multimedia, October 25, 2010, Firenze: 9-14.

21.

Mehmood, I. et al. 2016. “Divide-and-Conquer based Summarization Framework for Extracting Affective Video Content.” Neurocomputing, 174(A), 393-403.

22.

The Moving Picture Experts Group (MPEG). [n.d.]. MPEG. Villar Dora: The Moving Picture Experts Group. [online] [cited 2016. 9. 11.] <http://mpeg.chiariglione.org/>

23.

Nowak, M. A., Plotkin, J. B. and Jansen, V. A. 2000. “The Evolution of Syntactic Communication.” Nature, 404(6777), 495-498.

24.

Park, J. R. and Lu, C. 2009. “Application of Semi-Automatic Metadata Generation in Libraries: Types, Tools, and Techniques.” Library & Information Science Research, 31(4), 225-231.

25.

PBCore. [n.d.]. PBCore. [online] [cited 2016. 9. 3.]<http://pbcore.org/introducing-pbcore-2-0/>

26.

Reijnders, K. 2011. Suspense Tours: Narrative Generation in the Context of Tourism. Amsterdam: Universiteit van Amsterdam.

27.

Shotton, D. M. et al. 2002. “A Metadata Classification Schema for Semantic Content Analysis of Videos.” Journal of Microscopy, 205(1), 33-42.

28.

Smeaton, A. F., Over, P. and Doherty, A. R. 2010. “Video Shot Boundary Detection: Seven Years of TRECVid Activity.” Computer Vision and Image Understanding, 114(4), 411-418.

29.

Teeter, P. and Sandberg, J. 2016. “Cracking the Enigma of Asset Bubbles with Narratives.”Strategic Organization, 15(1), 91-99.

30.

Togawa, H. and Okuda, M. 2005. “Position-Based Keyframe Selection for Human Motion Animation.” In Proceedings of 11th International Conference on Parallel and Distributed Systems, July 20-22, 2005, Fukuoka: 182-185.

31.

TV-Anytime Forum. 2005. TV Anytime Forum. [online] [cited 2016. 5. 16.]<http://www.tv-anytime.org/>

32.

Wang, M. et al. 2012. “Event Driven Web Video Summarization by Tag Localization and Key-Shot Identification.” IEEE Transactions on Multimedia, 14(4), 975-985.

33.

Yokoi, K., Nakai, H. and Sato, T. 2008. “Toshiba at TRECVID 2008: Surveillance Event Detection Task.” In Proceedings of the TRECVID 2008 Workshop, November 17-18, 2008, Gaithersburg, MD: National Institute of Standards and Technology.

한국문헌정보학회지