Finding Meaning on YouTube: Tag Recommendation and Category Discovery

Marius Pasca
Computer Vision and Pattern Recognition, IEEE (2010)
Google Scholar

Abstract

We present a system that automatically recommends tags for YouTube
videos solely based on their audiovisual content. We also propose a novel framework
for unsupervised discovery of video categories that exploits knowledge mined
from the World-Wide Web text documents/searches. First, video content to tag
association is learned by training classifiers that map audiovisual
content-based features from millions of videos on YouTube.com to existing
uploader-supplied tags for these videos. When a new video is uploaded, the
labels provided by these classifiers are used to automatically suggest tags
deemed relevant to the video. Our system has learned a vocabulary of over 20,000 tags.
Secondly, we mined large volumes of Web pages and search queries to discover a
set of possible text entity categories and a set of associated is-A
relationships that map individual text entities to categories. Finally, we
apply these is-A relationships mined from web text on the tags learned from
audiovisual content of videos to automatically synthesize a reliable set of
categories most relevant to videos -- along with a mechanism to predict these
categories for new uploads. We then present rigorous rating studies that
establish that: (a) the average relevance of tags automatically recommended by
our system matches the average relevance of the uploader-supplied tags at the
same or better coverage and (b) the average precision@K of video categories
discovered by our system is 70% with K=5.

Research Areas