Understanding Infographics through Textual and Visual Tag Prediction

Understanding Infographics through Textual and Visual Tag Prediction

Bylinskii Z, Alsheikh S, Madan S, Recasens A, Zhong K, Pfister H, Durand F, and Oliva A.

ArXiv e-prints, 2017.

We introduce the problem of visual hashtag discovery for infographics: extracting visual elements from an infographic that are diagnostic of its topic. Given an infographic as input, our computational approach automatically outputs textual and visual elements predicted to be representative of the infographic content. Concretely, from a curated dataset of 29K large infographic images sampled across 26 categories and 391 tags, we present an automated two step approach. First, we extract the text from an infographic and use it to predict text tags indicative of the infographic content. And second, we use these predicted text tags as a supervisory signal to localize the most diagnostic visual elements from within the infographic i.e. visual hashtags. We report performances on a categorization and multi-label tag prediction problem and compare our proposed visual hashtags to human annotations.