20220112
In this repository, there are three questions that address different aspects of image and video analysis using data mining techniques. The methods include object detection using the MS COCO, ResNet-50 model, Keras based on TensorFlow library and visualizes networks using Gephi. After loading the images by URL and pre-processing the features, we can discover insight into these data. Finally, apply Matplotlib, Seaborn tools and t-SNE to visualize the results and interpret the conclusions.
A researcher has performed a network analysis on the dataset and now likes to know whether we can find different types of video thumbnails in different communities. In order to answer that question, she wants to categorize the video thumbnails using image classification. Design a classification scheme (with clear definitions and examples) for catego- rizing video thumbnails in nodes.csv [column video_thumbnail]. Make sure that your classification scheme could be used by someone who hasn’t seen the data before.
Google Images can uncover interesting cultural representations. For example, if you search for “CEO”, you will mainly see pictures of middle- aged white males; if you search for “big data”, you will see an abundance of dark blue, cyberspace-like images. Come up with a Google Images query you want to research, explain your choice, and collect at least 50 images from this query.
Use the subsets of movie trailers from 1920-1940, 1960-1980 and 2000-2020 from dataset, but instead of comparing the shot types and shot lengths, use one of the (pre-trained) image feature extraction methods to compare the subsets, and explain the choice. Make a plan for tackling the dimensionality of the data: each subset consists of multiple videos, each video consist of multiple frames/seconds/shots, and each frame/second/shot could contain multiple faces/genders/emotions/objects/texts/colors.