Skip to content

Analysis of Football Manager-related content on YouTube, leveraging the YouTube Data API to explore engagement metrics, content trends, and creator strategies.

License

Notifications You must be signed in to change notification settings

terryjbates/Football_Manager_YouTube_API_Analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

31 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Football Manager YouTube Video Analysis

Analysis of Football Manager (FM) YouTube content to understand trends, high-performing video categories, and viewership patterns using Python, Pandas, Seaborn, Plotly, Microsoft Excel, and DuckDB.

Table of Contents

Project Summary

This project analyzes Football Manager-related YouTube content, identifying key content categories, patterns in viewership trends, and correlations between video performance and metadata. The analysis leverages the YouTube Data API to provide insights for content creators and businesses targeting the Football Manager audience, showcasing the dynamics of high-performing content and its evolution over time.

Background

The Football Manager Ecosystem

Football Manager (FM) is a highly immersive football management simulation game that puts players in control of every aspect of their team, from tactical decisions to financial management. Available on all major gaming platforms, FM has cultivated a passionate global community, with over 7 million players engaging with the latest release, Football Manager 2024. Central to this ecosystem are content creators on platforms like YouTube and Twitch, who not only entertain but also provide critical gameplay strategies, detailed tutorials, and creative challenges. By analyzing these creators and their content, this project uncovers actionable insights for growing and sustaining engagement in this dynamic ecosystem.

The YouTube Platform for Football Manager Content

YouTube serves not only as a hub for Football Manager content, but also as a powerful marketing tool and search engine. For content creators, it offers a unique opportunity to engage with highly targeted audiences, build loyal communities, and generate sustainable revenue through ad monetization. Along with ad monetization, creators can leverage the YouTube platform to promote their brands on alternative platforms, such as Patreon and Twitch, directly sell their own merchandise, or earn brand sponsorship as well.

YouTube’s algorithm, designed to surface relevant content to users based on viewing habits, makes it easier for creators to break into niches like Football Manager. Unlike traditional SEO strategies that rely heavily on written content and webpage optimization, YouTube offers the advantage of video durability—content created years ago can continue to generate views and engagement, long after initial publication.

This dynamic creates a highly competitive, yet accessible, space for creators of all sizes. Viewers use YouTube as a de facto search engine to find tutorials, gameplay strategies, and community-driven content. As a result, creators who understand how to optimize their video content can potentially gain significant visibility and reach. Furthermore, businesses targeting the Football Manager community can leverage YouTube’s ad platform to connect with a dedicated and engaged audience, fostering brand loyalty through authentic, creator-led partnerships.

For both creators and marketers, the insights from this project offer actionable strategies to capitalize on YouTube’s unique ecosystem. By identifying high-performing video categories and engagement patterns, this analysis provides a roadmap for content optimization and audience growth, ultimately reinforcing YouTube’s role as a cornerstone of the Football Manager content landscape.

Data Overview

Primary data sources:

  • The YouTube Data API provided metadata, performance metrics, and channel details for hundreds of creators and thousands of FM-related videos.

  • Data was processed and stored in DuckDB for efficient querying and manipulation.

  • Key metrics were exported to CSV and analyzed in Microsoft Excel using Pivot Tables and area charts to categorize content and track trends.

Analytical Techniques

Data Cleaning and Processing

To ensure data consistency, columns with inconsistent capitalization, special characters, or missing values were cleaned. Descriptive statistics and exploratory analysis were conducted to summarize the data.

Categorization

Videos were categorized based on their title and description. Word frequency analysis and clustering techniques helped reveal trends and associations in content. Categories include:

  • Challenges
  • Experiments
  • Rebuilds and Team Development
  • Player Development
  • Guides and Tutorials
  • Community Collaborations
  • Discussion

Correlation Analysis

Pearson’s Correlation Coefficient was used to analyze relationships between view counts, like counts, comment counts, and other metrics.

Time Series Analysis

Video viewership patterns were analyzed over time, with specific attention to periods of increased engagement, such as the months surrounding new FM releases.

Tools and Technologies

Data Collection and Storage

  • YouTube Data API: For fetching video metadata and engagement metrics.
  • DuckDB: For SQL-based data querying of dataframes and local database storage.
  • Selenium: Automated browser interactions to simulate user scrolling and dynamically load video lists.
  • Chrome WebDriver: A tool for controlling Chrome browser sessions programmatically.

Data Analysis and Processing

  • Python: Programming language used for API data extraction, data manipulation, and analysis.
    • pandas: Data cleaning, transformation, and analysis.
    • NumPy: Numerical operations and efficient data handling.
    • re: Text processing with regular expressions.
    • NLTK: Text processing and stopword removal.
    • func_timeout: Managing long-running web scraping activity with timeouts.
    • datetime: Handling and formatting timestamps.
    • langdetect: Language detection library to detect creator language.

Data Visualization

  • matplotlib: Static plot creation.
  • seaborn: Pairplots, scatterplots, barplots, and other visualizations.
  • Plotly: Interactive visualizations (e.g., stacked bar plots, scatter plots).
  • WordCloud: Wordcloud generation from text data.
  • Pivot Tables: Aggregations of groups of individual data within one or more discrete categories.
  • Area Charts: Graphic display of quantitative data.

Workflow and Environment

  • Jupyter Notebooks: Organization and documentation of data analysis and visualization efforts.
  • Anaconda: Python development environment and dependencies.
  • Microsoft Excel: Spreadsheet editor.

Methodology

This section outlines the steps taken to collect, process, and analyze the data in this project. The methodology follows a structured approach, ensuring clarity, reproducibility, and alignment with data analysis best practices.

1. Define Project Objectives and Scope

The goal of this project was to analyze Football Manager (FM) YouTube content, identify key engagement patterns, and explore content trends to uncover actionable insights for creators and analysts. Unless otherwise specified, project scope is focused on monetizable YouTube channels associated with Football Manager-related content with the largest amount of channel subscribers, collecting data for these cannels, extracting engagement metrics, and implementing content categorization.


2. Data Collection

  • YouTube Data API Integration:
    The YouTube Data API was utilized to gather channel and video metadata. Search terms such as “FM24” and “Football Manager” were used to identify relevant content.
  • Web Scraping with Selenium and Chrome Webdriver:
    To bypass YouTube Data API quota limitations, Selenium was employed to extract video IDs directly from playlist URLs.
  • Data Persistence:
    All collected data was stored in DuckDB and CSV files for efficient querying and reproducibility.
  • Time Period:
    Data collected from the earliest dates available in the YouTube Data API until November 29, 2024.

3. Data Cleaning and Preprocessing

  • Filtering Non-Relevant Channels:
    • Channels unrelated to Football Manager were excluded based on domain knowledge and keyword-based filtering.
    • Channels below monetization threshold of 1000 subscribers.
    • Channels autogenerated by the YouTube platform. Ex: Football Manager 2024 - Topic
  • Remove Duplicate Columns
    • Remove dataframe columns storing same information. Ex: description and brandingDescription, channelTitle and brandingTitle.
  • Confirm Numeric Fields are Valid
    • subscriberCount
    • viewCount
    • videoCount
  • Standardizing Keywords:
    Channel branding keywords were cleaned and transformed into lists to enable keyword-specific analysis.
  • Feature Engineering:
    Several new columns were created to enhance the dataset, including:
    • numeric_duration for video lengths in seconds.
    • publish_day_name for day-of-week analysis.
    • LikeRatio and CommentRatio to normalize engagement metrics.

4. Exploratory Data Analysis (EDA)

  • Summary Statistics:
    Descriptive statistics provided an overview of key metrics like subscribers, views, and likes.
  • Visualizations:
    Correlation heatmaps, histograms, scatterplots, and boxplots were used to reveal patterns and distributions in the data.
  • Correlation Analysis:
    Relationships between likes, views, comments, and video duration were analyzed using correlation coefficients.

5. Content Categorization and Trends Analysis

  • Primary and Secondary Categories:
    Videos were classified into categories such as “Challenges,” “Experiments,” and “Rebuilds” to uncover dominant themes.
  • Wordcloud Analysis:
    Video descriptions were aggregated and visualized to identify prominent terms and content trends.
  • Seasonality Analysis:
    Engagement patterns over time were analyzed, with a focus on the release cycles of Football Manager games and their impact on video performance.

6. Insights and Findings

  • Ecosystem

    • 575 YouTube channels are produce Football Manager content, with 46% eligible for the YouTube Partner Program Eligibility based on subscriber count.

    • A total of 7.49M YouTube subscribers to channels related to Football Manager were found.

    • The oldest channel found producing Football Manager content was created in February 2007 (docks), yet the oldest verifiable Football Manager content video in our dataset was created in January 2013 ("Genie Scout 13 Tutorial - Intro" by FM Scout ).

    • Subscriber distribution is highly skewed; small numbers of creators have significant subscriber amonuts, while larger numbers of creators have much smaller communities.

      channelTitle subscriberCount
      NickRTFM 879000
      Nick28T 575000
      Domingo Replay 432000
      Zealand 371000
      docks 244000
      Ataberk Doğan 231000
      Manny Plus 218000
      Kırmızı Kep 216000
      WorkTheSpace 211000
      TomFM 208000
      FM Scout 194000
      lollujo 190000
      ZackNaniTV 161000
      DK FALCON 154000
      Omega Luke 151000
      DoctorBenjy FM 113000
      Seals 311 97600
      Steini 97000
      Arthur Ray 96600
      Zealand Live 91300

      Most Channels Under 100K Subs Channels Under 100K Subs Have Same Pattern

      • Ranking YouTube channels by their number of subscribers, 62% (4.64M) of subscribers are accounted for by the top 20% of content creators.
      • The average number of subscriptions per channel is 33K, while the median amount rests at 5670 subscriptions per channel. (Note:Channel subscriptions are not exclusive on the YouTube platform; a viewer may subscribe to, and surface content from, one or more channels at any time and with any frequency).
    • Several larger content creators, ranked by subscriber count, have launched secondary and tertiary channels, increasing their overall subscription footprint:

    • 81% of languages spoken in Football Manager-related channels are European. English is Dominant Language

      • English (en) is the spoken language in 50% of all channels.
      • Turkish (tr)is the second-most common language with 10.55%. In third to sixth place:
        • Spanish (es): 6.88%
        • Portuguese (pt): 6.42%
        • French (fr): 4.13%
        • German (de): 3.21%
      • The non-European languages found in our dataset were:
        • Indonesian
        • Korean
        • Arabic
        • Japanese
    • Of channels identifying the country of their creator, 32.82% of are in Great Britain. 11.79% are in Turkey. Brazilian channels make up 6.15% of channels, explaining the presence of Portuguese in our language discussion. Indonesia has an equal number of channels as Spain, and outpaced Spain, Germany, and the United States. Channel Count by Country

    • 77% of monetizable channels producing Football Manager content have branding keywords present. For channels using branding keywords, the average number of is 18.93.

  • Engagement Patterns

    • Likes

      • Likes strongly correlate with views (r=0.82), indicating viewers generally like content they’ve watched. Likes vs. Views
      • A modest correlation between likes and comments (r=0.67). Commenters on a video would also need to view the video before investing effort to comment on a video

      Comments vs. Likes

    • Views

      • A weaker correlation exists between views and comments, with a correlation coefficient of 0.56. Comments vs. Views
    • Comments

      • No correlation between running time for videos and the amount of comments. Comments vs. Views
    • Running Time

      • No clear relationships were found between video running time and the amount of views, likes, and comments, with all computed correlation coefficients nearing 0. The Football Manager content audience has no preference in content length, leaving creators freedom to produce long-form or short-form content without impacting engagement. Engagement Metrics Correlation Heatmap
  • Content Trends


7. Visualizations


8. Next Steps

To further expand on the insights gained from this project and explore untapped areas of analysis, the following next steps are proposed:

Machine Learning Classification

This analysis relied on domain-specific knowledge, direct knowledge of individual content creator brand's and information, and personal judgement. An interesting exercise would be training a supervised machine learning model to programmatically assess if a YouTube video was a "Football Manager" video or not. Features would include title, description text, tags, even assessing the thumbnail imagery, and classify if a video is a "Football Manager" video or not. This would alleviate manual asessment and decrease the toil in data acquisition process.

Clustering analysis techniques run against data for various Football Manager-related channels may uncover similar video content based on quantitative data instead of qualitative judgments made by analysts. Once similar videos are isolated, based upon categorization by content creator, it may be possible to discover overlapping sets of content types and user bases.

Sentiment Analysis of Video Comments

Analyze the sentiment behind viewer comments to uncover responses and their correlation with specific video categories or creators. Assess if particular percentages of positive or negative sentiment in video comments are correlated with video engagment, result in increased sponsored product sales, or result in channel "conversion" (new subscriber).

Tie In Cross-Platform Audience Behavior

Compare YouTube engagement metrics with other platforms like Twitch or TikTok to understand cross-platform audience overlaps and preferences. Explore if similar patterns of growth, content type, and seasonality, mirror or diverge as compared to the YouTube platform.

Measure Viewer Retention and Drop-Off Points

Viewer "churn" can only be assessed by individual content creators, as that data is not made public in the YouTube Data API. If internal classification o content was possible, creators could investigate viewer retention rates across video categories, identify where an when audiences lose interest in channel offererings, and isolate potential improvements in content delivery.

Integrate Impact of Out-of-Game Events

Football Manager players are playing a simulation of the real-world Football ecosystem. There may be scandals, events, competition rules changes, and so on, playing a role in what sorts of content are desired on the YouTube platform. The emergence of an exciting new footballing talent, a World Cup, clubs relegated due to financial mismanagement, are events with high content creation potential. Collecting a stream of such events and overlaying the timeline of this data with YouTube engagement metrics may provide interesting insights as to what sorts of content viewers may have interest in.

Explore International Retail Data with YouTube Content Trends

One of the more interesting discoveries gleaned from leveraging visual analytics was the lack of content creators in significant parts of the globe. Mexico, Central America, Africa, and central Asia had no content creators to be found. It is surprising that nations where Football is a predominate sport, that there is not more appetite to play games centered on Football Management simulation. While this may relate to a marketing dilemna, there may be structure underpinnings that explain the dearth of creators. Playing the game itself may require a specific set of computing software and hardware requirements that residents of these areas cannot match. Internet access, game retail costs, and lack of advertising may also explain this phenomena as well. As the YouTube Data API does not present demographic viewer information, it may be an exercise for content creators to analyze their channel statistics and observe if there is interest in the game via content consumption, if there are not significant amounts of creators and, potentially, players in these areas.


Repository Structure

  • data/: Contains raw YouTube video data and spreadsheets.
  • notebooks/: Jupyter notebooks with all analysis and code.
  • images/: Visualization outputs used in this README.md.
  • requirements.txt: List of Python dependencies used in the analysis notebook.
  • README.md: This document.

About

Analysis of Football Manager-related content on YouTube, leveraging the YouTube Data API to explore engagement metrics, content trends, and creator strategies.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published