Skip to content

wanerllubbse/imgur-scraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 

Repository files navigation

Imgur Scraper

Imgur Scraper lets you search, crawl, and export millions of posts, memes, and image/video galleries from Imgur with detailed metadata. It supports keyword search, tags, user galleries, and comment threads so you can build rich datasets for analytics, content research, or automation workflows. Use this Imgur scraper to reliably collect structured information at scale without dealing with fragile manual copy-paste.

Bitbash Banner

Telegram   WhatsApp   Gmail   Website

Created by Bitbash, built to showcase our approach to Scraping and Automation!
If you are looking for Imgur Scraper you've just found your team — Let’s Chat. 👆👆

Introduction

Imgur Scraper is a high-performance data collection tool that automates browsing through Imgur search results, tag pages, user profiles, and individual posts. It extracts structured data such as titles, views, scores, tags, media info, and nested comments into a clean JSON dataset.

This project solves the pain of manually scrolling and saving memes or post metadata, or relying on incomplete/unofficial endpoints. It is ideal for developers, data scientists, content marketers, and researchers who need detailed Imgur data for trend analysis, dashboards, recommendation systems, or moderation tools.

Imgur content mining at scale

  • Supports keyword-based search with sorting and filtering options across Imgur.
  • Handles tag pages, user galleries, and individual post URLs in a single unified workflow.
  • Optionally fetches full comment threads and replies for deeper engagement analysis.
  • Allows limiting by page range and max items to control crawl scope and cost.
  • Provides extension hooks so you can transform or enrich every record with custom logic.

Features

Feature Description
Keyword search scraping Search Imgur for any keyword and collect matching posts with sorting and filtering options.
Tag page scraping Crawl posts from one or more tag URLs and build datasets around specific topics or communities.
User gallery scraping Fetch all posts from any public Imgur user profile without artificial limits.
Comment extraction Optionally include full comment threads and replies for each post, including engagement counts.
Detailed post metadata Capture views, favorites, scores, vote counts, tags, media details, and account information.
Pagination control Configure endPage and maxItems to limit how many pages and items are scraped.
Proxy support Use a proxy configuration to reduce blocking risk and keep scraping stable over long runs.
Custom mapping hooks Use extendOutputFunction and customMapFunction to attach custom fields or transform records before saving.

What Data This Scraper Extracts

Field Name Field Description
type Entity type, e.g., "post" for standard post records.
id Unique Imgur post identifier.
accountId Unique identifier of the author account.
title Post title or headline as shown on Imgur.
description Text description or caption of the post (can be empty).
numberOfViews Total view count of the post.
numberOfUpvotes Number of upvotes received.
numberOfDownvotes Number of downvotes received.
numberOfPoints Net points score (upvotes minus downvotes or platform-computed score).
numberOfImages Number of media items (images/videos) attached to the post.
numberOfComments Total number of comments on the post (top-level and replies).
numberOfFavorites Number of times the post has been favorited.
virality Calculated virality score based on engagement and reach.
score Overall score metric used for ranking or “most viral” lists.
isInMostViral Boolean indicating whether the post appears in “Most Viral” feeds.
isAlbum Boolean flag indicating if the post is an album containing multiple media items.
isMature Boolean indicating if the post is marked as mature content.
coverId Identifier of the cover image used for the post/album.
createdAt ISO timestamp when the post was created.
updatedAt ISO timestamp when the post was last updated (may be null).
url Canonical URL of the Imgur post/gallery.
platform Source platform label, such as "api" or similar.
account Nested object with account details for the author.
account.id Unique ID of the author account.
account.username Author’s username.
account.avatarUrl URL of the author’s avatar image.
account.createdAt ISO timestamp when the user account was created.
tags Array of tags (strings) assigned to the post.
media Array of media objects (images, GIFs, videos) attached to the post.
media.mime_type MIME type of the media file (e.g., video/mp4).
media.url Direct media URL.
media.ext File extension derived from the media type.
media.width Media width in pixels.
media.height Media height in pixels.
media.size File size in bytes.
media.title Media title if applicable.
media.description Media-specific description if available.
media.isAnimated Indicates whether the media is animated (GIF/video).
media.isLooping Indicates whether the media is looping.
media.duration Duration of the media in seconds (for video).
media.has_sound Boolean flag indicating if the media includes audio.
comments Array of top-level comment objects with nested replies.
comments.id Unique identifier of the comment.
comments.parent_id ID of the parent comment (0 for top-level comments).
comments.comment Text content of the comment.
comments.account_id Account identifier of the commenter.
comments.post_id ID of the related post.
comments.upvote_count Number of upvotes on the comment.
comments.downvote_count Number of downvotes on the comment.
comments.point_count Net score of the comment.
comments.vote Current vote state if available.
comments.platform_id Numeric platform identifier.
comments.platform Platform label (e.g., "android").
comments.created_at ISO timestamp when the comment was created.
comments.updated_at ISO timestamp when the comment was last updated.
comments.deleted_at Timestamp if the comment was deleted (otherwise null).
comments.next Reference to next comment chunk/thread if present.
comments.comments Array of nested reply comments associated with the parent comment.

Example Output

Example:

[
  {
    "type": "post",
    "id": "JBTJqu2",
    "accountId": "22440270",
    "title": "holiday",
    "description": "",
    "numberOfViews": 8578,
    "numberOfUpvotes": 23,
    "numberOfDownvotes": 5,
    "numberOfPoints": 18,
    "numberOfImages": 1,
    "numberOfComments": 6,
    "numberOfFavorites": 1,
    "virality": 3475.255272505103,
    "score": 18.5435,
    "isInMostViral": false,
    "isAlbum": true,
    "isMature": false,
    "coverId": "dK9p4A1",
    "createdAt": "2019-07-13T03:20:56Z",
    "updatedAt": null,
    "url": "https://imgur.com/gallery/JBTJqu2",
    "platform": "api",
    "account": {
      "id": "22440270",
      "username": "IsNice",
      "avatarUrl": "https://i.imgur.com/YLvWS5K_d.png?maxwidth=290&fidelity=grand",
      "createdAt": "2015-07-19T10:05:29Z"
    },
    "tags": [
      "storytime",
      "funny",
      "awesome"
    ],
    "media": [
      {
        "mime_type": "video/mp4",
        "url": "https://i.imgur.com/dK9p4A1.mp4",
        "ext": "mp4",
        "width": 960,
        "height": 540,
        "size": 87699,
        "title": "",
        "description": "",
        "isAnimated": true,
        "isLooping": true,
        "duration": 9,
        "has_sound": false
      }
    ],
    "comments": [
      {
        "id": 1681576587,
        "parent_id": 0,
        "comment": "https://youtu.be/q-qqrGtlHkg",
        "account_id": 3708825,
        "post_id": "JBTJqu2",
        "upvote_count": 2,
        "downvote_count": 0,
        "point_count": 2,
        "vote": null,
        "platform_id": 4,
        "platform": "android",
        "created_at": "2019-07-13T03:45:15Z",
        "updated_at": "2019-07-13T04:05:31Z",
        "deleted_at": null,
        "next": null,
        "comments": [
          {
            "id": 1681616815,
            "parent_id": 1681576587,
            "comment": "http://i.imgur.com/D5veJQj.gif",
            "account_id": 85975675,
            "post_id": "JBTJqu2",
            "upvote_count": 2,
            "downvote_count": 0,
            "point_count": 2,
            "vote": null,
            "platform_id": 4,
            "platform": "android",
            "created_at": "2019-07-13T05:28:37Z",
            "updated_at": "2019-07-13T18:44:25Z",
            "deleted_at": null,
            "next": null,
            "comments": []
          }
        ]
      }
    ]
  }
]

Directory Structure Tree

imgur-scraper/
├── src/
│   ├── main.js
│   ├── crawler.js
│   ├── extractors/
│   │   ├── postExtractor.js
│   │   ├── commentExtractor.js
│   │   └── mediaExtractor.js
│   ├── mappers/
│   │   ├── extendOutputFunction.example.js
│   │   └── customMapFunction.example.js
│   ├── utils/
│   │   ├── logger.js
│   │   ├── pagination.js
│   │   └── requestQueue.js
│   └── config/
│       └── inputSchema.json
├── data/
│   ├── input.example.json
│   └── sample-output.json
├── tests/
│   ├── crawler.test.js
│   └── mapping.test.js
├── .env.example
├── package.json
├── package-lock.json
├── README.md
└── LICENSE

Use Cases

  • Content researchers use it to collect large volumes of memes and visual posts, so they can analyze cultural trends, formats, and engagement over time.
  • Data scientists use it to build training datasets of images, GIFs, and associated text, so they can power recommendation models or computer vision projects.
  • Marketing teams use it to monitor how specific tags, campaigns, or topics perform on Imgur, so they can refine creative strategies and targeting.
  • Community managers use it to audit user-generated content and comment behavior, so they can detect spam, abuse, or high-value contributors.
  • Developers use it to integrate Imgur post data into dashboards, automation scripts, or internal tools without relying on brittle manual export methods.

FAQs

Q1: What types of Imgur URLs are supported? This scraper supports search result URLs, tag pages, user profile/gallery URLs, and individual post/gallery URLs. You can mix them freely in startUrls, and the tool will automatically detect and handle each type.

Q2: Do I have to enable comment scraping? No. Comment scraping is controlled by the includeComments boolean flag. When set to true, the scraper will fetch all available comments and replies for each post, which increases runtime and resource usage. If you only need post-level metrics, leave it set to false for faster runs.

Q3: How can I limit how many posts are collected? You can use the endPage parameter to stop after a certain number of result pages per URL and the maxItems parameter to cap the total number of posts across the whole run. Combining both lets you fine-tune scope and avoid overshooting your target volume.

Q4: Can I customize the output format? Yes. With extendOutputFunction and customMapFunction, you can inject custom JavaScript that receives each page or item and returns a transformed object. This makes it easy to normalize fields, add derived metrics, or drop unnecessary properties before saving.


Performance Benchmarks and Results

Primary Metric: On a typical mid-range server with a stable network connection, the scraper can collect around 100 detailed Imgur posts (including media metadata) in roughly 2 minutes when comment scraping is disabled.

Reliability Metric: With a properly configured proxy setup and reasonable throttling, success rates above 95% per run are common, even when crawling multiple pages of search results and user galleries.

Efficiency Metric: By prioritizing lightweight list pages first and then scheduling detail requests, the scraper maintains high throughput while keeping average memory and CPU usage modest, suitable for continuous or scheduled runs.

Quality Metric: For public posts, the scraper consistently retrieves complete metadata, tags, and media information for more than 98% of successfully requested items, providing a robust foundation for analytics and machine learning pipelines.

Book a Call Watch on YouTube

Review 1

"Bitbash is a top-tier automation partner, innovative, reliable, and dedicated to delivering real results every time."

Nathan Pennington
Marketer
★★★★★

Review 2

"Bitbash delivers outstanding quality, speed, and professionalism, truly a team you can rely on."

Eliza
SEO Affiliate Expert
★★★★★

Review 3

"Exceptional results, clear communication, and flawless delivery.
Bitbash nailed it."

Syed
Digital Strategist
★★★★★