Imgur Scraper lets you search, crawl, and export millions of posts, memes, and image/video galleries from Imgur with detailed metadata. It supports keyword search, tags, user galleries, and comment threads so you can build rich datasets for analytics, content research, or automation workflows. Use this Imgur scraper to reliably collect structured information at scale without dealing with fragile manual copy-paste.
Created by Bitbash, built to showcase our approach to Scraping and Automation!
If you are looking for Imgur Scraper you've just found your team — Let’s Chat. 👆👆
Imgur Scraper is a high-performance data collection tool that automates browsing through Imgur search results, tag pages, user profiles, and individual posts. It extracts structured data such as titles, views, scores, tags, media info, and nested comments into a clean JSON dataset.
This project solves the pain of manually scrolling and saving memes or post metadata, or relying on incomplete/unofficial endpoints. It is ideal for developers, data scientists, content marketers, and researchers who need detailed Imgur data for trend analysis, dashboards, recommendation systems, or moderation tools.
- Supports keyword-based search with sorting and filtering options across Imgur.
- Handles tag pages, user galleries, and individual post URLs in a single unified workflow.
- Optionally fetches full comment threads and replies for deeper engagement analysis.
- Allows limiting by page range and max items to control crawl scope and cost.
- Provides extension hooks so you can transform or enrich every record with custom logic.
| Feature | Description |
|---|---|
| Keyword search scraping | Search Imgur for any keyword and collect matching posts with sorting and filtering options. |
| Tag page scraping | Crawl posts from one or more tag URLs and build datasets around specific topics or communities. |
| User gallery scraping | Fetch all posts from any public Imgur user profile without artificial limits. |
| Comment extraction | Optionally include full comment threads and replies for each post, including engagement counts. |
| Detailed post metadata | Capture views, favorites, scores, vote counts, tags, media details, and account information. |
| Pagination control | Configure endPage and maxItems to limit how many pages and items are scraped. |
| Proxy support | Use a proxy configuration to reduce blocking risk and keep scraping stable over long runs. |
| Custom mapping hooks | Use extendOutputFunction and customMapFunction to attach custom fields or transform records before saving. |
| Field Name | Field Description |
|---|---|
type |
Entity type, e.g., "post" for standard post records. |
id |
Unique Imgur post identifier. |
accountId |
Unique identifier of the author account. |
title |
Post title or headline as shown on Imgur. |
description |
Text description or caption of the post (can be empty). |
numberOfViews |
Total view count of the post. |
numberOfUpvotes |
Number of upvotes received. |
numberOfDownvotes |
Number of downvotes received. |
numberOfPoints |
Net points score (upvotes minus downvotes or platform-computed score). |
numberOfImages |
Number of media items (images/videos) attached to the post. |
numberOfComments |
Total number of comments on the post (top-level and replies). |
numberOfFavorites |
Number of times the post has been favorited. |
virality |
Calculated virality score based on engagement and reach. |
score |
Overall score metric used for ranking or “most viral” lists. |
isInMostViral |
Boolean indicating whether the post appears in “Most Viral” feeds. |
isAlbum |
Boolean flag indicating if the post is an album containing multiple media items. |
isMature |
Boolean indicating if the post is marked as mature content. |
coverId |
Identifier of the cover image used for the post/album. |
createdAt |
ISO timestamp when the post was created. |
updatedAt |
ISO timestamp when the post was last updated (may be null). |
url |
Canonical URL of the Imgur post/gallery. |
platform |
Source platform label, such as "api" or similar. |
account |
Nested object with account details for the author. |
account.id |
Unique ID of the author account. |
account.username |
Author’s username. |
account.avatarUrl |
URL of the author’s avatar image. |
account.createdAt |
ISO timestamp when the user account was created. |
tags |
Array of tags (strings) assigned to the post. |
media |
Array of media objects (images, GIFs, videos) attached to the post. |
media.mime_type |
MIME type of the media file (e.g., video/mp4). |
media.url |
Direct media URL. |
media.ext |
File extension derived from the media type. |
media.width |
Media width in pixels. |
media.height |
Media height in pixels. |
media.size |
File size in bytes. |
media.title |
Media title if applicable. |
media.description |
Media-specific description if available. |
media.isAnimated |
Indicates whether the media is animated (GIF/video). |
media.isLooping |
Indicates whether the media is looping. |
media.duration |
Duration of the media in seconds (for video). |
media.has_sound |
Boolean flag indicating if the media includes audio. |
comments |
Array of top-level comment objects with nested replies. |
comments.id |
Unique identifier of the comment. |
comments.parent_id |
ID of the parent comment (0 for top-level comments). |
comments.comment |
Text content of the comment. |
comments.account_id |
Account identifier of the commenter. |
comments.post_id |
ID of the related post. |
comments.upvote_count |
Number of upvotes on the comment. |
comments.downvote_count |
Number of downvotes on the comment. |
comments.point_count |
Net score of the comment. |
comments.vote |
Current vote state if available. |
comments.platform_id |
Numeric platform identifier. |
comments.platform |
Platform label (e.g., "android"). |
comments.created_at |
ISO timestamp when the comment was created. |
comments.updated_at |
ISO timestamp when the comment was last updated. |
comments.deleted_at |
Timestamp if the comment was deleted (otherwise null). |
comments.next |
Reference to next comment chunk/thread if present. |
comments.comments |
Array of nested reply comments associated with the parent comment. |
Example:
[
{
"type": "post",
"id": "JBTJqu2",
"accountId": "22440270",
"title": "holiday",
"description": "",
"numberOfViews": 8578,
"numberOfUpvotes": 23,
"numberOfDownvotes": 5,
"numberOfPoints": 18,
"numberOfImages": 1,
"numberOfComments": 6,
"numberOfFavorites": 1,
"virality": 3475.255272505103,
"score": 18.5435,
"isInMostViral": false,
"isAlbum": true,
"isMature": false,
"coverId": "dK9p4A1",
"createdAt": "2019-07-13T03:20:56Z",
"updatedAt": null,
"url": "https://imgur.com/gallery/JBTJqu2",
"platform": "api",
"account": {
"id": "22440270",
"username": "IsNice",
"avatarUrl": "https://i.imgur.com/YLvWS5K_d.png?maxwidth=290&fidelity=grand",
"createdAt": "2015-07-19T10:05:29Z"
},
"tags": [
"storytime",
"funny",
"awesome"
],
"media": [
{
"mime_type": "video/mp4",
"url": "https://i.imgur.com/dK9p4A1.mp4",
"ext": "mp4",
"width": 960,
"height": 540,
"size": 87699,
"title": "",
"description": "",
"isAnimated": true,
"isLooping": true,
"duration": 9,
"has_sound": false
}
],
"comments": [
{
"id": 1681576587,
"parent_id": 0,
"comment": "https://youtu.be/q-qqrGtlHkg",
"account_id": 3708825,
"post_id": "JBTJqu2",
"upvote_count": 2,
"downvote_count": 0,
"point_count": 2,
"vote": null,
"platform_id": 4,
"platform": "android",
"created_at": "2019-07-13T03:45:15Z",
"updated_at": "2019-07-13T04:05:31Z",
"deleted_at": null,
"next": null,
"comments": [
{
"id": 1681616815,
"parent_id": 1681576587,
"comment": "http://i.imgur.com/D5veJQj.gif",
"account_id": 85975675,
"post_id": "JBTJqu2",
"upvote_count": 2,
"downvote_count": 0,
"point_count": 2,
"vote": null,
"platform_id": 4,
"platform": "android",
"created_at": "2019-07-13T05:28:37Z",
"updated_at": "2019-07-13T18:44:25Z",
"deleted_at": null,
"next": null,
"comments": []
}
]
}
]
}
]
imgur-scraper/
├── src/
│ ├── main.js
│ ├── crawler.js
│ ├── extractors/
│ │ ├── postExtractor.js
│ │ ├── commentExtractor.js
│ │ └── mediaExtractor.js
│ ├── mappers/
│ │ ├── extendOutputFunction.example.js
│ │ └── customMapFunction.example.js
│ ├── utils/
│ │ ├── logger.js
│ │ ├── pagination.js
│ │ └── requestQueue.js
│ └── config/
│ └── inputSchema.json
├── data/
│ ├── input.example.json
│ └── sample-output.json
├── tests/
│ ├── crawler.test.js
│ └── mapping.test.js
├── .env.example
├── package.json
├── package-lock.json
├── README.md
└── LICENSE
- Content researchers use it to collect large volumes of memes and visual posts, so they can analyze cultural trends, formats, and engagement over time.
- Data scientists use it to build training datasets of images, GIFs, and associated text, so they can power recommendation models or computer vision projects.
- Marketing teams use it to monitor how specific tags, campaigns, or topics perform on Imgur, so they can refine creative strategies and targeting.
- Community managers use it to audit user-generated content and comment behavior, so they can detect spam, abuse, or high-value contributors.
- Developers use it to integrate Imgur post data into dashboards, automation scripts, or internal tools without relying on brittle manual export methods.
Q1: What types of Imgur URLs are supported?
This scraper supports search result URLs, tag pages, user profile/gallery URLs, and individual post/gallery URLs. You can mix them freely in startUrls, and the tool will automatically detect and handle each type.
Q2: Do I have to enable comment scraping?
No. Comment scraping is controlled by the includeComments boolean flag. When set to true, the scraper will fetch all available comments and replies for each post, which increases runtime and resource usage. If you only need post-level metrics, leave it set to false for faster runs.
Q3: How can I limit how many posts are collected?
You can use the endPage parameter to stop after a certain number of result pages per URL and the maxItems parameter to cap the total number of posts across the whole run. Combining both lets you fine-tune scope and avoid overshooting your target volume.
Q4: Can I customize the output format?
Yes. With extendOutputFunction and customMapFunction, you can inject custom JavaScript that receives each page or item and returns a transformed object. This makes it easy to normalize fields, add derived metrics, or drop unnecessary properties before saving.
Primary Metric: On a typical mid-range server with a stable network connection, the scraper can collect around 100 detailed Imgur posts (including media metadata) in roughly 2 minutes when comment scraping is disabled.
Reliability Metric: With a properly configured proxy setup and reasonable throttling, success rates above 95% per run are common, even when crawling multiple pages of search results and user galleries.
Efficiency Metric: By prioritizing lightweight list pages first and then scheduling detail requests, the scraper maintains high throughput while keeping average memory and CPU usage modest, suitable for continuous or scheduled runs.
Quality Metric: For public posts, the scraper consistently retrieves complete metadata, tags, and media information for more than 98% of successfully requested items, providing a robust foundation for analytics and machine learning pipelines.
