A 'smol' program that crawls following/followers/statuses count data from Twitter account profile page using Selenium, and put the crawled data into MySQL database using PyMySQL.
The purpose of this program is to record the followers count daily and see how the count changes everyday. MAYBE THIS IS NOT PRODUCTION-READY, so use this with caution!
YES, I HAD. but one day Twitter suspended my API application, even though I didn't overuse or abuse it! Probably this is an Elon thing
Source code of original implementation, which uses Twitter API using python-twitter
, is stored in old
branch.
Dockerfile
is ready, in both current and old(original) source tree.
To build:
$ cd <root-directory-of-source>
$ docker build -t twitter-account-data-crawler:latest .
After build, run:
$ docker run -d \
--name twitter-account-data-crawler \
-v <path-of-config.yaml>:/app/config/config.yaml \
twitter-account-data-crawler
You have to prepare configuration file(config.yaml
). Please refer the example config file and create your own.
If you're using Podman, just replace docker
with podman
in command line.
You may still run the program without Docker or OCI-compliant runtimes.
To get this work:
$ cd <root-direvtory-of-source>
# Install requirements
$ pip install -r requirements.txt
# and run!
$ python index.py
Configuration file(config.yaml
) should be exist in config
folder.
Currently only MySQL(and probably MySQL-based DBMS like MariaDB) is supported.
Creating tables per target account is recommended.
The table at least should have these columns:
date
: type datefollowing_count
: type int, unsignedfollower_count
: type int, unsignedtweet_count
: type int, unsigned
An example SQL query for these columns:
CREATE TABLE `account_track_table` (
`date` date NOT NULL,
`following_count` int UNSIGNED NOT NULL,
`follower_count` int UNSIGNED NOT NULL,
`tweet_count` int UNSIGNED NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;