redditarchiver is a web application that lets you download Reddit submissions on your device to keep a copy of them for archival purposes. Downloaded submissions are included in a html file with formatting and navigation features. (You can see an example with the file example.html)
ℹ️ Looking for RedditArchiver without the web server, which consists only of a Python script? Check RedditArchiver-standalone
When someone requests the download of a submission, the application uses in background the Reddit API to "download" the full content of the thread. Then, the content is put in a HTML file and served to requesting user.
Each user, when interacting with Reddit API, has a "speed limit" of how much submissions/comments they are allowed to read. If everyone was using the name of RedditArchiver to read submissions, this speed would be reached quickly.
Therefore, RedditArchiver reads Reddit on behalf of the requesting user, and not on its own behalf, to avoid running into rate limiting issues. Please note that, when you allow RedditArchiver to "read through your account", you only allow it to read public submissions on your behalf: RedditArchiver is not able to see your upvoted/saved posts, your user information, or your password, because the "read" permission does not cover this. It is not even able to see your Reddit username.
For configuration reference, see CONFIG.md.
If you want to host your own instance on a little server of yours (and you should!), here is a guide on how to proceed (that may require some small changes depending on your exact environment):
# apt install python3-pip python3-venv
It is strongly advised to create a system user dedicated to the app:
# adduser redditarchiver
Go to https://www.reddit.com/prefs/apps and create a new app. The type should be "web app" and the redirect URI should be <your endpoint>/token
. So for example, if your app is going to be accessible on https://redditarchiver.example.com
, it should be https://redditarchiver.example.com/token
. (You are not forced to expose the app on the Internet – a private IP address also works)
Take note of the app ID and the app secret.
Drop all the content of src
in a directory on your server.
Make sure the running user has write access to directories data
, logs
, output
and run
, and can run run.sh
.
# chown redditarchiver: . -R
# chmod 444 * -R
# chmod 544 static templates -R
# chmod 744 run.sh data run logs output -R
Switch to the dedicated user, create a Python virtualenv and activate it:
# su redditarchiver
$ python3 -m venv env
$ source env/bin/activate
Install Python dependencies:
(env) $ pip install -r requirements.txt
Switch back to root.
Edit the config file (config.yml.example
), then rename it to config.yml
.
Copy redditarchiver.service
into /etc/systemd/system
. Do not forget to change directories and users in the file.
Then
# systemctl daemon-reload
# systemctl enable redditarchiver
# systemctl start redditarchiver
The app needs a reverse proxy (such as Apache or nginx) in front of it to work. You may use the following Apache configuration for example:
<Location />
ProxyPass unix:/srv/redditarchiver/gunicorn.sock|http://127.0.0.1/
ProxyPassReverse unix:/srv/redditarchiver/gunicorn.sock|http://127.0.0.1/
</Location>
Do not forget to enable mod_proxy and mod_proxy_http on Apache.
You also need to give appropriate rights to the Apache user on run
directory:
# chgrp www-data run
# chmod 2774 run -R
Restart both Apache and RedditArchiver.
This software is licensed with MIT license.