Skip to content

bluesky self-hosting tool for easy deploy in anywhere.

License

Notifications You must be signed in to change notification settings

itaru2622/bluesky-selfhost-env

Repository files navigation

https://github.com/itaru2622/bluesky-selfhost-env

Contents:

This repository aims to get self-hosted a bluesky environment easy, with:

  • Configurable hosting domain: easily tuned by environment variable (${DOMAIN}).
  • Reproducibility: full disclosure of all configurations and operations, including reverse proxy rules and patches to the original code of bluesky-social.
  • Simplicity: all bluesky components run on one host, powered by docker-compose.
  • Minimal remapping: the simplest possible mapping rules between FQDN, reverse proxy, and docker-container, for easy understanding and tuning.

Currently, my latest release is 2025-04-08, based on the 2025-04-08 code from bluesky-social.

As shown below, most features work as expected in the self-hosting environment.
Unfortunately, some features may not work correctly; the reasons for this are described in bluesky-social/atproto#2334

Test results with 'asof-2024-06-02' and later:

  • ok: Create account on pds (via social-app, bluesky API).
  • ok: Basic usages on social-app
    • ok: Sign in, edit profile, post/repost articles, search posts/users/feeds, vote like/follow.
    • ok: Receive notifications when others vote like/follow you.
    • ok: Subscribe/unsubscribe to labeler in profile page.
    • ok: Report to labeler for any post.
    • not yet: DM(chat) with others.
  • ok: Integrate with feed-generator NOTE: it has some delay, reload on social-app.
  • ok: Moderate with ozone.
    • ok: Sign in and configure labels on ozone-UI.
    • ok: Receive the report sent by user.
    • ok: Assign label to the post/account on ozone-UI, then events published to subscribeLabels.
    • ok: The view of post changes on social-app when using workaround tool.
  • ok: Subscribe to events from pds/bgs(relay)/ozone by firehose/websocket.
  • ok: Subscribe to events from jetstream, since 2024-10-19r1
  • not yet: Others.

back to top

The following operations assume that the self-hosting domain is mysky.local.com (defined in Makefile).
You can change the domain name by setting the environment variable as follows:

### <a id="ops0-configparams"/>0) Configure parameters and install tools

```bash
# 1) Set domain name for self-hosting bluesky
export DOMAIN=whatever.yourdomain.com

# 2) Set 'asof' date (YYYY-MM-DD or 'latest') to select docker images and sources.
#    Example: 2025-04-08 (latest prebuild) or 'latest' (following docker image naming).
export asof=2025-04-08

# 3) Set email addresses:

# 3-1) EMAIL4CERTS: for Let's Encrypt certificate signing.
export EMAIL4CERTS=your@mail.address
# Use 'internal' (reserved) for self-signed certificates to avoid rate limits during setup.
export EMAIL4CERTS=internal

# 3-2) PDS_EMAIL_SMTP_URL: for PDS (e.g., smtps://youraccount:your-app-password@smtp.gmail.com)
export PDS_EMAIL_SMTP_URL=smtps://

# 3-3) FEEDGEN_EMAIL: for feed-generator account.
export FEEDGEN_EMAIL=feedgen@example.com

## Install required tools (if missing).
apt install -y make pwgen
(cd ops-helper/apiImpl ; npm install)
(sudo curl -o /usr/local/bin/websocat -L https://github.com/vi/websocat/releases/download/v1.13.0/websocat.x86_64-unknown-linux-musl; sudo chmod a+x /usr/local/bin/websocat)

# 4) Check configuration.
make echo

# 5) Generate and check container secrets.
make genSecrets
  1. Create DNS A-Records in your self-hosting network.

At a minimum, you will need the following two A-Records.
Refer the appendix for a sample DNS server (bind9) configuration.

     -    ${DOMAIN}
     -  *.${DOMAIN}
  1. Generate and install a CA certificate (necessary for private/closed networks and when working with self-signed certificates).
    • Once generated, copy the crt and key files to ./certs/root.{crt,key}
    • Important: Install root.crt on your host machine and within your browser. Follow the steps below to easily obtain self-signed CA certificates:
# Get and store the self-signed CA certificate into ./certs/root.{crt,key} with caddy.
make getCAcert
# Install the CA certificate on the host machine.
make installCAcert

# Remember to install the certificate in your browser.
# Check DNS server responses for your self-hosting domain
dig  ${DOMAIN}
dig  any.${DOMAIN}

# Check if DNS works as expected. Test from all nodes you want to access your self-hosting bluesky, including host and client machines.
ping ${DOMAIN}
ping any.${DOMAIN}

# Start containers for testing
make    docker-start f=./docker-compose-debug-caddy.yaml services=

# Test HTTPS and WSS with your docker environment
curl -L https://test-wss.${DOMAIN}/
websocat wss://test-wss.${DOMAIN}/ws

# Test reverse proxy mapping to ensure it works as expected for bluesky
#  These should redirect to PDS
curl -L https://pds.${DOMAIN}/xrpc/any-request | jq
curl -L https://some-hostname.pds.${DOMAIN}/xrpc/any-request | jq

#  These should redirect to social-app
curl -L https://pds.${DOMAIN}/others | jq
curl -L https://some-hostname.pds.${DOMAIN}/others | jq

# Stop test containers, without persisting data
make    docker-stop-with-clean f=./docker-compose-debug-caddy.yaml

=> If testOK, then go ahead; otherwise, examine your environment.

This section first outlines deploying bluesky with prebuilt images.
Refer later for instructions on building images from sources independently.

# 0) Pull prebuilt docker images from docker.io to explicitly avoid building images.
make docker-pull

# 1) Deploy the essential containers (database, caddy, etc.).
make docker-start

# Wait for log messages to cease.

# 2) Deploy the core bluesky containers (plc, bgs, appview, pds, ozone, ...).
make docker-start-bsky

# The operation below is obsolete due to patching/152-indigo-newpds-dayper-limit.diff
# 3) Configure the bgs parameter for the perDayLimit setting using the REST API.
# ~~~ make api_setPerDayLimit ~~~
# 1) Verify that the social-app is ready to serve content.
curl -L https://social-app.${DOMAIN}/

# 2) Generate an account specifically for the feed generator.
make api_CreateAccount_feedgen

# 3) Launch the bluesky feed-generator.
make docker-start-bsky-feedgen  FEEDGEN_PUBLISHER_DID=did:plc:...

# 4) Publish the feed's existence (using scripts/publishFeedGen.ts on the feed-generator).
make publishFeed
# 1) Generate an account for the ozone service or administrator.
#  A working email address is essential, as ozone/PDS will send a confirmation code to it.
make api_CreateAccount_ozone                    email=your-valid@email.address.com handle=...

# 2) Launch Ozone
# Ozone uses the same DID for both OZONE_SERVER_DID and OZONE_ADMIN_DIDS, as documented in [HOSTING.md](https://github.com/bluesky-social/ozone/blob/main/HOSTING.md)
make docker-start-bsky-ozone  OZONE_SERVER_DID=did:plc:  OZONE_ADMIN_DIDS=did:plc:

# 3) Run the workaround tool to index label assignments into the appview DB through subscribeLabels.
# ./ops-helper/apiImpl/subscribeLabels2BskyDB.ts --help
./ops-helper/apiImpl/subscribeLabels2BskyDB.ts

# 4) [Required occasionally] Refresh the DidDoc prior to ozone sign-in (required since asof-2024-07-05)
#    First, request and get PLC signature by email
make api_ozone_reqPlcSign                       handle=... password=...
#    Then, update the didDoc using obtained signature
make api_ozone_updateDidDoc   plcSignToken=     handle=...  ozoneURL=...

# 5) [Optional] Invite a new member to the ozone team (by assigning a role):
#    Valid roles are: tools.ozone.team.defs#roleAdmin | tools.ozone.team.defs#roleModerator | tools.ozone.team.defs#roleTriage
make api_ozone_member_add   role=  did=did:plc:
make docker-start-bsky-jetstream

Access https://social-app.${DOMAIN}/ (e.g., https://social-app.mysky.local.com/) in your browser.

See the screenshots for instructions on creating or signing in to an account.

# Subscribe almost all collections from jetstream
websocat "wss://jetstream.${DOMAIN}/subscribe?wantedCollections=app.bsky.actor.profile&wantedCollections=app.bsky.feed.like&wantedCollections=app.bsky.feed.post&wantedCollections=app.bsky.feed.repost&wantedCollections=app.bsky.graph.follow&wantedCollections=app.bsky.graph.block&wantedCollections=app.bsky.graph.muteActor&wantedCollections=app.bsky.graph.unmuteActor"

Access https://ozone.${DOMAIN}/configure (e.g., https://ozone.mysky.local.com/configure) in your browser.

# Choice 1: Shut down containers, retaining data.
make docker-stop

# Choice 2: Shut down containers and delete the data.
make docker-stop-with-clean

back to top

export u=foo
make api_CreateAccount handle=${u}.pds.${DOMAIN} password=${u} email=${u}@example.com resp=./data/accounts/${u}.secrets

# To create more accounts, simply re-assign $u and call the above operation, as shown below.
export u=bar
!make

export u=baz
!make

After configuring the parameters and optional environment variable, proceed as follows:

# Get source code from all repositories
make    cloneAll

# Create work branches and stay on them for all repositories (repos/*); optional but recommended for safety.
make    createWorkBranch

Then, build the docker images as follows:

# 0) Apply the minimum necessary patch to build images, regardless of self-hosting.
#    See https://github.com/bluesky-social/atproto/discussions/2026 for details, specifically for feed-generator/Dockerfile etc.
# NOTE: This operation will create a new branch, apply the patch, and stay on that new branch.
make patch-dockerbuild

# 1) Build the images
make build DOMAIN= f=./docker-compose-builder.yaml

# The following operation is obsolete and no longer supported due to its fragile nature (high cost and low return). Also, this patch has no effect on PDS scaling out (multiple PDS domains).
# ~~ 2) Optionally apply a patch for self-hosting and rebuild the image ~~
# ~~  'optional' signifies that applying this patch is not essential for achieving a self-hosting environment. ~~
# ~~ NOTE: This operation will create a new branch, apply the patch, and keep you on that branch. ~~
#
# ~~ make _patch-selfhost-even-not-mandatory ~~
# ~~ make build services=social-app f=./docker-compose-builder.yaml ~~

back to top

By setting the fork_repo_prefix variable before cloneAll, it registers your remote fork repository with git remote add fork .... then you have additional easy operations against multiple repositores, as below.

export fork_repo_prefix=git@github.com:YOUR_GITHUB_ACCOUNT/

make cloneAll

# Easily manage (push and pull) branches and tags for all repositories with a single command targeting your remote fork repositories.
make exec under=./repos/* cmd='git push fork branch'
make exec under=./repos/* cmd='git tag -a "asof-XXXX-XX-XX" '
make exec under=./repos/* cmd='git push fork --tags'

# Push your develop-branch in justOneRepo working folder to your remote fork repository.
make exec under=./repos/justOneRepo cmd='git push fork develop-branch'

# See the Makefile for complete details and usage examples.

back to top

  1. Get all env vars in docker-compose
# Names and their values
_yqpath='.services[].environment, .services[].build.args'
_yqpath='.services[].environment'

# List of var=val
cat ./docker-compose-builder.yaml | yq -y "${_yqpath}" \
  | grep -v '^---' | sed 's/^- //' | sort -u -f

# Output in yaml
cat ./docker-compose-builder.yaml | yq -y "${_yqpath}" \
  | grep -v '^---' | sed 's/^- //' | sort -u -f  \
  | awk -F= -v col=":" -v q="'" -v sp="  " -v list="-" '{print   sp list sp q $1 q col sp q $2 q}' \
  | sed '1i defs:' | yq -y


# List of names
cat ./docker-compose-builder.yaml | yq -y "${_yqpath}" \
  | grep -v '^---' | sed 's/^- //' | sort -u -f \
  | awk -F= '{print $1}' | sort -u -f
  1. Get env vars regarding {URL | DID | DOMAIN} == mapping rules in docker-compose
# get {name=value} of env vars regarding { URL | DID | DOMAIN }
cat ./docker-compose-builder.yaml | yq -y .services[].environment \
 | grep -v '^---' | sed 's/^- //' | sort -u -f \
 | grep -e :// -e did: -e {DOMAIN}

# get names of env vars regarding { URL | DID | DOMAIN }
cat ./docker-compose-builder.yaml | yq -y .services[].environment \
 | grep -v '^---' | sed 's/^- //' | sort -u -f \
 | grep -e :// -e did: -e {DOMAIN} \
 | awk -F= '{print $1}' | sort -u -f \
 | tee /tmp/url-or-did.txt
  1. Get mapping rules in reverse proxy (caddy )
# dump rules, no idea to convert into  easy readable format...
cat config/caddy/Caddyfile

back to top

  1. Get files related env vars in sources
# Files named *env*
find repos -type f | grep -v -e /.git/  | grep -i env \
  | grep -v -e .jpg$ -e .ts$  -e .json$ -e .png$ -e .js$

# Files containing 'export'
find repos -type f | grep -v /.git/  | xargs grep -l export \
  | grep -v -e .js$ -e .jsx$  -e .ts$ -e .tsx$ -e .go$ -e go.sum$ -e go.mod$ -e .po$ -e .json$ -e .patch$ -e .lock$ -e .snap$
  1. Get all env vars from source code
# In an easy way
_files=repos
# Ensure files to search for envs
_files=`find repos -type f | grep -v -e '/.git' -e /__  -e /tests/ -e _test.go -e /interop-test-files  -e /testdata/ -e /testing/ -e /jest/ -e /node_modules/ -e /dist/ | sort -u -f`

# For JavaScripts families, get env vars from process.env.ENVNAME
grep -R process.env ${_files} \
  | cut -d : -f 2- | sed 's/.*process\.//' | grep '^env\.' | sed 's/^env\.//' \
  | sed -r 's/(^[A-Za-z_0-9\-]+).*/\1/' | sort -u -f \
  | tee /tmp/vars-js1.txt

# For JavaScripts families, get env vars from envXXX('MORE_ENVNAME'), Refer to atproto/packages/common/src/env.ts for envXXX
grep -R -e envStr -e envInt -e envBool -e envList ${_files} \
  | cut -d : -f 2- \
  | grep -v -e ^import -e ^export -e ^function  \
  | sed "s/\"/'/g" \
  | grep \' | awk -F\' '{print $2}' | sort -u -f \
  | tee /tmp/vars-js2.txt

# For golang, get env vars from EnvVar(s): []string{"ENVNAME", "MORE_ENVNAME"}
grep -R EnvVar ${_files} \
  | cut -d : -f 3- | sed -e 's/.*string//' -e 's/[,"{}]//g' \
  | tr ' ' '\n' | grep -v ^$ | sort -u -f \
  | tee /tmp/vars-go.txt

# for docker-compose, get env vars from services[].environment
echo {$_files} \
  | tr ' ' '\n' | grep -v ^$ | grep -e .yaml$ -e .yml$ | grep compose \
  | xargs yq -y .services[].environment | grep -v ^--- | sed 's/^- //' \
  | sed 's/: /=/' | sed "s/'//g" \
  | sort -u -f \
  | awk -F= '{print $1}' | sort -u -f \
  | tee /tmp/vars-compose.txt


# Get unique lists
cat /tmp/vars-js1.txt /tmp/vars-js2.txt /tmp/vars-go.txt /tmp/vars-compose.txt | sort -u -f > /tmp/envs.txt

# Pick env vars related to mapping {URL, ENDPOINT, DID, HOST, PORT, ADDRESS}
cat /tmp/envs.txt  | grep -e URL -e ENDPOINT -e DID -e HOST -e PORT -e ADDRESS
  1. Find {URL | DID | bsky } near env names in sources
find repos -type f | grep -v -e /.git  -e __ -e .json$ \
  | xargs grep -R -n -A3 -B3 -f /tmp/envs.txt \
  | grep -A2 -B2 -e :// -e did: -e bsky
  1. Find bsky.{social, app, network} in sources (to check hard-coded domain/FQDN)
find repos -type f | grep -v -e /.git -e /tests/ -e /__ -e Makefile -e .yaml$ -e .md$  -e .sh$ -e .json$ -e .txt$ -e _test.go$ \
  | xargs grep -n -e bsky.social -e bsky.app -e bsky.network  -e bsky.dev

back to top

This task uses the result(/tmp/envs.txt) of the above as input.

# Create table showing { env x container => value } with the ops-helper script.
cat ./docker-compose-builder.yaml | ./ops-helper/compose2envtable/main.py -l /tmp/envs.txt -o ./docs/env-container-val.xlsx

back to top

This self-hosting env tries to use self-signed certificates as trusted certificates by installing them into containers. The expected behavior is that by sharing /etc/ssl/certs/ca-certificates.crt amang all containers, containers can distinguish that those in ca-certificates.crt are trusted.

Unfortunately, this approach works just in some containers, but not all. It seems depending on distribution(Debian/Alpine/...) and language(Java/Node.js/Golang). The rule cannot be determined in actual behaviors. Therefore, all of the methods below are involved for safety when using self-signed certificates.

  • The host deploys /etc/ssl/certs/ca-certificates.crts to containers by volume mount.
  • Define env vars for self-signed certificates, such as GOINSECURE, NODE_TLS_REJECT_UNAUTHORIZED for each language.

back to top

Create account Sign-in
components url (origin)
atproto https://github.com/bluesky-social/atproto.git
indigo https://github.com/bluesky-social/indigo.git
social-app https://github.com/bluesky-social/social-app.git
feed-generator https://github.com/bluesky-social/feed-generator.git
pds https://github.com/bluesky-social/pds.git
ozone https://github.com/bluesky-social/ozone.git
did-method-plc https://github.com/did-method-plc/did-method-plc.git
jetstream https://github.com/bluesky-social/jetstream.git

other dependencies:

components url (origin)
reverse proxy https://github.com/caddyserver/caddy (official docker image of caddy:2)
DNS server bind9 or others, such as https://github.com/itaru2622/docker-bind9.git

back to top

Description of test network:

DOMAIN for self-hosting: mysky.local.com

IP:
  - docker host for selfhost: 192.168.1.51
  - DNS server:               192.168.1.27
  - DNS forwarders:           8.8.8.8 (upper level DNS server;dns.google.)

DNS A-Records:
  -   mysky.local.com  : 192.168.1.51
  - *.mysky.local.com  : 192.168.1.51

The above would be described in bind9 configuration file as below:

::::::::::::::
/etc/bind/named.conf
::::::::::::::
include "/etc/bind/rndc.key";
controls {
        inet 127.0.0.1 allow { 127.0.0.1; } keys { "rndc-key"; };
};
options {
        directory         "/etc/bind";
        // UDP 53, from any
        listen-on         { any; };
        // HTTP 80, from any
        listen-on  port 80  tls none http default  { any; };
        listen-on-v6      { none; };
        forwarders        { 8.8.8.8 ; };  # dns.gogle.
        allow-recursion   { any; };
        allow-query       { any; };
        allow-query-cache { any; };
        allow-transfer    { any; };
};
zone "local.com" { type master; file "zone-local.com"; allow-query { 0.0.0.0/0; }; allow-update { 0.0.0.0/0; }; allow-transfer { 0.0.0.0/0; }; };
::::::::::::::
/etc/bind/zone-local.com
::::::::::::::
$ORIGIN .
$TTL 259200	; 3 days
local.com		IN SOA	local.com. root.local.com. (
				2024022809 ; serial
				3600       ; refresh (1 hour)
				900        ; retry (15 minutes)
				86400      ; expire (1 day)
				3600       ; minimum (1 hour)
				)
			NS	local.com.
			A	192.168.1.27
$ORIGIN local.com.
$TTL 3600	; 1 hour
mysky		A	192.168.1.51
$ORIGIN mysky.local.com.
*			A	192.168.1.51

cf. The simplest way to use the above DNS server(192.168.1.27) temporaly is
to add it to /etc/resolv.conf as shown below on all testing machines (docker host, client machines for browsers)

nameserver 192.168.1.27

back to top

special thanks to prior works on self-hosting.

hacks in bluesky:

back to top