Skip to content

jwbjnwolf/nginx-bad-bot-blocker

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Nginx Bad Bot and User-Agent Blocker, customised for fedi instances, and made more tor friendly.

The default configuration for this blocker interferes with fedi software, such as Mastodon/GoToSocial from federating correctly.

It also blocks a lot of Tor exit nodes as a result of them getting caught up in bad traffic.

Problem:

  • The deny.conf behavior of blocking dot file/folder requests doesn't exclude .well-known, that fedi software needs to crawl to federate properly.
  • The deny.conf behavior also blocks image hotlinking, which breaks fedi software.
  • The globalblacklist.conf user-agent blocklist includes a lot of keywords that are, or may, be part of many fedi instance domains, which are included in user-agents by said software when they crawl other instances, causing instances to be falsely blocked.
  • Many tor exit node IPs get caught up in bad traffic, reported to AbuseIPDB in overwhelming numbers and end up in the globalblacklist.conf list as a result. There's only a finite amount of these nodes so even one block can be very noticable as a Tor user, and needing to refresh the exit node as a result, which isn't optimal.

Changes:

  • In deny.conf, add an exclusion for .well-known requests: Edits.
  • In deny.conf, comment out the image hotlinking section so hotlinking isn't prevented: Edits.
  • In globalblacklist.conf, comment out problem user-agent keyword blocks so they don't cause false positives: See below for list.
  • In globalblacklist.conf, changed the very not good bot "AdsBot-Google" to be blocked. ADs can get in the damn bin.
  • In globalblacklist.conf, added some AI crawler bots to be blocked that aren't currently present.
  • Added a bash script to routinely comment out Tor exit node IPs in globalblacklist.conf when I sync from upstream.

How to use this fork instead of upstream:

  • Follow instructions for installing files from the upstream repo.
  • Edit your deny.conf file with the changes provided in these two commits as also stated above: Commit 1, Commit 2.
  • Edit your update-ngxblocker updater script to point to the configuration hosted here: Edits.
  • Alternatively, point your updater script to the configuration hosted on my Codeberg mirror: Edits.

Important note for self hosted Git repos such as if you use Forgejo like I now do:

  • Please ensure you do not include the deny.conf files in any server blocks or location blocks for git repositories such as Forgejo to ensure the repos function as intended. Using it with a git repo that has dotfiles for example will result in the dot files in the repo being inaccessible.

User-agent Keywords commented out:

- Alligator,
- Anarchie,
- Anarchy,
- Attach,

- BackStreet,
- BackWeb,
- Badass,
- Bandit,
- Bigfoot,
- Blow,
- Bolt,
- Buck,
- Buddy,
- Bullseye,

- Collector,
- Copier,
- Cosmos,
- Crescent,
- Curious,
- Custo,

- Demon,
- Devil,
- Disco,
- Dragonfly,
- Drip,

- Evil,
- FrontPage,
- Fuzz,
- Gopher,

- Harvest,
- Iria,
- Kinza,
- Leap,

- Magnet,
- Mojeek,

- Needle,
- Nibbler,
- Ninja,

- Octopus,
- Obot,
- Pump,

- Reaper,
- Ripper,
- Ripz,

- Screaming,
- Snake,
- Snoopy,
- Spanner,
- Steeler,
- Stripper,
- Sucker,

- TakeOut,
- Teleport,
- TheNomad,
- Titan,
- Twice,

- Webster,
- Whack,
- Whacker,
- Widow,
- Xenu,
- Yak,
- Zade,
- Zeus.

Changed "good" user agents to be blocked:

- AdsBot-Google.

Added user agents to be blocked:

- Ai[0-9]bot (AI2 Bot & AI2 Bot-Dolma specifically but [0-9] for just incase),
- Omgilibot,
- WellKnownBot.

About

Nginx Bad Bot Blocker, customised for fedi instances, and made more tor friendly. THIS IS NOW A MIRROR ONLY. Now on Forgejo & Codeberg. https://git.wolfi.ee/jase/nginx-bad-bot-blocker or https://codeberg.org/jasewolf/nginx-bad-bot-blocker-mirror

Resources

License

Stars

Watchers

Forks

Languages

  • Shell 100.0%