Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: vpn constantly restarts due to being unhealthy #2154

Open
r3ps4J opened this issue Mar 12, 2024 · 39 comments
Open

Bug: vpn constantly restarts due to being unhealthy #2154

r3ps4J opened this issue Mar 12, 2024 · 39 comments

Comments

@r3ps4J
Copy link

r3ps4J commented Mar 12, 2024

Is this urgent?

None

Host OS

Debian 12 Bookworm

CPU arch

aarch64

VPN service provider

PureVPN

What are you using to run the container

docker-compose

What is the version of Gluetun

Running version latest built on 2024-03-07T12:32:25.391Z (commit 3254fc8)

What's the problem 🤔

My gluetun container started constantly restarting the vpn. I understand this is the "auto-healing" mechanism, but I can't figure out what causes it. Especially since I haven't changed anything in my gluetun configuration. Unsure if it's actually a bug or just a user error, but any help would be appreciated.

I checked the healthcheck page as well, so find my answers for each step below:

  1. The VPN server IP address you are trying to connect to is no longer valid 🔌 Update your server information
    It should be correct, but just in case I also tried a manual configuration downloaded from PureVPN's dashboard with the latest IP which resulted in the same problem.
  2. The VPN server crashed 💥, try changing your VPN servers filtering options such as SERVER_REGIONS
    I removed the countries filter altogether, but no luck.
  3. Your host firewall is blocking outbound connections
    I haven't changed my firewall or installed a new one, and it worked before.
  4. Your Internet connection is not working 🤯, ensure it works
    It is definitely working outside the gluetun container.
  5. Are you using Docker Desktop >= v4.5.1?? Then downgrade back to v4.5.1. See @Miexil's comment.
    Running on Debian 12 Bookworm so not relevant.
  6. Something else ➡️ https://github.com/qdm12/gluetun/issues/new/choose
    Here I am lol!

Share your logs (at least 10 lines)

========================================
========================================
=============== gluetun ================
========================================
=========== Made with ❤️ by ============
======= https://github.com/qdm12 =======
========================================
========================================

Running version latest built on 2024-03-07T12:32:25.391Z (commit 3254fc8)

🔧 Need help? https://github.com/qdm12/gluetun/discussions/new
🐛 Bug? https://github.com/qdm12/gluetun/issues/new
✨ New feature? https://github.com/qdm12/gluetun/issues/new
☕ Discussion? https://github.com/qdm12/gluetun/discussions/new
💻 Email? quentin.mcgaw@gmail.com
💰 Help me? https://www.paypal.me/qmcgaw https://github.com/sponsors/qdm12
2024-03-12T20:40:36+01:00 INFO [routing] default route found: interface eth0, gateway 172.21.0.1, assigned IP 172.21.0.3 and family v4
2024-03-12T20:40:36+01:00 INFO [routing] local ethernet link found: eth0
2024-03-12T20:40:36+01:00 INFO [routing] local ipnet found: 172.21.0.0/16
2024-03-12T20:40:36+01:00 INFO [firewall] enabling...
2024-03-12T20:40:36+01:00 DEBUG [firewall] iptables --policy INPUT DROP
2024-03-12T20:40:36+01:00 DEBUG [firewall] iptables --policy OUTPUT DROP
2024-03-12T20:40:36+01:00 DEBUG [firewall] iptables --policy FORWARD DROP
2024-03-12T20:40:36+01:00 DEBUG [firewall] ip6tables-nft --policy INPUT DROP
2024-03-12T20:40:36+01:00 DEBUG [firewall] ip6tables-nft --policy OUTPUT DROP
2024-03-12T20:40:36+01:00 DEBUG [firewall] ip6tables-nft --policy FORWARD DROP
2024-03-12T20:40:36+01:00 DEBUG [firewall] iptables --append INPUT -i lo -j ACCEPT
2024-03-12T20:40:36+01:00 DEBUG [firewall] ip6tables-nft --append INPUT -i lo -j ACCEPT
2024-03-12T20:40:36+01:00 DEBUG [firewall] iptables --append OUTPUT -o lo -j ACCEPT
2024-03-12T20:40:36+01:00 DEBUG [firewall] ip6tables-nft --append OUTPUT -o lo -j ACCEPT
2024-03-12T20:40:36+01:00 DEBUG [firewall] iptables --append OUTPUT -m conntrack --ctstate ESTABLISHED,RELATED -j ACCEPT
2024-03-12T20:40:36+01:00 DEBUG [firewall] ip6tables-nft --append OUTPUT -m conntrack --ctstate ESTABLISHED,RELATED -j ACCEPT
2024-03-12T20:40:36+01:00 DEBUG [firewall] iptables --append INPUT -m conntrack --ctstate ESTABLISHED,RELATED -j ACCEPT
2024-03-12T20:40:36+01:00 DEBUG [firewall] ip6tables-nft --append INPUT -m conntrack --ctstate ESTABLISHED,RELATED -j ACCEPT
2024-03-12T20:40:36+01:00 DEBUG [firewall] iptables --append OUTPUT -o eth0 -s 172.21.0.3 -d 172.21.0.0/16 -j ACCEPT
2024-03-12T20:40:36+01:00 DEBUG [firewall] ip6tables-nft --append OUTPUT -o eth0 -d ff02::1:ff/104 -j ACCEPT
2024-03-12T20:40:36+01:00 DEBUG [firewall] iptables --append INPUT -i eth0 -d 172.21.0.0/16 -j ACCEPT
2024-03-12T20:40:36+01:00 INFO [firewall] enabled successfully
2024-03-12T20:40:37+01:00 INFO [storage] creating /gluetun/servers.json with 17820 hardcoded servers
2024-03-12T20:40:37+01:00 DEBUG [netlink] IPv6 is not supported after searching 0 routes
2024-03-12T20:40:37+01:00 INFO Alpine version: 3.18.6
2024-03-12T20:40:37+01:00 INFO OpenVPN 2.5 version: 2.5.8
2024-03-12T20:40:37+01:00 INFO OpenVPN 2.6 version: 2.6.8
2024-03-12T20:40:37+01:00 INFO Unbound version: 1.19.1
2024-03-12T20:40:37+01:00 INFO IPtables version: v1.8.9
2024-03-12T20:40:37+01:00 INFO Settings summary:
├── VPN settings:
|   ├── VPN provider settings:
|   |   ├── Name: purevpn
|   |   └── Server selection settings:
|   |       ├── VPN type: openvpn
|   |       └── OpenVPN server selection settings:
|   |           └── Protocol: UDP
|   └── OpenVPN settings:
|       ├── OpenVPN version: 2.5
|       ├── User: [set]
|       ├── Password: [set]
|       ├── Network interface: tun0
|       ├── Run OpenVPN as: root
|       └── Verbosity level: 1
├── DNS settings:
|   ├── Keep existing nameserver(s): no
|   ├── DNS server address to use: 127.0.0.1
|   └── DNS over TLS settings:
|       ├── Enabled: yes
|       ├── Update period: every 24h0m0s
|       ├── Unbound settings:
|       |   ├── Authoritative servers:
|       |   |   └── cloudflare
|       |   ├── Caching: yes
|       |   ├── IPv6: no
|       |   ├── Verbosity level: 1
|       |   ├── Verbosity details level: 0
|       |   ├── Validation log level: 0
|       |   ├── System user: root
|       |   └── Allowed networks:
|       |       ├── 0.0.0.0/0
|       |       └── ::/0
|       └── DNS filtering settings:
|           ├── Block malicious: yes
|           ├── Block ads: no
|           ├── Block surveillance: no
|           └── Blocked IP networks:
|               ├── 127.0.0.1/8
|               ├── 10.0.0.0/8
|               ├── 172.16.0.0/12
|               ├── 192.168.0.0/16
|               ├── 169.254.0.0/16
|               ├── ::1/128
|               ├── fc00::/7
|               ├── fe80::/10
|               ├── ::ffff:127.0.0.1/104
|               ├── ::ffff:10.0.0.0/104
|               ├── ::ffff:169.254.0.0/112
|               ├── ::ffff:172.16.0.0/108
|               └── ::ffff:192.168.0.0/112
├── Firewall settings:
|   ├── Enabled: yes
|   └── Outbound subnets:
|       └── 192.168.178.0/24
├── Log settings:
|   └── Log level: DEBUG
├── Health settings:
|   ├── Server listening address: 127.0.0.1:9999
|   ├── Target address: cloudflare.com:443
|   ├── Duration to wait after success: 5s
|   ├── Read header timeout: 100ms
|   ├── Read timeout: 500ms
|   └── VPN wait durations:
|       ├── Initial duration: 6s
|       └── Additional duration: 5s
├── Shadowsocks server settings:
|   └── Enabled: no
├── HTTP proxy settings:
|   └── Enabled: no
├── Control server settings:
|   ├── Listening address: :8000
|   └── Logging: yes
├── OS Alpine settings:
|   ├── Process UID: 1000
|   ├── Process GID: 1000
|   └── Timezone: europe/amsterdam
├── Public IP settings:
|   ├── Fetching: every 12h0m0s
|   ├── IP file path: /tmp/gluetun/ip
|   └── Public IP data API: ipinfo
└── Version settings:
    └── Enabled: yes
2024-03-12T20:40:37+01:00 INFO [routing] default route found: interface eth0, gateway 172.21.0.1, assigned IP 172.21.0.3 and family v4
2024-03-12T20:40:37+01:00 DEBUG [routing] ip rule add from 172.21.0.3/32 lookup 200 pref 100
2024-03-12T20:40:37+01:00 INFO [routing] adding route for 0.0.0.0/0
2024-03-12T20:40:37+01:00 DEBUG [routing] ip route replace 0.0.0.0/0 via 172.21.0.1 dev eth0 table 200
2024-03-12T20:40:37+01:00 INFO [firewall] setting allowed subnets...
2024-03-12T20:40:37+01:00 DEBUG [firewall] iptables --append OUTPUT -o eth0 -s 172.21.0.3 -d 192.168.178.0/24 -j ACCEPT
2024-03-12T20:40:37+01:00 INFO [routing] default route found: interface eth0, gateway 172.21.0.1, assigned IP 172.21.0.3 and family v4
2024-03-12T20:40:37+01:00 INFO [routing] adding route for 192.168.178.0/24
2024-03-12T20:40:37+01:00 DEBUG [routing] ip route replace 192.168.178.0/24 via 172.21.0.1 dev eth0 table 199
2024-03-12T20:40:37+01:00 DEBUG [routing] ip rule add to 192.168.178.0/24 lookup 199 pref 99
2024-03-12T20:40:37+01:00 DEBUG [routing] ip rule add to 172.21.0.0/16 lookup 254 pref 98
2024-03-12T20:40:37+01:00 INFO TUN device is not available: open /dev/net/tun: no such file or directory; creating it...
2024-03-12T20:40:37+01:00 INFO [dns] using plaintext DNS at address 1.1.1.1
2024-03-12T20:40:37+01:00 INFO [http server] http server listening on [::]:8000
2024-03-12T20:40:37+01:00 INFO [healthcheck] listening on 127.0.0.1:9999
2024-03-12T20:40:37+01:00 INFO [firewall] allowing VPN connection...
2024-03-12T20:40:37+01:00 DEBUG [firewall] iptables --append OUTPUT -d 146.70.155.11 -o eth0 -p udp -m udp --dport 53 -j ACCEPT
2024-03-12T20:40:37+01:00 DEBUG [firewall] iptables --append OUTPUT -o tun0 -j ACCEPT
2024-03-12T20:40:37+01:00 DEBUG [firewall] ip6tables-nft --append OUTPUT -o tun0 -j ACCEPT
2024-03-12T20:40:37+01:00 INFO [openvpn] OpenVPN 2.5.8 aarch64-alpine-linux-musl [SSL (OpenSSL)] [LZO] [LZ4] [EPOLL] [MH/PKTINFO] [AEAD] built on Nov  2 2022
2024-03-12T20:40:37+01:00 INFO [openvpn] library versions: OpenSSL 3.1.4 24 Oct 2023, LZO 2.10
2024-03-12T20:40:37+01:00 INFO [openvpn] TCP/UDP: Preserving recently used remote address: [AF_INET]146.70.155.11:53
2024-03-12T20:40:37+01:00 INFO [openvpn] UDP link local: (not bound)
2024-03-12T20:40:37+01:00 INFO [openvpn] UDP link remote: [AF_INET]146.70.155.11:53
2024-03-12T20:40:43+01:00 INFO [healthcheck] program has been unhealthy for 6s: restarting VPN
2024-03-12T20:40:43+01:00 INFO [healthcheck] 👉 See https://github.com/qdm12/gluetun-wiki/blob/main/faq/healthcheck.md
2024-03-12T20:40:43+01:00 INFO [healthcheck] DO NOT OPEN AN ISSUE UNLESS YOU READ AND TRIED EACH POSSIBLE SOLUTION
2024-03-12T20:40:43+01:00 INFO [vpn] stopping
2024-03-12T20:40:43+01:00 INFO [vpn] starting
2024-03-12T20:40:43+01:00 INFO [firewall] allowing VPN connection...
2024-03-12T20:40:43+01:00 DEBUG [firewall] iptables --delete OUTPUT -d 146.70.155.11 -o eth0 -p udp -m udp --dport 53 -j ACCEPT
2024-03-12T20:40:43+01:00 DEBUG [firewall] iptables --delete OUTPUT -o tun0 -j ACCEPT
2024-03-12T20:40:43+01:00 DEBUG [firewall] ip6tables-nft --delete OUTPUT -o tun0 -j ACCEPT
2024-03-12T20:40:43+01:00 DEBUG [firewall] iptables --append OUTPUT -d 43.250.205.50 -o eth0 -p udp -m udp --dport 53 -j ACCEPT
2024-03-12T20:40:43+01:00 DEBUG [firewall] iptables --append OUTPUT -o tun0 -j ACCEPT
2024-03-12T20:40:43+01:00 DEBUG [firewall] ip6tables-nft --append OUTPUT -o tun0 -j ACCEPT
2024-03-12T20:40:43+01:00 INFO [openvpn] OpenVPN 2.5.8 aarch64-alpine-linux-musl [SSL (OpenSSL)] [LZO] [LZ4] [EPOLL] [MH/PKTINFO] [AEAD] built on Nov  2 2022
2024-03-12T20:40:43+01:00 INFO [openvpn] library versions: OpenSSL 3.1.4 24 Oct 2023, LZO 2.10
2024-03-12T20:40:43+01:00 INFO [openvpn] TCP/UDP: Preserving recently used remote address: [AF_INET]43.250.205.50:53
2024-03-12T20:40:43+01:00 INFO [openvpn] UDP link local: (not bound)
2024-03-12T20:40:43+01:00 INFO [openvpn] UDP link remote: [AF_INET]43.250.205.50:53
2024-03-12T20:40:54+01:00 INFO [healthcheck] program has been unhealthy for 11s: restarting VPN
2024-03-12T20:40:54+01:00 INFO [healthcheck] 👉 See https://github.com/qdm12/gluetun-wiki/blob/main/faq/healthcheck.md
2024-03-12T20:40:54+01:00 INFO [healthcheck] DO NOT OPEN AN ISSUE UNLESS YOU READ AND TRIED EACH POSSIBLE SOLUTION
2024-03-12T20:40:54+01:00 INFO [vpn] stopping
2024-03-12T20:40:54+01:00 INFO [vpn] starting
2024-03-12T20:40:54+01:00 INFO [firewall] allowing VPN connection...
2024-03-12T20:40:54+01:00 DEBUG [firewall] iptables --delete OUTPUT -d 43.250.205.50 -o eth0 -p udp -m udp --dport 53 -j ACCEPT
2024-03-12T20:40:54+01:00 DEBUG [firewall] iptables --delete OUTPUT -o tun0 -j ACCEPT
2024-03-12T20:40:54+01:00 DEBUG [firewall] ip6tables-nft --delete OUTPUT -o tun0 -j ACCEPT
2024-03-12T20:40:54+01:00 DEBUG [firewall] iptables --append OUTPUT -d 138.199.35.38 -o eth0 -p udp -m udp --dport 53 -j ACCEPT
2024-03-12T20:40:54+01:00 DEBUG [firewall] iptables --append OUTPUT -o tun0 -j ACCEPT
2024-03-12T20:40:54+01:00 DEBUG [firewall] ip6tables-nft --append OUTPUT -o tun0 -j ACCEPT
2024-03-12T20:40:55+01:00 INFO [openvpn] OpenVPN 2.5.8 aarch64-alpine-linux-musl [SSL (OpenSSL)] [LZO] [LZ4] [EPOLL] [MH/PKTINFO] [AEAD] built on Nov  2 2022
2024-03-12T20:40:55+01:00 INFO [openvpn] library versions: OpenSSL 3.1.4 24 Oct 2023, LZO 2.10
2024-03-12T20:40:55+01:00 INFO [openvpn] TCP/UDP: Preserving recently used remote address: [AF_INET]138.199.35.38:53
2024-03-12T20:40:55+01:00 INFO [openvpn] UDP link local: (not bound)
2024-03-12T20:40:55+01:00 INFO [openvpn] UDP link remote: [AF_INET]138.199.35.38:53
2024-03-12T20:41:11+01:00 INFO [healthcheck] program has been unhealthy for 16s: restarting VPN
2024-03-12T20:41:11+01:00 INFO [healthcheck] 👉 See https://github.com/qdm12/gluetun-wiki/blob/main/faq/healthcheck.md
2024-03-12T20:41:11+01:00 INFO [healthcheck] DO NOT OPEN AN ISSUE UNLESS YOU READ AND TRIED EACH POSSIBLE SOLUTION
2024-03-12T20:41:11+01:00 INFO [vpn] stopping
2024-03-12T20:41:11+01:00 INFO [vpn] starting
2024-03-12T20:41:11+01:00 INFO [firewall] allowing VPN connection...
2024-03-12T20:41:11+01:00 DEBUG [firewall] iptables --delete OUTPUT -d 138.199.35.38 -o eth0 -p udp -m udp --dport 53 -j ACCEPT
2024-03-12T20:41:12+01:00 DEBUG [firewall] iptables --delete OUTPUT -o tun0 -j ACCEPT
2024-03-12T20:41:12+01:00 DEBUG [firewall] ip6tables-nft --delete OUTPUT -o tun0 -j ACCEPT
2024-03-12T20:41:12+01:00 DEBUG [firewall] iptables --append OUTPUT -d 67.213.219.186 -o eth0 -p udp -m udp --dport 53 -j ACCEPT
2024-03-12T20:41:12+01:00 DEBUG [firewall] iptables --append OUTPUT -o tun0 -j ACCEPT
2024-03-12T20:41:12+01:00 DEBUG [firewall] ip6tables-nft --append OUTPUT -o tun0 -j ACCEPT
2024-03-12T20:41:12+01:00 INFO [openvpn] OpenVPN 2.5.8 aarch64-alpine-linux-musl [SSL (OpenSSL)] [LZO] [LZ4] [EPOLL] [MH/PKTINFO] [AEAD] built on Nov  2 2022
2024-03-12T20:41:12+01:00 INFO [openvpn] library versions: OpenSSL 3.1.4 24 Oct 2023, LZO 2.10
2024-03-12T20:41:12+01:00 INFO [openvpn] TCP/UDP: Preserving recently used remote address: [AF_INET]67.213.219.186:53
2024-03-12T20:41:12+01:00 INFO [openvpn] UDP link local: (not bound)
2024-03-12T20:41:12+01:00 INFO [openvpn] UDP link remote: [AF_INET]67.213.219.186:53
2024-03-12T20:41:33+01:00 INFO [healthcheck] program has been unhealthy for 21s: restarting VPN
2024-03-12T20:41:33+01:00 INFO [healthcheck] 👉 See https://github.com/qdm12/gluetun-wiki/blob/main/faq/healthcheck.md
2024-03-12T20:41:33+01:00 INFO [healthcheck] DO NOT OPEN AN ISSUE UNLESS YOU READ AND TRIED EACH POSSIBLE SOLUTION
2024-03-12T20:41:33+01:00 INFO [vpn] stopping
2024-03-12T20:41:33+01:00 INFO [vpn] starting
2024-03-12T20:41:33+01:00 INFO [firewall] allowing VPN connection...
2024-03-12T20:41:33+01:00 DEBUG [firewall] iptables --delete OUTPUT -d 67.213.219.186 -o eth0 -p udp -m udp --dport 53 -j ACCEPT
2024-03-12T20:41:33+01:00 DEBUG [firewall] iptables --delete OUTPUT -o tun0 -j ACCEPT
2024-03-12T20:41:33+01:00 DEBUG [firewall] ip6tables-nft --delete OUTPUT -o tun0 -j ACCEPT
2024-03-12T20:41:33+01:00 DEBUG [firewall] iptables --append OUTPUT -d 172.111.229.6 -o eth0 -p udp -m udp --dport 53 -j ACCEPT
2024-03-12T20:41:33+01:00 DEBUG [firewall] iptables --append OUTPUT -o tun0 -j ACCEPT
2024-03-12T20:41:33+01:00 DEBUG [firewall] ip6tables-nft --append OUTPUT -o tun0 -j ACCEPT
2024-03-12T20:41:33+01:00 INFO [openvpn] OpenVPN 2.5.8 aarch64-alpine-linux-musl [SSL (OpenSSL)] [LZO] [LZ4] [EPOLL] [MH/PKTINFO] [AEAD] built on Nov  2 2022
2024-03-12T20:41:33+01:00 INFO [openvpn] library versions: OpenSSL 3.1.4 24 Oct 2023, LZO 2.10
2024-03-12T20:41:33+01:00 INFO [openvpn] TCP/UDP: Preserving recently used remote address: [AF_INET]172.111.229.6:53
2024-03-12T20:41:33+01:00 INFO [openvpn] UDP link local: (not bound)
2024-03-12T20:41:33+01:00 INFO [openvpn] UDP link remote: [AF_INET]172.111.229.6:53
2024-03-12T20:41:59+01:00 INFO [healthcheck] program has been unhealthy for 26s: restarting VPN
2024-03-12T20:41:59+01:00 INFO [healthcheck] 👉 See https://github.com/qdm12/gluetun-wiki/blob/main/faq/healthcheck.md
2024-03-12T20:41:59+01:00 INFO [healthcheck] DO NOT OPEN AN ISSUE UNLESS YOU READ AND TRIED EACH POSSIBLE SOLUTION
2024-03-12T20:41:59+01:00 INFO [vpn] stopping
2024-03-12T20:41:59+01:00 INFO [vpn] starting
2024-03-12T20:41:59+01:00 INFO [firewall] allowing VPN connection...
2024-03-12T20:41:59+01:00 DEBUG [firewall] iptables --delete OUTPUT -d 172.111.229.6 -o eth0 -p udp -m udp --dport 53 -j ACCEPT
2024-03-12T20:41:59+01:00 DEBUG [firewall] iptables --delete OUTPUT -o tun0 -j ACCEPT
2024-03-12T20:41:59+01:00 DEBUG [firewall] ip6tables-nft --delete OUTPUT -o tun0 -j ACCEPT
2024-03-12T20:41:59+01:00 DEBUG [firewall] iptables --append OUTPUT -d 104.250.183.4 -o eth0 -p udp -m udp --dport 53 -j ACCEPT
2024-03-12T20:41:59+01:00 DEBUG [firewall] iptables --append OUTPUT -o tun0 -j ACCEPT
2024-03-12T20:41:59+01:00 DEBUG [firewall] ip6tables-nft --append OUTPUT -o tun0 -j ACCEPT
2024-03-12T20:41:59+01:00 INFO [openvpn] OpenVPN 2.5.8 aarch64-alpine-linux-musl [SSL (OpenSSL)] [LZO] [LZ4] [EPOLL] [MH/PKTINFO] [AEAD] built on Nov  2 2022
2024-03-12T20:41:59+01:00 INFO [openvpn] library versions: OpenSSL 3.1.4 24 Oct 2023, LZO 2.10
2024-03-12T20:41:59+01:00 INFO [openvpn] TCP/UDP: Preserving recently used remote address: [AF_INET]104.250.183.4:53
2024-03-12T20:41:59+01:00 INFO [openvpn] UDP link local: (not bound)
2024-03-12T20:41:59+01:00 INFO [openvpn] UDP link remote: [AF_INET]104.250.183.4:53

Share your configuration

version: "3"
services:
  gluetun:
    image: qmcgaw/gluetun
    container_name: gluetun
    cap_add:
      - NET_ADMIN
    ports:
      #- 8888:8888/tcp # HTTP proxy
      #- 8388:8388/tcp # Shadowsocks
      #- 8388:8388/udp # Shadowsocks

      #- 8080:8080 # qbittorrent
      #- 9091:9091 # transmission
      - 9696:9696 # prowlarr
      - 8191:8191 # flaresolverr
    #volumes:
    #  - ./NL-ovpn-tcp.conf:/gluetun/custom.conf:ro
    environment:
      - LOG_LEVEL=debug
      - VPN_SERVICE_PROVIDER=purevpn
      - VPN_TYPE=openvpn
      #- OPENVPN_CUSTOM_CONFIG=/gluetun/custom.conf
      - OPENVPN_USER=****
      - OPENVPN_PASSWORD=****
      - COUNTRIES=Netherlands
      - TZ=Europe/Amsterdam
      #- HTTPPROXY=on
      #- SHADOWSOCKS=on
      - FIREWALL_OUTBOUND_SUBNETS=192.168.178.0/24
    restart: unless-stopped
@r3ps4J r3ps4J changed the title Bug: PureVPN constantly restarts due to being unhealthy Bug: vpn constantly restarts due to being unhealthy Mar 12, 2024
@k-matti
Copy link

k-matti commented Mar 16, 2024

have the same problem, did you find the solution?

@r3ps4J
Copy link
Author

r3ps4J commented Mar 16, 2024

No I can't figure it out. My system is just down till there is a fix

@vogtrj
Copy link

vogtrj commented Mar 17, 2024

I'm having the same problem as well, also with openvpn through proton. Had previously worked without issue for at least the past year and stopped working some time around the last week or so. I tried rolling my openvpn credentials to see if I could revive gluetun, but no luck. I'm using a mix of openvpn and wireguard tunnels through proton in other applications and haven't had any problems so I think this is something unique to gluetun.

@misku

This comment was marked as off-topic.

@misku

This comment was marked as off-topic.

@nomisunrider
Copy link

nomisunrider commented Mar 20, 2024

Same problem for me here. I've tried multiple versions and they all result in:

INFO [healthcheck] program has been unhealthy for 26s: restarting VPN (see https://github.com/qdm12/gluetun-wiki/blob/main/faq/healthcheck.md)

Has something external used in these healthchecks changed?

@miltykiss

This comment was marked as off-topic.

@NebulaBC
Copy link

NebulaBC commented Mar 22, 2024

Confirming that the latest version with this fix fixes this issue for me.

I tried this fix (both the :test docker tag and :latest) but it changed absolutely nothing for me. My container is still spam restarting with these exact same errors. Nothing external changed for me, I didn’t even update anything. My whole docker network just came crumbling down one day when my gluetun container started flipping out. I don’t get any network connection at all from my containers, and I’ve verified that it’s not a DOT problem, and I’ve also tried a new VPN config.

Edit: I’ve just switched to linuxserver.io’s wireguard container. I’d recommend the same to anyone else who’s having this problem if they’re willing to read the docs. It works just as well as gluetun. I hate to shill another project in the issues, but for some reason it doesn’t seem to be a priority fixing this.

@r3ps4J
Copy link
Author

r3ps4J commented Mar 22, 2024

I tried it on latest as well, problem still occurs.

@rickykresslein
Copy link

rickykresslein commented Mar 24, 2024

I downgraded to :v3 and it's working again for me. Obviously not ideal, but it works.
Edit: Never mind, the problem continues.

@vogtrj
Copy link

vogtrj commented Mar 24, 2024

Confirming that the latest version with this fix fixes this issue for me.

I tried this fix (both the :test docker tag and :latest) but it changed absolutely nothing for me. My container is still spam restarting with these exact same errors. Nothing external changed for me, I didn’t even update anything. My whole docker network just came crumbling down one day when my gluetun container started flipping out. I don’t get any network connection at all from my containers, and I’ve verified that it’s not a DOT problem, and I’ve also tried a new VPN config.

Edit: I’ve just switched to linuxserver.io’s wireguard container. I’d recommend the same to anyone else who’s having this problem if they’re willing to read the docs. It works just as well as gluetun. I hate to shill another project in the issues, but for some reason it doesn’t seem to be a priority fixing this.

FWIW, I'm still using gluetun but switched my protocol from OpenVPN to Wireguard and it's been stable ever since. That may be more preferable to most, compared to changing to a totally different container. I'd prefer to use OpenVPN over Wireguard for this application, since I have to specify one specific server to establish the Wireguard tunnel, which may be down or heavily loaded. With OpenVPN I can just specify the VPN server city and get auto-assigned to an available VPN server. For now I'm biding my time on Wireguard and plan to switchback to OpenVPN when this bug is hopefully fixed soon.

@rickykresslein
Copy link

@vogtrj I'm using Wireguard and still have the issue. Glad it's working for you though.

@qdm12
Copy link
Owner

qdm12 commented Mar 26, 2024

Message for everyone:

I've spent one hour writing an answer to each of your comments, your username is tagged below, please read it. And in general, applicable to all of you:

  1. UPDATE YOUR SERVERS DATA. It looks like not many of you (any of you?) did NOT run the command to update it.
  2. Please do not jump commenting on the issue unless you have the exact same problem with the same provider at the very least. If it's another provider, open another issue.
  3. I do not prioritize similar "unhealthy" issues, see the comment addressed to @NebulaBC below why. This is also now mentioned in the wiki.

@r3ps4J

The VPN server IP address you are trying to connect to is no longer valid 🔌 Update your server information

And

[storage] creating /gluetun/servers.json with 17820 hardcoded servers

shows you did not update your servers. This is likely the only problem, update your servers. It's a single command to run, please run it.


@vogtrj your provider is different, please create another issue. Also make sure you've updated your servers, as well.

switchback to OpenVPN when this bug is hopefully fixed soon.

Not really a bug, so it won't be fixed. It's a VPN server issue or your servers data is outdated and you can update it.


@misku fails ipinfo health checks see the updated healthcheck page. The relevant sentence for this would be All of the above are NOT causes, but **consequences** of the VPN not working..

It would be great if there was a way to disable healthcheck all together (at least for debugging).

No, there would be no point. If network doesn't work for 6 seconds, failing each request sent every second, then it likely won't recover, and it's better to just restart the VPN internally. This was complex and implemented to resolve many problems (years ago) of VPN connection being 'up' but not working. For debugging (although I'm not sure what you can debug, since the VPN connection is dead anyway), you can change the VPN auto healing parameters, again, read the healthcheck page mentioned above.

Your issue might be different (on top of being PIA instead of PureVPN). Please create a separate issue if you want to continue discussing your problem. But from a quick look, it may be related to the DNS over TLS setup downloading files

2024-03-17T22:52:45Z INFO [healthcheck] healthy!
2024-03-17T22:52:46Z INFO [ip getter] Public IP address is 84.247.116.27 (Netherlands, North Holland, Amsterdam)
2024-03-17T22:53:02Z INFO [dns] downloading DNS over TLS cryptographic files
2024-03-17T22:53:03Z INFO [dns] downloading hostnames and IP block lists
2024-03-17T22:53:14Z INFO [healthcheck] program has been unhealthy for 6s: restarting VPN

It seems to go unhealthy after or while downloading files downloading hostnames and IP block lists which are a few megabytes, and the download of block files doesn't seem to complete (or unbound would be launched). It looks like the VPN connection works until you try to download a few megabytes and then it fails, that's probably a VPN server limitation, not something to do with Gluetun. Anyway, it looks like DOT=off solved it for you (at your own risk 😄), I'm marking your comments as off-topic.


@nomisunrider

Has something external used in these healthchecks changed?

No. Update your servers data.


@NebulaBC

for some reason it doesn’t seem to be a priority fixing this.

The reasons are:

  • this kind of issue is 90% of the cases due to outdated VPN server IP addresses or the VPN server misbehaving. In both cases, I cannot do anything about it.
  • this kind of issue is opened by someone every week at least, and it's getting very repetitive. I've even added a log stating DO NOT OPEN AN ISSUE in the program logs to reduce this.
  • I need to focus on more important fixes/features/maintenance I can do something about. I really cannot do much for this kind of issue. I'm the sole maintainer, not really getting paid, working on it in my free time, so I can't allocated much resource for this kind of issue.

Don't use the :test image. It's often garbage-ish code I'm trying on my machines for development 😄 I guess I should host my own registry...

@r3ps4J
Copy link
Author

r3ps4J commented Mar 26, 2024

@qdm12 thanks for your detailed reponse. As I mentioned in my initial post, I've tried this with a manual configuration downloaded from PureVPN directly as well which resulted in the same problem. Nevertheless, I'll try to update the servers again and let you know.

@r3ps4J
Copy link
Author

r3ps4J commented Mar 26, 2024

Didn't change anything. I ultimately ran into some errors while updating servers, but I do have servers in servers.json so assuming they are updated.

2024-03-26T09:52:53+01:00 INFO updating Purevpn servers...
gluetun  | 2024-03-26T09:53:00+01:00 WARN reached the maximum number of consecutive failures: 2 failed attempts resolving my2-auto-udp.ptoserver.com: lookup my2-auto-udp.ptoserver.com on 127.0.0.11:53: no such host
gluetun  | 2024-03-26T09:53:00+01:00 WARN reached the maximum number of consecutive failures: 2 failed attempts resolving th2-auto-udp.ptoserver.com: lookup th2-auto-udp.ptoserver.com on 127.0.0.11:53: no such host
gluetun  | 2024-03-26T09:53:00+01:00 WARN reached the maximum number of consecutive failures: 2 failed attempts resolving my2-auto-tcp.ptoserver.com: lookup my2-auto-tcp.ptoserver.com on 127.0.0.11:53: no such host
gluetun  | 2024-03-26T09:53:00+01:00 WARN reached the maximum number of consecutive failures: 2 failed attempts resolving th2-auto-tcp.ptoserver.com: lookup th2-auto-tcp.ptoserver.com on 127.0.0.11:53: no such host

This one happened last time. Guessing it's because of restarting/rerunning the command a couple of times.

2024-03-26T09:55:21+01:00 ERROR updating server information: getting servers: too many requests sent for this month from https://ipinfo.io/87.249.135.102: 429 429 Too Many Requests

@misku
Copy link

misku commented Mar 26, 2024

@qdm12 Thank you for taking the time to answer and updating the healthcheck page. Just wanted to let you know I did update the servers data before posting here 😄 If DOT=off worked, then the problem has to be DNS related. It's super odd that with DOT=on name resolution works for some time and then it dies. It could also be that the resolution doesn't work for some of addresses that gluetun is using thus healthcheck failure. I'll look into it more and open a new issue if necessary. Thanks again and keep up the good work 👍

@kainzilla
Copy link

Sorry in advance for this wall of text, I hope this helps anyone at all:

tldr: For me, UDP-based VPNs (both Wireguard and OpenVPN) experiences this issue, but TCP-based OpenVPN works without connection restarts.

I've been experiencing this issue for a while, and I've been trying to troubleshoot it - because health check failures could fail for any number of reasons, a lot of people in this thread might be seeing various different issues, like some users might have intermittently failing DNS lookups causing healthchecks to fail (perhaps a local PiHole rate-limiting your lookups with it's default settings?), or others might have issues with specifically UDP traffic and UDP traffic only (which is what I've finally confirmed happened in my case), the VPN service or their ISP not being stable, or users might be getting rate limited by the site the health check uses (cloudflare.com by default) because they're hitting it so often all being possible causes.

General shape of my version of the issue:

  • Sample server running this container in this case was:
    • Wireguard-based VPN using UDP protocol to paid Proton VPN server, port-forwarding in use, originally with an external script and later on with the integrated Proton VPN port-forward support in Gluetun itself
    • Red Hat Enterprise Linux 9.1 through 9.3 (OS upgrades over time)
    • podman running the containers, rootful or rootless both experience issue
    • Multiple CPUs, motherboards, and memory used (hardware upgrades over time)
    • Tested with multiple different NICs due to motherboard changes, but also an add-in 10Gb Aquantius and currently on a 2.5Gb Realtek integrated NIC on mobo
    • Multiple network routers have been in place during testing span (TPLink retail, OPNsense self-built)
    • Multiple network paths tested including direct connection to routers to eliminate switches
    • Multiple different ProtonVPN servers / locations tested, same issue occurring on all of them
    • Multiple different ISPs tested
    • Tested with IPv6 both enabled / disabled for the home network and server
  • Unknown date when it started happening but at least months ago - issue may have always been present
  • This issue was initially noticed due to Deluge and qBittorrent handling VPN restarts poorly; every time the VPN would restart these two torrent clients would start failing to respond on their actually-working forwarded ports; torrent client Transmission is currently used because it does not fail when the VPN reconnects.
  • The frequency of the issue occurring varies greatly and it was difficult to test for. Sometimes the issue wouldn't happen for 12hours+, sometimes happening once every two minutes
  • The issue appears to happen more under higher load, but it's very unclear what portion of load might trigger it - bandwidth consumption, amount of DNS lookups, total connections to torrent peers, etc.; could never pin down what caused it to happen more often other than "lots of torrents".
  • If the VPN link was left idle, these restarts almost never occurred but I can't hard-verify they never happened.
  • Special note: I'm in the unusual circumstance that I have configured a second server at a second location that's identical software-wise to the first server, including using the same VPN server, the same ISP, the same containers, and the exact same VPN settings, and this other server doesn't have this issue. This leads me to believe the issue is some weird very specific combination of hardware + software features that can trigger this issue in my case because this has been tested with wildly different NICs, routers, and even multiple ISPs at this point in time.

Tests performed with no changes:

  • As mentioned above multiple of the following were tested "incidentally" just through changes over time:
    • CPU / mobo / memory / NICs / switches / routers / modems / ISPs
    • Observed on RHEL 9.1-9.3 OS (podman-based containers) and also Unraid OS (docker-based containers)
  • All possible ethtool features that could be toggled on/off (meaning does not show "forced" status) for Aquantia and RealTek NICs were tested individually toggled from their default
  • Because RHEL uses podman and podman can be run rootlessly, both a rootful + full privileges configuration and rootless configuration were tested
  • Because the restarts were being triggered by failed health checks, I tried some testing focused on the health checks in particular:
    • Set DOT=off as suggested in this thread (which changes DNS handling in the Gluetun container)
    • Set DNS_ADDRESS=10.8.8.1 to entirely bypass the Gluetun DNS stack and direct DNS requests to Proton VPN's integrated DNS servers, to eliminate the Gluetun DNS lookup as a possible source of failure
    • Set the HEALTH_TARGET_ADDRESS= to my public IP (meaning no DNS lookup) and a port opened to an HTTPS service that I could verify would absolutely not fail to reply when the connection was working (meaning no possible rate-limiting to HTTPS responses)
    • Tried increasing the HEALTH_VPN_DURATION_INITIAL= to higher values to confirm it wasn't temporary interruptions, using values of up to 60s; confirmed it absolutely does not get a response for even 60s
    • Tried outright increasing the HEALTH_VPN_DURATION_INITIAL=3600s to a full hour so that I could specifically test if other traffic still worked even though the health-check page load was failing - this caused an unexpected behavior where the networking for the server stopped responding. Server itself was still responsive and usable, the networking just died. I was baffled by this particular testing result, I never had an opportunity to confirm if the VPN can still send other traffic when the health checks start failing
  • Just before finding my personal solution to the issue, I attempted the Gluetun container with OpenVPN protocol, set to UDP, to the exact same Proton VPN server, with the exact same settings otherwise, and found that it experienced the same interruptions in exactly the same way, so the Wireguard / OpenVPN protocols didn't appear to be involved

TCP worked:

Because OpenVPN supports TCP as another protocol option, I created a TCP-specific Proton VPN configuration for the same server and otherwise exact same settings, and the connectivity no longer fails; over 5 days under high load with no disconnections and no further failures whatsoever.

In my case, it appears that something within the networking chain specifically is getting tripped up by UDP and UDP alone, but it's not clear at all what was involved because the exact same set of software is working flawlessly with UDP + Wireguard on another server with the exact same software configuration, on this exact same ISP to this exact same VPN server, in this immediate area, with zero disconnects for weeks. I'm honestly baffled, but I hope this wall of text gives others ideas on things they might not have checked yet such as testing Proton VPN's integrated DNS servers located at 10.8.8.1 / 10.7.7.1, or testing OpenVPN with TCP protocol if they're currently using UDP.

@rickykresslein
Copy link

@kainzilla Thank you for sharing that, I think that must be my issue! I'm using ProtonVPN over Wireguard with UDP. I'll try changing to TCP and report back.

@nomisunrider
Copy link

Similar, I've been using UDP but do have TCP options. Will test and see if there is a difference.

@r3ps4J
Copy link
Author

r3ps4J commented Mar 26, 2024

PureVPN uses openvpn tcp I think, definitely openvpn.

@hanovof0811

This comment was marked as off-topic.

@alarys
Copy link

alarys commented Apr 22, 2024

I have the same issue. I was very happy with qmcgaw/gluetun, and it was working flawlessly for many months. Now the healthcheck is consistently failing and not allowing the tunnel to come up.

I use Privado as a VPN, and it requires UDP (so I can't test or use TCP).

I've tried to roll back to v3.37.0, and even v3.32 but those have the same issue. I've even tried it on a fresh k3s install on a separate system, and it's the same.

I'm quite perplexed. I have checked all the items that the healthcheck wiki says to check, and they are all fine. It would be great if the healthcheck gave a bit more information on what is failing, instead of just saying " program has been unhealthy".

For now, I'm using a completely different VPN solution.

Hopefully this can get resolved. I really liked this solution.

@qdm12
Copy link
Owner

qdm12 commented Apr 25, 2024

@kainzilla Awesome write up, thank you for this. I'll link this in the wiki. This is very scientific and really narrows down the problem source area to, well, UDP. Actually a few things crossing my mind related to this:

  1. I've just added on the latest image the variable WIREGUARD_PERSISTENT_KEEPALIVE_INTERVAL can you try Wireguard and set this one for example to 25s to see if it improves stability?
  2. The protonvpn openvpn config doesn't have a ping instruction, it would be interesting to test with one 🤔 (
    func (p *Provider) OpenVPNConfig(connection models.Connection,
    )
  3. The protonvpn openvpn config fiddles with mtu and mssfix, maybe these can be changed to have a more stable connection?

On the other hand, I'm quite curious (I think I have seen it somewhere else) about Gluetun taking the whole host network down...

this caused an unexpected behavior where the networking for the server stopped responding. Server itself was still responsive and usable, the networking just died. I was baffled by this particular testing result, I never had an opportunity to confirm if the VPN can still send other traffic when the health checks start failing

Can you create a separate issue for this? Maybe it exchausts the TCP dialing somehow. Especially if the healthcheck timeout is at 3600s, it won't touch the vpn and will just keep on retrying to tcp dial Cloudflare.com:443. Very odd indeed.


@r3ps4J

It would be great if the healthcheck gave a bit more information on what is failing, instead of just saying " program has been unhealthy".

You can set LOG_LEVEL=debug to see details, but it's 99% chance just an i/o timeout error, meaning there is no data received at all when trying to reach Cloudflare.com:443

I've tried this with a manual configuration downloaded from PureVPN directly as well which resulted in the same problem

Yes it's probably due either to their VPN server or the udp connection being unreliable and cutting off for longer than 6s (or both). Maybe an openvpn configuration problem even in the official one, at least for udp. Try fiddling with the ping, mssfix and mtu options?


@misku

😄 If DOT=off worked, then the problem has to be DNS related. It's super odd that with DOT=on name resolution works for some time and then it dies. It could also be that the resolution doesn't work for some of addresses that gluetun is using thus healthcheck failure.

  • DOT=on uses TLS+DNSSEC which both make tcp packets possibly larger, maybe tunneling that over UDP makes it fail. Again, I don't have any issue with Mullvad for example so it might be the vpn server and/or the physical connection to the vpn server being unreliable. Same comment as above, try fiddling with ping/mtu/mssfix maybe?
  • Gluetun only uses cloudflare.com for the healthcheck (also configurable)

@r3ps4J
Copy link
Author

r3ps4J commented Apr 25, 2024

@qdm12 I already have debug turned on I believe, I think you meant to ping @alarys for that one.

@kainzilla
Copy link

@qdm12 Thank for the reply! I'm working on testing out some of the suggestions to see if I can further map out the issue.

  1. I've just added on the latest image the variable WIREGUARD_PERSISTENT_KEEPALIVE_INTERVAL can you try Wireguard and set this one for example to 25s to see if it improves stability?

No change I can perceive - it appears to happen still approx. every 1-4 hours, which seems consistent with the 'before' behavior.

  1. The protonvpn openvpn config doesn't have a ping instruction, it would be interesting to test with one 🤔

I'm in progress on testing this suggestion now for the OpenVPN UDP connection type - so far I've actually been using the OPENVPN_CUSTOM_CONFIG option pointed at OpenVPN configurations direct from Proton VPN's site, and the settings look similar to the integrated Gluetun support for Proton VPN, but here's the literal settings including the just-added ping setting:

client
dev tun
proto udp

remote <server IP snipped> 80
remote <server IP snipped> 4569
remote <server IP snipped> 51820
remote <server IP snipped> 5060
remote <server IP snipped> 1194

remote-random
resolv-retry infinite
nobind

cipher AES-256-GCM

setenv CLIENT_CERT 0
tun-mtu 1500
mssfix 0
persist-key
persist-tun

reneg-sec 0

remote-cert-tls server
auth-user-pass

# Added for testing 2024-04-28:
ping 10

The TCP configuration file is identical aside from proto udp being proto tcp instead and a shorter list of ports for the server; the same TCP configuration file is currently still maintaining great multi-week connectivity.

  1. The protonvpn openvpn config fiddles with mtu and mssfix, maybe these can be changed to have a more stable connection?

I'll be happy to test new values for these if you have suggestions - even in the working TCP configuration it does occasionally log a message about MTU mismatch that I haven't dug into, I'll collect those messages to post as well. Because those log entries only show on the OpenVPN connection type and not Wireguard (which appears to have an untouched 1400 MTU and no log entries for MTU), I think it won't relate to the UDP disconnects but I'm happy to test things out.

Can you create a separate issue for this? Maybe it exchausts the TCP dialing somehow. Especially if the healthcheck timeout is at 3600s, it won't touch the vpn and will just keep on retrying to tcp dial Cloudflare.com:443. Very odd indeed.

Absolutely! Let me confirm I can re-create that specific issue however after checking over some of the other tests - after the single instance of it happening, because changing healthcheck setting caused the network loss I'd reverted it immediately; digging into this could be interesting.

@alarys
Copy link

alarys commented May 3, 2024

Looks like the issue I posted is not related at all. I was using Privado, with Country: Switzerland. I changed Country to Canada, and it started working.

@giorgiooriani
Copy link

giorgiooriani commented May 5, 2024

I am having a similar issue that seems to be related to bandwith (not sure). When downloading a torrent at 80-100mbps I get unhealthy checks on the container even though the download works fine. I am also not getting any information in the logs, the container just switches to unhealthy, then healthy and back and forth.

I wish I could check/provide the logs for the error but the logs just stop at some point before the error, even in debug mode.

Using Windscribe on the latest built on 2024-05-04T16:22:29.394Z (commit ef6874f).

Below are the debug logs. After the healthy check it just flips between healthy and unhealthy even though the torrent in the background is downloading just fine. When I'm not downloading anything it stays healthy.

========================================
========================================
=============== gluetun ================
========================================
=========== Made with ❤️ by ============
======= https://github.com/qdm12 =======
========================================
========================================
Running version latest built on 2024-05-04T16:22:29.394Z (commit ef6874f)
🔧 Need help? https://github.com/qdm12/gluetun/discussions/new
🐛 Bug? https://github.com/qdm12/gluetun/issues/new
✨ New feature? https://github.com/qdm12/gluetun/issues/new
☕ Discussion? https://github.com/qdm12/gluetun/discussions/new
💻 Email? quentin.mcgaw@gmail.com
💰 Help me? https://www.paypal.me/qmcgaw https://github.com/sponsors/qdm12
2024-05-07T13:00:09+02:00 INFO [routing] default route found: interface eth0, gateway 10.0.5.1, assigned IP 10.0.5.15 and family v4
2024-05-07T13:00:09+02:00 INFO [routing] local ethernet link found: eth0
2024-05-07T13:00:09+02:00 INFO [routing] local ipnet found: 10.0.5.0/24
2024-05-07T13:00:10+02:00 INFO [firewall] enabling...
2024-05-07T13:00:10+02:00 DEBUG [firewall] iptables-legacy --policy INPUT DROP
2024-05-07T13:00:10+02:00 DEBUG [firewall] iptables-legacy --policy OUTPUT DROP
2024-05-07T13:00:10+02:00 DEBUG [firewall] iptables-legacy --policy FORWARD DROP
2024-05-07T13:00:10+02:00 DEBUG [firewall] ip6tables-legacy --policy INPUT DROP
2024-05-07T13:00:10+02:00 DEBUG [firewall] ip6tables-legacy --policy OUTPUT DROP
2024-05-07T13:00:10+02:00 DEBUG [firewall] ip6tables-legacy --policy FORWARD DROP
2024-05-07T13:00:10+02:00 DEBUG [firewall] iptables-legacy --append INPUT -i lo -j ACCEPT
2024-05-07T13:00:10+02:00 DEBUG [firewall] ip6tables-legacy --append INPUT -i lo -j ACCEPT
2024-05-07T13:00:10+02:00 DEBUG [firewall] iptables-legacy --append OUTPUT -o lo -j ACCEPT
2024-05-07T13:00:10+02:00 DEBUG [firewall] ip6tables-legacy --append OUTPUT -o lo -j ACCEPT
2024-05-07T13:00:10+02:00 DEBUG [firewall] iptables-legacy --append OUTPUT -m conntrack --ctstate ESTABLISHED,RELATED -j ACCEPT
2024-05-07T13:00:10+02:00 DEBUG [firewall] ip6tables-legacy --append OUTPUT -m conntrack --ctstate ESTABLISHED,RELATED -j ACCEPT
2024-05-07T13:00:10+02:00 DEBUG [firewall] iptables-legacy --append INPUT -m conntrack --ctstate ESTABLISHED,RELATED -j ACCEPT
2024-05-07T13:00:10+02:00 DEBUG [firewall] ip6tables-legacy --append INPUT -m conntrack --ctstate ESTABLISHED,RELATED -j ACCEPT
2024-05-07T13:00:10+02:00 DEBUG [firewall] iptables-legacy --append OUTPUT -o eth0 -s 10.0.5.15 -d 10.0.5.0/24 -j ACCEPT
2024-05-07T13:00:10+02:00 DEBUG [firewall] ip6tables-legacy --append OUTPUT -o eth0 -d ff02::1:ff/104 -j ACCEPT
2024-05-07T13:00:10+02:00 DEBUG [firewall] iptables-legacy --append INPUT -i eth0 -d 10.0.5.0/24 -j ACCEPT
2024-05-07T13:00:10+02:00 INFO [firewall] enabled successfully
2024-05-07T13:00:11+02:00 INFO [storage] merging by most recent 19425 hardcoded servers and 19471 servers read from /gluetun/servers.json
2024-05-07T13:00:11+02:00 INFO [storage] Using nordvpn servers from file which are 44 days more recent
2024-05-07T13:00:11+02:00 INFO [storage] Using windscribe servers from file which are 124 days more recent
2024-05-07T13:00:11+02:00 DEBUG [netlink] IPv6 is not supported after searching 2 routes
2024-05-07T13:00:11+02:00 INFO Alpine version: 3.19.1
2024-05-07T13:00:12+02:00 INFO OpenVPN 2.5 version: 2.5.8
2024-05-07T13:00:12+02:00 INFO OpenVPN 2.6 version: 2.6.8
2024-05-07T13:00:13+02:00 INFO Unbound version: 1.19.3
2024-05-07T13:00:13+02:00 INFO IPtables version: v1.8.10
2024-05-07T13:00:13+02:00 INFO Settings summary:
├── VPN settings:
|   ├── VPN provider settings:
|   |   ├── Name: windscribe
|   |   └── Server selection settings:
|   |       ├── VPN type: wireguard
|   |       ├── Regions: Switzerland
|   |       └── Wireguard selection settings:
|   └── Wireguard settings:
|       ├── Private key: OMJ...H0=
|       ├── Pre-shared key: kVB...u4=
|       ├── Interface addresses:
|       |   └── 100.109.214.246/32
|       ├── Allowed IPs:
|       |   ├── 0.0.0.0/0
|       |   └── ::/0
|       └── Network interface: tun0
|           └── MTU: 1400
├── DNS settings:
|   ├── Keep existing nameserver(s): no
|   ├── DNS server address to use: 127.0.0.1
|   └── DNS over TLS settings:
|       ├── Enabled: yes
|       ├── Update period: every 24h0m0s
|       ├── Unbound settings:
|       |   ├── Authoritative servers:
|       |   |   └── cloudflare
|       |   ├── Caching: yes
|       |   ├── IPv6: no
|       |   ├── Verbosity level: 1
|       |   ├── Verbosity details level: 0
|       |   ├── Validation log level: 0
|       |   ├── System user: root
|       |   └── Allowed networks:
|       |       ├── 0.0.0.0/0
|       |       └── ::/0
|       └── DNS filtering settings:
|           ├── Block malicious: yes
|           ├── Block ads: no
|           ├── Block surveillance: no
|           └── Blocked IP networks:
|               ├── 127.0.0.1/8
|               ├── 10.0.0.0/8
|               ├── 172.16.0.0/12
|               ├── 192.168.0.0/16
|               ├── 169.254.0.0/16
|               ├── ::1/128
|               ├── fc00::/7
|               ├── fe80::/10
|               ├── ::ffff:127.0.0.1/104
|               ├── ::ffff:10.0.0.0/104
|               ├── ::ffff:169.254.0.0/112
|               ├── ::ffff:172.16.0.0/108
|               └── ::ffff:192.168.0.0/112
├── Firewall settings:
|   ├── Enabled: yes
|   └── Outbound subnets:
|       ├── 172.20.0.0/16
|       ├── 10.0.1.0/24
|       └── 10.0.5.0/24
├── Log settings:
|   └── Log level: debug
├── Health settings:
|   ├── Server listening address: 127.0.0.1:9999
|   ├── Target address: cloudflare.com:443
|   ├── Duration to wait after success: 5s
|   ├── Read header timeout: 100ms
|   ├── Read timeout: 500ms
|   └── VPN wait durations:
|       ├── Initial duration: 6s
|       └── Additional duration: 5s
├── Shadowsocks server settings:
|   └── Enabled: no
├── HTTP proxy settings:
|   └── Enabled: no
├── Control server settings:
|   ├── Listening address: :8000
|   └── Logging: yes
├── OS Alpine settings:
|   ├── Process UID: 1032
|   ├── Process GID: 65537
|   └── Timezone: Europe/Rome
├── Public IP settings:
|   ├── Fetching: every 12h0m0s
|   ├── IP file path: /tmp/gluetun/ip
|   └── Public IP data API: ipinfo
└── Version settings:
    └── Enabled: yes
2024-05-07T13:00:13+02:00 INFO [routing] default route found: interface eth0, gateway 10.0.5.1, assigned IP 10.0.5.15 and family v4
2024-05-07T13:00:13+02:00 DEBUG [routing] ip rule add from 10.0.5.15/32 lookup 200 pref 100
2024-05-07T13:00:13+02:00 INFO [routing] adding route for 0.0.0.0/0
2024-05-07T13:00:13+02:00 DEBUG [routing] ip route replace 0.0.0.0/0 via 10.0.5.1 dev eth0 table 200
2024-05-07T13:00:13+02:00 INFO [firewall] setting allowed subnets...
2024-05-07T13:00:13+02:00 DEBUG [firewall] iptables-legacy --append OUTPUT -o eth0 -s 10.0.5.15 -d 172.20.0.0/16 -j ACCEPT
2024-05-07T13:00:13+02:00 DEBUG [firewall] iptables-legacy --append OUTPUT -o eth0 -s 10.0.5.15 -d 10.0.1.0/24 -j ACCEPT
2024-05-07T13:00:13+02:00 DEBUG [firewall] iptables-legacy --append OUTPUT -o eth0 -s 10.0.5.15 -d 10.0.5.0/24 -j ACCEPT
2024-05-07T13:00:13+02:00 INFO [routing] default route found: interface eth0, gateway 10.0.5.1, assigned IP 10.0.5.15 and family v4
2024-05-07T13:00:13+02:00 INFO [routing] adding route for 172.20.0.0/16
2024-05-07T13:00:13+02:00 DEBUG [routing] ip route replace 172.20.0.0/16 via 10.0.5.1 dev eth0 table 199
2024-05-07T13:00:13+02:00 DEBUG [routing] ip rule add to 172.20.0.0/16 lookup 199 pref 99
2024-05-07T13:00:13+02:00 INFO [routing] adding route for 10.0.1.0/24
2024-05-07T13:00:13+02:00 DEBUG [routing] ip route replace 10.0.1.0/24 via 10.0.5.1 dev eth0 table 199
2024-05-07T13:00:13+02:00 DEBUG [routing] ip rule add to 10.0.1.0/24 lookup 199 pref 99
2024-05-07T13:00:13+02:00 INFO [routing] adding route for 10.0.5.0/24
2024-05-07T13:00:13+02:00 DEBUG [routing] ip route replace 10.0.5.0/24 via 10.0.5.1 dev eth0 table 199
2024-05-07T13:00:13+02:00 DEBUG [routing] ip rule add to 10.0.5.0/24 lookup 199 pref 99
2024-05-07T13:00:13+02:00 DEBUG [routing] ip rule add to 10.0.5.0/24 lookup 254 pref 98
2024-05-07T13:00:13+02:00 INFO [dns] using plaintext DNS at address 1.1.1.1
2024-05-07T13:00:13+02:00 INFO [http server] http server listening on [::]:8000
2024-05-07T13:00:13+02:00 INFO [healthcheck] listening on 127.0.0.1:9999
2024-05-07T13:00:13+02:00 DEBUG [wireguard] Wireguard server public key: G7LkwWk08Ase/Wi9mnOW77brNBC0vTCemvy1IW1nlV4=
2024-05-07T13:00:13+02:00 DEBUG [wireguard] Wireguard client private key: OMJ...H0=
2024-05-07T13:00:13+02:00 DEBUG [wireguard] Wireguard pre-shared key: kVB...u4=
2024-05-07T13:00:13+02:00 INFO [firewall] allowing VPN connection...
2024-05-07T13:00:13+02:00 DEBUG [firewall] iptables-legacy --append OUTPUT -d 169.150.197.163 -o eth0 -p udp -m udp --dport 1194 -j ACCEPT
2024-05-07T13:00:13+02:00 DEBUG [firewall] iptables-legacy --append OUTPUT -o tun0 -j ACCEPT
2024-05-07T13:00:13+02:00 DEBUG [firewall] ip6tables-legacy --append OUTPUT -o tun0 -j ACCEPT
2024-05-07T13:00:13+02:00 INFO [wireguard] Using available kernelspace implementation
2024-05-07T13:00:13+02:00 INFO [wireguard] Connecting to 169.150.197.163:1194
2024-05-07T13:00:13+02:00 INFO [wireguard] Wireguard setup is complete. Note Wireguard is a silent protocol and it may or may not work, without giving any error message. Typically i/o timeout errors indicate the Wireguard connection is not working.
2024-05-07T13:00:13+02:00 INFO [dns] downloading DNS over TLS cryptographic files
2024-05-07T13:00:14+02:00 INFO [healthcheck] healthy!
2024-05-07T13:00:15+02:00 INFO [dns] downloading hostnames and IP block lists
2024-05-07T13:00:22+02:00 DEBUG [healthcheck] unhealthy: dialing: dial tcp4: lookup cloudflare.com: i/o timeout
2024-05-07T13:00:22+02:00 INFO [dns] init module 0: validator
2024-05-07T13:00:22+02:00 INFO [dns] init module 1: iterator
2024-05-07T13:00:22+02:00 INFO [dns] start of service (unbound 1.19.3).
2024-05-07T13:00:22+02:00 INFO [dns] generate keytag query _ta-4a5c-4f66. NULL IN
2024-05-07T13:00:22+02:00 INFO [dns] ready
2024-05-07T13:00:22+02:00 INFO [vpn] You are running on the bleeding edge of latest!
2024-05-07T13:00:23+02:00 INFO [ip getter] Public IP address is 169.150.197.169 (Switzerland, Zurich, Zürich)
2024-05-07T13:00:23+02:00 INFO [healthcheck] healthy!

@qdm12
Copy link
Owner

qdm12 commented May 10, 2024

Healthcheck logic was changed a bit in 6042a9e this could help a bit (see the commit message for details).

@joestump
Copy link

joestump commented May 22, 2024

@r3ps4J I recently ran into this constant restart issue with Gluetun, which lead me to this thread. First, I want to thank you, @kainzilla, and @qdm12 for a very thoughtful discussion on this tricky issue. I wanted to post the diff that stabilized my setup along with the details of my various fixes.

My configuration:

  • Ubuntu LTS 22.04 VM running on a TrueNAS server
  • Docker 25.0.5
  • qBittorrent
  • SABnzbd
  • Gluetun
  • docker-autoheal

The issue for me, I suspect, was one/all of the following:

  • Not having auto-updates enabled with UPDATER_PERIOD and UPDATER_VPN_SERVICE_PROVIDERS set correctly.
  • I was pinned to v3, which appears to be two months old now.
  • I didn't have health checks set on the clients I was routing through Gluetun.
  • I was getting rate-limited by IPInfo.io. I registered with my GitHub account and got an API token so I could set PUBLICIP_API and PUBLICIP_API_TOKEN.

The error in the logs:

2024-05-22T13:00:31-07:00 WARN [ip getter] too many requests sent for this month from https://ipinfo.io/: 429 429 Too Many Requests; not retrying.

I'd also see the following error on my Homepage Gluetun "widget" about the IP address being empty:

{"public_ip": ""}

Here's the diff that appears to have resolved the issue:

diff --git a/playbooks/services/arr-clients.yaml b/playbooks/services/arr-clients.yaml
index 941b363..895eda7 100644
--- a/playbooks/services/arr-clients.yaml
+++ b/playbooks/services/arr-clients.yaml
@@ -28,7 +28,7 @@
           group: Downloads
           weight: 100
       gluetun:
-        image: qmcgaw/gluetun:v3
+        image: qmcgaw/gluetun:latest
         enabled: true
         ports:
           - 8000:8000/tcp # Control server
@@ -124,11 +124,15 @@
               - "{{ paths.local }}/gluetun:/gluetun"
             environment:
               - "VPN_SERVICE_PROVIDER=cyberghost"
               - "OPENVPN_USER={{ vpn.openvpn.user }}"
               - "OPENVPN_PASSWORD={{ vpn.openvpn.pass }}"
               - "TZ={{ tz }}"
+              - "UPDATER_PERIOD=24h"
+              - "UPDATER_VPN_SERVICE_PROVIDERS=cyberghost"
               - "SERVER_COUNTRIES=United States"
               - "FIREWALL_OUTBOUND_SUBNETS=192.168.1.0/24"
+              - "PUBLICIP_API=ipinfo"
+              - "PUBLICIP_API_TOKEN={{ vpn.publicip_api_token }}"
 
   - name: Create main qBittorrent directory
     file:
@@ -176,6 +180,11 @@
               - "{{ paths.local }}/qbittorrent/config:/config"
               - "{{ paths.data }}/qbittorrent/downloads:/downloads"
               - "{{ paths.data }}/qbittorrent/cross-seeds:/cross-seeds"
+            healthcheck:
+              test: ["CMD-SHELL", "ping -c 1 google.com && curl --fail http://localhost:{{ torrent.ports.http }}/"]
+              interval: 1m
+              timeout: 10s
+              retries: 3
 
   - name: CNAME {{ torrent.dns }}.{{ dns.wtf }} to {{ ansible_host }}
     amazon.aws.route53:
@@ -227,6 +236,11 @@
             volumes:
               - "{{ paths.local }}/sabnzbd/config:/config"
               - "{{ paths.downloads }}/Downloads:/config/Downloads"
+            healthcheck:
+              test: ["CMD-SHELL", "ping -c 1 google.com && curl --fail http://localhost:{{ nzb.ports.http }}/"]
+              interval: 1m
+              timeout: 10s
+              retries: 3

The above health checks, along with docker-autoheal, ensures my clients are always running. When Gluetun restarts, the following happens:

  1. Health checks on clients fail.
  2. docker-autoheal restarts container.
  3. Profit 😃

Once my container was healthy, you end up with this in the Gluetun logs:

2024-05-22 17:15:17	2024-05-22T14:15:17-07:00 INFO [http server] 200 GET /ip wrote 212B to 192.168.5.30:49780 in 385.727µs
2024-05-22 17:15:28	2024-05-22T14:15:28-07:00 INFO [http server] 200 GET /ip wrote 212B to 192.168.5.30:48064 in 192.656µs
2024-05-22 17:16:05	2024-05-22T14:16:05-07:00 INFO [http server] 200 GET /ip wrote 212B to 192.168.5.30:59392 in 131.21µs
2024-05-22 17:16:56	2024-05-22T14:16:56-07:00 INFO [http server] 200 GET /ip wrote 212B to 192.168.5.30:51008 in 189.31µs
2024-05-22 17:17:36	2024-05-22T14:17:36-07:00 INFO [http server] 200 GET /ip wrote 212B to 192.168.5.30:47416 in 238.739µs

There appears to be a back off of some sort as it's now polling IPInfo.io every 6 minutes or so.

@matthenning
Copy link

matthenning commented Jun 3, 2024

Thank you everyone who has put time and effort in diagnosing this issue. I've been experiencing this issue for a few days now after running the exact same setup without issues for the last months.
My VPN provider is Mullvad and I can only get a stable connection using TCP, which will result in very slow speeds compared to UDP.

After starting the container I sometimes get the expected throughput for a few minutes (around 500 MBit/s) but sometimes the following issue will start immediately:
The download throughput will drop towards zero and continue in a sawtooth pattern between 0 and a few hundred KBit/s every few seconds. After another minute oder two the healthcheck warnings start appearing and the vpn connection dies completely.

The fact that it doesn't die immediately everytime seems weird. I've thought about ISP throttling, but from what I can gather that doesn't seem likely, especially because I don't create huge amounts of traffic. A few gig every now and then.

I've tried:

  • switching to Wireguard with different MTUs
  • different ports
  • different regions and servers
  • disabling DOT
  • of course rebooting everything including the host server, firewall and modem

If I can help narrow down the issue in any way, please let me know.

@prom3theu5
Copy link

prom3theu5 commented Jun 19, 2024

How do you get around the fact that vpn providers have a certain number of servers each, so ultimately some of us using the same companies (mullvad, pia etc) are ultimately going to have the same IPs in terms of ip infos 30k a month checks?

Surely it's a requirement to signup and get an api key for them, given the user base of gluetun is increasing daily?

@qdm12
Copy link
Owner

qdm12 commented Aug 1, 2024

@matthenning this is very weird indeed. Please let us know if you resolve it, because I have no clue what it could be (especially in Germany).

How do you get around the fact that vpn providers have a certain number of servers each, so ultimately some of us using the same companies (mullvad, pia etc) are ultimately going to have the same IPs in terms of ip infos 30k a month checks?

I don't, and yes it is or will be problematic. Let's move this discussion to #2190 I could change the code to cycle/pick at random an ip data service, so it gives more room. I don't think there is an ultimate solution here though, but feel free to comment back on #2190 if you have an idea 😉 We could maybe use approximate free ip databases at some point, to have at least the country for an IP address. EDIT: also this is irrelevant to the healthcheck, the public ip is fetched independently.

@ga2p
Copy link

ga2p commented Aug 6, 2024

Hi,
The same thing happens with Surfshark - Argentina, the IPs do not match the server.json and it never connects.
It only connects if you set CUSTOM and WIREGUARD_ENDPOINT_IP=89.117.41.XXX
The new IPs are 89.117.41.xxx

@rhee876527
Copy link

After a scheduled system upgrade & reboot on a RHEL server, I woke up to this unhealthy restart bug.

Container restart didn't help.

However adding this to my compose and recreating it brought back the VPN. Can try with simple recreate first if you don't want to add the dns entry.

dns:
      - 9.9.9.11
      - 8.8.4.4

Running version latest built on 2024-08-23T13:50:02.262Z (commit ff7cadb)

Hope this helps someone.

@muay-throwaway
Copy link

muay-throwaway commented Sep 2, 2024

Speaking personally, I have had success just setting the protocol to TCP and the HEALTH_VPN_DURATION_INITIAL to 300s. At least in my case, the 6s default interval is overly conservative. It may be a result of my inconsistent internet connection; however, with this longer health-check interval, the containers can survive brief interruptions/inconsistencies in internet service, preventing dependent applications from failing.

Credit to @kainzilla and miguelsousa46 on Reddit for these solutions (the latter using 120s instead).

@jimbobjonesbob
Copy link

jimbobjonesbob commented Oct 29, 2024

Edit: I’ve just switched to linuxserver.io’s wireguard container. I’d recommend the same to anyone else who’s having this problem if they’re willing to read the docs. It works just as well as gluetun. I hate to shill another project in the issues, but for some reason it doesn’t seem to be a priority fixing this.

have you got a compose you can share?

@kainzilla
Copy link

A small update to my prior testing scenarios that had been outlined in this post - the second, nearly-identical configuration at a second location switched from using an old consumer Google Wifi router to a dedicated OPNsense-based router, identical to the "problem" site, and the same issue started occurring. The same solution of using TCP instead of UDP worked around the issue.

It wasn't possible to revert to the prior router to test if the issue resolved again, but it was hard-confirmed that nothing else had changed, and the failure was confirmed less than 48 hours after the change.

OPNsense (and the very similar pfSense) uses open-source software that shows up in many other consumer products, and is based on the BSD OS. It's possible there's an obscure and rarely-triggered issue with UDP traffic handling that could cause this issue for some routers that either use the same open source network software or are outright based on OPNsense/pfSense.

I've seen notes regarding UDP handling improvements in OPNsense release notes over the last year, but haven't yet tested if recent versions such as 24.7+ no longer experience the issue. If I test this, I'll update again in the future.

Also, please keep in mind that because this "VPN restarts due to health" bug's symptoms can cover a lot of potential causes, many readers in here are experiencing VPN health restarts for other reasons than routers handling UDP poorly, so using TCP protocol with OpenVPN won't be the solution for everyone, but it's worth testing.

@ZixX3r
Copy link

ZixX3r commented Nov 3, 2024

I think this is an issue with v3.39.1. I reverted back to v3.39 (image: qmcgaw/gluetun:v3.39) and it's working so far

EDIT by qdm12: v3.39 and v3.39.1 are the same 😄 v3.39 points to the last bugfix release so here it's v3.39.1

@qdm12
Copy link
Owner

qdm12 commented Nov 5, 2024

It turns out lowering WIREGUARD_MTU (it was 1400 in v3.39) does help quite a bit, especially if you see TLS related errors. Same applies to OpenVPN with OPENVPN_MSSFIX (not set by default). The two commits (latest image and future v3.40.x release) pushed this morning might help:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests