Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hysteria2 inbound/server memory leak 1.9.3 - 1.10.0-alpha.29 #2027

Closed
4 of 5 tasks
SasukeFreestyle opened this issue Aug 15, 2024 · 15 comments
Closed
4 of 5 tasks

Hysteria2 inbound/server memory leak 1.9.3 - 1.10.0-alpha.29 #2027

SasukeFreestyle opened this issue Aug 15, 2024 · 15 comments
Labels
bug Something isn't working

Comments

@SasukeFreestyle
Copy link

SasukeFreestyle commented Aug 15, 2024

Operating system

Linux

System version

Ubuntu 22.04

Installation type

Original sing-box Command Line

If you are using a graphical client, please provide the version of the client.

No response

Version

sing-box version 1.10.0-alpha.29

Environment: go1.22.6 linux/amd64
Tags: with_gvisor,with_quic,with_dhcp,with_wireguard,with_ech,with_utls,with_reality_server,with_acme,with_clash_api
CGO: disabled

Description

I've been getting memory leaks from version 1.9.3 (maybe even earlier) and above.
sing-box exceeds well over 2GB of RAM after ~8 hrs of runtime, after some time it crashes. I've currently no log of the crash but can update issue when I've a log from this.

I'm still learning go so bear with me :)

btop:
image

goroutine:
goroutine

heap:
heap

pprof dumps:
pprof.heap.pb.gz

pprof.goroutine.pb.gz

Reproduction

This has occurred on all machines I've tested hysteria2 inbound on so it should be easy to reproduce. I would guess you need about ~100 users and after about ~12 hours of uptime

Server certs are generated by openssl.

Server config:

{
   "log":{
      "level":"fatal",
      "timestamp":true
   },
   "experimental":{
      "cache_file":{
         "enabled":true
      }
   },
   "inbounds":[
      {
         "type":"hysteria2",
         "tag":"hy2-in-443",
         "listen":"127.0.0.1",
         "udp_timeout":180,
         "listen_port":443,
         "sniff":true,
         "sniff_override_destination":true,
         "domain_strategy":"prefer_ipv4",
         "up_mbps":0,
         "down_mbps":0,
         "obfs":{
            "type":"salamander",
            "password":"123"
         },
         "users":[
            {
               "name":"user",
               "password":"123"
            }
         ],
         "ignore_client_bandwidth":true,
         "tls":{
            "enabled":true,
            "certificate_path":"ca.crt",
            "key_path":"ca.key"
         }
      }
   ],
   "outbounds":[
      {
         "type":"direct",
         "tag":"direct"
      },
      {
         "type":"block",
         "tag":"block"
      },
      {
         "type":"dns",
         "tag":"dns-out"
      }
   ],
   "dns":{
      "disable_cache":true,
      "servers":[
         {
            "tag":"dns-out",
            "address":"udp://127.0.0.53",
            "address_strategy":"prefer_ipv4",
            "strategy":"prefer_ipv4",
            "detour":"direct"
         }
      ]
   },
   "route":{
      "rules":[
         {
            "ip_is_private":true,
            "outbound":"block"
         },
         {
            "rule_set":[
               "geoip-cn",
               "geoip-ir",
               "geoip-ru",
               "geoip-phishing",
               "geoip-malware",
               "geoip-private",
               "geosite-ir",
               "geosite-malware",
               "geosite-cryptominers",
               "geosite-phishing"
            ],
            "outbound":"block"
         },
         {
            "protocol":"dns",
            "outbound":"dns-out"
         }
      ],
      "rule_set":[
         {
            "tag":"geoip-cn",
            "type":"remote",
            "format":"binary",
            "update_interval":"3d",
            "url":"https://raw.githubusercontent.com/Chocolate4U/Iran-sing-box-rules/rule-set/geoip-cn.srs"
         },
         {
            "tag":"geoip-ir",
            "type":"remote",
            "format":"binary",
            "update_interval":"3d",
            "url":"https://raw.githubusercontent.com/Chocolate4U/Iran-sing-box-rules/rule-set/geoip-ir.srs"
         },
         {
            "tag":"geoip-ru",
            "type":"remote",
            "format":"binary",
            "update_interval":"3d",
            "url":"https://raw.githubusercontent.com/Chocolate4U/Iran-sing-box-rules/rule-set/geoip-ru.srs"
         },
         {
            "tag":"geoip-private",
            "type":"remote",
            "format":"binary",
            "update_interval":"3d",
            "url":"https://raw.githubusercontent.com/Chocolate4U/Iran-sing-box-rules/rule-set/geoip-private.srs"
         },
         {
            "tag":"geosite-ir",
            "type":"remote",
            "format":"binary",
            "update_interval":"3d",
            "url":"https://github.com/bootmortis/sing-geosite/releases/latest/download/geosite-ir.srs"
         },
         {
            "tag":"geoip-phishing",
            "type":"remote",
            "format":"binary",
            "update_interval":"3d",
            "url":"https://raw.githubusercontent.com/Chocolate4U/Iran-sing-box-rules/rule-set/geoip-phishing.srs"
         },
         {
            "tag":"geosite-phishing",
            "type":"remote",
            "format":"binary",
            "update_interval":"3d",
            "url":"https://raw.githubusercontent.com/Chocolate4U/Iran-sing-box-rules/rule-set/geosite-phishing.srs"
         },
         {
            "tag":"geoip-malware",
            "type":"remote",
            "format":"binary",
            "update_interval":"3d",
            "url":"https://raw.githubusercontent.com/Chocolate4U/Iran-sing-box-rules/rule-set/geoip-malware.srs"
         },
         {
            "tag":"geosite-malware",
            "type":"remote",
            "format":"binary",
            "update_interval":"3d",
            "url":"https://raw.githubusercontent.com/Chocolate4U/Iran-sing-box-rules/rule-set/geosite-malware.srs"
         },
         {
            "tag":"geosite-cryptominers",
            "type":"remote",
            "format":"binary",
            "update_interval":"3d",
            "url":"https://raw.githubusercontent.com/Chocolate4U/Iran-sing-box-rules/rule-set/geosite-cryptominers.srs"
         }
      ],
      "auto_detect_interface":true
   }
}

Logs

No response

Supporter

Integrity requirements

  • I confirm that I have read the documentation, understand the meaning of all the configuration items I wrote, and did not pile up seemingly useful options or default values.
  • I confirm that I have provided the server and client configuration files and process that can be reproduced locally, instead of a complicated client configuration file that has been stripped of sensitive data.
  • I confirm that I have provided the simplest configuration that can be used to reproduce the error I reported, instead of depending on remote servers, TUN, graphical interface clients, or other closed-source software.
  • I confirm that I have provided the complete configuration files and logs, rather than just providing parts I think are useful out of confidence in my own intelligence.
@nekohasekai nekohasekai added the bug Something isn't working label Aug 15, 2024
@nekohasekai
Copy link
Member

I guess the problem was introduced in sing-box 1.10.0-alpha.23. Can you check if the problem exists in versions before alpha 23 or 1.9.3?

@SasukeFreestyle
Copy link
Author

I'm sure the problem occurred in 1.9.3 and also versions below see #1245
With versions 1.9.0+ connections are properly closed, but there is still a leak.
Im running 1.9.3 now but it will take some hours for the leak to reproduce again. In the next post I will dump pprof of 1.9.3

@SasukeFreestyle
Copy link
Author

SasukeFreestyle commented Aug 17, 2024

So here are the 1.9.3 dumps, took longer than expected. Uptime 1 day 14 hours.

image

heap:
heap1

goroutine:
goroutine1

heap:
pprof.heap.pb.gz
goroutine:
pprof.goroutine.pb.gz

So I've switched to alpha 22 as of writing this and will dump those in the next post.

Thank you for your time and effort.

@SasukeFreestyle
Copy link
Author

SasukeFreestyle commented Aug 18, 2024

1.10.0-alpha.22

image

heap:
heap22

goroutine:

goroutine22

heap:
pprof.heap.pb.gz
goroutine
pprof.goroutine.pb.gz

All of these versions I've tested will eventually leak and system will run out of memory.

If you need me to test something else and/or in different conditions I'll be happy to help

@erfantkerfan

This comment was marked as spam.

@nekohasekai
Copy link
Member

Please try both 1.9.4 and 1.10.0-beta.1

@SasukeFreestyle
Copy link
Author

SasukeFreestyle commented Aug 19, 2024

Hello again and hope you're well :)

1.9.4 dump.
image

heap:
heap-crash

goroutine:
goroutine-crash

heap:
pprof.heap.pb.gz

goroutine:
pprof.goroutine.pb.gz

switched to 1.10.0-beta.2 for testing

@SasukeFreestyle
Copy link
Author

1.10.0-beta.2 dump

image

heap:
heap-crash

goroutine:
goroutine-crash

heap:
pprof.heap.pb.gz

goroutine:
pprof.goroutine.011.pb.gz

@SasukeFreestyle
Copy link
Author

Seems fixed in 1.10.0-beta.4 :)
Do you want me to dump pprof of beta.4?

If you don't need them you can close this issue.

@nekohasekai
Copy link
Member

nekohasekai commented Aug 24, 2024

I don't think it's been resolved, beta 4 has no changes related to this issue.

@nekohasekai
Copy link
Member

I have no progress on your issue, we did fix a memory leak.

The new goroutine snapshots make me think that maybe you did have over 6k active UDP connections causing this occupancy, I'm not sure if there is still a leak.

If you'd like to continue the discussion via IM, you can contact me at Telegram@attachBaseContext or Discord@nekohasekai.

@SasukeFreestyle
Copy link
Author

SasukeFreestyle commented Aug 26, 2024

I honestly lack the knowledge to determine how much memory is required for lets say 6k active connections and I agree that it might not be a leak in that regard.

But after testing beta.4 a couple of days now I clearly see multiple times that for example if sing-box is using 700MB, it will after a couple hours drop to around 500MB and then repeat back to 700MB and down again to 500MB.

Anyway here is 15hr uptime pprof of beta.4

I want to thank you for taking your time on this and the reason I'm doing this to help make sing-box be as efficiently as possible

I realize that no changes related to this issue is present is beta.4 but from an end-user view it is magically fixed I guess :)

image

heap:
heap-fix

goroutine:
goroutine-fix

heap:
pprof.heap.pb.gz

goroutine:
pprof.goroutine.pb.gz

@nekohasekai
Copy link
Member

Can you try 1.9.4 again? By the way, a large number of UDP connections may be caused by protocols such as BitTorrent. You can try blocking protocol dtls and bittorrent.

@SasukeFreestyle
Copy link
Author

SasukeFreestyle commented Sep 1, 2024

I added the following lines to my configuration under rules and have been testing it for a couple of days. Version 1.9.4

      "rules":[
         {
            "outbound":"block",
            "protocol":[
               "bittorrent",
               "dtls"
            ]
         },

Memory usage is stable. Between 450MB and 750MB during peak-hours.

heap:
heap-fix
goroutine:
goroutine-fix

heap:
pprof.heap.pb.gz
goroutine:
pprof.goroutine.pb.gz

I don't mind blocking DTLS/Bittorrent traffic and if this is a fix for the memory usage I'm happy with the results and thank you for the tip.

I just want to also point out that after adding these rules I checked if Bittorrent was blocked and it worked, Using qBittorrent
But that's another issue not related to this topic.

@SasukeFreestyle
Copy link
Author

SasukeFreestyle commented Oct 5, 2024

I'm closing this issue for now as for me I consider this to be fixed, Sing-box never exceeds over 1GB with the amount of users I got.
Version 1.9.6 / 1.10.0-beta.11

Thank you for your support! Be well :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants