Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ipfs kubo node memory usage increases endlessly #10447

Open
3 tasks done
etherscoredao opened this issue Jun 12, 2024 · 5 comments
Open
3 tasks done

Ipfs kubo node memory usage increases endlessly #10447

etherscoredao opened this issue Jun 12, 2024 · 5 comments
Assignees
Labels
kind/bug A bug in existing code (including security flaws) need/analysis Needs further analysis before proceeding need/triage Needs initial labeling and prioritization P1 High: Likely tackled by core team if no one steps up

Comments

@etherscoredao
Copy link

etherscoredao commented Jun 12, 2024

Checklist

Installation method

ipfs-desktop

Version

Kubo version: 0.29.0-3f0947b
Repo version: 15
System version: amd64/linux
Golang version: go1.22.4

Config

{
  "API": {
    "HTTPHeaders": {
      "Access-Control-Allow-Methods": [
        "PUT",
        "POST"
      ],
      "Access-Control-Allow-Origin": [
        "http://0.0.0.0:5001"
      ]
    }
  },
  "Addresses": {
    "API": "/ip4/0.0.0.0/tcp/5001",
    "Announce": null,
    "AppendAnnounce": null,
    "Gateway": "/ip4/0.0.0.0/tcp/8080",
    "NoAnnounce": [
      "/ip4/10.0.0.0/ipcidr/8",
      "/ip4/100.64.0.0/ipcidr/10",
      "/ip4/169.254.0.0/ipcidr/16",
      "/ip4/172.16.0.0/ipcidr/12",
      "/ip4/192.0.0.0/ipcidr/24",
      "/ip4/192.0.2.0/ipcidr/24",
      "/ip4/192.168.0.0/ipcidr/16",
      "/ip4/198.18.0.0/ipcidr/15",
      "/ip4/198.51.100.0/ipcidr/24",
      "/ip4/203.0.113.0/ipcidr/24",
      "/ip4/240.0.0.0/ipcidr/4",
      "/ip6/100::/ipcidr/64",
      "/ip6/2001:2::/ipcidr/48",
      "/ip6/2001:db8::/ipcidr/32",
      "/ip6/fc00::/ipcidr/7",
      "/ip6/fe80::/ipcidr/10"
    ],
    "Swarm": []
  },
  "AutoNAT": {
    "ServiceMode": "disabled"
  },
  "Bootstrap": [],
  "DNS": {
    "Resolvers": {}
  },
  "Datastore": {
    "BloomFilterSize": 0,
    "GCPeriod": "1h",
    "HashOnRead": false,
    "Spec": {
      "mounts": [
        {
          "child": {
            "path": "blocks",
            "shardFunc": "/repo/flatfs/shard/v1/next-to-last/2",
            "sync": true,
            "type": "flatfs"
          },
          "mountpoint": "/blocks",
          "prefix": "flatfs.datastore",
          "type": "measure"
        },
        {
          "child": {
            "compression": "none",
            "path": "datastore",
            "type": "levelds"
          },
          "mountpoint": "/",
          "prefix": "leveldb.datastore",
          "type": "measure"
        }
      ],
      "type": "mount"
    },
    "StorageGCWatermark": 90,
    "StorageMax": "10GB"
  },
  "Discovery": {
    "MDNS": {
      "Enabled": false
    }
  },
  "Experimental": {
    "FilestoreEnabled": false,
    "GraphsyncEnabled": false,
    "Libp2pStreamMounting": false,
    "OptimisticProvide": false,
    "OptimisticProvideJobsPoolSize": 0,
    "P2pHttpProxy": false,
    "StrategicProviding": false,
    "UrlstoreEnabled": false
  },
  "Gateway": {
    "APICommands": [],
    "DeserializedResponses": null,
    "DisableHTMLErrors": null,
    "ExposeRoutingAPI": null,
    "HTTPHeaders": {},
    "NoDNSLink": false,
    "NoFetch": false,
    "PathPrefixes": [],
    "PublicGateways": null,
    "RootRedirect": "",
    "Writable": false
  },
  "Identity": {
    "PeerID": "12D3KooWR66ReXcHWSzQEihJpWYUDQgQMAPfnxE3HeGdr1cuLnWS"
  },
  "Import": {
    "CidVersion": null,
    "HashFunction": null,
    "UnixFSChunker": null,
    "UnixFSRawLeaves": null
  },
  "Internal": {
    "Bitswap": {
      "EngineBlockstoreWorkerCount": 4096,
      "MaxOutstandingBytesPerPeer": null,
      "ProviderSearchDelay": null,
      
    }
  },
  "Ipns": {
    "RecordLifetime": "",
    "RepublishPeriod": "",
    "ResolveCacheSize": 128
  },
  "Migration": {
    "DownloadSources": [],
    "Keep": ""
  },
  "Mounts": {
    "FuseAllowOther": false,
    "IPFS": "/ipfs",
    "IPNS": "/ipns"
  },
  "Peering": {
    "Peers": null
  },
  "Pinning": {
    "RemoteServices": {}
  },
  "Plugins": {
    "Plugins": null
  },
  "Provider": {
    "Strategy": ""
  },
  "Pubsub": {
    "DisableSigning": false,
    "Router": ""
  },
  "Reprovider": {},
  "Routing": {
    "AcceleratedDHTClient": false,
    "Methods": null,
    "Routers": null
  },
  "Swarm": {
    "AddrFilters": [
      "/ip4/10.0.0.0/ipcidr/8",
      "/ip4/100.64.0.0/ipcidr/10",
      "/ip4/169.254.0.0/ipcidr/16",
      "/ip4/172.16.0.0/ipcidr/12",
      "/ip4/192.0.0.0/ipcidr/24",
      "/ip4/192.0.2.0/ipcidr/24",
      "/ip4/192.168.0.0/ipcidr/16",
      "/ip4/198.18.0.0/ipcidr/15",
      "/ip4/198.51.100.0/ipcidr/24",
      "/ip4/203.0.113.0/ipcidr/24",
      "/ip4/240.0.0.0/ipcidr/4",
      "/ip6/100::/ipcidr/64",
      "/ip6/2001:2::/ipcidr/48",
      "/ip6/2001:db8::/ipcidr/32",
      "/ip6/fc00::/ipcidr/7",
      "/ip6/fe80::/ipcidr/10"
    ],
    "ConnMgr": {},
    "DisableBandwidthMetrics": false,
    "DisableNatPortMap": true,
    "RelayClient": {},
    "RelayService": {},
    "ResourceMgr": {},
    "Transports": {
      "Multiplexers": {},
      "Network": {},
      "Security": {}
    }
  }
}

Description

Hi, memory usage increases endlessly.
The ipfs kubo node is installed on a server with:

  • 4 CPU
  • 16GB RAM
  • 100GB HD

I use docker stats to monitor the memory usage, here are the current log:

CONTAINER ID   NAME                                      CPU %     MEM USAGE / LIMIT     MEM %     NET I/O           BLOCK I/O         PIDS
45ddbe02cb58   ipfs                                      156.94%   4.876GiB / 15.61GiB   31.24%    3.43GB / 1.82GB   26MB / 1.5GB      22

Debugging infos:
memory stack: ipfs.stack.tar.gz
memory heap: ipfs.heap.tar.gz
cpu profile: ipfs.cpuprof.tar.gz

Looks like some goroutines are hanging like:

goroutine 89125206 [select, 99 minutes]:
github.com/quic-go/quic-go.(*incomingStreamsMap[...]).AcceptStream(0x2f23e40, {0x2f04190, 0x3ffe080})
	github.com/quic-go/quic-go@v0.44.0/streams_map_incoming.go:82 +0x111
github.com/quic-go/quic-go.(*streamsMap).AcceptUniStream(0xc01c35df10, {0x2f04190, 0x3ffe080})
	github.com/quic-go/quic-go@v0.44.0/streams_map.go:192 +0xce
github.com/quic-go/quic-go.(*connection).AcceptUniStream(0x1dd1d07?, {0x2f04190?, 0x3ffe080?})
	github.com/quic-go/quic-go@v0.44.0/connection.go:2289 +0x29
github.com/quic-go/quic-go/http3.(*connection).HandleUnidirectionalStreams(0xc026333220, 0xc027924d80)
	github.com/quic-go/quic-go@v0.44.0/http3/conn.go:124 +0xa2
created by github.com/quic-go/quic-go/http3.(*SingleDestinationRoundTripper).init in goroutine 89112403
	github.com/quic-go/quic-go@v0.44.0/http3/client.go:98 +0x2f2

more infos in the attached memory stack file

@etherscoredao etherscoredao added kind/bug A bug in existing code (including security flaws) need/triage Needs initial labeling and prioritization labels Jun 12, 2024
@gitzec
Copy link

gitzec commented Jun 15, 2024

I guess your cluster / stack / compose / start command is interesting here as you can limit resources there.

For myself I limited the containers in my swarm because I had no luck doing it with the node's config. I also switched to ipns -> dhtclient to reduce traffic on some nodes. Maybe this also reduces the memory footprint (wild guess).

@aschmahmann aschmahmann added the need/analysis Needs further analysis before proceeding label Jun 18, 2024
@aschmahmann aschmahmann added P1 High: Likely tackled by core team if no one steps up and removed need/triage Needs initial labeling and prioritization labels Jun 18, 2024
@wenyue
Copy link

wenyue commented Jun 26, 2024

It seems that, the problem is introduced since 0.28.0, and has been fix in the master branch. So I would revert to 0.27.0 .

@aschmahmann
Copy link
Contributor

@etherscoredao if you're able to try with a custom build try updating to the latest go-libp2p master and seeing if things have improved given the fix here libp2p/go-libp2p#2841

@dejl
Copy link

dejl commented Aug 25, 2024

I see this too.
Debian Linux - x64

[2657620.399067] Out of memory: Killed process 4187 (ipfs) total-vm:48650700kB, anon-rss:34086188kB, file-rss:0kB, shmem-rss:0kB, UID:1001 pgtables:73548kB oom_score_adj:0

@hsanjuan
Copy link
Contributor

hsanjuan commented Nov 8, 2024

I believe this can be closed after most recent fixes in libp2p/quic

@hsanjuan hsanjuan added the need/triage Needs initial labeling and prioritization label Nov 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug A bug in existing code (including security flaws) need/analysis Needs further analysis before proceeding need/triage Needs initial labeling and prioritization P1 High: Likely tackled by core team if no one steps up
Projects
None yet
Development

No branches or pull requests

7 participants