Skip to content

multi: allow disabling of sub-servers #537

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from

Conversation

ellemouton
Copy link
Member

With this PR, a new "mode" is added for subservers to complement the existing "integrated" and "remote" modes.
The new mode is "disable" which will allow users to pick and choose which subservers they want Litd to start up with.

For example:

--loop-mode=disable
--pool-mode=disable
--faraday-mode=disable

This mode addition is added in the last commit of the PR. The prior commits prepare the code for the change by making all subserver logic a bit more contained within the subserver manager

@guggero guggero self-requested a review May 2, 2023 07:22
@ellemouton ellemouton force-pushed the disableSubservers branch 4 times, most recently from 6a1b082 to 75d7659 Compare May 2, 2023 10:47
@ellemouton
Copy link
Member Author

ok - it is ready now @guggero :) has all the additions that we discussed off-line

@ellemouton
Copy link
Member Author

trying to see if I can add an itest quickly

Copy link
Member

@guggero guggero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice, LGTM 🎉

@ellemouton ellemouton force-pushed the disableSubservers branch 2 times, most recently from 16faf0e to 29f7ef4 Compare May 2, 2023 13:47
@ellemouton
Copy link
Member Author

cool - added an itest that runs the test suite against a lit node where all the subservers are disabled.

@guggero guggero self-requested a review May 2, 2023 15:10
@jamaljsr
Copy link
Member

jamaljsr commented May 2, 2023

I was playing with this today just to see what breaks in the UI when disabling the daemons. I expected to be errors when disabling Loop or Pool, so no surprises there. But in one case, there's a panic occurring which was a bit surprising.

When running with remote LND, Pool, Faraday and disabled Loop, the panic below occurs only on the first run of litd.

2023-05-02 19:46:45.780 [INF] LITD: Handling gRPC web request: /poolrpc.Trader/NodeRatings
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x1b0 pc=0x850c28]

goroutine 1516 [running]:
google.golang.org/grpc.(*ClientConn).NewStream(0x2732070?, {0x2731fc8?, 0x4000600900?}, 0x1b?, {0x4000854020?, 0x1?}, {0x0?, 0x2723160?, 0x40002b4230?})
	/go/pkg/mod/google.golang.org/grpc@v1.39.0/stream.go:153 +0x28
google.golang.org/grpc.NewClientStream(...)
	/go/pkg/mod/google.golang.org/grpc@v1.39.0/stream.go:163
github.com/mwitkow/grpc-proxy/proxy.(*handler).handler(0x40001a09c0, {0x40005c2240?, 0x40005c2240?}, {0x2737ab8, 0x40002be9c0})
	/go/pkg/mod/github.com/mwitkow/grpc-proxy@v0.0.0-20181017164139-0f1106ef9c76/proxy/handler.go:74 +0xcc
github.com/lightninglabs/lightning-terminal.(*rpcProxy).StreamServerInterceptor(0x400083ca80, {0x0, 0x0}, {0x2737ab8, 0x40002be9c0?}, 0x400000e1e0, 0x400084ee10)
	/go/src/github.com/lightninglabs/lightning-terminal/rpc_proxy.go:414 +0x1d8
google.golang.org/grpc.(*Server).processStreamingRPC(0x4000448700, {0x273ecc0, 0x4000750870}, 0x40002e0120, 0x0, 0x40004cfa40, 0x0)
	/go/pkg/mod/google.golang.org/grpc@v1.39.0/server.go:1548 +0xccc
google.golang.org/grpc.(*Server).handleStream(0x4000448700, {0x273ecc0, 0x4000750870}, 0x40002e0120, 0x0)
	/go/pkg/mod/google.golang.org/grpc@v1.39.0/server.go:1627 +0x7f0
google.golang.org/grpc.(*Server).serveStreams.func1.2()
	/go/pkg/mod/google.golang.org/grpc@v1.39.0/server.go:940 +0x84
created by google.golang.org/grpc.(*Server).serveStreams.func1
	/go/pkg/mod/google.golang.org/grpc@v1.39.0/server.go:938 +0x294

In my docker setup, the litd node automatically restarts on failure. Subsequent grpc-web requests going forward respond appropriately. If I restart the whole docker network, which wipes all data, then the initial grpc-web request causes a panic again. I'm not sure what the cause of this could be, but figured I'd mention it. I can reproduce this consistently and provide more logs if necessary.

I tested a few other combinations of integrated/remote/disable for LND, Loop, and Pool. This was the only one that I found to have this issue.

@ellemouton
Copy link
Member Author

Hi @jamaljsr - thanks for this!

When you say " remote LND, Pool, Faraday and disabled Loop", do you mean that lnd, pool & faraday are all being run in remote mode or just LND?

Is this consistently happening after the call to "LITD: Handling gRPC web request: /poolrpc.Trader/NodeRatings"

I tried to quickly re-create this now but have not yet been successful. Will try again tomorrow :)

@jamaljsr
Copy link
Member

jamaljsr commented May 2, 2023

When you say " remote LND, Pool, Faraday and disabled Loop", do you mean that lnd, pool & faraday are all being run in remote mode or just LND?

Yes, I am using the flags --lnd-mode=remote, --faraday-mode=remote, --loop-mode=disable, --pool-mode=remote.

Is this consistently happening after the call to "LITD: Handling gRPC web request: /poolrpc.Trader/NodeRatings"

Yes, its always triggered by the NodeRatings call. When the UI loads, first /lnrpc.Lightning/GetInfo is called which succeeds, then /poolrpc.Trader/NodeRatings is called. This is just the order of requests on page load. I can try reordering them and see if it affects the behavior.

@ellemouton
Copy link
Member Author

ok cool! I will try re-create tomorrow with those settings (i wasnt running them all in remote mode). Thanks again for finding this! 🐞

@jamaljsr
Copy link
Member

jamaljsr commented May 3, 2023

I tested removing the call to NodeRatings on page load. This time, I ran into the panic when navigating to the Pool page, where it fetches accounts and orders. So it seems that when Loop is disabled, the first Pool request causes the panic.

2023-05-03 00:53:01.121 [INF] LITD: Handling gRPC web request: /poolrpc.Trader/ListAccounts
2023-05-03 00:53:01.122 [INF] LITD: Handling gRPC web request: /poolrpc.Trader/ListOrders
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x1b0 pc=0x850e98]

goroutine 1839 [running]:
google.golang.org/grpc.(*ClientConn).NewStream(0x27311f0?, {0x2731148?, 0x4000551b80?}, 0x1c?, {0x4000684000?, 0x1?}, {0x0?, 0x27222e0?, 0x40006742a0?})
	/go/pkg/mod/google.golang.org/grpc@v1.39.0/stream.go:153 +0x28
google.golang.org/grpc.NewClientStream(...)
	/go/pkg/mod/google.golang.org/grpc@v1.39.0/stream.go:163
github.com/mwitkow/grpc-proxy/proxy.(*handler).handler(0x400011c780, {0x4000d68240?, 0x4000d68240?}, {0x2736c38, 0x4000164180})
	/go/pkg/mod/github.com/mwitkow/grpc-proxy@v0.0.0-20181017164139-0f1106ef9c76/proxy/handler.go:74 +0xcc
github.com/lightninglabs/lightning-terminal.(*rpcProxy).StreamServerInterceptor(0x4000675810, {0x0, 0x0}, {0x2736c38, 0x4000164180?}, 0x4000910090, 0x4000679330)
	/go/src/github.com/lightninglabs/lightning-terminal/rpc_proxy.go:414 +0x1d8
google.golang.org/grpc.(*Server).processStreamingRPC(0x40001cfa40, {0x273de60, 0x4000556000}, 0x40009a2120, 0x0, 0x40006ba540, 0x0)
	/go/pkg/mod/google.golang.org/grpc@v1.39.0/server.go:1548 +0xccc
google.golang.org/grpc.(*Server).handleStream(0x40001cfa40, {0x273de60, 0x4000556000}, 0x40009a2120, 0x0)
	/go/pkg/mod/google.golang.org/grpc@v1.39.0/server.go:1627 +0x7f0
google.golang.org/grpc.(*Server).serveStreams.func1.2()
	/go/pkg/mod/google.golang.org/grpc@v1.39.0/server.go:940 +0x84
created by google.golang.org/grpc.(*Server).serveStreams.func1
	/go/pkg/mod/google.golang.org/grpc@v1.39.0/server.go:938 +0x294

I also tested with remote Loop and disabled Pool (flags --lnd-mode=remote, --faraday-mode=remote, --loop-mode=remote, --pool-mode=disable). It crashed when making the first Loop request.

2023-05-03 01:04:42.955 [INF] LITD: Handling gRPC web request: /looprpc.SwapClient/ListSwaps
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x1b0 pc=0x850e98]

goroutine 954 [running]:
google.golang.org/grpc.(*ClientConn).NewStream(0x27311f0?, {0x2731148?, 0x4000430cc0?}, 0x1d?, {0x400047e3a0?, 0x1?}, {0x0?, 0x27222e0?, 0x4000771500?})
	/go/pkg/mod/google.golang.org/grpc@v1.39.0/stream.go:153 +0x28
google.golang.org/grpc.NewClientStream(...)
	/go/pkg/mod/google.golang.org/grpc@v1.39.0/stream.go:163
github.com/mwitkow/grpc-proxy/proxy.(*handler).handler(0x40001a0960, {0x40001c4b00?, 0x40001c4b00?}, {0x2736c38, 0x40002bd8c0})
	/go/pkg/mod/github.com/mwitkow/grpc-proxy@v0.0.0-20181017164139-0f1106ef9c76/proxy/handler.go:74 +0xcc
github.com/lightninglabs/lightning-terminal.(*rpcProxy).StreamServerInterceptor(0x4000771810, {0x0, 0x0}, {0x2736c38, 0x40002bd8c0?}, 0x400000f0c8, 0x4000775360)
	/go/src/github.com/lightninglabs/lightning-terminal/rpc_proxy.go:414 +0x1d8
google.golang.org/grpc.(*Server).processStreamingRPC(0x4000083a40, {0x273de60, 0x40008081b0}, 0x400076be60, 0x0, 0x4000517e60, 0x0)
	/go/pkg/mod/google.golang.org/grpc@v1.39.0/server.go:1548 +0xccc
google.golang.org/grpc.(*Server).handleStream(0x4000083a40, {0x273de60, 0x40008081b0}, 0x400076be60, 0x0)
	/go/pkg/mod/google.golang.org/grpc@v1.39.0/server.go:1627 +0x7f0
google.golang.org/grpc.(*Server).serveStreams.func1.2()
	/go/pkg/mod/google.golang.org/grpc@v1.39.0/server.go:940 +0x84
created by google.golang.org/grpc.(*Server).serveStreams.func1
	/go/pkg/mod/google.golang.org/grpc@v1.39.0/server.go:938 +0x294	

What's strange is that after a litd restart in both cases, the RPCs behave as expected going forward. Maybe there's something wrong with my docker setup.

@ellemouton
Copy link
Member Author

thanks for the extra info @jamaljsr :) I have tried this configuration now and have still not yet been able to reproduce 🤔 probably worth noting that i'm not running them all in docker containers. Im just running everything locally.

Very strange that it also works for you after restarting.... I wonder what this could be...

@ellemouton
Copy link
Member Author

ok I thiiiink I may have found something that might lead to this panic... brb

@ellemouton ellemouton force-pushed the disableSubservers branch 2 times, most recently from 89c4691 to ec8b0e7 Compare May 3, 2023 09:03
@ellemouton ellemouton force-pushed the disableSubservers branch from ec8b0e7 to 6679d2d Compare May 3, 2023 10:47
@ellemouton
Copy link
Member Author

@jamaljsr , can you let me know if you still run into this panic with the latest version?

Copy link
Member

@guggero guggero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 🎉

@jamaljsr
Copy link
Member

jamaljsr commented May 3, 2023

@jamaljsr , can you let me know if you still run into this panic with the latest version?

I just tested 6679d2d and can confirm that the panic no longer occurs in the two scenarios mentioned in #537 (comment). Great work resolving this even though you couldn't repro it yourself 💪

Copy link
Contributor

@ViktorTigerstrom ViktorTigerstrom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me 🚀! Just a small fix below that I've noted. Super excited to get into the codebase as I'm new to it!

A general question that's related to the fix though:
Would it make sense to include disabling of some subserver, while the rest are enabled, and if so include testing of that? This would require quite a bit of changes to the testing code though..

@ellemouton ellemouton force-pushed the disableSubservers branch from 22f0264 to 0f85a9d Compare May 4, 2023 06:05
@ellemouton
Copy link
Member Author

Thanks for the review @ViktorTigerstrom!

Yeah good question regarding testing multiple configurations. You can disable some while enabling others but I think the current test is good enough since these sub-servers don't interact with each-other. The test suite is also pretty heavy right now so would like to prevent unnecessarily running it unless for a specific configuration unless it is a specific configuration we explicitly want to test for.

Copy link
Contributor

@ViktorTigerstrom ViktorTigerstrom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, thanks 🚀

You can disable some while enabling others but I think the current test is good enough since these sub-servers don't interact with each-other.

Ok makes sense :)!

@ellemouton
Copy link
Member Author

ellemouton commented May 4, 2023

awesome! Thanks for the review everyone 🙏

just a note for anyone looking here & seeing it has 2 approvals: Pls dont merge yet :) we want to wait for the front-end side of things to be ready to handle the case where sub-servers are disabled.

We might actually first want to merge #541 so that we dont allow disabling of subservers until the front end can query the status. In that case, I can definitely decouple the PRs so that we can merge that one first. Lemme know

guggero pushed a commit that referenced this pull request May 16, 2023
Enable to start litd with taproot asset subserver disabled.
The default mode for the new sub-server is "Disabled"
Add coverage in itests for flows with some subservers disabled (based on
Elle's #537)
guggero pushed a commit that referenced this pull request May 16, 2023
Enable to start litd with taproot asset subserver disabled.
The default mode for the new sub-server is "Disabled"
Add coverage in itests for flows with some subservers disabled (based on
Elle's #537)
@ellemouton ellemouton force-pushed the disableSubservers branch from 0f85a9d to 7a848fd Compare May 21, 2023 16:11
@guggero guggero linked an issue Jul 17, 2023 that may be closed by this pull request
@AndySchroder
Copy link

Is this branch stable enough for me to use if I use the --disableui option? I'd really like to begin using

--loop-mode=disable
--pool-mode=disable
--faraday-mode=disable

soon.

@AndySchroder
Copy link

If so, would appreciate if you could rebase on top of #598 .

@guggero
Copy link
Member

guggero commented Aug 8, 2023

I think this PR is going to be replaced by #541. @ellemouton can we close this? Looks like we're including both the commits of this PR in #541.

@ellemouton
Copy link
Member Author

yes indeed - we can close this :)

@AndySchroder - please see #541 instead

@ellemouton ellemouton closed this Aug 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

allow disabling unneeded subservers
5 participants