-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
xud doesn't refresh "lnd is WaitingUnlock" state #1090
Comments
This is a really bizarre issue. I can reproduce it locally reliably. Starting off with lnd locked and then unlocking lnd is fine. But following the steps here, where you first start with lnd unlocked, then stop it, then start it again while locked, then unlock it, results in the status within xud getting stuck on locked. The bizarre part is that the moment I enter a breakpoint in VS code, the error goes away, even if I never hit the break point. On the very next connection attempt, it will properly show that lnd is no longer locked. I can put in log statements that show that it thinks the |
Issue is open here: grpc/grpc-node#993 |
It looks like this may be fixed when using a heartbeat to detect when lnd has gone offline - it could also be related to using newer versions of lnd. I'll open a PR that closes this issue - the issue no longer occurs for me when I try to reproduce it in that PR. |
This commit shortens the interval for the outbound capacity check timer from 60 to 3 seconds and sets the client as disconnected any time a call fails due to an unreachable server. This makes the capacity checks act like a heartbeat, checking that the server is reachable every few seconds even in the absence of any other activity. Previously, we had used a dummy server -> client streaming call and listened for the `error` event on the lnd side, however with newer versions of lnd and grpc this is no longer a reliable way to tell when lnd has gone down. This also resolves an issue where lnd would get stuck in the `WaitingUnlock` state if it is stopped while xud is running and comes back online in the locked state. Closes #1090.
This commit shortens the interval for the outbound capacity check timer from 60 to 3 seconds and sets the client as disconnected any time a call fails due to an unreachable server. This makes the capacity checks act like a heartbeat, checking that the server is reachable every few seconds even in the absence of any other activity. Previously, we had used a dummy server -> client streaming call and listened for the `error` event on the lnd side, however with newer versions of lnd and grpc this is no longer a reliable way to tell when lnd has gone down. This also resolves an issue where lnd would get stuck in the `WaitingUnlock` state if it is stopped while xud is running and comes back online in the locked state. Closes #1090.
I'm still experiencing similar issue with latest changes with a fresh docker environment during the 2nd start of the environment.
xud seems to be stuck in a loop of trying to verify lnd connections
After about ~20 minutes the |
xud keeps printing those "trying to verify connection" messages? And just to be clear, you observed this after starting xud while lnds were in a locked state, and then performing the |
Yes, this can be reproduced with a fresh testnet/mainnet docker environment. After the first launch execute |
Thanks I will look into it. |
I can't reproduce running in my dev environment against a local lnd instanc. @karl will share full logs if this comes up again, which hopefully will shed some light on how lnd went from connected status (which is required for the unlock calls to work) to disconnected. |
It's easily reproduced by running |
Could you reproduce @sangaman ? If not, I can demonstrate it on my machine. |
Also experienced by @belboo :
|
I tested several combinations this morning and can't reproduce this. Is your issue still on? @erkarl |
Otherwise, let's close. |
Never had the issue since on our side! |
Thanks @belboo ! Closing for now. |
How it is
Docker-setup: All testnet clients ready & synced:
t0: restart lndltc, unlock lndltc via
lncli unlock
t1: now lndltc works and shows
"synced_to_chain": true
. Butxucli getinfo
shows lndtlc"error": "lnd is WaitingUnlock",
which is not true ([lncli] Wallet is already unlocked
)t2: restarted xud container - fixed.
How it should be
xud pulls unlock state in 5s interval, same as other states from lnd to automatically detect when it was unlocked successfully.
TODO
The text was updated successfully, but these errors were encountered: