Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Connection reset by peer (os error 104) #1340

Open
rwjack opened this issue Sep 16, 2024 · 50 comments · May be fixed by #1359
Open

Connection reset by peer (os error 104) #1340

rwjack opened this issue Sep 16, 2024 · 50 comments · May be fixed by #1359
Labels

Comments

@rwjack
Copy link

rwjack commented Sep 16, 2024

Describe the bug

Librespot keeps crashing on random intervals. I implemented the 0.0.0.0 apresolve.spotify.com bypass in /etc/hosts, but now this issue just replaced the other one: #1322

To reproduce

  1. Launch librespot with
stream = spotify:///librespot?name=Spotify&devicename=HA_Snapcast&bitrate=320&volume=100&cache=/data/librespot&cache-size-limit=2G&autoplay=true&params=--disable-discovery
  1. Start playing media
  2. Crashes on random intervals

Log

... SNIP ...
2024-09-16 12-09-53.469 [Error] (librespot_core::session) Connection reset by peer (os error 104)
2024-09-16 12-09-53.469 [Error] (librespot_connect::spirc) subscription terminated
2024-09-16 12-09-53.469 [Error] (librespot_playback::player) Unable to load audio item: MercuryError
2024-09-16 12-09-53.475 [Warn] (librespot) Spirc shut down unexpectedly
2024-09-16 12-09-53.475 [Warn] (librespot_core::apresolve) Failed to resolve Access Point: error trying to connect: tcp connect error: Connection refused (os error 111)
2024-09-16 12-09-53.475 [Warn] (librespot_core::apresolve) Using fallback "ap.spotify.com:443"
2024-09-16 12-09-53.475 [Info] (librespot_core::session) Connecting to AP "ap.spotify.com:443"
2024-09-16 12-09-53.811 [Info] (librespot_core::session) Authenticated as "x" !
2024-09-16 12-09-53.812 [Info] (librespot_playback::mixer::softmixer) Mixing with softvol and volume control: Log(60.0)
... SNIP ...

Host (what you are running librespot on):

  • OS: Debian container, through Snapcast
  • Platform: HAOS x86
@rwjack rwjack added the bug label Sep 16, 2024
@kingosticks
Copy link
Contributor

What version is this?

I don't see any crashes in that log.

@rwjack
Copy link
Author

rwjack commented Sep 16, 2024

My bad, left the version out: 0.4.2.

Well, it might not be an actual librespot crash, but the music stops. Spotify goes from playing -> idle.

I thought it was a crash because every time I start librespot in my container, it starts off with the apresolve warnings, then it says how it authenticated as my user, etc. - Meaning something broke, because it started the whole apresolve and authentication loop again

@artenverho
Copy link

I seem to observe the same in the past week. I am not 100% sure but it almost seems like Spotify has become much more stringent with keeping the phone/pc/tablet client connected. Ever since I tweaked my wifi network to allow for lower latency roaming it has not dropped out at all (the librespot sever is connected via Ethernet btw.). I never had these issues before though.

@rwjack
Copy link
Author

rwjack commented Sep 18, 2024

Same here, over ethernet, although I did have this issue before, where it would just stop playing. I didn't exactly inspect the logs before, but it happened at least 10x less frequently, once every couple of days, as far as I remember.

@wisp3rwind
Copy link

What version is this?

I don't see any crashes in that log.

FWIW I've been observing the same with the current dev version: Sessions are disconnected quite frequently by the remote (This might have been happening before, but now it might happen as often as every few minutes). In consequence, the Spirc task shuts down and is restarted, resulting in losing the connection to the controlling Spotify client and music stopping.

Probably a duplicate of #1151, which for some people seems to be getting more severe now.

@artenverho
Copy link

The effect of tweaking my network was short lived. I have already experienced quite a few ‘subscription terminated’ errors today.

@tla44
Copy link

tla44 commented Sep 20, 2024

Seeing the same Connection reset by peer (os error 104) at random intervals.

Snapcast v0.28.0
Librespot 0.4.2
Debian GNU/Linux 12 (bookworm/Raspberry Pi OS)

Sep 20 17:54:01 raspberrypi snapserver[1532]: Connection reset by peer (os error 104)
Sep 20 17:54:01 raspberrypi snapserver[1532]: Spirc shut down unexpectedly
Sep 20 17:54:01 raspberrypi snapserver[1532]: Ignoring blacklisted access point ap-gue1.spotify.com:4070
Sep 20 17:54:01 raspberrypi snapserver[1532]: Ignoring blacklisted access point ap-gew4.spotify.com:443
Sep 20 17:54:01 raspberrypi snapserver[1532]: Connecting to AP "ap-gew1.spotify.com:4070"

@rwjack
Copy link
Author

rwjack commented Sep 20, 2024

oh no :c

Just today I tried disabling the apresolve bypass (0.0.0.0 in etc/hosts), thinking it might fix the issue, but judging from your comment I think we're back at square one.

EDIT: To confirm, this issue is unrelated to the apresolve bypass, I get the same error whether enabling the apresolve bypass, or not.

@bitclick
Copy link

i am affected too, with 0.4.2 and dev versions. could it be caused by the login-method change (password not working)?

@artenverho
Copy link

It seems to be getting worse for me.. this evening it is giving me the error almost every 30min… is there any way to debug this?

@herrernst
Copy link
Contributor

I've also noticing frequent disconnects at the moment, almost every other minute. I'm using current dev branch (118a9e1, which has new reconnect handling, but was also happening before):

Sep 29 13:28:03 rpi librespot[515]: [2024-09-29T11:28:03Z ERROR librespot_core::session] Connection to server closed.
Sep 29 13:28:03 rpi librespot[515]: [2024-09-29T11:28:03Z ERROR librespot_core::session] Connection reset by peer (os error 104)
Sep 29 13:28:03 rpi librespot[515]: [2024-09-29T11:28:03Z ERROR librespot_connect::spirc] connection ID update selected, but none received
Sep 29 13:28:03 rpi librespot[515]: [2024-09-29T11:28:03Z WARN  librespot] Spirc shut down unexpectedly
Sep 29 13:28:04 rpi librespot[515]: [2024-09-29T11:28:04Z INFO  librespot_core::session] Connecting to AP "ap-gew4.spotify.com:4070"
Sep 29 13:28:04 rpi librespot[515]: [2024-09-29T11:28:04Z INFO  librespot_core::session] Authenticated as 'username' !
Sep 29 13:28:04 rpi librespot[515]: [2024-09-29T11:28:04Z INFO  librespot_core::session] Country: "AT"
Sep 29 13:30:09 rpi librespot[515]: [2024-09-29T11:30:09Z ERROR librespot_core::session] session lost connection to server
Sep 29 13:30:09 rpi librespot[515]: [2024-09-29T11:30:09Z ERROR librespot_connect::spirc] remote update selected, but none received
Sep 29 13:30:09 rpi librespot[515]: [2024-09-29T11:30:09Z WARN  librespot] Spirc shut down unexpectedly
Sep 29 13:30:09 rpi librespot[515]: [2024-09-29T11:30:09Z INFO  librespot_core::session] Connecting to AP "ap-gew4.spotify.com:4070"
Sep 29 13:30:10 rpi librespot[515]: [2024-09-29T11:30:10Z INFO  librespot_core::session] Authenticated as 'username' !
Sep 29 13:30:10 rpi librespot[515]: [2024-09-29T11:30:10Z INFO  librespot_core::session] Country: "AT"

@kingosticks
Copy link
Contributor

Someone who is experiencing this is going to have to try and debug it. Maybe compare the look of our traffic with the official client in wireshark? Is there some obvious thing there? Otherwise go the full hog with spotify-analyze and a mitm proxy setup?

@kingosticks
Copy link
Contributor

I've also noticing frequent disconnects at the moment, almost every other minute

When you say that, do you really mean it's always every two minutes? That's a big hint....

@ius
Copy link

ius commented Sep 29, 2024

Someone who is experiencing this is going to have to try and debug it. Maybe compare the look of our traffic with the official client in wireshark? Is there some obvious thing there? Otherwise go the full hog with spotify-analyze and a mitm proxy setup?

I've performed a tcpdump (of spotifyd) and observed the following after ~30min of playback:

  • ap-gew4.spotify.com (34.158.1.133) starts sending very small messages (11 bytes TLS payload) on a 2 minute interval
  • Each message is answered by my client with an equal size response
  • Right after the fourth exchange, the server apparently kills the connection using TCP RSTs (no clean shutdown!?)

When you say that, do you really mean it's always every two minutes? That's a big hint....

I guess this might be the two minute interval you're looking for? Seems like a keepalive sequence to me but then again I don't know the Spotify protocol at all

@herrernst
Copy link
Contributor

That apparently corresponds to my log:

Sep 29 14:02:40 rpi librespot[515]: [2024-09-29T12:02:40Z INFO  librespot_playback::player] Loading <Bells> with Spotify URI <spotify:track:7MeZ2vkj3UNOsMiekKRbGM>
Sep 29 14:02:40 rpi librespot[515]: [2024-09-29T12:02:40Z INFO  librespot_playback::player] <Bells> (278026 ms) loaded
Sep 29 14:07:18 rpi librespot[515]: [2024-09-29T12:07:18Z INFO  librespot_playback::player] Loading <Two for Joy> with Spotify URI <spotify:track:1da3pb7bvblDW6RxgjRkp5>
Sep 29 14:07:19 rpi librespot[515]: [2024-09-29T12:07:19Z INFO  librespot_playback::player] <Two for Joy> (358373 ms) loaded
Sep 29 14:13:16 rpi librespot[515]: [2024-09-29T12:13:16Z INFO  librespot_playback::player] Loading <If Not Now, When?> with Spotify URI <spotify:track:3le3GTVe74KjeR5rT8r5GB>
Sep 29 14:13:17 rpi librespot[515]: [2024-09-29T12:13:17Z INFO  librespot_playback::player] <If Not Now, When?> (347293 ms) loaded
Sep 29 14:19:04 rpi librespot[515]: [2024-09-29T12:19:04Z INFO  librespot_playback::player] Loading <Eight Steps> with Spotify URI <spotify:track:0t4O65QuTib5GfB0bcOVuf>
Sep 29 14:19:04 rpi librespot[515]: [2024-09-29T12:19:04Z INFO  librespot_playback::player] <Eight Steps> (301173 ms) loaded
Sep 29 14:24:05 rpi librespot[515]: [2024-09-29T12:24:05Z INFO  librespot_playback::player] Loading <Gone Darker> with Spotify URI <spotify:track:7IhWfavmX0X25CgfZn3bFd>
Sep 29 14:24:05 rpi librespot[515]: [2024-09-29T12:24:05Z INFO  librespot_playback::player] <Gone Darker> (425053 ms) loaded
Sep 29 14:26:15 rpi librespot[515]: [2024-09-29T12:26:15Z ERROR librespot_core::session] Connection to server closed.
Sep 29 14:26:15 rpi librespot[515]: [2024-09-29T12:26:15Z ERROR librespot_core::session] Connection reset by peer (os error 104)
Sep 29 14:26:15 rpi librespot[515]: [2024-09-29T12:26:15Z ERROR librespot_connect::spirc] user attributes mutation selected, but none received
Sep 29 14:26:15 rpi librespot[515]: [2024-09-29T12:26:15Z WARN  librespot] Spirc shut down unexpectedly
Sep 29 14:26:15 rpi librespot[515]: [2024-09-29T12:26:15Z INFO  librespot_core::session] Connecting to AP "ap-gew4.spotify.com:4070"
Sep 29 14:26:16 rpi librespot[515]: [2024-09-29T12:26:16Z INFO  librespot_core::session] Authenticated as 'username' !
Sep 29 14:26:16 rpi librespot[515]: [2024-09-29T12:26:16Z INFO  librespot_core::session] Country: "AT"
Sep 29 14:30:08 rpi librespot[515]: [2024-09-29T12:30:08Z ERROR librespot_core::session] Connection to server closed.
Sep 29 14:30:08 rpi librespot[515]: [2024-09-29T12:30:08Z ERROR librespot_core::session] Connection reset by peer (os error 104)
Sep 29 14:30:08 rpi librespot[515]: [2024-09-29T12:30:08Z ERROR librespot_connect::spirc] user attributes mutation selected, but none received
Sep 29 14:30:08 rpi librespot[515]: [2024-09-29T12:30:08Z WARN  librespot_connect::spirc] Cannot flush spirc event sender when done.
Sep 29 14:30:08 rpi librespot[515]: [2024-09-29T12:30:08Z WARN  librespot] Spirc shut down unexpectedly
Sep 29 14:30:08 rpi librespot[515]: [2024-09-29T12:30:08Z INFO  librespot_core::session] Connecting to AP "ap-gew4.spotify.com:4070"
Sep 29 14:30:08 rpi librespot[515]: [2024-09-29T12:30:08Z INFO  librespot_core::session] Authenticated as 'username' !
Sep 29 14:30:08 rpi librespot[515]: [2024-09-29T12:30:08Z INFO  librespot_core::session] Country: "AT"
Sep 29 14:32:13 rpi librespot[515]: [2024-09-29T12:32:13Z ERROR librespot_core::session] session lost connection to server
Sep 29 14:32:13 rpi librespot[515]: [2024-09-29T12:32:13Z ERROR librespot_connect::spirc] remote update selected, but none received
Sep 29 14:32:13 rpi librespot[515]: [2024-09-29T12:32:13Z WARN  librespot] Spirc shut down unexpectedly
Sep 29 14:32:14 rpi librespot[515]: [2024-09-29T12:32:14Z INFO  librespot_core::session] Connecting to AP "ap-gew4.spotify.com:4070"
Sep 29 14:32:14 rpi librespot[515]: [2024-09-29T12:32:14Z INFO  librespot_core::session] Authenticated as 'username' !
Sep 29 14:32:14 rpi librespot[515]: [2024-09-29T12:32:14Z INFO  librespot_core::session] Country: "AT"
Sep 29 14:34:19 rpi librespot[515]: [2024-09-29T12:34:19Z ERROR librespot_core::session] session lost connection to server
Sep 29 14:34:19 rpi librespot[515]: [2024-09-29T12:34:19Z ERROR librespot_connect::spirc] remote update selected, but none received
Sep 29 14:34:19 rpi librespot[515]: [2024-09-29T12:34:19Z WARN  librespot] Spirc shut down unexpectedly
Sep 29 14:34:19 rpi librespot[515]: [2024-09-29T12:34:19Z INFO  librespot_core::session] Connecting to AP "ap-gew4.spotify.com:4070"
Sep 29 14:34:20 rpi librespot[515]: [2024-09-29T12:34:20Z INFO  librespot_core::session] Authenticated as 'username' !
Sep 29 14:34:20 rpi librespot[515]: [2024-09-29T12:34:20Z INFO  librespot_core::session] Country: "AT"

@kingosticks
Copy link
Contributor

Their protocol has a Ping-Pong keep alive system. They also send a Pong-ACK too.

But, do I understand correctly, they RST the connection despite the 4th Pong being returned?

We could try some builds with all of this logged. we could try disabling our pong entirely to see if the same thing happens sooner. I wonder why only some people see this. Have you tried forcing the use of an access point using port 443 or 80?

@kingosticks
Copy link
Contributor

"session lost connection to server" is slightly different. It's the opposite to "Connection reset by peer". It's librespot saying it didn't get a Ping from Spotify within the last 125 seconds.

@rwjack
Copy link
Author

rwjack commented Sep 29, 2024

Have you tried forcing the use of an access point using port 443 or 80?

For me, these disconnects happen with or without the apresolve bypass (forcing 80/443)

@kingosticks
Copy link
Contributor

Have you tried forcing the use of an access point using port 443 or 80?

For me, these disconnects happen with or without the apresolve bypass (forcing 80/443)

To be clear, I'm not suggesting using the fallback access point. I'm suggesting trying one of the normal access points on either port 443 or 80, rather than 4070. And only because firewalls like traffic on HTTP ports.

@kingosticks
Copy link
Contributor

kingosticks commented Sep 30, 2024

Their protocol has a Ping-Pong keep alive system. They also send a Pong-ACK too.

I've just had a quick look at Spotify's Windows client in Wireshark. It doesn't ping-pong exactly like we do (did it change or did we always do this wrong?) Maybe this matters, maybe not - I'm still not having any issues with how we do it.

// 60 seconds after startup...
Ping (from server)
Pong (from client)
// No Pong-ACK (from server)??

// 60 secs later...
Ping
// 60 seconds later...
Pong
Pong-ACK?

// 60 secs later..
Ping
// 60 seconds later...
Pong
Pong-ACK?

// and so on.....

Feel free to see what your Spotify client is doing, especially if you're experiencing issues. Ideally do this on Linux/mac using spotify-analyze to get a better idea.

@kingosticks
Copy link
Contributor

Ideally do this on Linux/mac using spotify-analyze to get a better idea.

FYI I couldn't get it to work with latest version of wireshark but maybe someone else will have more luck.

@wisp3rwind
Copy link

Ideally do this on Linux/mac using spotify-analyze to get a better idea.

FYI I couldn't get it to work with latest version of wireshark but maybe someone else will have more luck.

Here's a patch for Wireshark 4.2:
From 2a0450edab4fff17cd0e4479e30a0fcf35d680dc Mon Sep 17 00:00:00 2001
From: wisp3rwind <17089248+wisp3rwind@users.noreply.github.com>
Date: Mon, 30 Sep 2024 17:20:21 +0200
Subject: [PATCH] fixes for recent spotify/wireshark

---
 dissect/dissect.sh                            |  2 +-
 .../protobuf_dissector/modules/ast/ast.lua    |  4 +-
 .../protobuf_dissector/modules/compiler.lua   |  4 +-
 .../modules/front_end/cursor.lua              |  4 +-
 .../modules/front_end/file_reader.lua         |  6 +--
 .../modules/front_end/lexer.lua               | 14 +++---
 .../modules/front_end/parser.lua              |  4 +-
 .../modules/front_end/token.lua               |  4 +-
 dissect/protobuf_dissector/modules/prefs.lua  | 10 ++++-
 dissect/protobuf_dissector/modules/syntax.lua |  4 +-
 dissect/protobuf_dissector/protobuf.lua       |  3 +-
 dissect/spotify.lua                           | 44 ++++++++++++++++---
 dump/dump.c                                   | 10 ++++-
 13 files changed, 80 insertions(+), 33 deletions(-)

diff --git a/dissect/dissect.sh b/dissect/dissect.sh
index d05fdf3..f00a708 100755
--- a/dissect/dissect.sh
+++ b/dissect/dissect.sh
@@ -1,7 +1,7 @@
 #!/usr/bin/env bash
 set -eux
 
-exec wireshark-gtk $1 \
+exec wireshark $1 \
     -X lua_script:protobuf_dissector/protobuf.lua \
     -X lua_script:spotify.lua \
     -X lua_script:mercury.lua \
diff --git a/dissect/protobuf_dissector/modules/ast/ast.lua b/dissect/protobuf_dissector/modules/ast/ast.lua
index 2eb4824..76fb15b 100644
--- a/dissect/protobuf_dissector/modules/ast/ast.lua
+++ b/dissect/protobuf_dissector/modules/ast/ast.lua
@@ -62,8 +62,8 @@ if not _G['protbuf_dissector'] then return end
 
 
 -- make sure wireshark is new enough
-if not GRegex then
-    return nil, "Wireshark is too old: no GRegex library"
+if not rex_pcre2 then
+    return nil, "Wireshark is too old: no rex_pcre2 library"
 end
 
 
diff --git a/dissect/protobuf_dissector/modules/compiler.lua b/dissect/protobuf_dissector/modules/compiler.lua
index 6f355dc..d964377 100644
--- a/dissect/protobuf_dissector/modules/compiler.lua
+++ b/dissect/protobuf_dissector/modules/compiler.lua
@@ -32,8 +32,8 @@ if not _G['protbuf_dissector'] then return end
 
 
 -- make sure wireshark is new enough
-if not GRegex then
-    return nil, "Wireshark is too old: no GRegex library"
+if not rex_pcre2 then
+    return nil, "Wireshark is too old: no rex_pcre2 library"
 end
 
 
diff --git a/dissect/protobuf_dissector/modules/front_end/cursor.lua b/dissect/protobuf_dissector/modules/front_end/cursor.lua
index b4cd3c9..134b04c 100644
--- a/dissect/protobuf_dissector/modules/front_end/cursor.lua
+++ b/dissect/protobuf_dissector/modules/front_end/cursor.lua
@@ -94,8 +94,8 @@ function Cursor:nextLine()
 end
 
 
-local wspace_rgx = GRegex.new("^([ \t]++)|^(\n)", "s")
-local chunk_rgx  = GRegex.new("^([^\n]++)|^(\n)", "s")
+local wspace_rgx = rex_pcre2.new("^([ \t]++)|^(\n)", "s")
+local chunk_rgx  = rex_pcre2.new("^([^\n]++)|^(\n)", "s")
 local len = string.len
 local sub = string.sub
 
diff --git a/dissect/protobuf_dissector/modules/front_end/file_reader.lua b/dissect/protobuf_dissector/modules/front_end/file_reader.lua
index 6bdc90a..fb086f2 100644
--- a/dissect/protobuf_dissector/modules/front_end/file_reader.lua
+++ b/dissect/protobuf_dissector/modules/front_end/file_reader.lua
@@ -24,8 +24,8 @@ if not _G['protbuf_dissector'] then return end
 
 
 -- make sure wireshark is new enough
-if not GRegex then
-    return nil, "Wireshark is too old: no GRegex library"
+if not rex_pcre2 then
+    return nil, "Wireshark is too old: no rex_pcre2 library"
 end
 
 
@@ -46,7 +46,7 @@ end
 function FileReader:read(name)
 end
 
-local filepath_rgx = GRegex.new("^(.*)([^/\\\\]+)$", "U")
+local filepath_rgx = rex_pcre2.new("^(.*)([^/\\\\]+)$", "U")
 
 
 --------------------------------------------------------------------------------
diff --git a/dissect/protobuf_dissector/modules/front_end/lexer.lua b/dissect/protobuf_dissector/modules/front_end/lexer.lua
index 25504be..460f027 100644
--- a/dissect/protobuf_dissector/modules/front_end/lexer.lua
+++ b/dissect/protobuf_dissector/modules/front_end/lexer.lua
@@ -50,8 +50,8 @@ if not _G['protbuf_dissector'] then return end
 
 
 -- make sure wireshark is new enough
-if not GRegex then
-    return nil, "Wireshark is too old: no GRegex library"
+if not rex_pcre2 then
+    return nil, "Wireshark is too old: no rex_pcre2 library"
 end
 
 local Settings    = require "settings"
@@ -83,7 +83,7 @@ function Lexer:generateVariableMatchPatterns()
     for _, token in ipairs(self.Syntax.token_info.VARIABLE) do
         if not token.value then
             dprint2("creating new variable regex for: ", token.ttype)
-            tbl[token.ttype] = GRegex.new("^" .. token.pattern .. "$")
+            tbl[token.ttype] = rex_pcre2.new("^" .. token.pattern .. "$")
         end
     end
     self.variable_token_regexes = tbl
@@ -97,7 +97,7 @@ function Lexer:generateStringMatchPatterns()
         for _, token in ipairs(self.Syntax.token_info[category]) do
             if not token.value then
                 dprint2("creating new string regex for: ", token.ttype)
-                tbl[token.ttype] = GRegex.new("^" .. token.pattern .. "$")
+                tbl[token.ttype] = rex_pcre2.new("^" .. token.pattern .. "$")
             end
         end
     end
@@ -112,7 +112,7 @@ function Lexer:generateSkipPatterns()
         for _, token in ipairs(self.Syntax.token_info[category]) do
             if token.skip then
                 dprint2("creating new skip regex for: ", token.ttype)
-                tbl[token.ttype] = GRegex.new(token.skip, token.skip_flags)
+                tbl[token.ttype] = rex_pcre2.new(token.skip, token.skip_flags)
                 if not tbl[token.ttype] then
                     error("Could not compile regex for: " .. token.ttype)
                 end
@@ -173,7 +173,7 @@ function Lexer:compilePatterns()
     local t = {}
     for ttype, str in pairs(self.pattern_strings) do
         -- Gregex will raise a Lua error if this doesn't succeed
-        t[ttype] = GRegex.new(str)
+        t[ttype] = rex_pcre2.new(str)
     end
     self.pattern = t
 end
@@ -226,7 +226,7 @@ function Lexer:tokenize(chunk, filename)
     -- normally we'd use Gregex.gmatch() in a for-loop to do this, but we need to
     -- skip quoted strings inside the for-loop, and that can't be done with gmatch;
     -- so we're going to do it the slower way by creating lots of substrings :(
-    -- also, GRegex.match() doesn't consider subsequent iterations to match a
+    -- also, rex_pcre2.match() doesn't consider subsequent iterations to match a
     -- pattern with a "^" anchor, which is unfortunate, so by creating substrings
     -- we get to use "^" to prevent skipping unmatched words/tokens
 
diff --git a/dissect/protobuf_dissector/modules/front_end/parser.lua b/dissect/protobuf_dissector/modules/front_end/parser.lua
index d03e1a9..7799034 100644
--- a/dissect/protobuf_dissector/modules/front_end/parser.lua
+++ b/dissect/protobuf_dissector/modules/front_end/parser.lua
@@ -40,8 +40,8 @@ if not _G['protbuf_dissector'] then return end
 
 
 -- make sure wireshark is new enough
-if not GRegex then
-    return nil, "Wireshark is too old: no GRegex library"
+if not rex_pcre2 then
+    return nil, "Wireshark is too old: no rex_pcre2 library"
 end
 
 
diff --git a/dissect/protobuf_dissector/modules/front_end/token.lua b/dissect/protobuf_dissector/modules/front_end/token.lua
index 5f2b32c..259a9ae 100644
--- a/dissect/protobuf_dissector/modules/front_end/token.lua
+++ b/dissect/protobuf_dissector/modules/front_end/token.lua
@@ -93,8 +93,8 @@ function Token:isNativeType()
 end
 
 
-local hex_rgx = GRegex.new("^0[xX]([a-fA-F0-9]+)$")
-local oct_rgx = GRegex.new("^0([0-7]+)$")
+local hex_rgx = rex_pcre2.new("^0[xX]([a-fA-F0-9]+)$")
+local oct_rgx = rex_pcre2.new("^0([0-7]+)$")
 
 
 function Token:convertToNumber()
diff --git a/dissect/protobuf_dissector/modules/prefs.lua b/dissect/protobuf_dissector/modules/prefs.lua
index ba5cdda..28f66f8 100644
--- a/dissect/protobuf_dissector/modules/prefs.lua
+++ b/dissect/protobuf_dissector/modules/prefs.lua
@@ -14,6 +14,12 @@
 if not _G['protbuf_dissector'] then return end
 
 
+-- make sure wireshark is new enough
+if not rex_pcre2 then
+    return nil, "Wireshark is too old: no rex_pcre2 library"
+end
+
+
 local Settings = require "settings"
 local dprint   = Settings.dprint
 local dprint2  = Settings.dprint2
@@ -24,10 +30,10 @@ local derror   = Settings.derror
 local Prefs = {}
 
 
-local range_rgx = GRegex.new("([0-9]+)\\s*(?:-\\s*([0-9]+))?")
+local range_rgx = rex_pcre2.new("([0-9]+)\\s*(?:-\\s*([0-9]+))?")
 local function getRange(range)
     local t = {}
-    for first, second in GRegex.gmatch(range, range_rgx) do
+    for first, second in rex_pcre2.gmatch(range, range_rgx) do
         if first then
             first = tonumber(first)
             if second then
diff --git a/dissect/protobuf_dissector/modules/syntax.lua b/dissect/protobuf_dissector/modules/syntax.lua
index 7dce9e2..9fa9738 100644
--- a/dissect/protobuf_dissector/modules/syntax.lua
+++ b/dissect/protobuf_dissector/modules/syntax.lua
@@ -26,8 +26,8 @@ if not _G['protbuf_dissector'] then return end
 
 
 -- make sure wireshark is new enough
-if not GRegex then
-    error("Wireshark is too old: no GRegex library - upgrade to version 1.12 or higher.")
+if not rex_pcre2 then
+    error("Wireshark is too old: no rex_pcre2 library - upgrade to version 1.12 or higher.")
 end
 
 
diff --git a/dissect/protobuf_dissector/protobuf.lua b/dissect/protobuf_dissector/protobuf.lua
index 9898c28..c38960c 100644
--- a/dissect/protobuf_dissector/protobuf.lua
+++ b/dissect/protobuf_dissector/protobuf.lua
@@ -40,7 +40,8 @@ _G['protbuf_dissector'] = {
 
 
 -- help wireshark find our modules
-package.prepend_path("modules")
+-- package.prepend_path("modules")
+package.path = __DIR__ .. __DIR_SEPARATOR__ .. "modules" .. __DIR_SEPARATOR__ .. "?.lua;" .. package.path
 
 
 -- load our settings
diff --git a/dissect/spotify.lua b/dissect/spotify.lua
index 66ab1db..78bdc80 100644
--- a/dissect/spotify.lua
+++ b/dissect/spotify.lua
@@ -28,16 +28,50 @@ function spotify.dissector(buffer, pinfo, tree)
     subtree:add(f.cmd, cmd)
     subtree:add(f.length, length)
 
-    if cmd:uint() == 0xab then
+    if false then
+    elseif cmd:uint() == 0x02 then
+        pinfo.cols.info = "SecretBlock"
+    elseif cmd:uint() == 0x04 then
+        pinfo.cols.info = "Ping: " .. payload(0, 4):uint()
+    elseif cmd:uint() == 0x08 then
+        pinfo.cols.info = "StreamChunk"
+    elseif cmd:uint() == 0x09 then
+        pinfo.cols.info = "StreamChunkRes"
+    elseif cmd:uint() == 0x0a then
+        pinfo.cols.info = "ChannelError"
+    elseif cmd:uint() == 0x0b then
+        pinfo.cols.info = "ChannelAbort"
+    elseif cmd:uint() == 0x0c then
+        pinfo.cols.info = "RequestKey: " .. payload()
+    elseif cmd:uint() == 0x0d then
+        pinfo.cols.info = "AesKey: " .. payload()
+    elseif cmd:uint() == 0x0e then
+        pinfo.cols.info = "AesKeyError"
+    elseif cmd:uint() == 0x19 then
+        pinfo.cols.info = "Image"
+    elseif cmd:uint() == 0x1b then
+        pinfo.cols.info = "CountryCode: " .. payload()
+    elseif cmd:uint() == 0x49 then
+        pinfo.cols.info = "Pong"
+    elseif cmd:uint() == 0x4a then
+        pinfo.cols.info = "PongAck"
+    elseif cmd:uint() == 0x4b then
+        pinfo.cols.info = "Pause"
+    -- elseif cmd:uint() == 0x50 then
+    --     pinfo.cols.info = "ProductInfo"
+    elseif cmd:uint() == 0x69 then
+        pinfo.cols.info = "LegacyWelcome"
+    elseif cmd:uint() == 0x76 then
+        pinfo.cols.info = "LicenseVersion"
+    elseif cmd:uint() == 0xab then
         DissectorTable.get("protobuf"):try("ClientResponseEncrypted", payload, pinfo, tree)
+        pinfo.cols.info = "Login"
     elseif cmd:uint() == 0xac then
         DissectorTable.get("protobuf"):try("APWelcome", payload, pinfo, tree)
+        pinfo.cols.info = "APWelcome"
     elseif cmd:uint() == 0xad then
         DissectorTable.get("protobuf"):try("APLoginFailed", payload, pinfo, tree)
-    elseif cmd:uint() == 0x0c then
-        pinfo.cols.info = "Req AudioKey: " .. payload()
-    elseif cmd:uint() == 0x0d then
-        pinfo.cols.info = "Got AudioKey: " .. payload()
+        pinfo.cols.info = "APLoginFailed"
     else
         DissectorTable.get("spotify.cmd"):try(cmd:uint(), payload, pinfo, tree)
     end
diff --git a/dump/dump.c b/dump/dump.c
index de486d1..3fe60c0 100644
--- a/dump/dump.c
+++ b/dump/dump.c
@@ -15,6 +15,7 @@
 #include <pthread.h>
 #include <dlfcn.h>
 #include <assert.h>
+#include <arpa/inet.h>
 
 #ifdef __APPLE__
 #include <mach-o/dyld.h>
@@ -103,11 +104,16 @@ static void find_shn_heuristic(void *text_start, size_t text_size, void **p_shn_
 }
 
 static void patch_shn(void) {
-    dump_fd = open("dump.pcap", O_CREAT | O_RDWR | O_TRUNC, 0644);
+    char *fname = (char *) malloc(64 * sizeof(char));
+    pid_t pid = getpid();
+    snprintf(fname, 64, "dump-%ld.pcap", (long) pid);
+
+    dump_fd = open(fname, O_CREAT | O_RDWR | O_TRUNC, 0644);
 
     pcap_write_header(dump_fd, PCAP_DLT_USER0);
 
-    printf("Patching ...\n");
+
+    printf("Patching ... (PID = %ld)\n", (long) pid);
 
     size_t text_size = 0;
     void *text_start = NULL;
-- 
2.46.2

@roderickvd
Copy link
Member

I've just had a quick look at Spotify's Windows client in Wireshark. It doesn't ping-pong exactly like we do (did it change or did we always do this wrong?) Maybe this matters, maybe not - I'm still not having any issues with how we do it.

We probably Pong right away, no?

The above could be explained as follows:

// 60 seconds after startup...
// --> Initially, ping-pong right away so both server and client know we're good to go
Ping (from server)
Pong (from client)
// No Pong-ACK (from server)??

// -->  transition to "CONNECTED" state (just my words)

// 60 secs later...
// --> Every minute, the server sends a heartbeat
Ping
// 60 seconds later...
// --> And every minute after, the client does the same
Pong
Pong-ACK?

// 60 secs later..
// --> Rinse ...
Ping
// 60 seconds later...
// --> ... and repeat
Pong
Pong-ACK?

// and so on.....

@kingosticks
Copy link
Contributor

kingosticks commented Sep 30, 2024

We probably Pong right away, no?

Yes, we do.

Yes, it could, I thought something along those lines too but it seems elaborate. My only real thought about this was maybe there was some value we should be sending back in the Pong. (We send zero). Maybe sometimes it's expecting more? I am guessing. I'm hoping someone with the issue will take a look at theirs.

I still couldn't get spotify-analyze to work properly to check the value. Value out of range errors now. It used to just work...

@wisp3rwind
Copy link

wisp3rwind commented Sep 30, 2024

What I'm seeing is (Spotify Desktop on Linux)

Ping // Server -> Client
// 60s
Pong // Client -> Server
PongAck // Server -> Client
// 60s
// go to beginning

i.e. the sequence is always the same, even after start-up.

I could open a PR with a quick & dirty fix, but I can't easily test it myself: The disconnects don't happen so frequently for me right now, so it's hard to say whether there's any difference (it might run for half an hour without connection resets).

EDIT: The Pong payload seems to be always zero.

@roderickvd
Copy link
Member

Yes, it could, I thoght something along those lines too but it seems elaborate.

That other project I'm working on actually does something as elaborate as that.

My only real thought about this was maybe there was some value we should be sending back in the Pong. We send zero. Maybe sometimes it's expecting more? I am guessing. I'm hoping someone with the issue will take a look at theirs.

Does the Ping message have some payload, like a unique ID that we should echo with the Pong?

@kingosticks
Copy link
Contributor

kingosticks commented Sep 30, 2024

I could open a PR with a quick & dirty fix, but I can't easily test it myself: T

I vote for ope a PR and let someone try it.

Does the Ping message have some payload, like a unique ID that we should echo with the Pong?

It contains the server timestamp, which we do some strange arithmetic on (it looks backwards to me) and I think we send back in some other spirc response later. It's possible it's this value coming back that Spotify doesn't like.

@wisp3rwind
Copy link

I still couldn't get spotify-analyze to work properly to check the value. Value out of range errors now. It used to just work...

Have you applied the entire patch above? That works for me (if it doesn't, I'll check whether I pasted some half-baked version).

I vote for ope a PR and let someone try it.

👍

@kingosticks
Copy link
Contributor

Yes I did apply the whole thing. It complains the dump is corrupt but I've tried it a few times.

@wisp3rwind
Copy link

wisp3rwind commented Sep 30, 2024

Oh, that's why I also updated dump.c: The patch that is applied to the executable affects several processes that concurrently write to the same file, corrupting it (only one process actually communicates with the server, but all of them write the pcap header). You'll have to make again and collect new dumps. It will now write to dump-<PID>.pcap files. Select the right one for analysis by file size.

@kingosticks
Copy link
Contributor

Excellent, I missed that. Do you want to PR those spotify-analyze fixes too?

@wisp3rwind
Copy link

@rwjack @artenverho @herrernst @ius @tla44 could you test whether #1357 fixes the issue for you?

@ius
Copy link

ius commented Oct 1, 2024

@rwjack @artenverho @herrernst @ius @tla44 could you test whether #1357 fixes the issue for you?

I was able to keep librespot running for 7h+ with that patch applied without a single reconnect.

I'll follow up with some A/B testing to be sure (unless enough others confirm it works for them).

@rwjack
Copy link
Author

rwjack commented Oct 1, 2024

I would gladly try it out, but I cannot seem to build the damn thing: #1346 (comment)

@herrernst
Copy link
Contributor

@rwjack @artenverho @herrernst @ius @tla44 could you test whether #1357 fixes the issue for you?

Works flawlessly for me now since some hours (wasn't possible before), thank you very much!

@noahhaon
Copy link

noahhaon commented Oct 2, 2024

@rwjack @artenverho @herrernst @ius @tla44 could you test whether #1357 fixes the issue for you?

Got this built and running for my LibreElec/kodi setup on a rpi4 and it has been working flawlessly for several hours today. Thank you for the fix @wisp3rwind <3

Couldn't get the LibreElec buildchain to build it successfully, however. It worked fine with commit a61f707c5c27babe726b065f01e09b4b3c8f722e, so some change in the librespot cargo dependencies (aws-lc-sys maybe?) since then broke the build - but that's a problem for the libreelec maintainers.

Was able to get it built w cross and some fiddling otherwise :)

@wisp3rwind
Copy link

Got this built and running for my LibreElec/kodi setup on a rpi4 and it has been working flawlessly for several hours today. Thank you for the fix @wisp3rwind <3

Thanks in particular also to @kingosticks for the initial diagnosis!

@fivebanger
Copy link
Contributor

I can confirm that the changes from #1357 works now for hours without any issue. I have seen several interruptions of Podcasts/Songs using librespot v0.4.2 but no interruption so far using this PR. Thanks.

@wisp3rwind
Copy link

There's a revised implementation of the modified keep-alive handling in #1359. It should behave the same as #1357, but it'd be great if any of you could give it a quick test as well!

@ius
Copy link

ius commented Oct 6, 2024

There's a revised implementation of the modified keep-alive handling in #1359. It should behave the same as #1357, but it'd be great if any of you could give it a quick test as well!

Had already spotted it and it's been running at least as well as the v1 for me for the past two days.

@antoinecellerier
Copy link

antoinecellerier commented Oct 6, 2024

I was getting quite frequent connection resets. After applying the proposed change in #1359 none have happened this afternoon with 5 hours of straight playback. Thanks for fixing the issue!

prior to the change, this is the frequency at which they'd been happening:

[2024-10-05T01:03:27Z ERROR librespot_core::session] Connection reset by peer (os error 104)
[2024-10-05T01:29:29Z ERROR librespot_core::session] Connection reset by peer (os error 104)
[2024-10-05T03:49:32Z ERROR librespot_core::session] Connection reset by peer (os error 104)
[2024-10-05T04:25:34Z ERROR librespot_core::session] Connection reset by peer (os error 104)
[2024-10-05T06:31:35Z ERROR librespot_core::session] Connection reset by peer (os error 104)
[2024-10-05T11:51:12Z ERROR librespot_core::session] Connection reset by peer (os error 104)
[2024-10-05T13:17:12Z ERROR librespot_core::session] Connection reset by peer (os error 104)
[2024-10-05T13:37:13Z ERROR librespot_core::session] Connection reset by peer (os error 104)
[2024-10-05T14:09:13Z ERROR librespot_core::session] Connection reset by peer (os error 104)
[2024-10-05T14:25:14Z ERROR librespot_core::session] Connection reset by peer (os error 104)
[2024-10-05T21:21:29Z ERROR librespot_core::session] Connection reset by peer (os error 104)
[2024-10-05T23:01:32Z ERROR librespot_core::session] Connection reset by peer (os error 104)
[2024-10-06T01:45:34Z ERROR librespot_core::session] Connection reset by peer (os error 104)
[2024-10-06T02:13:35Z ERROR librespot_core::session] Connection reset by peer (os error 104)
[2024-10-06T02:15:37Z ERROR librespot_core::session] Connection reset by peer (os error 104)
[2024-10-06T05:13:39Z ERROR librespot_core::session] Connection reset by peer (os error 104)
[2024-10-06T05:41:40Z ERROR librespot_core::session] Connection reset by peer (os error 104)
[2024-10-06T06:23:41Z ERROR librespot_core::session] Connection reset by peer (os error 104)
[2024-10-06T07:43:43Z ERROR librespot_core::session] Connection reset by peer (os error 104)
[2024-10-06T08:21:44Z ERROR librespot_core::session] Connection reset by peer (os error 104)

@artenverho
Copy link

This fixed it for me as well! Thanks!

Perhaps a rookie question: my setup originally used Spotifyd (mainly for the dbus feature). Compiling Spotifyd requires librespot version 0.4.2. Is it possible to apply the fix on this version too and use it to recompile spotifyd?

@roderickvd
Copy link
Member

Either #1357 or #1359 are going to be merged before 0.5 release, depending on which is going to work best for you guys. Please take the time to check it out!

@ektich
Copy link

ektich commented Oct 7, 2024

Either #1357 or #1359 are going to be merged before 0.5 release, depending on which is going to work best for you guys. Please take the time to check it out!

Compiled #1359 and been running without any problems for about an hour now. 0.4.2 was "connection reset"-ing for the past couple of weeks every 5 minutes or so.

@milnivlek
Copy link

Another data point in case it helps: I've been running a build with #1359 for most of today, playing background music for many hours straight. I no longer saw "Spirc shut down unexpectedly" at all during this time. Previously I ran into this problem at least once every 5 to 30 minutes. So the PR appears to have resolved the issue for me.

@jared-94
Copy link

Thanks for this PR.

Tested with #1359 I no longer have os error 104 .

But, now, even when I don't listen music, I get a os error 107.
I don't know if this is related to this PR, or a network issue, or something else.
To my knowledge, I never had this error before.

I'm running 4 instances of librespot with snapcast, on a VM machine (Proxmox VE 8.1)

Here is a log when error happens in the night, when nobody is listening music:

root@snapcast:~# service snapserver status
● snapserver.service - Snapcast server
     Loaded: loaded (/lib/systemd/system/snapserver.service; enabled; vendor preset: enabled)
     Active: active (running) since Tue 2024-10-08 10:39:04 CEST; 1 day 22h ago
       Docs: man:snapserver(1)
   Main PID: 32443 (snapserver)
      Tasks: 23 (limit: 4661)
     Memory: 44.4M
        CPU: 18min 19.128s
     CGroup: /system.slice/snapserver.service
             ├─32443 /usr/bin/snapserver --logging.sink=system --server.datadir=/var/lib/snapserver
             ├─32446 /usr/local/bin/librespot --cache-size-limit=2G --name Salon --bitrate 320 --backend pipe --cache /var/cache/librespot/steph --initial-volume 90 --onevent /home/steph/onevent salon --autoplay on --verbose
             ├─32449 /usr/local/bin/librespot --cache-size-limit=2G --name Chambre Steph --bitrate 320 --backend pipe --cache /var/cache/librespot/steph --initial-volume 90 --onevent /home/steph/onevent chambresteph --autoplay on --verb>
             ├─32452 /usr/local/bin/librespot --cache-size-limit=2G --name Bureau Steph --bitrate 320 --backend pipe --cache /var/cache/librespot/steph --initial-volume 90 --onevent /home/steph/onevent bureausteph --autoplay on --verbose
             └─32455 /usr/local/bin/librespot --cache-size-limit=2G --name 1erEtage --bitrate 320 --backend pipe --cache /var/cache/librespot/steph --initial-volume 90 --onevent /home/steph/onevent etage1 --autoplay on --verbose

oct. 10 02:48:48 snapcast snapserver[32443]: remote update selected, but none received
oct. 10 02:48:48 snapcast snapserver[32443]: Spirc shut down unexpectedly
oct. 10 02:48:48 snapcast snapserver[32443]: Transport endpoint is not connected (os error 107)
oct. 10 02:48:48 snapcast snapserver[32443]: Connecting to AP "ap-gew1.spotify.com:4070"
oct. 10 02:48:48 snapcast snapserver[32443]: Authenticated as '******' !
oct. 10 02:48:48 snapcast snapserver[32443]: Country: "FR"
oct. 10 04:43:43 snapcast snapserver[32443]: Error reading message header of length 0: End of file
oct. 10 04:43:44 snapcast snapserver[32443]: onDisconnect: 832d72b9-3c9a-4cf7-abcb-846ddc94f933
oct. 10 04:43:45 snapcast snapserver[32443]: StreamServer::NewConnection: 192.168.0.104
oct. 10 04:43:45 snapcast snapserver[32443]: Hello from 832d72b9-3c9a-4cf7-abcb-846ddc94f933, host: SHIELD Android TV, v0.28.0, ClientName: Snapclient, OS: Android 11, Arch: arm64-v8a, Protocol version: 2

@kingosticks
Copy link
Contributor

kingosticks commented Oct 10, 2024

I think we'd need a debug log for that. And not here, a separate issue.

@jared-94
Copy link

OK, but do you know how I can access librespot log when launched from snapcast ? (debian OS)

@kingosticks
Copy link
Contributor

OK, but do you know how I can access librespot log when launched from snapcast ? (debian OS)

Snapcast has a params argument (or something like that) where you can pass arbitrary extra options to librespot, pass -v. Then the log should be accessible via journalctl -u snapcast like you have above. Alternatively, maybe you can find a way to set environment variable RUST_LOG=trace for the snapcast process.

Failing that, launch it standalone for this debugging purpose.

@ncsufan8628
Copy link

anyone have a working binary for Alpine they can share? been failing miserably at compiling this myself and am facing the above issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.