Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Synapse cannot replicate data that's bigger than 16384 bytes #6327

Closed
anoadragon453 opened this issue Nov 5, 2019 · 5 comments
Closed

Synapse cannot replicate data that's bigger than 16384 bytes #6327

anoadragon453 opened this issue Nov 5, 2019 · 5 comments
Labels
A-Workers Problems related to running Synapse in Worker Mode (or replication) O-Uncommon Most users are unlikely to come across this or unexpected workflow S-Tolerable Minor significance, cosmetic issues, low or no impact to users. T-Defect Bugs, crashes, hangs, security vulnerabilities, or other reported issues.

Comments

@anoadragon453
Copy link
Member

Description

Traceback (most recent call last):
  File "/home/ops/.synapse3/env3/lib/python3.5/site-packages/twisted/internet/defer.py", line 1418, in _inlineCallbacks
    result = g.send(result)
StopIteration: ([(505374, ('@andrewm:amorgan.xyz', None, 'm.direct', '{"@xxx:matrix.org": ["!WUAgSOAjsAMjDlDkhf:amorgan.xyz", "!uLpJrldJgpBjerWFJD:matrix.org", "!aSlnrTeGrTkKwziiij:matrix.org"], "@yyy:matrix.org": ["!yZHTGeDKZUeKaqeTeU:matrix.org"], <truncated>

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/ops/.synapse3/env3/lib/python3.5/site-packages/synapse/replication/tcp/resource.py", line 225, in _run_notifier_loop
    conn.stream_update(stream.NAME, token, row)
  File "/home/ops/.synapse3/env3/lib/python3.5/site-packages/synapse/replication/tcp/protocol.py", line 544, in stream_update
    self.send_command(RdataCommand(stream_name, token, data))
  File "/home/ops/.synapse3/env3/lib/python3.5/site-packages/synapse/replication/tcp/protocol.py", line 293, in send_command
    % (cmd.NAME, len(encoded_string), self.MAX_LENGTH)
Exception: Failed to send command RDATA as too long (21544 > 16384)

I'm running my Synapse in worker mode. When changing my m.direct account data, Synapse's master process attempts to replicate that to other workers. Sending fails however, due to the following code:

if len(encoded_string) > self.MAX_LENGTH:
raise Exception(
"Failed to send command %s as too long (%d > %d)"
% (cmd.NAME, len(encoded_string), self.MAX_LENGTH)
)

Removing that block means that messages are sent, but that's probably not the correct solution here.

It's worth noting that this doesn't seem to affect any functionality here, as even after removing the block, the other workers drop the m.direct account_data information because they don't care about it:

2019-11-05 13:03:37,953 - synapse.replication.tcp.resource - 206 - DEBUG - replication_notifier-119932- Sending 1 updates to 4 connections
2019-11-05 13:03:37,953 - synapse.storage.SQL - 170 - DEBUG - persist_events-8315- [SQL values] {get_auth_chain_ids-22f676} [['$2G8qXHXGkPFQGTZuPpduhi9JDAgN2-yeI-2--W1Bs9M', '$unV8k7cZ1Oa6b01myu2uZKHbSdWKFvLyTHYwx1wfCt0', '$7XWXFJli6U8hOnHhjDPOnerAqXRMofSkBywyAtyjY5I'
2019-11-05 13:03:37,953 - synapse.replication.tcp.resource - 211 - INFO - replication_notifier-119932- Streaming: account_data -> 505400
2019-11-05 13:03:37,954 - synapse.replication.tcp.protocol - 551 - DEBUG - replication_notifier-119932- [synapse.app.federation_sender-KQqdj] Dropping RDATA 'account_data' 505400
2019-11-05 13:03:37,954 - synapse.replication.tcp.protocol - 551 - DEBUG - replication_notifier-119932- [synapse.app.appservice-iDcpI] Dropping RDATA 'account_data' 505400
2019-11-05 13:03:37,954 - synapse.replication.tcp.protocol - 551 - DEBUG - replication_notifier-119932- [synapse.app.user_dir-RNGId] Dropping RDATA 'account_data' 505400
2019-11-05 13:03:37,954 - synapse.replication.tcp.resource - 227 - ERROR - replication_notifier-119932- Failed to replicate

But maybe if I was running other types of workers, they would care about it? Then it would matter if the replication data failed to send.

Anyways, we probably shouldn't be hitting our limit and dropping things. Instead we should either raise it or break up the data into chunks and then send them separately.

Version information

  • Homeserver: amorgan.xyz
  • Version: v1.5.0

  • Install method: pip

  • Platform: Linux, debian
@neilisfragile neilisfragile added z-p2 (Deprecated Label) A-Workers Problems related to running Synapse in Worker Mode (or replication) z-bug (Deprecated Label) labels Nov 6, 2019
@anoadragon453
Copy link
Member Author

This can also be a problem with some events being a maximum 65K bytes and thus being dropped as they're sent over replication.

@ptman
Copy link
Contributor

ptman commented Aug 25, 2022

#11728 - does redis solve this problem?

@DMRobertson DMRobertson added S-Major Major functionality / product severely impaired, no satisfactory workaround. T-Defect Bugs, crashes, hangs, security vulnerabilities, or other reported issues. O-Uncommon Most users are unlikely to come across this or unexpected workflow S-Tolerable Minor significance, cosmetic issues, low or no impact to users. and removed z-bug (Deprecated Label) z-p2 (Deprecated Label) S-Major Major functionality / product severely impaired, no satisfactory workaround. labels Aug 25, 2022
@DMRobertson
Copy link
Contributor

I'm going to assume that this is specific to the old TCP replication and mark this as S-Tolerable. I still don't understand the relationship between TCP replication and redis.

If that's correct, I suggest we close this (WONTFIX) because TCP replication is deprecated.

@ptman
Copy link
Contributor

ptman commented Aug 25, 2022

IIUC redis replaced TCP replication

@richvdh
Copy link
Member

richvdh commented Aug 25, 2022

yeah, this MAX_LENGTH limit is specific to direct-TCP replication. WONTFIX

@richvdh richvdh closed this as not planned Won't fix, can't repro, duplicate, stale Aug 25, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A-Workers Problems related to running Synapse in Worker Mode (or replication) O-Uncommon Most users are unlikely to come across this or unexpected workflow S-Tolerable Minor significance, cosmetic issues, low or no impact to users. T-Defect Bugs, crashes, hangs, security vulnerabilities, or other reported issues.
Projects
None yet
Development

No branches or pull requests

5 participants