Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Servers may sometimes temporarily blackhole federation traffic #1733

Closed
ara4n opened this issue Dec 30, 2016 · 10 comments
Closed

Servers may sometimes temporarily blackhole federation traffic #1733

ara4n opened this issue Dec 30, 2016 · 10 comments

Comments

@ara4n
Copy link
Member

ara4n commented Dec 30, 2016

Possibly related to #1729, i'm seeing arasphere failing to respond to federation requests entirely from darmstadt. This leaks inbound connections on arasphere and outbound connections on darmstadt (which is running pre-0.18.6, so doesn't clean up long-lived outbound connections as per #1725)

See https://matrix.org/_matrix/media/v1/download/matrix.org/RciZsIFyASPpHBLFOaUcxhum for logs.

This is probably the root cause of the leaking FDs on HSes which caused widespread problems between Dec 26-30.

@ara4n
Copy link
Member Author

ara4n commented Dec 30, 2016

This is probably somehow related to #1732, which was also happening at the same time

@ara4n
Copy link
Member Author

ara4n commented Dec 30, 2016

the darmstadt->arasphere requests finally unblocked after 6 hours:

2016-12-30 07:05:04,442 - synapse.access.https.8448 - 59 - INFO - PUT-16175 - 130.83.177.132 - 8448 - Received request: PUT /_matrix/federation/v1/send/1482195882156/
2016-12-30 07:05:04,442 - synapse.federation.transport.server - 138 - INFO - PUT-16175 - Request from darmstadt.ccc.de
2016-12-30 07:05:04,443 - synapse.federation.transport.server - 244 - INFO - PUT-16175 - Received txn 1482195882156 from darmstadt.ccc.de. (PDUs: 1, EDUs: 0, failures: 0)
2016-12-30 14:01:49,436 - synapse.federation.federation_server - 558 - INFO - PUT-16175 - Missing 1 events for room '!cURbafjkfsMDVwdRDQ:matrix.org': ['$14829486863336717wieod:matrix.org']...
2016-12-30 14:01:49,436 - synapse.http.outbound - 124 - INFO - PUT-16175 - {POST-O-24208} [darmstadt.ccc.de] Sending request: POST matrix://darmstadt.ccc.de/_matrix/federation/v1/get_missing_events/!cURbafjkfsMDVwdRDQ:matrix.org
2016-12-30 14:01:49,475 - synapse.http.outbound - 209 - INFO - PUT-16175 - {POST-O-24208} [darmstadt.ccc.de] Result: 502 Bad Gateway
2016-12-30 14:01:49,805 - synapse.access.https.8448 - 91 - INFO - PUT-16175 - 130.83.177.132 - 8448 - {darmstadt.ccc.de} Processed request: 25005363ms (0ms, 0ms) (1ms/4) 72B 200 "PUT /_matrix/federation/v1/send/1482195882156/ HTTP/1.1" "Synapse/0.18.5 (b=develop,09f1591)"

Looks like this is indeed #1729, and the attempt to get_missing_events was somehow blocked on another one.

@ara4n
Copy link
Member Author

ara4n commented Dec 30, 2016

Yes, this is precisely what's happening. Looking at the first darmstadt->arasphere request to unblock, it seems that it was stacked up on an event_auth check for arasphere.net->productionservers.net for curbaf finally unblocking, triggered by a productionservers.net->arasphere.net transaction:

2016-12-30 14:01:48,336 - synapse.http.matrixfederationclient - 183 - WARNING - PUT-12409 - {GET-O-24163} Sending request failed to productionservers.net: GET matrix://productionservers.net/_matrix/federation/v1/event_auth/!cURbafjkfsMDVwdRDQ:matrix.org/$14168361922zrVc
V:arasphere.net: CancelledError - CancelledError: 
2016-12-30 14:01:48,336 - synapse.http.outbound - 209 - INFO - PUT-12409 - {GET-O-24163} [productionservers.net] Result: CancelledError - CancelledError: 
2016-12-30 14:01:48,336 - synapse.handlers.federation - 1497 - ERROR - PUT-12409 - Failed to get auth chain
Traceback (most recent call last):
  File "synapse/handlers/federation.py", line 1456, in do_auth
    [e.event_id for e in result["auth_chain"]]
CancelledError
2016-12-30 14:01:48,336 - synapse.handlers.federation - 1506 - INFO - PUT-12409 - Different auth: set(['$141661037721nBtSl:arasphere.net', '$1416450173215crCca:matrix.org'])
2016-12-30 14:01:48,337 - synapse.state - 385 - INFO - PUT-12409 - Resolving state for !cURbafjkfsMDVwdRDQ:matrix.org with 2 groups
2016-12-30 14:01:48,337 - synapse.handlers.federation - 1550 - INFO - PUT-12409 - Different auth after resolution: set(['$141661037721nBtSl:arasphere.net'])
2016-12-30 14:01:48,338 - synapse.federation.federation_server - 154 - ERROR -  - Failed to handle PDU
Traceback (most recent call last):
  File "synapse/federation/federation_server.py", line 147, in on_incoming_transaction
    )
  File "/home/matrix/.synapse/local/lib/python2.7/site-packages/twisted/internet/defer.py", line 1126, in _inlineCallbacks
    result = result.throwExceptionIntoGenerator(g)
  File "/home/matrix/.synapse/local/lib/python2.7/site-packages/twisted/python/failure.py", line 389, in throwExceptionIntoGenerator
    return g.throw(self.type, self.value, self.tb)
  File "synapse/federation/federation_server.py", line 578, in _handle_new_pdu
  File "/home/matrix/.synapse/local/lib/python2.7/site-packages/twisted/internet/defer.py", line 1126, in _inlineCallbacks
    result = result.throwExceptionIntoGenerator(g)
  File "/home/matrix/.synapse/local/lib/python2.7/site-packages/twisted/python/failure.py", line 389, in throwExceptionIntoGenerator
    return g.throw(self.type, self.value, self.tb)
  File "synapse/federation/federation_server.py", line 612, in _handle_new_pdu
  File "/home/matrix/.synapse/local/lib/python2.7/site-packages/twisted/internet/defer.py", line 1126, in _inlineCallbacks
    result = result.throwExceptionIntoGenerator(g)
  File "/home/matrix/.synapse/local/lib/python2.7/site-packages/twisted/python/failure.py", line 389, in throwExceptionIntoGenerator
    return g.throw(self.type, self.value, self.tb)
  File "synapse/handlers/federation.py", line 174, in on_receive_pdu
    origin,
  File "/home/matrix/.synapse/local/lib/python2.7/site-packages/twisted/internet/defer.py", line 1126, in _inlineCallbacks
    result = result.throwExceptionIntoGenerator(g)
  File "/home/matrix/.synapse/local/lib/python2.7/site-packages/twisted/python/failure.py", line 389, in throwExceptionIntoGenerator
    return g.throw(self.type, self.value, self.tb)
  File "synapse/handlers/federation.py", line 1173, in _handle_new_event
    (event_map[e_id].type, event_map[e_id].state_key): event_map[e_id]
  File "/home/matrix/.synapse/local/lib/python2.7/site-packages/twisted/internet/defer.py", line 1126, in _inlineCallbacks
    result = result.throwExceptionIntoGenerator(g)
  File "/home/matrix/.synapse/local/lib/python2.7/site-packages/twisted/python/failure.py", line 389, in throwExceptionIntoGenerator
    return g.throw(self.type, self.value, self.tb)
  File "synapse/handlers/federation.py", line 1351, in _prep_event
    try:
  File "/home/matrix/.synapse/local/lib/python2.7/site-packages/twisted/internet/defer.py", line 1128, in _inlineCallbacks
    result = g.send(result)
  File "synapse/handlers/federation.py", line 1637, in do_auth
    "rejects": {
  File "synapse/api/auth.py", line 130, in check
    logger.info(
SynapseError: 403: Room '!cURbafjkfsMDVwdRDQ:matrix.org' does not exist
2016-12-30 14:01:48,339 - synapse.federation.federation_server - 558 - INFO - PUT-15997 - Missing 1 events for room '!cURbafjkfsMDVwdRDQ:matrix.org': ['$14829486863336717wieod:matrix.org']...
2016-12-30 14:01:48,339 - synapse.http.outbound - 124 - INFO - PUT-15997 - {POST-O-24205} [darmstadt.ccc.de] Sending request: POST matrix://darmstadt.ccc.de/_matrix/federation/v1/get_missing_events/!cURbafjkfsMDVwdRDQ:matrix.org

Meanwhile, looking at this blocking request (PUT-12409) - it started at 05:31 and was rattling around for ages (like, 8.4 hours) whilst blocking other stuff... ugh:

2016-12-30 05:31:54,505 - synapse.access.https.8448 - 59 - INFO - PUT-12409 - 66.41.93.104 - 8448 - Received request: PUT /_matrix/federation/v1/send/1483075196511/
...
2016-12-30 14:01:48,341 - synapse.access.https.8448 - 91 - INFO - PUT-12409 - 66.41.93.104 - 8448 - {productionservers.net} Processed request: 30593836ms (464ms, 15ms) (1639ms/22) 114B 200 "PUT /_matrix/federation/v1/send/1483075196511/ HTTP/1.1" "Synapse/0.18.5"

It looks like some of those blocks were presumably due to dependent requests being similarly blackholed. Despite arasphere being on 0.18.6-rc1. The nightmare logs of this request as seen by arasphere are here:

PUT-12409.txt

@ara4n
Copy link
Member Author

ara4n commented Dec 30, 2016

In fact, every time arasphere received this txn from productionservers.net it triggered a multi-hour meltdown:

matrix@phoenix:~/synapse$ cat darmstadt.log | grep '/_matrix/federation/v1/send/1483075196511/'
2016-12-30 05:27:10,750 - synapse.access.https.8448 - 59 - INFO - PUT-12095 - 66.41.93.104 - 8448 - Received request: PUT /_matrix/federation/v1/send/1483075196511/
2016-12-30 05:28:14,870 - synapse.access.https.8448 - 59 - INFO - PUT-12158 - 66.41.93.104 - 8448 - Received request: PUT /_matrix/federation/v1/send/1483075196511/
2016-12-30 05:29:41,904 - synapse.access.https.8448 - 59 - INFO - PUT-12246 - 66.41.93.104 - 8448 - Received request: PUT /_matrix/federation/v1/send/1483075196511/
2016-12-30 05:31:54,505 - synapse.access.https.8448 - 59 - INFO - PUT-12409 - 66.41.93.104 - 8448 - Received request: PUT /_matrix/federation/v1/send/1483075196511/
2016-12-30 05:33:59,901 - synapse.access.https.8448 - 59 - INFO - PUT-12487 - 66.41.93.104 - 8448 - Received request: PUT /_matrix/federation/v1/send/1483075196511/
2016-12-30 05:36:03,867 - synapse.access.https.8448 - 59 - INFO - PUT-12569 - 66.41.93.104 - 8448 - Received request: PUT /_matrix/federation/v1/send/1483075196511/
2016-12-30 05:38:20,371 - synapse.access.https.8448 - 59 - INFO - PUT-12673 - 66.41.93.104 - 8448 - Received request: PUT /_matrix/federation/v1/send/1483075196511/
2016-12-30 05:40:15,857 - synapse.access.https.8448 - 59 - INFO - PUT-12743 - 66.41.93.104 - 8448 - Received request: PUT /_matrix/federation/v1/send/1483075196511/
2016-12-30 05:42:42,741 - synapse.access.https.8448 - 59 - INFO - PUT-12892 - 66.41.93.104 - 8448 - Received request: PUT /_matrix/federation/v1/send/1483075196511/
2016-12-30 05:45:07,980 - synapse.access.https.8448 - 59 - INFO - PUT-12995 - 66.41.93.104 - 8448 - Received request: PUT /_matrix/federation/v1/send/1483075196511/
2016-12-30 05:46:59,120 - synapse.access.https.8448 - 59 - INFO - PUT-13067 - 66.41.93.104 - 8448 - Received request: PUT /_matrix/federation/v1/send/1483075196511/
2016-12-30 07:17:26,521 - synapse.access.https.8448 - 91 - INFO - PUT-12095 - 66.41.93.104 - 8448 - {productionservers.net} Processed request: 6615771ms (287ms, 12ms) (1065ms/22) 114B 200 "PUT /_matrix/federation/v1/send/1483075196511/ HTTP/1.1" "Synapse/0.18.5"
2016-12-30 07:17:29,298 - synapse.access.https.8448 - 91 - INFO - PUT-13067 - 66.41.93.104 - 8448 - {productionservers.net} Processed request: 5430178ms (3ms, 0ms) (0ms/0) 114B 200 "PUT /_matrix/federation/v1/send/1483075196511/ HTTP/1.1" "Synapse/0.18.5"
2016-12-30 07:17:29,567 - synapse.access.https.8448 - 91 - INFO - PUT-12995 - 66.41.93.104 - 8448 - {productionservers.net} Processed request: 5541587ms (8ms, 0ms) (0ms/0) 114B 200 "PUT /_matrix/federation/v1/send/1483075196511/ HTTP/1.1" "Synapse/0.18.5"
2016-12-30 07:17:29,681 - synapse.access.https.8448 - 91 - INFO - PUT-12892 - 66.41.93.104 - 8448 - {productionservers.net} Processed request: 5686940ms (0ms, 0ms) (0ms/0) 114B 200 "PUT /_matrix/federation/v1/send/1483075196511/ HTTP/1.1" "Synapse/0.18.5"
2016-12-30 07:17:29,791 - synapse.access.https.8448 - 91 - INFO - PUT-12743 - 66.41.93.104 - 8448 - {productionservers.net} Processed request: 5833934ms (4ms, 0ms) (0ms/0) 114B 200 "PUT /_matrix/federation/v1/send/1483075196511/ HTTP/1.1" "Synapse/0.18.5"
2016-12-30 07:17:29,792 - synapse.access.https.8448 - 91 - INFO - PUT-12673 - 66.41.93.104 - 8448 - {productionservers.net} Processed request: 5949421ms (4ms, 0ms) (0ms/0) 114B 200 "PUT /_matrix/federation/v1/send/1483075196511/ HTTP/1.1" "Synapse/0.18.5"
2016-12-30 07:17:29,793 - synapse.access.https.8448 - 91 - INFO - PUT-12569 - 66.41.93.104 - 8448 - {productionservers.net} Processed request: 6085926ms (4ms, 0ms) (0ms/0) 114B 200 "PUT /_matrix/federation/v1/send/1483075196511/ HTTP/1.1" "Synapse/0.18.5"
2016-12-30 07:17:29,793 - synapse.access.https.8448 - 91 - INFO - PUT-12487 - 66.41.93.104 - 8448 - {productionservers.net} Processed request: 6209892ms (0ms, 0ms) (0ms/0) 114B 200 "PUT /_matrix/federation/v1/send/1483075196511/ HTTP/1.1" "Synapse/0.18.5"
2016-12-30 09:25:52,328 - synapse.access.https.8448 - 91 - INFO - PUT-12158 - 66.41.93.104 - 8448 - {productionservers.net} Processed request: 14257458ms (359ms, 0ms) (1757ms/24) 114B 200 "PUT /_matrix/federation/v1/send/1483075196511/ HTTP/1.1" "Synapse/0.18.5"
2016-12-30 11:20:01,853 - synapse.access.https.8448 - 91 - INFO - PUT-12246 - 66.41.93.104 - 8448 - {productionservers.net} Processed request: 21019949ms (5435ms, 15ms) (1751ms/31) 114B 200 "PUT /_matrix/federation/v1/send/1483075196511/ HTTP/1.1" "Synapse/0.18.5"
2016-12-30 14:01:48,341 - synapse.access.https.8448 - 91 - INFO - PUT-12409 - 66.41.93.104 - 8448 - {productionservers.net} Processed request: 30593836ms (464ms, 15ms) (1639ms/22) 114B 200 "PUT /_matrix/federation/v1/send/1483075196511/ HTTP/1.1" "Synapse/0.18.5"

@ara4n
Copy link
Member Author

ara4n commented Dec 30, 2016

Looking at the first of these txns (PUT-12095), it seems arasphere was trying to track down a bunch of events from arasphere... which are not present in its own DB:

PUT-12095.txt

2016-12-30 05:27:11,577 - synapse.federation.federation_server - 558 - INFO - PUT-12095 - Missing 1 events for room '!cURbafjkfsMDVwdRDQ:matrix.org': ['$14168370037dJyNv:arasphere.net']...
2016-12-30 07:17:25,457 - synapse.federation.federation_server - 590 - INFO - PUT-12095 - Still missing 1 events for room '!cURbafjkfsMDVwdRDQ:matrix.org': ['$14168357801cKMiB:arasphere.net']...
2016-12-30 07:17:26,218 - synapse.handlers.federation - 1452 - INFO - PUT-12095 - Missing auth: set(['$141661037721nBtSl:arasphere.net'])

The full list of referenced events is:

$141661037721nBtSl:arasphere.net
$14168357801cKMiB:arasphere.net
$14168361922zrVcV:arasphere.net
$14168362293tYqnx:arasphere.net
$14168362334OLnTd:arasphere.net
$14168368325cmTRA:arasphere.net
$14168368996LFQag:arasphere.net
$14168370037dJyNv:arasphere.net

These are all events of erik doing things on arasphere in Nov 2014. Some of them them show up on matrix.org; none of them are on arasphere itself, probably because arasphere.net's DB is dead since then. So it looks like the original event that productionservers.net referenced was @erikj:arasphere.net leaving curbaf ($14168370037dJyNv:arasphere.net), which then triggers a meltdown of arasphere trying to find this event and similar ones from other servers. It goes around different broken servers (including ones which are down?!) trying to find the ancient events, which nobody of course has other than matrix.org.

@ara4n
Copy link
Member Author

ara4n commented Dec 30, 2016

A workaround for disabling spidering servers at random in get_missing_events has landed in #1734 and released in 0.18.6-rc2.

We're hoping these are the only things which were locking the room linearizer and so blackholing federation traffic for a given room. N.B. that after 3 concurrent federation requests get stuck from the server, all future ones get queued or 429'd by the federation rate limiting logic, which can then starve out the remote server from being able to federate to the local one entirely.

@ara4n
Copy link
Member Author

ara4n commented Dec 30, 2016

The problem is still there; productionservers.net just had a 25 minute hiatus whilst trying to find missing events. The first failed req from matrix.org was:

2016-12-30 16:45:51,494 - synapse.access.https.8448 - 59 - INFO - PUT-5941- - - 8448 - Received request: PUT /_matrix/federation/v1/send/1483124896752/
2016-12-30 16:45:51,498 - synapse.federation.transport.server - 244 - INFO - PUT-5941- Received txn 1483124896752 from matrix.org. (PDUs: 1, EDUs: 0, failures: 0)
2016-12-30 16:46:55,267 - synapse.access.https.8448 - 59 - INFO - PUT-6033- - - 8448 - Received request: PUT /_matrix/federation/v1/send/1483124896752/
2016-12-30 16:46:55,271 - synapse.federation.transport.server - 244 - INFO - PUT-6033- Received txn 1483124896752 from matrix.org. (PDUs: 1, EDUs: 0, failures: 0)
2016-12-30 16:48:08,265 - synapse.access.https.8448 - 59 - INFO - PUT-6161- - - 8448 - Received request: PUT /_matrix/federation/v1/send/1483124896752/
2016-12-30 16:49:57,993 - synapse.access.https.8448 - 59 - INFO - PUT-6261- - - 8448 - Received request: PUT /_matrix/federation/v1/send/1483124896752/
2016-12-30 16:52:14,352 - synapse.access.https.8448 - 59 - INFO - PUT-6455- - - 8448 - Received request: PUT /_matrix/federation/v1/send/1483124896752/
2016-12-30 16:54:26,582 - synapse.access.https.8448 - 59 - INFO - PUT-6615- - - 8448 - Received request: PUT /_matrix/federation/v1/send/1483124896752/
2016-12-30 16:56:42,876 - synapse.access.https.8448 - 59 - INFO - PUT-6846- - - 8448 - Received request: PUT /_matrix/federation/v1/send/1483124896752/
2016-12-30 16:58:41,679 - synapse.access.https.8448 - 59 - INFO - PUT-7083- - - 8448 - Received request: PUT /_matrix/federation/v1/send/1483124896752/
2016-12-30 17:00:33,760 - synapse.access.https.8448 - 59 - INFO - PUT-7347- - - 8448 - Received request: PUT /_matrix/federation/v1/send/1483124896752/
2016-12-30 17:02:39,180 - synapse.access.https.8448 - 59 - INFO - PUT-7788- - - 8448 - Received request: PUT /_matrix/federation/v1/send/1483124896752/
2016-12-30 17:03:59,739 - synapse.federation.transport.server - 244 - INFO - - Received txn 1483124896752 from matrix.org. (PDUs: 1, EDUs: 0, failures: 0)
2016-12-30 17:04:44,788 - synapse.access.https.8448 - 59 - INFO - PUT-8239- - - 8448 - Received request: PUT /_matrix/federation/v1/send/1483124896752/
2016-12-30 17:08:09,613 - synapse.federation.transport.server - 244 - INFO - - Received txn 1483124896752 from matrix.org. (PDUs: 1, EDUs: 0, failures: 0)
2016-12-30 17:13:32,338 - synapse.access.https.8448 - 91 - INFO - PUT-5941- - - 8448 - {matrix.org} Processed request: 1660844ms (6ms, 0ms) (90ms/4) 51B 200 "PUT /_matrix/federation/v1/send/1483124896752/ HTTP/1.1" "Synapse/0.18.6-rc1 (b=matrix-org-hotfixes,2f703b8)"
2016-12-30 17:13:34,353 - synapse.federation.transport.server - 244 - INFO - - Received txn 1483124896752 from matrix.org. (PDUs: 1, EDUs: 0, failures: 0)
2016-12-30 17:13:34,869 - synapse.federation.transport.server - 244 - INFO - - Received txn 1483124896752 from matrix.org. (PDUs: 1, EDUs: 0, failures: 0)
2016-12-30 17:13:34,873 - synapse.access.https.8448 - 91 - INFO - PUT-7347- - - 8448 - {matrix.org} Processed request: 781113ms (3ms, 0ms) (0ms/0) 51B 200 "PUT /_matrix/federation/v1/send/1483124896752/ HTTP/1.1" "Synapse/0.18.6-rc1 (b=matrix-org-hotfixes,2f703b8)"
2016-12-30 17:13:38,414 - synapse.federation.transport.server - 244 - INFO - - Received txn 1483124896752 from matrix.org. (PDUs: 1, EDUs: 0, failures: 0)
2016-12-30 17:13:38,417 - synapse.access.https.8448 - 91 - INFO - PUT-7083- - - 8448 - {matrix.org} Processed request: 896737ms (20ms, 0ms) (0ms/0) 51B 200 "PUT /_matrix/federation/v1/send/1483124896752/ HTTP/1.1" "Synapse/0.18.6-rc1 (b=matrix-org-hotfixes,2f703b8)"
2016-12-30 17:13:39,238 - synapse.federation.transport.server - 244 - INFO - - Received txn 1483124896752 from matrix.org. (PDUs: 1, EDUs: 0, failures: 0)
2016-12-30 17:13:39,242 - synapse.access.https.8448 - 91 - INFO - PUT-6846- - - 8448 - {matrix.org} Processed request: 1016364ms (35ms, 2ms) (0ms/0) 51B 200 "PUT /_matrix/federation/v1/send/1483124896752/ HTTP/1.1" "Synapse/0.18.6-rc1 (b=matrix-org-hotfixes,2f703b8)"
2016-12-30 17:13:39,883 - synapse.federation.transport.server - 244 - INFO - - Received txn 1483124896752 from matrix.org. (PDUs: 1, EDUs: 0, failures: 0)
2016-12-30 17:13:39,886 - synapse.access.https.8448 - 91 - INFO - PUT-6615- - - 8448 - {matrix.org} Processed request: 1153303ms (16ms, 1ms) (0ms/0) 51B 200 "PUT /_matrix/federation/v1/send/1483124896752/ HTTP/1.1" "Synapse/0.18.6-rc1 (b=matrix-org-hotfixes,2f703b8)"
2016-12-30 17:13:40,423 - synapse.federation.transport.server - 244 - INFO - - Received txn 1483124896752 from matrix.org. (PDUs: 1, EDUs: 0, failures: 0)
2016-12-30 17:13:40,426 - synapse.access.https.8448 - 91 - INFO - PUT-6455- - - 8448 - {matrix.org} Processed request: 1286074ms (8ms, 0ms) (0ms/0) 51B 200 "PUT /_matrix/federation/v1/send/1483124896752/ HTTP/1.1" "Synapse/0.18.6-rc1 (b=matrix-org-hotfixes,2f703b8)"
2016-12-30 17:13:41,224 - synapse.federation.transport.server - 244 - INFO - - Received txn 1483124896752 from matrix.org. (PDUs: 1, EDUs: 0, failures: 0)
2016-12-30 17:13:41,228 - synapse.access.https.8448 - 91 - INFO - PUT-6261- - - 8448 - {matrix.org} Processed request: 1423234ms (44ms, 0ms) (0ms/0) 51B 200 "PUT /_matrix/federation/v1/send/1483124896752/ HTTP/1.1" "Synapse/0.18.6-rc1 (b=matrix-org-hotfixes,2f703b8)"
2016-12-30 17:13:41,939 - synapse.access.https.8448 - 91 - INFO - PUT-6161- - - 8448 - {matrix.org} Processed request: 1533674ms (3ms, 0ms) (0ms/0) 51B 200 "PUT /_matrix/federation/v1/send/1483124896752/ HTTP/1.1" "Synapse/0.18.6-rc1 (b=matrix-org-hotfixes,2f703b8)"
2016-12-30 17:13:51,289 - synapse.access.https.8448 - 91 - INFO - PUT-6033- - - 8448 - {matrix.org} Processed request: 1616021ms (2ms, 1ms) (12ms/4) 51B 200 "PUT /_matrix/federation/v1/send/1483124896752/ HTTP/1.1" "Synapse/0.18.6-rc1 (b=matrix-org-hotfixes,2f703b8)"
2016-12-30 17:14:34,454 - synapse.access.https.8448 - 91 - INFO - PUT-7788- - - 8448 - {matrix.org} Processed request: 715273ms (7ms, 0ms) (0ms/0) 51B 200 "PUT /_matrix/federation/v1/send/1483124896752/ HTTP/1.1" "Synapse/0.18.6-rc1 (b=matrix-org-hotfixes,2f703b8)"
2016-12-30 17:14:36,355 - synapse.access.https.8448 - 91 - INFO - PUT-8239- - - 8448 - {matrix.org} Processed request: 591566ms (24ms, 0ms) (0ms/0) 51B 200 "PUT /_matrix/federation/v1/send/1483124896752/ HTTP/1.1" "Synapse/0.18.6-rc1 (b=matrix-org-hotfixes,2f703b8)"

@ara4n
Copy link
Member Author

ara4n commented Dec 30, 2016

Meanwhile, during the outage, it still seems to have been querying lots of different servers for missing events:

2016-12-30 16:44:23,768 - synapse.http.outbound - 124 - INFO - PUT-5507- {POST-O-9213} [croxis.net] Sending request: POST matrix://croxis.net/_matrix/federation/v1/get_missing_events/!cURbafjkfsMDVwdRDQ:matrix.org
2016-12-30 16:44:23,840 - synapse.http.outbound - 124 - INFO - PUT-5513- {POST-O-9215} [anemone.me] Sending request: POST matrix://anemone.me/_matrix/federation/v1/get_missing_events/!cURbafjkfsMDVwdRDQ:matrix.org
2016-12-30 16:44:23,952 - synapse.http.outbound - 124 - INFO - PUT-5515- {POST-O-9218} [edutel.nl] Sending request: POST matrix://edutel.nl/_matrix/federation/v1/get_missing_events/!cURbafjkfsMDVwdRDQ:matrix.org
2016-12-30 16:44:25,803 - synapse.http.outbound - 124 - INFO - PUT-5516- {POST-O-9222} [gazizova.net] Sending request: POST matrix://gazizova.net/_matrix/federation/v1/get_missing_events/!cURbafjkfsMDVwdRDQ:matrix.org
2016-12-30 16:44:26,279 - synapse.http.outbound - 124 - INFO - PUT-5527- {POST-O-9226} [maunium.net] Sending request: POST matrix://maunium.net/_matrix/federation/v1/get_missing_events/!cURbafjkfsMDVwdRDQ:matrix.org

@ara4n
Copy link
Member Author

ara4n commented Jan 5, 2017

Hopefully this can be improved a lot when #1744 lands, however it's not the only problem here. My hunch is that the only reason this is showing up so badly is #1753.

@ara4n
Copy link
Member Author

ara4n commented Jan 9, 2017

I think this is basically covered by #1729 now

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants