Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

"Failed to load timeline position" (/context 404) from jump to last unread message button after using Purge History API #16207

Closed
zeratax opened this issue Aug 30, 2023 · 3 comments
Labels
A-Admin-API A-Context-Endpoint /context client API endpoint O-Occasional Affects or can be seen by some users regularly or most users rarely S-Major Major functionality / product severely impaired, no satisfactory workaround. T-Defect Bugs, crashes, hangs, security vulnerabilities, or other reported issues. X-Needs-Info This issue is blocked awaiting information from the reporter

Comments

@zeratax
Copy link

zeratax commented Aug 30, 2023

Description

After my db exploded in size (aprox 300gb) i needed to quickly get rid of as much state as possible so I purged every room from any non local event older than 30 days.

But now any room that has only local members is missing any message older than 30 days.

Looking into an example channel, I see 0 messages in element and a jump to last unread message button that results in

Failed to load timeline position
Tried to load a specific point in this room's timeline, but was unable to find it.

Error loading timeline panel at !yHkWvIueFDsVBYAhRo:dmnd.sh/$HvTnDTEDlze3Q15hditXICmEchw_ef7hehoKxx2O_Uk M_NOT_FOUND: MatrixError: [404] Event not found. (https://dmnd.sh/_matrix/client/v3/rooms/!yHkWvIueFDsVBYAhRo%3Admnd.sh/context/%24HvTnDTEDlze3Q15hditXICmEchw_ef7hehoKxx2O_Uk?filter=%7B%22lazy_load_members%22%3Atrue%7D)
    at c (vector://vector/webapp/bundles/9080b7a61d57e094dbd7/bundle.js:2:1341581)
    at p.requestOtherUrl (vector://vector/webapp/bundles/9080b7a61d57e094dbd7/bundle.js:2:3603980)

Now when I directly interact with postgres I can still find tons of events in this room with:

SELECT e.room_id, r.name, e.event_id, e.type, e.content, j.json
  FROM events e
  LEFT JOIN event_json j USING (room_id)
  LEFT JOIN room_stats_state r USING (room_id)
  WHERE e.room_id = '!yHkWvIueFDsVBYAhRo:dmnd.sh'
  ORDER BY stream_ordering DESC
  LIMIT 100;

And also specifically event $HvTnDTEDlze3Q15hditXICmEchw_ef7hehoKxx2O_Uk is still there.

So far this seems to be true for all my local rooms, the events are still in the db, but not visible in Element. I can search for them on old clients, but I assume that's just because encrypted messages get cached for search.

Can I recover this somehow?

Steps to reproduce

I run my instance with workers, but during the purge all of those were disabled with only the main worker running and nginx also being disabled to avoid any other traffic to my instance.

I used this script to purge:

purge_script.zsh
FILE=./errors.txt
if test -f "$FILE"; then
  echo "retrying failed attempts..."
  A=("${(@f)$(cat $FILE)}")
  mv $FILE ./errors.txt.bak
else
  A=(`psql -U synapse_pg -d synapse_db -h localhost -c "SELECT room_id FROM rooms;"`)
fi

token="ADMIN_TOKEN"
timestamp=(`date --date="$(date) -30 days" +%s%3N`)
n=1
for line in ${A[@]}; do
if [[ "$line" =~ ^!.* ]];
then
    echo "${n}: trying to purge ${line}..."
    (( n += 1 ))
    result=(`curl -X POST "http://localhost:8008/_synapse/admin/v1/purge_history/${line}?access_token=${token}" --data "{ \"purge_up_to_ts\": ${timestamp} }"`)
    if [ $? -eq 0 ]; then
        echo "... succeded!"
    else
        echo $line >> errors.txt
        echo "... failed!"
        echo "curl -X POST 'http://localhost:8008/_synapse/admin/v1/purge_history/${line}?access_token=${token}'  --data '{ \"purge_up_to_ts\": ${timestamp} }'"
        sleep 3
    fi
    echo "result: $result"
fi
done

Homeserver

dmnd.sh

Synapse Version

{"server_version":"1.71.0","python_version":"3.8.10"}

Installation Method

pip (from PyPI)

Database

postgres with a single server. i think i migrated from sqlite but that must have been 5 years ago.

Workers

Multiple workers

Platform

  • OS: Ubuntu 20.04.6 LTS x86_64
  • Host: KVM/QEMU (Standard PC (i440FX + PIIX, 1996) pc-i440fx-7.2)
  • Kernel: 5.4.0-77-generic
  • Uptime: 92 days, 19 hours, 54 mins
  • Packages: 1347 (dpkg)
  • Shell: zsh 5.8
  • Terminal: /dev/pts/6
  • CPU: Intel Xeon E5-2630 v4 (6) @ 2.199GHz
  • GPU: 00:02.0 Vendor 1234 Device 1111
  • Memory: 7416MiB / 16013MiB

Configuration

No response

Relevant log output

2023-08-30 04:37:06,886 - synapse.metrics.background_process_metrics - 244 - ERROR - rotate_notifs-60- Background process 'rotate_notifs' threw an exception
Traceback (most recent call last):
  File "/home/matrix/synapse/env/lib/python3.8/site-packages/synapse/metrics/background_process_metrics.py", line 242, in run
    return await func(*args, **kwargs)
  File "/home/matrix/synapse/env/lib/python3.8/site-packages/synapse/storage/databases/main/event_push_actions.py", line 1307, in _rotate_notifs
    caught_up = await self.db_pool.runInteraction(
  File "/home/matrix/synapse/env/lib/python3.8/site-packages/synapse/storage/database.py", line 881, in runInteraction
    return await delay_cancellation(_runInteraction())
  File "/home/matrix/synapse/env/lib/python3.8/site-packages/twisted/internet/defer.py", line 1443, in _inlineCallbacks
    result = current_context.run(result.throwExceptionIntoGenerator, g)
  File "/home/matrix/synapse/env/lib/python3.8/site-packages/twisted/python/failure.py", line 500, in throwExceptionIntoGenerator
    return g.throw(self.type, self.value, self.tb)
  File "/home/matrix/synapse/env/lib/python3.8/site-packages/synapse/storage/database.py", line 848, in _runInteraction
    result = await self.runWithConnection(
  File "/home/matrix/synapse/env/lib/python3.8/site-packages/synapse/storage/database.py", line 976, in runWithConnection
    return await make_deferred_yieldable(
  File "/home/matrix/synapse/env/lib/python3.8/site-packages/twisted/python/threadpool.py", line 238, in inContext
    result = inContext.theWork()  # type: ignore[attr-defined]
  File "/home/matrix/synapse/env/lib/python3.8/site-packages/twisted/python/threadpool.py", line 254, in <lambda>
    inContext.theWork = lambda: context.call(  # type: ignore[attr-defined]
  File "/home/matrix/synapse/env/lib/python3.8/site-packages/twisted/python/context.py", line 118, in callWithContext
    return self.currentContext().callWithContext(ctx, func, *args, **kw)
  File "/home/matrix/synapse/env/lib/python3.8/site-packages/twisted/python/context.py", line 83, in callWithContext
    return func(*args, **kw)
  File "/home/matrix/synapse/env/lib/python3.8/site-packages/twisted/enterprise/adbapi.py", line 293, in _runWithConnection
    compat.reraise(excValue, excTraceback)
  File "/home/matrix/synapse/env/lib/python3.8/site-packages/twisted/python/deprecate.py", line 298, in deprecatedFunction
    return function(*args, **kwargs)
  File "/home/matrix/synapse/env/lib/python3.8/site-packages/twisted/python/compat.py", line 403, in reraise
    raise exception.with_traceback(traceback)
  File "/home/matrix/synapse/env/lib/python3.8/site-packages/twisted/enterprise/adbapi.py", line 284, in _runWithConnection
    result = func(conn, *args, **kw)
  File "/home/matrix/synapse/env/lib/python3.8/site-packages/synapse/storage/database.py", line 969, in inner_func
    return func(db_conn, *args, **kwargs)
  File "/home/matrix/synapse/env/lib/python3.8/site-packages/synapse/storage/database.py", line 710, in new_transaction
    r = func(cursor, *args, **kwargs)
  File "/home/matrix/synapse/env/lib/python3.8/site-packages/synapse/storage/databases/main/event_push_actions.py", line 1530, in _rotate_notifs_txn
    self._rotate_notifs_before_txn(
  File "/home/matrix/synapse/env/lib/python3.8/site-packages/synapse/storage/databases/main/event_push_actions.py", line 1643, in _rotate_notifs_before_txn
    self.db_pool.simple_upsert_many_txn(
  File "/home/matrix/synapse/env/lib/python3.8/site-packages/synapse/storage/database.py", line 1464, in simple_upsert_many_txn
    return self.simple_upsert_many_txn_native_upsert(
  File "/home/matrix/synapse/env/lib/python3.8/site-packages/synapse/storage/database.py", line 1560, in simple_upsert_many_txn_native_upsert
    txn.execute_values(sql, args, fetch=False)
  File "/home/matrix/synapse/env/lib/python3.8/site-packages/synapse/storage/database.py", line 382, in execute_values
    return self._do_execute(
  File "/home/matrix/synapse/env/lib/python3.8/site-packages/synapse/storage/database.py", line 436, in _do_execute
    return func(sql, *args, **kwargs)
  File "/home/matrix/synapse/env/lib/python3.8/site-packages/synapse/storage/database.py", line 383, in <lambda>
    lambda the_sql: execute_values(self.txn, the_sql, values, fetch=fetch),
  File "/home/matrix/synapse/env/lib/python3.8/site-packages/psycopg2/extras.py", line 1292, in execute_values
    cur.execute(b''.join(parts))
psycopg2.errors.UniqueViolation: duplicate key value violates unique constraint "event_push_summary_unique_index"
DETAIL:  Key (user_id, room_id)=(@echo_bot:dmnd.sh, !ZWwTVjrGzdpdVdkRBC:matrix.org) already exists.

Anything else that would be useful to know?

I don't have any logging from the purging itself since it happened too long ago and I haven't yet vacuumed so I am still running out of space. The log above is the only error I get repeatedly currently.

All of these rooms are room version 1-4 so very old

@zeratax
Copy link
Author

zeratax commented Aug 30, 2023

actually nevermind the exact even that the last unread message points to I can't seem to find this way? What i found before just had it listed in the prev_events field:

event from !yHkWvIueFDsVBYAhRo:dmnd.sh
{
  "auth_events": [
    "$L0qPnlPdup8ulMk4RfqhJvStgLDUBl4ngczvw6qYrY4",
    "$YKavGK4PhIqWmMyFzpNu6aR1nHcha9w-cLiaNQB-FUI",
    "$Zss-GH6gJXGRfZTjy2aDIjvc7nzVXOUkgAOWml090Fs",
    "$-L8OOdBQgIQWQOVwjCYVjH3wr7MYbyn5dHg4E-8fTvA"
  ],
  "prev_events": [
    "$HvTnDTEDlze3Q15hditXICmEchw_ef7hehoKxx2O_Uk"
  ],
  "type": "m.room.member",
  "room_id": "!yHkWvIueFDsVBYAhRo:dmnd.sh",
  "sender": "@daniel:dmnd.sh",
  "content": {
    "membership": "join",
    "displayname": "Daniel",
    "avatar_url": "mxc://dmnd.sh/eCrtVyqTKrlHahhCkxNFWhFX"
  },
  "depth": 23,
  "prev_state": [],
  "state_key": "@daniel:dmnd.sh",
  "origin": "dmnd.sh",
  "origin_server_ts": 1585755085445,
  "hashes": {
    "sha256": "ip1PpFOScC9EDnt4bvu3wmLMYJbrYT9p07KucmbHV/c"
  },
  "signatures": {
    "dmnd.sh": {
      "ed25519:a_Ysle": "7GAMP20JYJdQkYafud+Igrfv5OnHM4juAH42OJC38QTyqKQ4K+zwUvfunZFgSRkZAP7OlmFk+qiZdnRCHumWCg"
    }
  }
}

still i can see plenty of messages (as in real messages with a body, not just join events etc) in all these channels in postgres that i can not see in my element scrollback on no device, nor can other people in these rooms.

other event with a proper body
{
  "auth_events": [
    [
      "$1572119910866QDAAt:dmnd.sh",
      {
        "sha256": "sH4iQEw/dSfVo2XtXq1JMVX/3IZv2BcE5QLJNa0WMsc"
      }
    ],
    [
      "$15550156812593YvpRC:dmnd.sh",
      {
        "sha256": "3UhmP24q+5eTGEebXPN6AmKnPVQvotDqrSjplDazvnM"
      }
    ],
    [
      "$1572115012822ggxWZ:dmnd.sh",
      {
        "sha256": "e4sNcvvOljj6OKgqHcj2lJ6iHQmhTIeH6AXjfpJo7X0"
      }
    ]
  ],
  "prev_events": [
    [
      "$15757375341742LQAPQ:dmnd.sh",
      {
        "sha256": "uNAsG9nDcAuPQxTLxkErSoXB1/h+Hpg/Y/0qcnZoGw0"
      }
    ]
  ],
  "type": "m.room.message",
  "room_id": "!MYczBHIFOSXqTpHVGB:dmnd.sh",
  "sender": "@andre:dmnd.sh",
  "content": {
    "body": "hackerman.png",
    "info": {
      "size": 780459,
      "mimetype": "image/png",
      "thumbnail_info": {
        "w": 610,
        "h": 600,
        "mimetype": "image/png",
        "size": 570232
      },
      "w": 842,
      "h": 828,
      "thumbnail_url": "mxc://dmnd.sh/EYvdbKXiHuUIlDdkwJFkbtFS"
    },
    "msgtype": "m.image",
    "url": "mxc://dmnd.sh/SntLwOugOttOyGqUDYMjSkSK"
  },
  "depth": 77,
  "prev_state": [],
  "event_id": "$15757380361757BgoMK:dmnd.sh",
  "origin": "dmnd.sh",
  "origin_server_ts": 1575738036831,
  "hashes": {
    "sha256": "oLNGBP6AojOXN4mWoWX8FHHDcPjMqdTH7UMC2MaN+Ks"
  },
  "signatures": {
    "dmnd.sh": {
      "ed25519:a_Ysle": "UVgOW/s0fdA5uzRXKS3LDsXWPkiQpakn11miw2AdJKO2i1+cUs0k0kQA+KqwILmhtIW0E7rrNzWURo3zigzgCg"
    }
  },
  "unsigned": {
    "age_ts": 1575738036831
  }
}

@DMRobertson
Copy link
Contributor

For reference, the purge history docs are here.

{"server_version":"1.71.0","python_version":"3.8.10"}

Please note there are security advisories affecting this version, see https://github.com/matrix-org/synapse/security/advisories.


Error loading timeline panel at !yHkWvIueFDsVBYAhRo:dmnd.sh/$HvTnDTEDlze3Q15hditXICmEchw_ef7hehoKxx2O_Uk M_NOT_FOUND: MatrixError: [404] Event not found. (https://dmnd.sh/_matrix/client/v3/rooms/!yHkWvIueFDsVBYAhRo%3Admnd.sh/context/%24HvTnDTEDlze3Q15hditXICmEchw_ef7hehoKxx2O_Uk?filter=%7B%22lazy_load_members%22%3Atrue%7D)

Can you find Synapse logs corresponding to this URL?

I can see that

events = await self.get_events_as_list(
[event_id],
redact_behaviour=redact_behaviour,
get_prev_content=get_prev_content,
allow_rejected=allow_rejected,
)
event = events[0] if events else None
must have returned None, which means that get_events_as_list returned an empty list. There are many possible explanations for this and it's hard to reason them through. Could you enable debug logging and try to reproduce the jump-to-timeline error for that event? It might give us more clues about what is going on here.

@DMRobertson DMRobertson changed the title All local messages appear as lost after using Purge History API "Failed to load timeline position" (/context 404) from jump to last unread message button after using Purge History API Aug 30, 2023
@DMRobertson DMRobertson added A-Admin-API A-Context-Endpoint /context client API endpoint S-Major Major functionality / product severely impaired, no satisfactory workaround. T-Defect Bugs, crashes, hangs, security vulnerabilities, or other reported issues. O-Occasional Affects or can be seen by some users regularly or most users rarely X-Needs-Info This issue is blocked awaiting information from the reporter labels Aug 30, 2023
@clokep
Copy link
Member

clokep commented Sep 22, 2023

Closing due to lack of response.

If you're the reporter please:

  1. Upgrade Synapse to a more modern version.
  2. Provide the requested logs.

@clokep clokep closed this as completed Sep 22, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A-Admin-API A-Context-Endpoint /context client API endpoint O-Occasional Affects or can be seen by some users regularly or most users rarely S-Major Major functionality / product severely impaired, no satisfactory workaround. T-Defect Bugs, crashes, hangs, security vulnerabilities, or other reported issues. X-Needs-Info This issue is blocked awaiting information from the reporter
Projects
None yet
Development

No branches or pull requests

3 participants