Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Listing user's media doesn't list just user's media #11858

Closed
matrix07012 opened this issue Jan 29, 2022 · 10 comments · Fixed by #11862
Closed

Listing user's media doesn't list just user's media #11858

matrix07012 opened this issue Jan 29, 2022 · 10 comments · Fixed by #11862
Labels
A-Admin-API S-Minor Blocks non-critical functionality, workarounds exist. T-Defect Bugs, crashes, hangs, security vulnerabilities, or other reported issues.

Comments

@matrix07012
Copy link

matrix07012 commented Jan 29, 2022

Description

I'm fairly certain that admin API for listing all media of a user lists media from all the rooms the user is in instead, because listing mine shows a ton of media that I never uploaded.

Steps to reproduce

curl --header "Authorization: Bearer <token>" "http://localhost:8008/_synapse/admin/v1/users/@myaccount:foo.bar/media?dir=b&from=0&limit=50&order_by=created_ts"

Version information

Synapse 1.51.0 on Ubuntu 20.04.3 install with from the deb repo

@dklimpel
Copy link
Contributor

Do you have any more information?
For example log files?
If you set the log level for SQL to DEBUG, you can see the queries.

@matrix07012
Copy link
Author

Do you have any more information? For example log files? If you set the log level for SQL to DEBUG, you can see the queries.

What event should I be looking for? I'm having a difficult time searching the log since debug prints so much.

@matrix07012
Copy link
Author

I'm sorry, I can't find the query, but all the media that doesn't belong to the account seems to have an id starting with a date.
Like this:
{ "media_id": "2022-01-29_agqqKeCRXBRrPppr", "media_type": "image/jpeg", "media_length": 95792, "upload_name": null, "created_ts": 1643486140020, "last_access_ts": 1643486257844, "quarantined_by": null, "safe_from_quarantine": false }

@dklimpel
Copy link
Contributor

That ist strange. media_ids in Synapse does not have a date. The media_id is a random string:

media_id = random_string(24)

The request from admin API to list users' media can you find with get_local_media_by_user_paginate_txn. That ist the name of the transaction in log file.

@matrix07012
Copy link
Author

matrix07012 commented Jan 30, 2022

That ist strange. media_ids in Synapse does not have a date. The media_id is a random string:

media_id = random_string(24)

The request from admin API to list users' media can you find with get_local_media_by_user_paginate_txn. That ist the name of the transaction in log file.

Thanks, I found it. The query seems to be correct SELECT "media_id", "media_type", "media_length", "upload_name", "created_ts", "last_access_ts", "quarantined_by", "safe_from_quarantine" FROM local_media_repository WHERE user_id = ? ORDER BY created_ts DESC, media_id ASC LIMIT ? OFFSET ?. Looks like the API is working correctly.
So I queried the database with just SELECT * FROM local_media_repository ORDER BY created_ts. Why is a url_cache column there? There's a lot of media ids there starting with a date and url_cache populated, oldest is 2022-01-28. I think I know what's happening, url previews get stored as local media and assigned to a random local user in the room. My friend from a different server posted a link yesterday and now I found it looking at the query, assigned to one of my local users that is in that room.

@dklimpel
Copy link
Contributor

Ok. That are preview URLs created with _handle_url:

async def _handle_url(
self, url: str, user: UserID, allow_data_urls: bool = False
) -> MediaInfo:
"""
Fetches content from a URL and parses the result to generate a MediaInfo.
It uses the media storage provider to persist the fetched content and
stores the mapping into the database.

file_id = datetime.date.today().isoformat() + "_" + random_string(16)

This should probably be documented somewhere.

@matrix07012
Copy link
Author

Ok. That are preview URLs created with _handle_url:

async def _handle_url(
self, url: str, user: UserID, allow_data_urls: bool = False
) -> MediaInfo:
"""
Fetches content from a URL and parses the result to generate a MediaInfo.
It uses the media storage provider to persist the fetched content and
stores the mapping into the database.

file_id = datetime.date.today().isoformat() + "_" + random_string(16)

This should probably be documented somewhere.

But it assigning them to random local users who didn't post the link seem like a bug

@dklimpel
Copy link
Contributor

IMO it is not a bug. The user/sender who posts the link, posts only a (text) link, not an image. If your server has preview URLs enabled (what the sender does not know), the recipient (requests) download(s) the image. That is the reason whe the recipient becomes the owner of the image.

@matrix07012
Copy link
Author

IMO it is not a bug. The user/sender who posts the link, posts only a (text) link, not an image. If your server has preview URLs enabled (what the sender does not know), the recipient (requests) download(s) the image. That is the reason whe the recipient becomes the owner of the image.

Ok, I understand. It still makes administration quite confusing though. Maybe adding an option to the API to exclude url previews might be good idea.

@dklimpel
Copy link
Contributor

@clokep clokep added S-Minor Blocks non-critical functionality, workarounds exist. T-Defect Bugs, crashes, hangs, security vulnerabilities, or other reported issues. labels Feb 1, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A-Admin-API S-Minor Blocks non-critical functionality, workarounds exist. T-Defect Bugs, crashes, hangs, security vulnerabilities, or other reported issues.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants