Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add WebAPI for fetching torrent metadata #21015

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

Piccirello
Copy link
Member

@Piccirello Piccirello commented Jul 1, 2024

This PR implements two new APIs for fetching a torrent's metadata. The APIs accept a magnet URI, torrent hash, .torrent URL, or uploaded .torrent file, and return the torrent's associated metadata. This PR also modifies the /torrents/add API to support downloading a torrent whose metadata has been previously fetched. The ultimate goal is for the WebUI to provide an Add Torrent experience equivalent to that of the GUI, where content can be reprioritized/unchecked before the torrent is added.

/fetchMetadata API

HTTP request

To request metadata for a torrent, specify the torrent in the source query parameter of a GET request to /api/v2/torrents/fetchMetadata. Torrents are supported in the following formats:

  • magnet URI (e.g. magnet:?xt=urn:btih:a8eeefc8a0dc402b24686ddfd775a409fe4b00e0&dn=example)
  • hash (e.g. a8eeefc8a0dc402b24686ddfd775a409fe4b00e0)
  • .torrent URL (e.g. https://example.com/example.torrent)

HTTP response

Given the asynchronous nature of retrieving metadata, there are two successful HTTP status codes used.

When metadata is requested for a torrent that requires asynchronous background work (i.e. connecting to DHT/peers), the client will receive a 202. A 202 indicates that the request was successful, but additional background work must be completed before a meaningful response can be provided. The response will contain the info hashes, if available, or an empty object.

GET /api/v2/torrents/fetchMetadata?source=abc
HTTP/1.1 202 OK
{
  "hash": string,
  "infohash_v1": string,
  "infohash_v2": string
} | {}

When metadata is available for the torrent, either because the torrent exists in the transfer list or because the metadata has been retrieved from a prior request, the client will receive a 200.

GET /api/v2/torrents/fetchMetadata?source=abc
HTTP/1.1 200 OK
{
  "comment": string,
  "created_by": string,
  "creation_date": int,
  "files": [],
  "hash": string,
  "infohash_v1": string,
  "infohash_v2": string,
  "name": string,
  "piece_size": int,
  "pieces_num": int,
  "private": bool,
  "total_size": int,
  "trackers": [],
  "webseeds": []
}

Retrieved metadata will be cached in the current web session. Subsequent requests performed within the same web session will return the metadata immediately, while other web sessions will be required to reretrieve the torrent's metadata from peers. Once a torrent is added using the cached metadata, the metadata is removed from the cache.

/parseMetadata API

HTTP request

To request metadata for one to several .torrent file(s), you may upload the files to the /parseMetadata API. To do so, submit the file(s) as multipart MIME data. You may use any key for the uploaded value.

To test file upload using curl, specify the -F flag (e.g. curl https://127.0.0.1:8080/api/v2/torrents/parseMetadata --get -F file=@"/root/example.torrent").

HTTP response

This API always responds to successful requests (i.e. valid torrent file(s)) with a 200. The response will contain the full metadata for the uploaded torrent(s). The response object is keyed off of the uploaded file's name.

GET /api/v2/torrents/parseMetadata
HTTP/1.1 200 OK
{
  "example.torrent": {
    "comment": string,
    "created_by": string,
    "creation_date": int,
    "files": [],
    "hash": string,
    "infohash_v1": string,
    "infohash_v2": string,
    "name": string,
    "piece_size": int,
    "pieces_num": int,
    "private": bool,
    "total_size": int,
    "trackers": [],
    "webseeds": []
  }
}

/add API

The existing /add API now supports using the metadata cache that's populated by the new metadata APIs. When specifying a url and/or torrent file to download, the metadata cache is first checked for the torrent. If found, the metadata is used directly from the cache, rather than needing to re-retrieve it. Note that when specifying the name of a .torrent file uploaded via the parseMetadata API, you must prepend file: to the file name. For example, if you uploaded example.torrent to the parseMetadata API, you can add this torrent via the /add API by specifying a url of file:example.torrent.

When metadata is retrieved directly from the cache, you may also specify a new filePriorities parameter. This parameter allows for specifying the file priority of each file in the torrent. This parameter may only be specified when adding a single torrent.

Alternatives:

I explored having the metadata API leave the request open until the metadata was available. Once the metadata was fetched, it would be returned directly in the response of the original request. One downside of this approach is that metadata retrieval can take an arbitrary long amount of time. This could result in torrents whose metadata could never be retrieved via this API (e.g. due to the retrieval taking longer than the client's/reverse proxy's request timeout). This approach would also require some further modification of qBittorrent's web application layer to suppress the default behavior of returning a blank response.

Future work:

  • Modify the WebUI to make use of the new /fetchMetadata API. This will likely mean splitting the current Add/Upload Torrent dialog into two dialogs. The first dialog will support specifying the URL(s)/.torrent file(s) to submit, while the second dialog will display the torrent's metadata and allow for modification of file priorities.
  • Support downloading the retrieved metadata as a .torrent file (as supported in the GUI)

Closes #20966.

@glassez glassez self-assigned this Jul 1, 2024
@glassez glassez added the WebAPI WebAPI-related issues/changes label Jul 1, 2024
Copy link
Member

@glassez glassez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only preliminary comments after brief review.

src/webui/api/apicontroller.h Outdated Show resolved Hide resolved
src/webui/api/apicontroller.h Outdated Show resolved Hide resolved
src/webui/api/apicontroller.h Outdated Show resolved Hide resolved
src/webui/api/apicontroller.h Outdated Show resolved Hide resolved
src/webui/api/serialize/serialize_torrent.h Outdated Show resolved Hide resolved
src/webui/api/torrentscontroller.cpp Outdated Show resolved Hide resolved
src/webui/api/torrentscontroller.cpp Outdated Show resolved Hide resolved
src/webui/api/torrentscontroller.cpp Outdated Show resolved Hide resolved
src/webui/api/torrentscontroller.h Outdated Show resolved Hide resolved
@glassez
Copy link
Member

glassez commented Jul 1, 2024

@Piccirello
I wonder if it's difficult to implement client side bencode parser in JS? It is very possible that one (or more) already exists.
I believe that if we had used one, we would have got a more universal solution. Since users often add local (for the WebUI) .torrent files, we would not have to send it to the server first for parsing, and then again to add the torrent. As for magnet links, it would also look easier if you sent raw metadata to the client and forget about it. Otherwise, how do you propose to add such torrents (after WebUI user selects file priorities etc.)? Of course, you can store all this metadata in a session... But this seems to be a more cumbersome implementation.

In any case, this part of the API should be thought out, implying the subsequent addition of torrents. Otherwise, we may end up with something little useful in practice.

@Piccirello
Copy link
Member Author

Since users often add local (for the WebUI) .torrent files, we would not have to send it to the server first for parsing, and then again to add the torrent.

I'm not convinced that's the approach we'll eventually take. I can imagine sending the .torrent file once, returning the metadata to the client, and then allowing the torrent to be added without needing to re-send the file (likely by transmitting the info hash).

As for magnet links, it would also look easier if you sent raw metadata to the client and forget about it. Otherwise, how do you propose to add such torrents (after WebUI user selects file priorities etc.)? Of course, you can store all this metadata in a session... But this seems to be a more cumbersome implementation.

To me the session seems like an appropriate place to store this. I don't think the client should be responsible for parsing this data. It would also mean each client (official and unofficial) would need to implement it.

In any case, this part of the API should be thought out, implying the subsequent addition of torrents. Otherwise, we may end up with something little useful in practice.

I agree with this. I'll try to sketch out what the next steps would look like and how this API would be used.

@Piccirello Piccirello force-pushed the metadata-api branch 2 times, most recently from cec83e2 to f857170 Compare July 1, 2024 18:38
@NikcN22
Copy link

NikcN22 commented Jul 1, 2024

@Piccirello I wonder if it's difficult to implement client side bencode parser in JS? It is very possible that one (or more) already exists. I believe that if we had used one, we would have got a more universal solution. Since users often add local (for the WebUI) .torrent files, we would not have to send it to the server first for parsing, and then again to add the torrent. As for magnet links, it would also look easier if you sent raw metadata to the client and forget about it. Otherwise, how do you propose to add such torrents (after WebUI user selects file priorities etc.)? Of course, you can store all this metadata in a session... But this seems to be a more cumbersome implementation.

In any case, this part of the API should be thought out, implying the subsequent addition of torrents. Otherwise, we may end up with something little useful in practice.

I think there is no need to embed the Bencode decoder directly into the client API. This is quite easy to do directly in the “graphical” part. More interesting is the need to provide the ability to assign priority to files in the add method.

@Piccirello Piccirello force-pushed the metadata-api branch 2 times, most recently from 1490741 to f736e62 Compare July 1, 2024 20:22
@Chocobo1
Copy link
Member

Chocobo1 commented Jul 2, 2024

I wonder if it's difficult to implement client side bencode parser in JS? It is very possible that one (or more) already exists.

FYI, there certainly exists bencode encoder/decoder library in JS however I'm not aware that they are compatible with bittorrent v2. In the past, I had to mod one to suit my need.

@glassez
Copy link
Member

glassez commented Jul 3, 2024

FYI, there certainly exists bencode encoder/decoder library in JS however I'm not aware that they are compatible with bittorrent v2.

"bencode" format is independent from BitTorrent so generic "bencode" decoder should not depend on it too. Or do you refer to torrent file specific parsers?

@Chocobo1
Copy link
Member

Chocobo1 commented Jul 3, 2024

"bencode" format is independent from BitTorrent so generic "bencode" decoder should not depend on it too.

They had deficiencies in their implementations. Not fully conform with the spec.

@glassez
Copy link
Member

glassez commented Jul 3, 2024

"bencode" format is independent from BitTorrent so generic "bencode" decoder should not depend on it too.

They had deficiencies in their implementations. Not fully conform with the spec.

It seems to be the same problem as with bencode editors. I couldn't find BitTorrent independent editor for Linux.

@Piccirello
Copy link
Member Author

I ended up exploring how retrieved metadata would tie into the /add API, resulting in some changes to the /metadata API. Namely, the /metadata API now supports processing multiple sources at once. I've also made the necessary changes to the /add API to support downloading a torrent whose metadata has been previously retrieved via /metadata. This allows for adding the torrent with custom file priorities, which will enable a future PR to modify the WebUI's Add Torrent experience to mimic that of the GUI. PR description has been modified with the full changes.

src/webui/api/torrentscontroller.cpp Fixed Show resolved Hide resolved
@Piccirello Piccirello marked this pull request as ready for review July 9, 2024 21:42
@Piccirello Piccirello requested a review from a team July 9, 2024 21:42
@glassez
Copy link
Member

glassez commented Jul 12, 2024

sources can be delineated by a (non-url encoded) comma

I suppose you're talking about percent-encoding, right?
I don't believe it is valid requirement. IIRC, in URL some of characters MUST be percent-encoded and other CAN be percent-encoded.

@glassez
Copy link
Member

glassez commented Jul 12, 2024

When metadata is requested for multiple torrents, a 202 will be returned if any of the torrents requires asycnhronous background work. This means that a 202 may include data for some of the request torrents.

It would look much simpler and more convenient if "metadata" endpoint accepted only single source.

@Piccirello
Copy link
Member Author

sources can be delineated by a (non-url encoded) comma

I suppose you're talking about percent-encoding, right? I don't believe it is valid requirement. IIRC, in URL some of characters MUST be percent-encoded and other CAN be percent-encoded.

, is a reserved character and so it must be percent encoded.

It would look much simpler and more convenient if "metadata" endpoint accepted only single source.

This was my initial approach but I changed it because of how it will integrate with the Add Torrent dialogs in the WebUI. With proper documentation and status codes, I think this API will be easy for consumers/clients to understand.

@glassez
Copy link
Member

glassez commented Jul 12, 2024

, is a reserved character and so it must be percent encoded.

Then why do you suggest using it non-encoded?

I changed it because of how it will integrate with the Add Torrent dialogs in the WebUI.

Do "Add Torrent dialogs in the WebUI" really require API endpoint to allow accepting multiple sources at once?

With proper documentation and status codes, I think this API will be easy for consumers/clients to understand.

If "metadata" accepts only single source then 202 status would definitely mean that the response does not contain metadata, and 200 - on the contrary, that the metadata is available in the response.
If "metadata" accepts multiple sources then 202 status is ambiguous and you will still have to check the response body to understand which metadata is immediately available and which is not. In addition, it is possible that one of the sources is invalid. What should you do in this case? In single source approach you could just return error status...

@Piccirello Piccirello force-pushed the metadata-api branch 2 times, most recently from 778a776 to dee0d8c Compare September 30, 2024 12:34
@Piccirello
Copy link
Member Author

@qbittorrent/bug-handlers this PR has been in a ready state for several weeks. Can I get a thorough review and/or an approval?

@glassez
Copy link
Member

glassez commented Oct 4, 2024

@qbittorrent/bug-handlers this PR has been in a ready state for several weeks. Can I get a thorough review and/or an approval?

It's still on my to-do list. Unfortunately it's hard to review such changes using the mobile web version of GitHub, which I usually use to review the PRs. And there is still not enough time at the computer for it, since there are many other tasks.

@Piccirello
Copy link
Member Author

@qbittorrent/bug-handlers this PR has been in a ready state for several weeks. Can I get a thorough review and/or an approval?

It's still on my to-do list. Unfortunately it's hard to review such changes using the mobile web version of GitHub, which I usually use to review the PRs. And there is still not enough time at the computer for it, since there are many other tasks.

Understood. You've already given several rounds of reviews, so I invite other folks to take a look too.

Copy link
Member

@Chocobo1 Chocobo1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only commented on coding style.

src/webui/api/apicontroller.cpp Outdated Show resolved Hide resolved
Comment on lines 31 to 35
#include "base/bittorrent/torrentdescriptor.h"
#include "base/bittorrent/torrentinfo.h"
#include "base/net/downloadmanager.h"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you already forward declared them:

Suggested change
#include "base/bittorrent/torrentdescriptor.h"
#include "base/bittorrent/torrentinfo.h"
#include "base/net/downloadmanager.h"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So how about this?

src/webui/api/torrentscontroller.cpp Outdated Show resolved Hide resolved
src/webui/api/torrentscontroller.cpp Outdated Show resolved Hide resolved
src/webui/api/torrentscontroller.cpp Outdated Show resolved Hide resolved
src/webui/api/torrentscontroller.cpp Show resolved Hide resolved
src/webui/api/torrentscontroller.cpp Outdated Show resolved Hide resolved
src/webui/api/torrentscontroller.cpp Outdated Show resolved Hide resolved
src/webui/api/torrentscontroller.cpp Outdated Show resolved Hide resolved
src/webui/api/torrentscontroller.cpp Outdated Show resolved Hide resolved
Copy link
Member

@Chocobo1 Chocobo1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Last two comments for coding style.

src/webui/api/apicontroller.h Outdated Show resolved Hide resolved
src/webui/api/torrentscontroller.cpp Outdated Show resolved Hide resolved
@Piccirello
Copy link
Member Author

Any more feedback on this or is it ready for approvals?

@glassez
Copy link
Member

glassez commented Oct 19, 2024

I still haven't gotten to the detailed code review, but at the same time I had the opportunity to rethink its design.
IMO, this would eliminate some questionable aspects if it did not claim to have a general purpose, but would be presented in a more utilitarian sense as some kind of companion to the torrent addition function, that is, to expand the current torrents/add function and supplement it with the torrents/prepareAdd one.
What advantages could it have?

  1. This could be the only function that accepts both links to torrents (magnet links) and existing files (as you originally wanted to do, but unlike that time, this behavior looks legitimate in this interpretation, since it does what it says).
  2. There is no need to do anything doubtful if the requested torrent has already been added. In this case, torrents/prepareAdd naturally fails.

I will not insist (for subjective reasons). If someone explicitly approves of the current design, then so be it. Your responsibility.
Moreover, if someone (for example, @Chocobo1) finds it important to approve and merge it faster (without further code review), then you can do so. Maybe I'll come back to it later, when it's already been merged. Nothing prevents we from fixing the shortcomings of its implementation later (if any). The main thing in this matter is its basic design (and interface).

@@ -67,6 +69,8 @@ class APIController : public ApplicationComponent<QObject>
void setResult(const QJsonObject &result);
void setResult(const QByteArray &result, const QString &mimeType = {}, const QString &filename = {});

void setStatus(APIStatus status);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe just add another parameter to setResult?

void setResult(const QJsonObject &result, APIStatus status = APIStatus::Ok);

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How would this work for cases where we don't want to send any data? For example, setting an explicit 204.

src/webui/api/torrentscontroller.h Outdated Show resolved Hide resolved
if (m_torrentMetadata.contains(infoHash))
{
const BitTorrent::TorrentDescriptor &torrentDescr = m_torrentMetadata[infoHash];
return torrentDescr.info().has_value() && torrentDescr.info().value().isValid();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the following check is enough:

Suggested change
return torrentDescr.info().has_value() && torrentDescr.info().value().isValid();
return torrentDescr.info().has_value();

src/webui/api/torrentscontroller.cpp Outdated Show resolved Hide resolved
if (iter != m_torrentSource.end() && isMetadataDownloaded(iter.value()))
{
const BitTorrent::InfoHash &hash = iter.value();
const BitTorrent::TorrentDescriptor &torrentDescr = m_torrentMetadata[hash];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After using m_torrentSource.constFind, this should change to:

I don't understand what the relationship is... Why to copy it?

src/webui/api/torrentscontroller.cpp Outdated Show resolved Hide resolved
src/webui/api/torrentscontroller.cpp Outdated Show resolved Hide resolved
src/webui/api/torrentscontroller.cpp Outdated Show resolved Hide resolved
src/webui/api/torrentscontroller.cpp Outdated Show resolved Hide resolved
src/webui/api/torrentscontroller.cpp Outdated Show resolved Hide resolved
@Piccirello Piccirello force-pushed the metadata-api branch 9 times, most recently from b58e754 to 2e36fd9 Compare October 29, 2024 07:15
Signed-off-by: Thomas Piccirello <thomas@piccirello.com>
Signed-off-by: Thomas Piccirello <thomas@piccirello.com>
Signed-off-by: Thomas Piccirello <thomas@piccirello.com>
Signed-off-by: Thomas Piccirello <thomas@piccirello.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
WebAPI WebAPI-related issues/changes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Load torrent metadata by magnet, hash... in qBittorrent-nox
4 participants