Automatically import videos from other platforms #754

roipoussiere · 2018-06-29T16:27:58Z

In order to simplify the PeerTube adoption from current Youtubers, it's important to simplify the importing process.

The import script is a good step in this direction, but it is possible to do better: automatically import Youtube videos, so the PeerTube import can be totally transparent for the Youtuber.

We can do this without the Youtube API, by using atom feeds: https://www.youtube.com/feeds/videos.xml?channel_id=<channel_id>.

I'm not used with TypeScript but I wrote a small Python script as a proof-of-concept example, available here.

Used with the import script and a 5min-cronjob, a PeerTube user could feed its instance by only specifying its Youtube channel name.

Note that we also need a way to check if the video is not already on the PeerTube instance, can we provide the youtube video id in some video metadata?

The text was updated successfully, but these errors were encountered:

rigelk · 2018-06-29T20:33:00Z

Since it's a auto import, we can assume videos previously imported have the same title as on YT. We could just check against titles for a first implementation.

rezonant · 2018-06-29T20:50:08Z

Maybe for a POC, but speaking as someone who does these sort of automated import ops at my day job (I work for a large YouTube channel) I can say that titles can and are modified after publishing, especially within the first hour after publishing. Can't we store the YouTube video ID in some form of metadata entry on the PeerTube video and look it up to see if we've already imported it?

rigelk · 2018-06-29T20:58:22Z

Ah, right :/

I guess it's not a problem to add this metadata, since it's something that doesn't need to be federated (meaning we don't break things if we add a field to the video model).

My concern is more as to how to make that metadata structure broad enough to be reused for imports from the other platforms supported by youtube-dl. I guess a HashMap is fine there.

roipoussiere · 2018-06-29T22:10:23Z

I can say that titles can and are modified after publishing

fyi Youtube feed also provides an updated field, so I guess if the video is renamed, it will appears in the feed for the second time, with the same published date (and youtube id) but an other updated date.

My concern is more as to how to make that metadata structure broad enough to be reused for imports from the other platforms supported by youtube-dl. I guess a HashMap is fine there.

But not all platforms supported by youtube-dl provides an Atom/RSS feed.

Note that a unique item id is mandatory for Atom feed:

atom:entry elements MUST contain exactly one atom:id element.

... but is optional for RSS feed:

<guid> is an optional sub-element of . guid stands for globally unique identifier. It's a string that uniquely identifies the item.

Related: The RSS bridge project provides Atom/RSS for many websites, including video providers.

In fact we could simply use the video URL as a unique identifier, it is supported by all platforms and supposed to be unique.

rezonant · 2018-06-29T22:26:25Z

URL is an elegant way to do it but canonization is a factor. Though you wouldn't run into it with auto YT import you may run into it in other components. An example is the 4-5 different ways to express a YouTube URL (YouTube.be, mobile.youtube.com/watch, gaming.youtube.com/watch, music.youtube.com/watch, YouTube.com/watch). For this reason I would recommend just having a provider and ID pair, or just "provider:providerID" format. Use .split(':', 2) to separate the two

roipoussiere · 2018-06-29T23:52:00Z

An example is the 4-5 different ways to express a YouTube URL (YouTube.be, mobile.youtube.com/watch, gaming.youtube.com/watch, music.youtube.com/watch, YouTube.com/watch)

Hmm yes, good point. provider and providerID could be fine.

Note that Vimeo and DailyMotion also officially provide feeds, where we can find video unique ids:

<feed> / <entry> / <yt:videoId> for Youtube (example);
<rss> / <channel> / <item> / <dm:id> for DailyMoton (example);
<rss> / <channel> / <item> / <guid> for Vimeo (example).

So we could easily implement auto-download also for these platforms.

Bugsbane · 2018-09-27T14:27:27Z

This is an important idea for boosting the amount of content on the Peertube network. Once included, I would go as far as to as to ask for new users channels elsewhere that they'd like to set up for auto-import, when they're registering.

That way, creators from other platforms who make an account on a PeerTube instance "just to try it out" will automatically get up and running quickly and the network gets a ton of ongoing new content, even if they forget about their PeerTube channel and don't touch it again.

Beyond the scope of this issue, but worth considering later, would be actually doing a full import of someone's channel (not just videos), ie their channel description, avatar etc.

McFlat · 2018-10-10T17:06:22Z

I don't think that checking against the video title/name would be good, because users can upload different videos having the same name. it's better to store the original url for the imported video and check against the imported video url.

This is how it's done for the frontend when a user imports a video, but for the video-import script it doesn't set the original video url that's imported(targetUrl). So really it is a bug in the ./server/tools/peertube-import-videos.ts because it's not performing the same steps as the video import from the frontend, it also doesn't create entries in the videoImport table when using the script, but it does do it when done by a user from the frontend.

McFlat · 2018-10-10T21:55:33Z

Next I want to work on the videos import script to be like the frontend one because, the difference really affects me in a big way. I need it to check the targetUrl instead of a name/title of the video to keep out any duplicates.

McFlat · 2018-10-10T21:57:57Z

I think using a generic target url, converted from the many different possibilities would be best instead of doing provider:id, that way you can just use the url the way it is, instead of having to do a conversion all the time to fetch it

gnouts · 2019-06-18T09:21:33Z

(Maybe out of scope but) In the meantime I personnaly use @roipoussiere's Python script with small improvement and a cron job.
Here is my code : https://taboulisme.com/git/nouts/peertube-import
Though, it can only be set by the instance admin. I'm using it for duplicating some YouTube channels at https://alttube.fr/videos/local

aliceinwire · 2019-07-04T06:11:52Z

@gnouts I like your project but it would be more intuitive to have it on peertube anyway
It would improve discovery and usage of such tool

fflorent · 2019-07-04T06:30:30Z

@gnouts I would like to report an issue (in node arguments, -l is for license whereas your intent is to specify language, so this should be -L or --language). Where may I report a bug? :)

gnouts · 2019-07-04T11:41:39Z

@fflorent Thanks, I set up a github clone here : https://github.com/gnouts/peertube-import
@aliceinwire Sure, I agree. In the meantime it works for my usecase and actually I can't do better with my time and knowledge.

fflorent · 2019-07-24T20:44:12Z

While I remember having used this script to import a whole YT channel, I recently tried again and failed to run it for that purpose.

But the documentation seems to tell that we can pass the ID of a channel: https://github.com/Chocobozzz/PeerTube/blob/develop/support/doc/tools.md#peertube-import-videosjs

I also see that youtube-dl supports downloading a whole channel. I wonder if that's possible to use a cron script to automatically synchronize a YT channel. I'll investigate that but if anyone can tell me more about it, I would be very grateful! :)

fflorent · 2019-07-28T21:21:51Z

OK, so I could run the peertube-import-video script in order to upload a whole YT channel to Peertube. Also I figured out that we cannot rely on video names in order to detect whether a video has already been uploaded (if the YT owner renames a video, that creates duplicates…). Rather than that, in order to synchronize a YT channel with a PT one, I propose to rely on a --since parameter and use a crontab job.

For that purpose, I opened this PR: #1991

Even if that's WIP, feedback welcome :).

Florent

Bugsbane · 2019-08-20T19:40:06Z

Aren't all YouTube video ID's unique? If so, couldn't the import just store the YouTube video ID and then check there aren't already any videos imported with that same ID?

fflorent · 2019-08-20T20:10:37Z

@Bugsbane Yes, it has already been suggested here: #1991 (comment)

But it requires more efforts for implementing this.

If you are willing to contribute, please do so. I would personally appreciate this improvement :).

Bugsbane · 2019-09-20T01:08:16Z

If you are willing to contribute, please do so.

I'm more than happy to contribute... however my personal skills though lie in design. UX, communications and marketing rather than coding. :P

mister-monster · 2019-10-18T23:40:26Z

I actually found this issue (and multiple others) after i had almost completed a Python tool to do this, so i figured i would comment about it here.

https://github.com/mister-monster/YouTube2PeerTube

This is a tool that watches YouTube channels, and when new videos are found it mirrors them to a PeerTube channel.

Chocobozzz · 2019-10-19T11:04:34Z

@mister-monster Don't hesitate to make a MR to add your script in the documentation website: https://docs.joinpeertube.org/#/use-third-party-application

…obozzz#754)

* Add external channel URL for channel update / creation (#754) * Disallow synchronisation if user has no video quota (#754) * More constraints serverside (#754) * Disable sync if server configuration does not allow HTTP import (#754) * Working version synchronizing videos with a job (#754) TODO: refactoring, too much code duplication * More logs and try/catch (#754) * Fix eslint error (#754) * WIP: support synchronization time change (#754) * New frontend #754 * WIP: Create sync front (#754) * Enhance UI, sync creation form (#754) * Warning message when HTTP upload is disallowed * More consistent names (#754) * Binding Front with API (#754) * Add a /me API (#754) * Improve list UI (#754) * Implement creation and deletion routes (#754) * Lint (#754) * Lint again (#754) * WIP: UI for triggering import existing videos (#754) * Implement jobs for syncing and importing channels * Don't sync videos before sync creation + avoid concurrency issue (#754) * Cleanup (#754) * Cleanup: OpenAPI + API rework (#754) * Remove dead code (#754) * Eslint (#754) * Revert the mess with whitespaces in constants.ts (#754) * Some fixes after rebase (#754) * Several fixes after PR remarks (#754) * Front + API: Rename video-channels-sync to video-channel-syncs (#754) * Allow enabling channel sync through UI (#754) * getChannelInfo (#754) * Minor fixes: openapi + model + sql (#754) * Simplified API validators (#754) * Rename MChannelSync to MChannelSyncChannel (#754) * Add command for VideoChannelSync (#754) * Use synchronization.enabled config (#754) * Check parameters test + some fixes (#754) * Fix conflict mistake (#754) * Restrict access to video channel sync list API (#754) * Start adding unit test for synchronization (#754) * Continue testing (#754) * Tests finished + convertion of job to scheduler (#754) * Add lastSyncAt field (#754) * Fix externalRemoteUrl sort + creation date not well formatted (#754) * Small fix (#754) * Factorize addYoutubeDLImport and buildVideo (#754) * Check duplicates on channel not on users (#754) * factorize thumbnail generation (#754) * Fetch error should return status 400 (#754) * Separate video-channel-import and video-channel-sync-latest (#754) * Bump DB migration version after rebase (#754) * Prettier states in UI table (#754) * Add DefaultScope in VideoChannelSyncModel (#754) * Fix audit logs (#754) * Ensure user can upload when importing channel + minor fixes (#754) * Mark synchronization as failed on exception + typos (#754) * Change REST API for importing videos into channel (#754) * Add option for fully synchronize a chnanel (#754) * Return a whole sync object on creation to avoid tricks in Front (#754) * Various remarks (#754) * Single quotes by default (#754) * Rename synchronization to video_channel_synchronization * Add check.latest_videos_count and max_per_user options (#754) * Better channel rendering in list #754 * Allow sorting with channel name and state (#754) * Add missing tests for channel imports (#754) * Prefer using a parent job for channel sync * Styling * Client styling Co-authored-by: Chocobozzz <me@florianbigard.com>

Seyferto · 2022-08-10T11:17:26Z

So, this functions now and available for instances (presumably after updates)?

fflorent · 2022-08-10T12:02:31Z

So, this functions now and available for instances (presumably after updates)?

No. It will be available once v5 is out

Edit : or if admin checkout the develop branch, but that's not something advised.

Seyferto · 2022-08-10T12:03:58Z

When will it be released?

fflorent · 2022-08-10T12:05:32Z

Expected by the end of 2022
https://joinpeertube.org/news#ideas-jpt

Seyferto · 2022-08-10T13:32:53Z

Why not something advised?

fflorent · 2022-08-10T19:01:54Z

Why not something advised?

In production, because of some bugs or unstable / broken features in develop.

Seyferto · 2022-08-11T05:59:48Z

Testing takes 5 months?

Booteille · 2022-08-11T06:27:09Z

Testing takes 5 months?

PeerTube minor releases are made every 3 months or so. Major releases are made every year.

Seyferto · 2022-08-11T06:29:49Z

Is this minor? It should have really been there since the beginning, though, as doing them one by one is so tedious...

Chocobozzz · 2022-08-11T06:37:28Z

@Seyferto Next minor release is planned for September. If you're not happy with this ETA you can use nightly builds at your own risks: https://builds.joinpeertube.org/nightly/

Complaining about the time a feature is released in a free software project without a thank you when @fflorent took a lot of its own free time to develop is quite annoying. See also the section of our FAQ regarding this kind of remark: https://joinpeertube.org/faq#peertube-does-not-contain-all-the-tools-i-need-to-manage-my-instance

Seyferto · 2022-08-11T06:48:13Z

I did actually like it when I noticed it was being done, but was disappointed if it was going to be released by the end of the year... if it's next month then that is better... so, nightly has this feature? Do you suppose others' instances usually use nightlies, though, or do most people typically wait for a release?

Booteille · 2022-08-11T06:53:41Z

I did actually like it when I noticed it was being done, but was disappointed if it was going to be released by the end of the year... if it's next month then that is better... so, nightly has this feature? Do you suppose others' instances usually use nightlies, though, or do most people typically wait for a release?

Actually, @fflorent told you using nightly builds are not recommended since it's considered unstable. Some admins use it, others don't. That's up to you to do so.

Nightly has every commit merged. It's the "dev" branch.

emansom · 2022-08-29T23:25:34Z

In production, because of some bugs or unstable / broken features in develop.

@fflorent For people wanting to help test this new feature; are you referring to active bugs with this new functionality or more as a general statement/warning?

fflorent · 2022-08-30T06:27:17Z

@emansom That's a general statement, yes :)

Seyferto · 2022-09-09T10:38:05Z

@fflorent Is this being tested anywhere as of now?

rigelk added the Type: Feature Request ✨ label Jun 29, 2018

roipoussiere changed the title ~~Automatically import Youtube videos~~ Automatically import videos from other platforms Jun 30, 2018

McFlat mentioned this issue Oct 29, 2018

Video imports in frontend happen differently than from tools scripts #1337

Closed

Chocobozzz mentioned this issue Nov 16, 2018

Automatic YouTube channel mirroring #1398

Closed

roipoussiere mentioned this issue Nov 29, 2018

Import a full channel from youtube #1438

Closed

Chocobozzz added the Component: Import label Dec 13, 2018

kontrollanten mentioned this issue Dec 3, 2020

Exact timestamp when importing from Youtube #3387

Open

kmk3 mentioned this issue Mar 10, 2021

Mirror channel on PeerTube netblue30/firejail#4076

Open

Booteille mentioned this issue Jan 12, 2022

Automatic Twitch Integration #4713

Closed

fflorent added a commit to fflorent/PeerTube that referenced this issue Aug 8, 2022

Change REST API for importing videos into channel (Chocobozzz#754)

9dc5088

fflorent added a commit to fflorent/PeerTube that referenced this issue Aug 8, 2022

Add option for fully synchronize a chnanel (Chocobozzz#754)

c8ab2bc

fflorent added a commit to fflorent/PeerTube that referenced this issue Aug 8, 2022

Return a whole sync object on creation to avoid tricks in Front (Choc…

81e8e66

…obozzz#754)

fflorent added a commit to fflorent/PeerTube that referenced this issue Aug 8, 2022

Various remarks (Chocobozzz#754)

f76b3b1

fflorent added a commit to fflorent/PeerTube that referenced this issue Aug 8, 2022

Single quotes by default (Chocobozzz#754)

019fdac

fflorent added a commit to fflorent/PeerTube that referenced this issue Aug 8, 2022

Add check.latest_videos_count and max_per_user options (Chocobozzz#754)

b30054a

fflorent added a commit to fflorent/PeerTube that referenced this issue Aug 8, 2022

Better channel rendering in list Chocobozzz#754

29254d2

fflorent added a commit to fflorent/PeerTube that referenced this issue Aug 8, 2022

Allow sorting with channel name and state (Chocobozzz#754)

78cef0e

fflorent added a commit to fflorent/PeerTube that referenced this issue Aug 8, 2022

Add missing tests for channel imports (Chocobozzz#754)

47f4722

fflorent mentioned this issue Aug 8, 2022

Allow root account to manage other users' synchronizations #5181

Open

2 tasks

Chocobozzz closed this as completed in #5135 Aug 10, 2022

Chocobozzz removed the Status: In Progress 🔜 label Aug 10, 2022

fflorent mentioned this issue Aug 10, 2022

Auto update videos titles, descriptions, subtitles, thumbnails in channel synchronization #5186

Open

FireMasterK mentioned this issue Aug 15, 2022

API to find a YouTube/Alternative Platform content on Peertube #5198

Closed

Automatically import videos from other platforms #754

Automatically import videos from other platforms #754

Comments

roipoussiere commented Jun 29, 2018 • edited Loading

rigelk commented Jun 29, 2018

rezonant commented Jun 29, 2018

rigelk commented Jun 29, 2018

roipoussiere commented Jun 29, 2018 • edited Loading

rezonant commented Jun 29, 2018

roipoussiere commented Jun 29, 2018 • edited Loading

Bugsbane commented Sep 27, 2018

McFlat commented Oct 10, 2018

McFlat commented Oct 10, 2018

McFlat commented Oct 10, 2018

gnouts commented Jun 18, 2019

aliceinwire commented Jul 4, 2019

fflorent commented Jul 4, 2019

gnouts commented Jul 4, 2019

fflorent commented Jul 24, 2019

fflorent commented Jul 28, 2019

Bugsbane commented Aug 20, 2019

fflorent commented Aug 20, 2019

Bugsbane commented Sep 20, 2019

mister-monster commented Oct 18, 2019

Chocobozzz commented Oct 19, 2019 • edited Loading

Seyferto commented Aug 10, 2022

fflorent commented Aug 10, 2022 • edited Loading

Seyferto commented Aug 10, 2022

fflorent commented Aug 10, 2022 • edited Loading

Seyferto commented Aug 10, 2022

fflorent commented Aug 10, 2022

Seyferto commented Aug 11, 2022 • edited Loading

Booteille commented Aug 11, 2022

Seyferto commented Aug 11, 2022

Chocobozzz commented Aug 11, 2022

Seyferto commented Aug 11, 2022

Booteille commented Aug 11, 2022

emansom commented Aug 29, 2022 • edited Loading

fflorent commented Aug 30, 2022

Seyferto commented Sep 9, 2022

roipoussiere commented Jun 29, 2018 •

edited

Loading

roipoussiere commented Jun 29, 2018 •

edited

Loading

roipoussiere commented Jun 29, 2018 •

edited

Loading

Chocobozzz commented Oct 19, 2019 •

edited

Loading

fflorent commented Aug 10, 2022 •

edited

Loading

fflorent commented Aug 10, 2022 •

edited

Loading

Seyferto commented Aug 11, 2022 •

edited

Loading

emansom commented Aug 29, 2022 •

edited

Loading