Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Google Search Indexing - Video is not the main content of the page #6210

Open
DVDGuy99 opened this issue Feb 8, 2024 · 19 comments
Open

Google Search Indexing - Video is not the main content of the page #6210

DVDGuy99 opened this issue Feb 8, 2024 · 19 comments
Labels
Type: Bug 🐛 Confirmed bug, at least replicated once by another contributor

Comments

@DVDGuy99
Copy link

DVDGuy99 commented Feb 8, 2024

Describe the current behavior

Google Search Console gives the error "Video is not the main content of the page" when indexing videos on our PeerTube site.

This is one of the video pages that Google says the video is not the main content:

https://trailers.ddigest.com/w/sbKiNjKTCs8EkNpKb45ku9

(actually, I think it will say that for all the video pages - diving deeper into the video page indexing data, it says « Video is supplementary content on the page »)

Possibly related, but viewing a screenshot of the page generated within Google Search Console shows it displaying a "HLS.js does not seem to be supported" error where the video should be. Doing an exact term search for this on the videos section of Google shows quite a few PeerTube-hosted videos that have this as the crawled text description for the video.

Steps to reproduce

  1. Log in to Google Search Console account for the instance
  2. Under "Indexing" go to "Video Pages"
  3. A list of pages with the "Video is not the main content of the page" error should be listed here

Describe the expected behavior

As these pages are the main video playback pages, the video should of course be the main content of the page and Google should index these as such. Pages with videos that are indexed as the main content will show the video carousel with the video thumbnail as opposed to just a text link.

Additional information

  • PeerTube instance:

  • Browser name, version and platforms on which you could reproduce the bug:

  • Link to browser console log if relevant:

  • Link to server log if relevant (journalctl or /var/www/peertube/storage/logs/):

@Chocobozzz Chocobozzz added Status: In Progress 🔜 Type: Bug 🐛 Confirmed bug, at least replicated once by another contributor labels Feb 23, 2024
@Chocobozzz Chocobozzz self-assigned this Feb 23, 2024
@Chocobozzz
Copy link
Owner

Google bot seems to fail to load the HLS player (which is a non-sense). Trying to fallback to raw HTML element using c4a0621

Hope it will fix the issue (have to wait deploy on peertube2.cpy.re and re-schedule a google bot indexation)

@Chocobozzz
Copy link
Owner

Seems like to fix the issue 👍

@aflamrip
Copy link

Has the problem been solved or is the same problem still present?
Indexing-pages-with-videos-URL-inspection
1g

@Chocobozzz
Copy link
Owner

Has the problem been solved or is the same problem still present?

Should be fixed in next peertube release (6.1.0)

@aflamrip
Copy link

I think I found a solution but I don't know if it is right or wrong
On this path
peertube-latest\client\dist\standalone\videos
There is a file
embed.html
This part is modified

to

But I don't know if this method will solve the problem
Video is not the main content of the page

Video placement

Video is supplementary content on the page

Whether the page is a playback page for a single video (Video is main content on the page), or hosts additional meaningful content or videos (Video is supplementary content on the page).

@DVDGuy99
Copy link
Author

DVDGuy99 commented May 8, 2024

This issue seems to be still present in 6.1.0.

I've tried enabling/disabling web video, HLS with P2P support, and it doesn't seem to matter too much, as it still gives the "video is not the main content of the page" error:

google_search_console_not_main_content

Below is the JavaScript console error messages as shown in Google Search Console for a sample page (https://trailers.ddigest.com/w/1ZcXuBacku4tZeY7KPHwPF), including it in case it helps:

google_search_console_errors

@DVDGuy99
Copy link
Author

DVDGuy99 commented May 8, 2024

Here's the video page indexing report for another video with web video enabled:

google_search_console_not_main_content2

@Chocobozzz Chocobozzz reopened this May 15, 2024
@Chocobozzz
Copy link
Owner

It's a nonsense, sometimes Google considers the video is not the main content on the page, and a few days later it correctly indexes the video. I'll look into it again, but if anyone here has a any clue, here don't hesitate to share it

@DVDGuy99
Copy link
Author

This thread might shed some light, and I think there's a really stupid fix for all of this involving adding the word "video" to the URL:

https://support.google.com/webmasters/thread/247936417/how-to-fix-video-is-not-the-main-content-of-the-page?hl=en

There are only 4 videos on my site that have been indexed and the URL that is indexed is like this:

https://trailers.ddigest.com/videos/watch/218beda6-427d-4ba5-83ad-d815cd13fbc6

Whereas all the ones not indexed is like this:

https://trailers.ddigest.com/w/jtdUAPbo65bNgz4Momxmm4

I wonder if a separate Google sitemap can be created that uses the "video/watch" URL structure as opposed to the "w/" one. For now, Google doesn't seem to care if the first one redirects to the second one.

@DVDGuy99
Copy link
Author

DVDGuy99 commented Jun 1, 2024

I've set up a cron job to create a version of the sitemap to be a workaround for this issue. The script basically replaces "https://trailers.ddigest.com/w/" with "https://trailers.ddigest.com/videos/watch/" in the sitemap, and then replaced the submitted sitemap in Google Search Console with this newly edited sitemap. This seems to work and videos are now being indexed, even though it shouldn't (as I'm submitting pages with redirects):

Screenshot 2024-06-01 141625

@Chocobozzz
Copy link
Owner

Chocobozzz commented Jun 14, 2024

@DVDGuy99 Coming to the news: does google index all your videos with the new /videos/watch now?

@DVDGuy99
Copy link
Author

@Chocobozzz Yes, pretty much. It doesn't seem to re-add the videos that have already been indexed, even if they've been submitted via the sitemap. I'll try to force the reindexing (via the request indexing feature in Google Search Consoles) on a couple of older ones to see if they are also added/re-added.

@DVDGuy99
Copy link
Author

The older videos I've requested reindexing for have also been indexed as videos (moved out of the "Video is not the main content of the page" category), as have all the new videos that are included in my modified sitemap.

@Chocobozzz
Copy link
Owner

Unfortunately it doesn't work on my side, indexing https://framatube.org/videos/watch/gW6BUFLNSDWWZwUzZBXLoN instead of https://framatube.org/w/gW6BUFLNSDWWZwUzZBXLoN is refused by google because of the redirection. I'm surprised it works on your instance 🤔

@DVDGuy99
Copy link
Author

It's definitely a weird situation with Google at the moment, and I'm thinking it has to be a bug or something. I have several videos where, as you said, the page won't get indexed because it's a redirect, but the video (with the "/videos/watch/" URL) does get indexed. So it seems that video pages only get indexed if it has "video" in the URL even if it's a redirect. I'm going to submit the "/w/" version of the URL for these pages and see what happens - maybe the page gets indexed but the video indexing is removed due to the "not the main content" error.

@DVDGuy99
Copy link
Author

DVDGuy99 commented Jul 6, 2024

So I managed to get both versions of the page indexed by Google. Don't ask me how it works, it's not supposed to, but it does for me.

Screenshot 2024-07-06 160014 Screenshot 2024-07-06 155950

@kontrollanten
Copy link
Contributor

kontrollanten commented Sep 16, 2024

According to Googles docs it's recommended to have a video:content_loc tag in the sitemap, which currently doesn't exist in Peertubes sitemap.

It's required to provide either a video:content_loc or video:player_loc tag. We recommend that your provide the video:content_loc tag, if possible. This is the most effective way for Google to fetch your video content files. If video:content_loc isn't available, provide video:player_loc as an alternative.

Another explanation may be those client logs whom seems to come from Googlebot:

{
    "tags": [
        "client"
    ],
    "userAgent": "Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/128.0.6613.113 Mobile Safari/537.36 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)",
    "meta": "{\"currentTime\":0,\"data\":{\"type\":\"mediaError\",\"details\":\"manifestIncompatibleCodecsError\",\"fatal\":true,\"url\":\"https://cdn.peertube/streaming-playlists-native/hls/4bbb3e31-24fa-4ec4-8daa-ec6f1d54b4ef/bab5b5b4-884e-456e-9a84-fcd2f6a6e623-master.m3u8\",\"error\":{},\"reason\":\"no level with compatible codecs found in manifest\"}}",
    "url": "https://peertube/w/4bbb3e31-24fa-4ec4-8daa-ec6f1d54b4ef",
    "level": "error",
    "message": "Client log: HLS.js error: mediaError - fatal: true - manifestIncompatibleCodecsError",
    "timestamp": "2024-09-11T11:58:06.448Z"
}

It may also be worth a try to add more structured data to each watch page to convince Googlebot that it's really a watch page, not an article with a video. https://developers.google.com/search/docs/appearance/structured-data/video#examples

@Chocobozzz
Copy link
Owner

According to Googles docs it's recommended to have a video:content_loc tag in the sitemap, which currently doesn't exist in Peertubes sitemap.

Why not, but I think most other web video platforms (youtube, vimeo...) don't include this tag but are still indexed 😠

Another explanation may be those client logs whom seems to come from Googlebot:

I think it's an expected behaviour where Googlebot disabled video support in its engine.

@kontrollanten
Copy link
Contributor

Why not, but I think most other web video platforms (youtube, vimeo...) don't include this tag but are still indexed 😠

Sure, but I think Googlebot makes some kind of holistic assessment where other platforms has higher general ranking, loads faster, is easier to crawl, etc. So if we try to perfect on all points, maybe it'll be indexed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Bug 🐛 Confirmed bug, at least replicated once by another contributor
Projects
None yet
Development

No branches or pull requests

4 participants