Skip to content

Set up redirects based on 404 data & filter sitemap #1976

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Nov 3, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 6 additions & 16 deletions astro.config.ts
Original file line number Diff line number Diff line change
@@ -1,20 +1,17 @@
import { defineConfig } from 'astro/config';
import preact from '@astrojs/preact';
import sitemap from '@astrojs/sitemap';
import { defineConfig } from 'astro/config';

import { toString } from 'hast-util-to-string';
import { h } from 'hastscript';
import { escape } from 'html-escaper';

import { tokens, foregroundPrimary, backgroundPrimary } from './syntax-highlighting-theme';
import { astroAsides } from './integrations/astro-asides';
import { astroSpoilers } from './integrations/astro-spoilers';
import { astroCodeSnippets } from './integrations/astro-code-snippets';
import { remarkFallbackLang } from './plugins/remark-fallback-lang';
import { astroSpoilers } from './integrations/astro-spoilers';
import { sitemap } from './integrations/sitemap';
import { rehypeTasklistEnhancer } from './plugins/rehype-tasklist-enhancer';

import languages from './src/i18n/languages';
import { normalizeLangTag } from './src/i18n/bcp-normalize';
import { remarkFallbackLang } from './plugins/remark-fallback-lang';
import { backgroundPrimary, foregroundPrimary, tokens } from './syntax-highlighting-theme';

const AnchorLinkIcon = h(
'svg',
Expand Down Expand Up @@ -46,14 +43,7 @@ export default defineConfig({
},
integrations: [
preact({ compat: true }),
sitemap({
i18n: {
defaultLocale: 'en',
locales: Object.fromEntries(
Object.keys(languages).map((lang) => [lang, normalizeLangTag(lang)])
),
},
}),
sitemap(),
astroAsides(),
astroSpoilers(),
astroCodeSnippets(),
Expand Down
30 changes: 30 additions & 0 deletions integrations/sitemap.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
import AstroSitemap from '@astrojs/sitemap';
import type { AstroIntegration } from 'astro';
import { normalizeLangTag } from '../src/i18n/bcp-normalize';
import languages from '../src/i18n/languages';

const langTags = Object.keys(languages);

/** Set matching our `/[lang]/something.astro` redirect routes. */
const blocklist = new Set([
...langTags.map((lang) => `/${lang}/`),
...langTags.map((lang) => `/${lang}/install/`),
...langTags.map((lang) => `/${lang}/tutorial/`),
]);

/** Match a pathname starting with “lighthouse” or one of our language tags. */
const ValidRouteRE = new RegExp(`^/(lighthouse|${langTags.join('|')})/`);

/** Test a pathname is not in our blocklist and starts with a valid prefix. */
const isValidPath = (path: string) => !blocklist.has(path) && ValidRouteRE.test(path);

/** Get a preconfigured instance of the `@astrojs/sitemap` integration. */
export function sitemap(): AstroIntegration {
return AstroSitemap({
filter: (page) => isValidPath(new URL(page).pathname),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

i18n: {
defaultLocale: 'en',
locales: Object.fromEntries(langTags.map((lang) => [lang, normalizeLangTag(lang)])),
},
});
}
4 changes: 4 additions & 0 deletions netlify.toml
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,10 @@
command = "pnpm build"
ignore = "git diff --quiet $COMMIT_REF $CACHED_COMMIT_REF -- /"

[[redirects]]
from = "/reference/renderer-reference"
to = "/en/reference/integrations-reference"

[[redirects]]
from = "/en/reference/renderer-reference"
to = "/en/reference/integrations-reference"
Expand Down
9 changes: 9 additions & 0 deletions src/pages/[lang]/tutorial.astro
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
---
import languages from '../../i18n/languages';

export const getStaticPaths = () => Object.keys(languages).map((lang) => ({ params: { lang } }));

const { lang } = Astro.params;
---

<meta http-equiv="refresh" content={`0;url=/${lang}/tutorial/0-introduction/`} />
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this show up in the sitemap?

Copy link
Member Author

@delucis delucis Nov 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent question. The answer is: yes!

https://deploy-preview-1976--astro-docs-2.netlify.app/sitemap-0.xml

We can filter sitemap URLs I believe. Let me look at that — we should probably exclude our other similar redirects too. (Probably doesn’t matter too much — these redirect pages are completely empty so are very unlikely to be indexed, but still.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, question back time: does it matter if it is in sitemap?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its usually best to keep redirects out of the sitemap, but that can be tackled in a separate PR 👍

Its really not a huge deal in general, I think it's pretty common for redirects to linger in a sitemap for a while (or forever on some sites)

Copy link
Member Author

@delucis delucis Nov 3, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK — pushed a fix for the low-hanging fruit here. Our [lang]/something.astro redirects like the one added in this PR are easy to filter out. Other dynamic redirects in the root of src/pages/ will be trickier and can be tackled later I guess (or refactored to be configured in netlify.toml).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Engaged morning brain and implemented a much better filtering set-up so now the sitemap shouldn’t include any of our redirect routes!