Skip to content

Conversation

@youyongsong
Copy link
Contributor

@youyongsong youyongsong commented Jun 9, 2025

This commit improves the documentation translation system by simplifying the translation process, reducing context overhead, and making translation results more stable.

System Prompt Changes:

  • Convert system prompt from Chinese to English for better AI comprehension
  • Replace 4-step translation process with single-pass approach to improve output stability and reduce context
  • Remove target content comparison logic to simplify process and reduce translation context
  • Restructure prompt with clear baseline requirements and optional additional requirements section

Smart Terms Filtering:

  • Only include terms that actually appear in source content
  • Support multi-language terms mapping (English, Chinese, Russian)

Anchor Link Protection:

  • Introduce anchor preprocessing mechanism to prevent translation corruption of markdown anchor links
  • Replace anchors with numbered placeholders (ANCHOR_N) before translation
  • Restore original anchors after translation completion
  • Handle escaped underscores in anchor IDs to maintain compatibility with MDX processor

Title Translation:

  • Replace i18n.title-based translation with built-in title translation mapping table
  • Implement automatic title correction through prompt enhancement when titles exist in mapping table
  • Extract first-level headings from content and apply predefined translations when available
  • Fallback to AI translation for unmapped titles while preserving consistency for common terms

Frontmatter Cleanup:

  • Remove i18n fields from translated documents

Content Processing:

  • Add support for copy-only files configuration, with all files under apis/ directory set to copy-only by default

Technical Changes:

  • Upgrade model from gpt-4o-mini to gpt-4.1-mini
  • Update language constants from Chinese to English descriptions
  • Add title translation mapping table for common sections

Summary by CodeRabbit

  • New Features
    • Added support for Russian in terminology translation.
    • Introduced selective "copy-only" directories, allowing certain files to be copied without translation.
    • Enhanced title translation with dynamic prompts and a predefined translation map.
  • Improvements
    • Improved terminology handling for more accurate translations by filtering relevant terms.
    • Added anchor tag preservation to maintain content structure during translation.
    • Updated system prompt for translation to detailed English instructions.
    • Language labels are now shown in English.
    • Refined translation CLI to respect auto-translation disable flags and handle frontmatter updates explicitly.
    • Documentation improved for clarity and consistency across usage and configuration guides.
  • Bug Fixes
    • Ensured frontmatter is updated correctly and unnecessary fields are removed in translated files.

@changeset-bot
Copy link

changeset-bot bot commented Jun 9, 2025

🦋 Changeset detected

Latest commit: dcdda83

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package
Name Type
@alauda/doom Minor

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@coderabbitai
Copy link

coderabbitai bot commented Jun 9, 2025

Walkthrough

The translation module was enhanced to support Russian, improved terminology handling with source content filtering, and introduced copy-only directory logic for untranslated files. The system prompt for the translation model was rewritten in English with explicit formatting rules. New utilities for anchor tag preservation and title translation mapping were added, along with updates to language labels and a multilingual title translation map.

Changes

Files / Areas Change Summary
src/cli/translate.ts Added Russian support; introduced COPY_ONLY_DIRECTORIES for copy-only files; added anchor preservation functions; added title translation mapping and injection; refined terminology filtering with source content presence; rewrote system prompt in English; upgraded OpenAI model to "gpt-4.1-mini"; refactored translation flow and frontmatter handling, including respecting disableAutoTranslation flags.
src/shared/constants.ts Changed language labels from Chinese to English; added TITLE_TRANSLATION_MAP with multilingual section title translations.
.changeset/heavy-beers-hope.md Added changeset documenting translation system overhaul for improved accuracy and stability.
docs/en/apis/advanced-apis/*.mdx Removed i18n title metadata and explicit title fields from frontmatter; updated sourceSHA values; no content changes.
docs/zh/apis/advanced-apis/*.mdx Removed i18n title metadata blocks from frontmatter; no content changes.
docs/en/start.mdx Revised section titles and wording for clarity and consistency; added lint command documentation; improved translation section explanations; formatting and phrasing refinements.
docs/en/usage/configuration.md Rewrote translation system prompt section with detailed English baseline requirements; improved clarity and consistency in configuration descriptions and examples; minor renaming of sections and terminology standardization.
docs/zh/usage/configuration.md Replaced brief partly Chinese system prompt with a fully detailed English prompt specifying strict translation rules and parameters; removed previous four-step strategy; enhanced prompt parameterization.
package.json Modified translate script to include explicit source (-s zh) and target (-t en) language flags.

Sequence Diagram(s)

sequenceDiagram
    participant CLI_User
    participant TranslateCommand
    participant FileSystem
    participant TranslateFunction
    participant OpenAI_Model

    CLI_User->>TranslateCommand: Invoke translate command
    TranslateCommand->>FileSystem: Identify files (copy-only or translatable)
    alt File is copy-only
        TranslateCommand->>FileSystem: Copy file, update frontmatter
    else File requires translation
        TranslateCommand->>TranslateFunction: Prepare content (anchors, terminology, title)
        TranslateFunction->>OpenAI_Model: Send prompt with filtered terminology and title hint
        OpenAI_Model-->>TranslateFunction: Return translated content
        TranslateFunction->>TranslateCommand: Restore anchors, return translated content
        TranslateCommand->>FileSystem: Write translated file, update frontmatter
    end
Loading

Poem

A hop, a skip, a Russian leap,
New titles mapped, and anchors keep.
With prompts in English, crisp and clear,
The bunny brings translations near.
Some files we copy, some we change—
Across the docs, we rearrange!
🐇✨

✨ Finishing Touches
  • 📝 Generate Docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

This commit improves the documentation translation system by simplifying the translation process, reducing context overhead, and making translation results more stable.

System Prompt Changes:
* Convert system prompt from Chinese to English for better AI comprehension
* Replace 4-step translation process with single-pass approach to improve output stability and reduce context
* Remove target content comparison logic to simplify process and reduce translation context
* Restructure prompt with clear baseline requirements and optional additional requirements section

Smart Terms Filtering:
* Only include terms that actually appear in source content
* Support multi-language terms mapping (English, Chinese, Russian)

Anchor Link Protection:
* Introduce anchor preprocessing mechanism to prevent translation corruption of markdown anchor links
* Replace anchors with numbered placeholders (__ANCHOR_N__) before translation
* Restore original anchors after translation completion
* Handle escaped underscores in anchor IDs to maintain compatibility with MDX processor

Title Translation:
* Replace i18n.title-based translation with built-in title translation mapping table
* Implement automatic title correction through prompt enhancement when titles exist in mapping table
* Extract first-level headings from content and apply predefined translations when available
* Fallback to AI translation for unmapped titles while preserving consistency for common terms

Frontmatter Cleanup:
* Remove i18n fields from translated documents

Content Processing:
* Add support for copy-only files configuration, with all files under apis/ directory set to copy-only by default

Technical Changes:
* Upgrade model from gpt-4o-mini to gpt-4.1-mini
* Update language constants from Chinese to English descriptions
* Add title translation mapping table for common sections
@pkg-pr-new
Copy link

pkg-pr-new bot commented Jun 9, 2025

Open in StackBlitz

npm i https://pkg.pr.new/@alauda/doom@82

commit: b7fee83

@youyongsong youyongsong force-pushed the feat/refactor-translation branch from bcdf602 to b3e6a54 Compare June 9, 2025 14:59
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🔭 Outside diff range comments (1)
src/cli/translate.ts (1)

1-624: 💡 Verification agent

🧩 Analysis chain

Address type coverage regression.

The pipeline indicates type coverage has dropped below the 100% target. Please run type coverage analysis locally to identify and fix any untyped code paths introduced by these changes.


🏁 Script executed:

#!/bin/bash
# Check for type coverage issues in the modified file
npx type-coverage --detail --file-coverage src/cli/translate.ts --strict --at-least 100 || true

Length of output: 19796


Type coverage dropped to 96.41% (target 100%)
The recent changes have introduced multiple untyped parameters across the codebase. Please restore 100% coverage by adding the missing type annotations:

• Run the global check:

npx type-coverage --detail --strict --at-least 100 || true

• Identify all “any” or untyped parameters (e.g., in TSX components, CLI handlers, plugin functions).
• Add explicit types to function parameters, React component props, and callback signatures.
• Re-run the coverage check to confirm 100% is met.

🧰 Tools
🪛 Biome (1.9.4)

[error] 503-503: Avoid the delete operator which can impact performance.

Unsafe fix: Use an undefined assignment instead.

(lint/performance/noDelete)


[error] 557-557: Avoid the delete operator which can impact performance.

Unsafe fix: Use an undefined assignment instead.

(lint/performance/noDelete)


[error] 578-578: Avoid the delete operator which can impact performance.

Unsafe fix: Use an undefined assignment instead.

(lint/performance/noDelete)

🪛 ESLint

[error] 7-7: Unable to resolve path to module '@rspress/shared'.

(import-x/no-unresolved)


[error] 8-8: Unable to resolve path to module '@rspress/shared/logger'.

(import-x/no-unresolved)


[error] 9-9: Unable to resolve path to module 'commander'.

(import-x/no-unresolved)


[error] 10-10: Unable to resolve path to module 'ejs'.

(import-x/no-unresolved)


[error] 11-11: Unable to resolve path to module 'gray-matter'.

(import-x/no-unresolved)


[error] 12-12: Unable to resolve path to module 'openai'.

(import-x/no-unresolved)


[error] 13-13: Unable to resolve path to module 'p-ratelimit'.

(import-x/no-unresolved)


[error] 15-15: Unable to resolve path to module 'yoctocolors'.

(import-x/no-unresolved)

🪛 GitHub Actions: CI

[error] 562-562: Type coverage rate (99.98%) is lower than the target (100%).

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5bd0705 and b3e6a54.

📒 Files selected for processing (2)
  • src/cli/translate.ts (8 hunks)
  • src/shared/constants.ts (1 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (1)
src/cli/translate.ts (3)
src/shared/constants.ts (3)
  • Language (29-33)
  • Language (35-35)
  • TITLE_TRANSLATION_MAP (39-57)
src/cli/helpers.ts (2)
  • parseTerms (70-70)
  • escapeMarkdownHeadingIds (44-52)
src/plugins/replace/normalize-img-src.ts (2)
  • NormalizeImgSrcOptions (13-25)
  • normalizeImgSrc (27-158)
🪛 Biome (1.9.4)
src/cli/translate.ts

[error] 503-503: Avoid the delete operator which can impact performance.

Unsafe fix: Use an undefined assignment instead.

(lint/performance/noDelete)


[error] 557-557: Avoid the delete operator which can impact performance.

Unsafe fix: Use an undefined assignment instead.

(lint/performance/noDelete)


[error] 578-578: Avoid the delete operator which can impact performance.

Unsafe fix: Use an undefined assignment instead.

(lint/performance/noDelete)

🪛 GitHub Actions: CI
src/cli/translate.ts

[error] 562-562: Type coverage rate (99.98%) is lower than the target (100%).

🔇 Additional comments (7)
src/shared/constants.ts (1)

29-57: LGTM! Clean implementation of language constants and title mappings.

The conversion from Chinese to English language labels aligns well with the PR objectives, and the TITLE_TRANSLATION_MAP provides a good foundation for consistent title translations across supported languages.

src/cli/translate.ts (6)

48-54: Good additions for Russian support and copy-only configuration.

The inclusion of Russian in TERMS_SUPPORTED_LANGUAGES aligns with the expanded language support, and the COPY_ONLY_DIRECTORIES configuration appropriately excludes API documentation from translation.


56-110: Excellent rewrite of the system prompt with comprehensive translation rules.

The English prompt is well-structured with clear baseline requirements covering MDX format preservation, link handling, technical terms, anchor placeholders, and escape character handling. The conditional sections for title translation and terminology are properly integrated.


121-162: Smart optimization of terminology resolution.

The filtering logic effectively reduces the translation context by only including terms that actually appear in the source content. The case-insensitive regex matching with proper escaping ensures accurate term detection.


164-196: Well-implemented anchor preservation mechanism.

The anchor placeholder system effectively protects anchor links during translation. The handling of escaped underscores and proper error checking in the restoration function ensures robustness.


490-519: Clean implementation of copy-only file handling.

The logic properly preserves file content while updating metadata (sourceSHA) and removing translation-specific fields. This effectively supports the requirement to exclude certain directories from translation.

🧰 Tools
🪛 Biome (1.9.4)

[error] 503-503: Avoid the delete operator which can impact performance.

Unsafe fix: Use an undefined assignment instead.

(lint/performance/noDelete)


198-220: Effective title translation implementation.

The system properly extracts first-level headings, provides translation hints to the AI model, and applies predefined title mappings from TITLE_TRANSLATION_MAP. The fallback logic ensures consistent title handling.

Also applies to: 251-263, 566-579

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (1)
src/cli/translate.ts (1)

556-558: 🛠️ Refactor suggestion

Remove redundant sourceSHA assignment and fix delete operator usage.

Two issues here: sourceSHA is assigned twice (redundant), and the delete operator impacts performance.

-              const newFrontmatter = { ...sourceFrontmatter, sourceSHA }
-              delete newFrontmatter.i18n
-              newFrontmatter.sourceSHA = sourceSHA
+              const { i18n: _, ...newFrontmatter } = { ...sourceFrontmatter, sourceSHA }
🧰 Tools
🪛 Biome (1.9.4)

[error] 557-557: Avoid the delete operator which can impact performance.

Unsafe fix: Use an undefined assignment instead.

(lint/performance/noDelete)

🧹 Nitpick comments (1)
src/cli/translate.ts (1)

496-519: Consider optimization for copy-only files.

Based on past review feedback, since copy-only files don't require translation, the sourceSHA calculation and comparison might be unnecessary overhead. Consider skipping the hash calculation for copy-only files if the goal is pure copying.

Would you like me to propose an optimization that skips sourceSHA calculation for copy-only files to improve performance?

🧰 Tools
🪛 Biome (1.9.4)

[error] 503-503: Avoid the delete operator which can impact performance.

Unsafe fix: Use an undefined assignment instead.

(lint/performance/noDelete)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b3e6a54 and a44676a.

📒 Files selected for processing (2)
  • src/cli/translate.ts (8 hunks)
  • src/shared/constants.ts (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/shared/constants.ts
🧰 Additional context used
🧠 Learnings (1)
src/cli/translate.ts (2)
Learnt from: JounQin
PR: alauda/doom#82
File: src/cli/translate.ts:293-293
Timestamp: 2025-06-09T15:19:13.326Z
Learning: GPT-4.1-mini is a valid OpenAI model name that can be used in Azure OpenAI API calls, as confirmed by OpenAI's official documentation.
Learnt from: JounQin
PR: alauda/doom#82
File: src/cli/translate.ts:293-293
Timestamp: 2025-06-09T15:19:13.326Z
Learning: GPT-4.1-mini is a valid and recently released OpenAI model (April 2025) available on Azure OpenAI Service. It's identified as "gpt-4.1-mini-2025-04-14" in Azure OpenAI Service and significantly outperforms GPT-4o while being more cost-effective and faster. It supports up to 1M tokens of context and has multimodal capabilities.
🪛 Biome (1.9.4)
src/cli/translate.ts

[error] 503-503: Avoid the delete operator which can impact performance.

Unsafe fix: Use an undefined assignment instead.

(lint/performance/noDelete)


[error] 557-557: Avoid the delete operator which can impact performance.

Unsafe fix: Use an undefined assignment instead.

(lint/performance/noDelete)


[error] 579-579: Avoid the delete operator which can impact performance.

Unsafe fix: Use an undefined assignment instead.

(lint/performance/noDelete)

⏰ Context from checks skipped due to timeout of 90000ms (8)
  • GitHub Check: Build and Lint with Node.js 22 on ubuntu-latest
  • GitHub Check: Build and Lint with Node.js 24 on ubuntu-latest
  • GitHub Check: Build and Lint with Node.js 18 on ubuntu-latest
  • GitHub Check: Build and Lint with Node.js 20 on ubuntu-latest
  • GitHub Check: Build and Lint with Node.js 24 on ubuntu-latest
  • GitHub Check: Build and Lint with Node.js 22 on ubuntu-latest
  • GitHub Check: Build and Lint with Node.js 18 on ubuntu-latest
  • GitHub Check: Build and Lint with Node.js 20 on ubuntu-latest
🔇 Additional comments (8)
src/cli/translate.ts (8)

23-27: LGTM! Clean import and constant additions.

The new imports and constants properly support the enhanced translation features including Russian language support and copy-only directory handling.

Also applies to: 48-54


56-110: Excellent system prompt rewrite with comprehensive instructions.

The English rewrite significantly improves clarity and includes detailed baseline requirements that should enhance translation accuracy and consistency. The conditional sections for title translation and terms allow for dynamic prompt enhancement.


121-162: Smart terms filtering implementation enhances translation accuracy.

The dynamic filtering of terminology based on source content presence is an excellent improvement over static lists. The regex-based matching with proper escaping and case-insensitive search ensures reliable term detection.


164-220: Well-designed utility functions for anchor preservation and title translation.

The anchor handling functions effectively prevent translation corruption by using placeholder replacement, and the title translation lookup provides consistency for common documentation sections. Good error handling and clear separation of concerns.


248-276: Enhanced translate function integrates new features effectively.

The function now incorporates smart terminology filtering, title translation hints, and anchor preservation while maintaining a logical flow and backward compatibility.


290-293: Model upgrade and content handling improvements.

The upgrade to gpt-4.1-mini leverages a more capable model, and using contentWithPlaceholders correctly implements the anchor preservation strategy.


395-402: Efficient copy-only files detection using glob patterns.

The implementation correctly identifies copy-only files and uses a Set for efficient lookup during processing.


563-576: Robust title translation handling with fallback strategy.

The implementation correctly handles both AI-translated titles from content and predefined title mappings from frontmatter, with appropriate fallback logic. This addresses the concern about frontmatter vs content title handling.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (3)
src/cli/translate.ts (3)

502-502: Consider skipping sourceSHA for copy-only files to improve performance.

Since copy-only files are not translated, calculating and storing the sourceSHA may be unnecessary overhead. Consider whether this hash is actually needed for copy-only operations.


502-503: 🛠️ Refactor suggestion

Replace delete operator with destructuring for better performance.

The static analysis tool correctly identifies that the delete operator can impact performance. Use destructuring to exclude the i18n property instead.

-              const newFrontmatter = { ...sourceFrontmatter, sourceSHA }
-              delete newFrontmatter.i18n
+              const { i18n: _, ...newFrontmatter } = { ...sourceFrontmatter, sourceSHA }
🧰 Tools
🪛 Biome (1.9.4)

[error] 503-503: Avoid the delete operator which can impact performance.

Unsafe fix: Use an undefined assignment instead.

(lint/performance/noDelete)


580-582: 🛠️ Refactor suggestion

Replace delete operator with destructuring for consistency.

For consistency with other frontmatter handling and to address the static analysis performance concern, use destructuring instead of the delete operator.

-              if (typeof newFrontmatter.title !== 'string') {
-                delete newFrontmatter.title
-              }
+              const finalFrontmatter = typeof newFrontmatter.title === 'string' 
+                ? newFrontmatter 
+                : (() => {
+                    const { title: _, ...rest } = newFrontmatter
+                    return rest
+                  })()

Then use finalFrontmatter in the matter.stringify call.

🧰 Tools
🪛 Biome (1.9.4)

[error] 581-581: Avoid the delete operator which can impact performance.

Unsafe fix: Use an undefined assignment instead.

(lint/performance/noDelete)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a44676a and f0ffd99.

📒 Files selected for processing (1)
  • src/cli/translate.ts (8 hunks)
🧰 Additional context used
🧠 Learnings (1)
src/cli/translate.ts (2)
Learnt from: JounQin
PR: alauda/doom#82
File: src/cli/translate.ts:293-293
Timestamp: 2025-06-09T15:19:13.326Z
Learning: GPT-4.1-mini is a valid OpenAI model name that can be used in Azure OpenAI API calls, as confirmed by OpenAI's official documentation.
Learnt from: JounQin
PR: alauda/doom#82
File: src/cli/translate.ts:293-293
Timestamp: 2025-06-09T15:19:13.326Z
Learning: GPT-4.1-mini is a valid and recently released OpenAI model (April 2025) available on Azure OpenAI Service. It's identified as "gpt-4.1-mini-2025-04-14" in Azure OpenAI Service and significantly outperforms GPT-4o while being more cost-effective and faster. It supports up to 1M tokens of context and has multimodal capabilities.
🪛 Biome (1.9.4)
src/cli/translate.ts

[error] 503-503: Avoid the delete operator which can impact performance.

Unsafe fix: Use an undefined assignment instead.

(lint/performance/noDelete)


[error] 581-581: Avoid the delete operator which can impact performance.

Unsafe fix: Use an undefined assignment instead.

(lint/performance/noDelete)

⏰ Context from checks skipped due to timeout of 90000ms (8)
  • GitHub Check: Build and Lint with Node.js 24 on ubuntu-latest
  • GitHub Check: Build and Lint with Node.js 22 on ubuntu-latest
  • GitHub Check: Build and Lint with Node.js 18 on ubuntu-latest
  • GitHub Check: Build and Lint with Node.js 20 on ubuntu-latest
  • GitHub Check: Build and Lint with Node.js 22 on ubuntu-latest
  • GitHub Check: Build and Lint with Node.js 24 on ubuntu-latest
  • GitHub Check: Build and Lint with Node.js 18 on ubuntu-latest
  • GitHub Check: Build and Lint with Node.js 20 on ubuntu-latest
🔇 Additional comments (10)
src/cli/translate.ts (10)

23-28: LGTM: Import changes support new features.

The addition of TITLE_TRANSLATION_MAP import and other constants properly supports the new title translation and terminology features.


48-54: LGTM: Russian language support and copy-only directories added.

The expansion to include Russian in TERMS_SUPPORTED_LANGUAGES and the COPY_ONLY_DIRECTORIES constant align well with the PR objectives for multi-language support and selective file handling.


56-110: Excellent system prompt rewrite enhances translation quality.

The rewritten English system prompt is comprehensive and well-structured. It includes detailed baseline requirements for preserving MDX format, links, technical terms, and proper handling of frontmatter. The dynamic placeholders for title translation and terminology injection provide context-specific guidance to the AI model.


121-162: Smart terms filtering improves translation context efficiency.

The resolveTerms function intelligently filters terms to only include those present in the source content with available translations for both languages. The case-insensitive regex matching and proper escaping are well-implemented.


164-196: Robust anchor preservation mechanism protects link integrity.

The anchor handling functions provide a solid solution for preserving markdown anchors during translation. The escaped underscore handling for MDX compatibility and the numbered placeholder system with validation are well-designed.


198-220: Title extraction and translation functions are well-implemented.

The extractFirstLevelHeading and getTitleTranslation functions provide clean, focused functionality for title handling. The implementation correctly addresses the need for consistent title translations across languages.


243-304: Translate function properly integrates new features.

The updates to the translate function successfully integrate terms filtering, title translation hints, and anchor preservation. The model upgrade to gpt-4.1-mini enhances performance as noted in the PR objectives.


395-402: Copy-only directory logic is well-implemented.

The copy-only file detection using glob patterns provides a clean way to handle files that should be copied verbatim instead of translated.


557-561: Frontmatter handling is improved but could use destructuring.

The frontmatter handling correctly excludes the i18n field and preserves the sourceSHA. However, consider using destructuring instead of the delete operator for consistency.


569-578: Title translation from frontmatter is well-implemented.

The code now properly handles title translation from frontmatter using the predefined translation map, addressing the previous review concern about frontmatter title consideration.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (3)
src/cli/translate.ts (3)

500-500: Replace delete operator for better performance.

The static analysis tool correctly identifies that the delete operator can impact performance.

-              const newFrontmatter = { ...sourceFrontmatter, sourceSHA }
-              delete newFrontmatter.i18n
+              const { i18n: _, ...newFrontmatter } = { ...sourceFrontmatter, sourceSHA }
🧰 Tools
🪛 Biome (1.9.4)

[error] 500-500: Avoid the delete operator which can impact performance.

Unsafe fix: Use an undefined assignment instead.

(lint/performance/noDelete)


566-575: Consider consolidating title translation logic.

The title translation logic could be simplified by handling both frontmatter title and extracted heading title in a single location to avoid potential conflicts.

-              if (sourceFrontmatter.title) {
-                const titleTranslation = getTitleTranslation(
-                  sourceFrontmatter.title,
-                  source,
-                  target,
-                )
-                if (titleTranslation) {
-                  newFrontmatter.title = titleTranslation
-                }
-              }
+              // Apply title translation from frontmatter with priority over AI-translated title
+              if (sourceFrontmatter.title) {
+                const titleTranslation = getTitleTranslation(
+                  sourceFrontmatter.title,
+                  source,
+                  target,
+                )
+                if (titleTranslation) {
+                  newFrontmatter.title = titleTranslation
+                }
+              } else if (typedData.title && typeof typedData.title === 'string') {
+                newFrontmatter.title = typedData.title
+              }

577-584: Simplify frontmatter title handling.

The current approach for handling optional title field could be more concise using object destructuring.

-              const finalFrontmatter =
-                typeof newFrontmatter.title === 'string'
-                  ? newFrontmatter
-                  : (() => {
-                      // eslint-disable-next-line @typescript-eslint/no-unused-vars
-                      const { title: _, ...rest } = newFrontmatter
-                      return rest
-                    })()
+              const finalFrontmatter = typeof newFrontmatter.title === 'string' 
+                ? newFrontmatter 
+                : (() => {
+                    const { title: _, ...rest } = newFrontmatter
+                    return rest
+                  })()
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f0ffd99 and 3cd5e5b.

📒 Files selected for processing (1)
  • src/cli/translate.ts (8 hunks)
🧰 Additional context used
🧠 Learnings (1)
src/cli/translate.ts (2)
Learnt from: JounQin
PR: alauda/doom#82
File: src/cli/translate.ts:293-293
Timestamp: 2025-06-09T15:19:13.326Z
Learning: GPT-4.1-mini is a valid OpenAI model name that can be used in Azure OpenAI API calls, as confirmed by OpenAI's official documentation.
Learnt from: JounQin
PR: alauda/doom#82
File: src/cli/translate.ts:293-293
Timestamp: 2025-06-09T15:19:13.326Z
Learning: GPT-4.1-mini is a valid and recently released OpenAI model (April 2025) available on Azure OpenAI Service. It's identified as "gpt-4.1-mini-2025-04-14" in Azure OpenAI Service and significantly outperforms GPT-4o while being more cost-effective and faster. It supports up to 1M tokens of context and has multimodal capabilities.
🪛 Biome (1.9.4)
src/cli/translate.ts

[error] 500-500: Avoid the delete operator which can impact performance.

Unsafe fix: Use an undefined assignment instead.

(lint/performance/noDelete)

⏰ Context from checks skipped due to timeout of 90000ms (8)
  • GitHub Check: Build and Lint with Node.js 24 on ubuntu-latest
  • GitHub Check: Build and Lint with Node.js 18 on ubuntu-latest
  • GitHub Check: Build and Lint with Node.js 22 on ubuntu-latest
  • GitHub Check: Build and Lint with Node.js 20 on ubuntu-latest
  • GitHub Check: Build and Lint with Node.js 22 on ubuntu-latest
  • GitHub Check: Build and Lint with Node.js 20 on ubuntu-latest
  • GitHub Check: Build and Lint with Node.js 24 on ubuntu-latest
  • GitHub Check: Build and Lint with Node.js 18 on ubuntu-latest
🔇 Additional comments (11)
src/cli/translate.ts (11)

23-27: Good addition of title translation support.

The import of TITLE_TRANSLATION_MAP enhances the translation system with predefined title mappings, which should improve consistency and accuracy for common documentation sections.


48-48: Russian language support added correctly.

The addition of 'ru' to TERMS_SUPPORTED_LANGUAGES properly extends terminology resolution to support Russian translations.


51-54: Well-designed copy-only directory configuration.

The COPY_ONLY_DIRECTORIES constant provides a clean way to specify which directories should be copied verbatim instead of translated. The APIs directory is a logical choice for copy-only behavior.


56-110: Excellent system prompt rewrite for improved AI comprehension.

The conversion from Chinese to English with detailed baseline requirements significantly improves clarity. The prompt includes comprehensive rules for:

  • MDX format preservation
  • Link integrity protection
  • Technical term handling
  • Frontmatter processing
  • Anchor placeholder preservation
  • Escape character handling

This structured approach should enhance translation accuracy and consistency.


121-162: Smart optimization for terminology resolution.

The resolveTerms function efficiently filters terms to include only those:

  1. Present in the source content (case-insensitive regex matching)
  2. Having translations for both source and target languages

This reduces context overhead and improves translation relevance. The regex escaping for special characters is properly implemented.


164-196: Robust anchor preservation mechanism.

The anchor handling functions effectively:

  • Replace anchors with numbered placeholders before translation
  • Handle escaped underscores in anchor IDs for MDX compatibility
  • Restore anchors after translation with proper validation

The error handling for invalid anchor indices adds good defensive programming.


198-220: Effective title translation implementation.

The combination of extractFirstLevelHeading and getTitleTranslation provides a clean way to leverage predefined title mappings for consistent translations of common documentation sections.


248-267: Well-integrated new translation features.

The updates to the translate function properly incorporate:

  • Terms filtering based on source content
  • Title translation prompt injection
  • Anchor placeholder replacement/restoration

The conditional logic ensures features are only applied when relevant.


293-293: Model upgrade is valid.

Based on the retrieved learnings, gpt-4.1-mini is confirmed as a valid and recently released OpenAI model (April 2025) that offers significant performance improvements over GPT-4o while being more cost-effective.


396-402: Efficient copy-only file detection.

The glob-based approach for identifying copy-only files is clean and performant. Using a Set for lookups in the main processing loop is the right optimization.


493-516: Well-implemented copy-only file handling.

The copy-only logic properly:

  • Updates sourceSHA for tracking changes
  • Removes i18n field from frontmatter
  • Preserves original content without translation
  • Provides clear logging

The sourceSHA update ensures copy-only files can still be tracked for changes, which is important for incremental processing.

🧰 Tools
🪛 Biome (1.9.4)

[error] 500-500: Avoid the delete operator which can impact performance.

Unsafe fix: Use an undefined assignment instead.

(lint/performance/noDelete)

@youyongsong youyongsong requested a review from JounQin June 10, 2025 03:51
@youyongsong youyongsong enabled auto-merge (squash) June 10, 2025 04:18
@youyongsong youyongsong requested a review from JounQin June 10, 2025 04:18
Signed-off-by: JounQin <admin@1stg.me>
@JounQin JounQin changed the title Overhaul translation system for improved accuracy and stability feat: overhaul translation system for improved accuracy and stability Jun 10, 2025
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (4)
src/cli/translate.ts (4)

198-220: Well-implemented title translation utilities.

The functions are clean and focused. However, note that extractFirstLevelHeading only extracts markdown headings (# syntax) and doesn't consider titles in frontmatter, which may miss some title translation opportunities.


493-516: Consider optimizing sourceSHA for copy-only files.

Since copy-only files don't undergo translation, adding sourceSHA may be unnecessary and could be skipped to improve performance slightly.

🧰 Tools
🪛 Biome (1.9.4)

[error] 500-500: Avoid the delete operator which can impact performance.

Unsafe fix: Use an undefined assignment instead.

(lint/performance/noDelete)


574-576: 🛠️ Refactor suggestion

Replace delete operator with conditional assignment for better performance.

Use conditional logic instead of the delete operator for consistent performance optimization.

Apply this diff:

-              if (typeof newFrontmatter.title !== 'string') {
-                delete newFrontmatter.title
-              }
+              const finalFrontmatter = typeof newFrontmatter.title === 'string' 
+                ? newFrontmatter 
+                : (() => {
+                    const { title: _, ...rest } = newFrontmatter
+                    return rest
+                  })()

Then use finalFrontmatter in the matter.stringify call on line 578.

🧰 Tools
🪛 Biome (1.9.4)

[error] 575-575: Avoid the delete operator which can impact performance.

Unsafe fix: Use an undefined assignment instead.

(lint/performance/noDelete)


499-500: 🛠️ Refactor suggestion

Replace delete operator with destructuring for better performance.

The static analysis correctly identifies performance issues with the delete operator.

Apply this diff to use destructuring instead:

-              const newFrontmatter = { ...sourceFrontmatter, sourceSHA }
-              delete newFrontmatter.i18n
+              const { i18n: _, ...newFrontmatter } = { ...sourceFrontmatter, sourceSHA }
🧰 Tools
🪛 Biome (1.9.4)

[error] 500-500: Avoid the delete operator which can impact performance.

Unsafe fix: Use an undefined assignment instead.

(lint/performance/noDelete)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3cd5e5b and f7ff37f.

📒 Files selected for processing (1)
  • src/cli/translate.ts (8 hunks)
🧰 Additional context used
🧠 Learnings (1)
src/cli/translate.ts (2)
Learnt from: JounQin
PR: alauda/doom#82
File: src/cli/translate.ts:293-293
Timestamp: 2025-06-09T15:19:13.326Z
Learning: GPT-4.1-mini is a valid OpenAI model name that can be used in Azure OpenAI API calls, as confirmed by OpenAI's official documentation.
Learnt from: JounQin
PR: alauda/doom#82
File: src/cli/translate.ts:293-293
Timestamp: 2025-06-09T15:19:13.326Z
Learning: GPT-4.1-mini is a valid and recently released OpenAI model (April 2025) available on Azure OpenAI Service. It's identified as "gpt-4.1-mini-2025-04-14" in Azure OpenAI Service and significantly outperforms GPT-4o while being more cost-effective and faster. It supports up to 1M tokens of context and has multimodal capabilities.
🪛 Biome (1.9.4)
src/cli/translate.ts

[error] 500-500: Avoid the delete operator which can impact performance.

Unsafe fix: Use an undefined assignment instead.

(lint/performance/noDelete)


[error] 554-554: Avoid the delete operator which can impact performance.

Unsafe fix: Use an undefined assignment instead.

(lint/performance/noDelete)


[error] 575-575: Avoid the delete operator which can impact performance.

Unsafe fix: Use an undefined assignment instead.

(lint/performance/noDelete)

⏰ Context from checks skipped due to timeout of 90000ms (8)
  • GitHub Check: Build and Lint with Node.js 20 on ubuntu-latest
  • GitHub Check: Build and Lint with Node.js 18 on ubuntu-latest
  • GitHub Check: Build and Lint with Node.js 24 on ubuntu-latest
  • GitHub Check: Build and Lint with Node.js 22 on ubuntu-latest
  • GitHub Check: Build and Lint with Node.js 20 on ubuntu-latest
  • GitHub Check: Build and Lint with Node.js 22 on ubuntu-latest
  • GitHub Check: Build and Lint with Node.js 18 on ubuntu-latest
  • GitHub Check: Build and Lint with Node.js 24 on ubuntu-latest
🔇 Additional comments (5)
src/cli/translate.ts (5)

23-54: LGTM! Well-organized constants and imports.

The new imports align with the translation system overhaul objectives, and the constants provide clear configuration for copy-only directories and supported terminology languages.


121-162: Excellent smart filtering implementation.

The terminology resolution system is well-designed with proper regex escaping, case-insensitive matching, and filtering based on source content presence. This should significantly improve translation accuracy and reduce context overhead.


164-196: Robust anchor preservation mechanism.

The anchor placeholder system effectively protects anchor links from translation corruption. The handling of escaped underscores for MDX compatibility and proper error handling for invalid indices are excellent design choices.


222-304: Well-integrated translation function updates.

The function effectively integrates all new features - smart terminology filtering, title translation prompts, and anchor preservation. The model upgrade to gpt-4.1-mini is appropriate based on its improved performance characteristics.


563-572: Excellent title translation integration.

The title translation logic properly applies predefined mappings from the centralized translation table, which should improve consistency for common documentation sections.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
src/cli/translate.ts (1)

501-501: Consider replacing delete operators for better performance.

The static analysis tool correctly identifies that delete operators can impact performance. While this works functionally, consider using destructuring for consistency and performance:

-const newFrontmatter = { ...sourceFrontmatter, sourceSHA }
-delete newFrontmatter.i18n
+const { i18n: _, ...newFrontmatter } = { ...sourceFrontmatter, sourceSHA }

However, since there's a past review comment indicating preference for the delete approach for simplicity, this can remain as-is if performance isn't a concern.

Also applies to: 555-555, 582-582

🧰 Tools
🪛 Biome (1.9.4)

[error] 501-501: Avoid the delete operator which can impact performance.

Unsafe fix: Use an undefined assignment instead.

(lint/performance/noDelete)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between dcdda83 and b7fee83.

📒 Files selected for processing (15)
  • docs/en/apis/advanced-apis/event/index.mdx (1 hunks)
  • docs/en/apis/advanced-apis/event/search.mdx (1 hunks)
  • docs/en/apis/advanced-apis/log/aggregation.mdx (1 hunks)
  • docs/en/apis/advanced-apis/log/index.mdx (1 hunks)
  • docs/en/apis/advanced-apis/log/search.mdx (1 hunks)
  • docs/en/start.mdx (7 hunks)
  • docs/en/usage/configuration.md (5 hunks)
  • docs/zh/apis/advanced-apis/event/index.mdx (0 hunks)
  • docs/zh/apis/advanced-apis/event/search.mdx (0 hunks)
  • docs/zh/apis/advanced-apis/log/aggregation.mdx (0 hunks)
  • docs/zh/apis/advanced-apis/log/index.mdx (0 hunks)
  • docs/zh/apis/advanced-apis/log/search.mdx (0 hunks)
  • docs/zh/usage/configuration.md (1 hunks)
  • package.json (1 hunks)
  • src/cli/translate.ts (8 hunks)
💤 Files with no reviewable changes (5)
  • docs/zh/apis/advanced-apis/log/aggregation.mdx
  • docs/zh/apis/advanced-apis/event/search.mdx
  • docs/zh/apis/advanced-apis/event/index.mdx
  • docs/zh/apis/advanced-apis/log/index.mdx
  • docs/zh/apis/advanced-apis/log/search.mdx
✅ Files skipped from review due to trivial changes (6)
  • docs/en/apis/advanced-apis/log/aggregation.mdx
  • docs/en/apis/advanced-apis/event/search.mdx
  • docs/en/apis/advanced-apis/log/index.mdx
  • docs/en/apis/advanced-apis/log/search.mdx
  • package.json
  • docs/en/apis/advanced-apis/event/index.mdx
🧰 Additional context used
🧠 Learnings (1)
src/cli/translate.ts (2)
Learnt from: JounQin
PR: alauda/doom#82
File: src/cli/translate.ts:293-293
Timestamp: 2025-06-09T15:19:13.326Z
Learning: GPT-4.1-mini is a valid OpenAI model name that can be used in Azure OpenAI API calls, as confirmed by OpenAI's official documentation.
Learnt from: JounQin
PR: alauda/doom#82
File: src/cli/translate.ts:293-293
Timestamp: 2025-06-09T15:19:13.326Z
Learning: GPT-4.1-mini is a valid and recently released OpenAI model (April 2025) available on Azure OpenAI Service. It's identified as "gpt-4.1-mini-2025-04-14" in Azure OpenAI Service and significantly outperforms GPT-4o while being more cost-effective and faster. It supports up to 1M tokens of context and has multimodal capabilities.
🪛 Biome (1.9.4)
src/cli/translate.ts

[error] 501-501: Avoid the delete operator which can impact performance.

Unsafe fix: Use an undefined assignment instead.

(lint/performance/noDelete)


[error] 555-555: Avoid the delete operator which can impact performance.

Unsafe fix: Use an undefined assignment instead.

(lint/performance/noDelete)


[error] 582-582: Avoid the delete operator which can impact performance.

Unsafe fix: Use an undefined assignment instead.

(lint/performance/noDelete)

🪛 LanguageTool
docs/en/start.mdx

[uncategorized] ~152-~152: A punctuation mark might be missing here.
Context: ...eview. ### Using Scaffolding Templates {#new} Run yarn new to generate projects, ...

(AI_EN_LECTOR_MISSING_PUNCTUATION)


[typographical] ~212-~212: Consider adding a comma here.
Context: ...### Exporting PDF {#export} :::warning Please run the yarn build command before exp...

(PLEASE_COMMA)

docs/en/usage/configuration.md

[uncategorized] ~23-~23: Loose punctuation mark.
Context: ...# Basic Configuration {#basic} - lang: Default document language. To accommoda...

(UNLIKELY_OPENING_PUNCTUATION)


[uncategorized] ~24-~24: Loose punctuation mark.
Context: ... set to null or undefined. - title: Document title, displayed on the browse...

(UNLIKELY_OPENING_PUNCTUATION)


[uncategorized] ~25-~25: Loose punctuation mark.
Context: ..., displayed on the browser tab. - logo: Logo at the top left of the document, s...

(UNLIKELY_OPENING_PUNCTUATION)


[uncategorized] ~26-~26: Loose punctuation mark.
Context: ...lt into the doom package. - logoText: Document title, displayed next to the l...

(UNLIKELY_OPENING_PUNCTUATION)


[uncategorized] ~27-~27: Loose punctuation mark.
Context: ...xt to the logo at the top left. - icon: Document favicon, defaults to the same ...

(UNLIKELY_OPENING_PUNCTUATION)


[uncategorized] ~28-~28: Loose punctuation mark.
Context: ...defaults to the same as logo. - base: Base path of the document, used when de...

(UNLIKELY_OPENING_PUNCTUATION)


[uncategorized] ~29-~29: Loose punctuation mark.
Context: ...oduct-docs. Defaults to /. - outDir: Build output directory, defaults to di...

(UNLIKELY_OPENING_PUNCTUATION)


[uncategorized] ~85-~85: Loose punctuation mark.
Context: ...e ``` ### frontmatterMode - `ignore`: Ignore the frontmatter of the reference...

(UNLIKELY_OPENING_PUNCTUATION)


[uncategorized] ~86-~86: Loose punctuation mark.
Context: ...urrent document's frontmatter. - merge: Merge the frontmatter of the referenced...

(UNLIKELY_OPENING_PUNCTUATION)


[uncategorized] ~87-~87: Loose punctuation mark.
Context: ...ride the current document's. - replace: Replace the current document's frontmat...

(UNLIKELY_OPENING_PUNCTUATION)


[uncategorized] ~88-~88: Loose punctuation mark.
Context: ...t of the referenced document. - remove: Remove the current document's frontmatt...

(UNLIKELY_OPENING_PUNCTUATION)

⏰ Context from checks skipped due to timeout of 90000ms (8)
  • GitHub Check: Build and Lint with Node.js 20 on ubuntu-latest
  • GitHub Check: Build and Lint with Node.js 24 on ubuntu-latest
  • GitHub Check: Build and Lint with Node.js 22 on ubuntu-latest
  • GitHub Check: Build and Lint with Node.js 18 on ubuntu-latest
  • GitHub Check: Build and Lint with Node.js 20 on ubuntu-latest
  • GitHub Check: Build and Lint with Node.js 18 on ubuntu-latest
  • GitHub Check: Build and Lint with Node.js 22 on ubuntu-latest
  • GitHub Check: Build and Lint with Node.js 24 on ubuntu-latest
🔇 Additional comments (13)
docs/zh/usage/configuration.md (1)

169-229: LGTM! Comprehensive translation system prompt documentation.

The new system prompt documentation is well-structured and covers all the critical baseline requirements for accurate translation. The detailed rules for preserving links, handling MDX components, managing placeholders, and maintaining formatting align perfectly with the translation system overhaul described in the PR objectives.

The dynamic injection of terms and titleTranslationPrompt parameters provides good flexibility for context-aware translations.

docs/en/usage/configuration.md (1)

170-229: LGTM! Well-documented translation configuration.

The English documentation mirrors the Chinese version perfectly and provides clear, comprehensive guidance for the translation system configuration. The baseline requirements are well-explained and the dynamic prompt injection features are properly documented.

docs/en/start.mdx (2)

241-259: Good addition of lint command documentation.

The new lint command documentation section provides clear usage instructions and references to configuration details, improving the completeness of the CLI documentation.


5-77: LGTM! Improved documentation clarity.

The section title improvements ("Getting Started", "Creating a Project", "CLI Tool") and various editorial enhancements make the documentation more readable and user-friendly.

src/cli/translate.ts (9)

49-49: Good addition of Russian language support.

Adding Russian to the supported terminology languages expands the translation capabilities as intended by the PR objectives.


52-56: Well-implemented copy-only directories feature.

The COPY_ONLY_DIRECTORIES constant provides a clean way to specify directories that should be copied verbatim without translation, which aligns with the PR's goal of adding copy-only file configurations.


57-111: Excellent system prompt rewrite.

The new English system prompt is comprehensive and well-structured. The baseline requirements clearly define critical rules for preserving links, handling MDX format, managing placeholders, and maintaining formatting. The dynamic injection of titleTranslationPrompt and terms provides good flexibility for context-aware translations.


122-163: Smart terminology filtering implementation.

The resolveTerms function efficiently filters terminology based on:

  1. Presence of terms in source content (case-insensitive)
  2. Availability of translations in both source and target languages

This reduces context overhead and improves translation accuracy as intended by the PR objectives.


165-197: Robust anchor preservation mechanism.

The anchor placeholder system effectively protects markdown anchor links from translation corruption by:

  1. Replacing anchors with numbered placeholders before translation
  2. Handling escaped underscores for MDX processor compatibility
  3. Restoring original anchors after translation

This addresses a key stability improvement mentioned in the PR objectives.


199-221: Good title translation mapping integration.

The functions for extracting first-level headings and mapping title translations provide a clean way to handle predefined title translations, improving consistency as mentioned in the PR objectives.


294-294: Model update to gpt-4.1-mini is correct.

Based on the retrieved learnings, gpt-4.1-mini is a valid and recently released OpenAI model that offers improved performance and cost-effectiveness compared to previous models.


396-517: Well-implemented copy-only file handling.

The copy-only logic correctly:

  1. Identifies files matching copy-only directory patterns
  2. Updates frontmatter with sourceSHA but skips translation
  3. Maintains proper file structure and logging

This provides the selective copy functionality described in the PR objectives.

🧰 Tools
🪛 Biome (1.9.4)

[error] 501-501: Avoid the delete operator which can impact performance.

Unsafe fix: Use an undefined assignment instead.

(lint/performance/noDelete)


518-597: Comprehensive translation logic enhancement.

The enhanced translation workflow effectively:

  1. Handles anchor placeholders to preserve link integrity
  2. Applies title translations from the mapping table
  3. Properly manages frontmatter updates including removing i18n fields
  4. Maintains consistent logging and error handling

This addresses the core objectives of improving translation accuracy and stability.

🧰 Tools
🪛 Biome (1.9.4)

[error] 555-555: Avoid the delete operator which can impact performance.

Unsafe fix: Use an undefined assignment instead.

(lint/performance/noDelete)


[error] 582-582: Avoid the delete operator which can impact performance.

Unsafe fix: Use an undefined assignment instead.

(lint/performance/noDelete)

@youyongsong youyongsong merged commit 32a35c7 into main Jun 10, 2025
16 checks passed
@youyongsong youyongsong deleted the feat/refactor-translation branch June 10, 2025 05:09
github-actions bot pushed a commit that referenced this pull request Jun 10, 2025
* Overhaul translation system for improved accuracy and stability

This commit improves the documentation translation system by simplifying the translation process, reducing context overhead, and making translation results more stable.

System Prompt Changes:
* Convert system prompt from Chinese to English for better AI comprehension
* Replace 4-step translation process with single-pass approach to improve output stability and reduce context
* Remove target content comparison logic to simplify process and reduce translation context
* Restructure prompt with clear baseline requirements and optional additional requirements section

Smart Terms Filtering:
* Only include terms that actually appear in source content
* Support multi-language terms mapping (English, Chinese, Russian)

Anchor Link Protection:
* Introduce anchor preprocessing mechanism to prevent translation corruption of markdown anchor links
* Replace anchors with numbered placeholders (__ANCHOR_N__) before translation
* Restore original anchors after translation completion
* Handle escaped underscores in anchor IDs to maintain compatibility with MDX processor

Title Translation:
* Replace i18n.title-based translation with built-in title translation mapping table
* Implement automatic title correction through prompt enhancement when titles exist in mapping table
* Extract first-level headings from content and apply predefined translations when available
* Fallback to AI translation for unmapped titles while preserving consistency for common terms

Frontmatter Cleanup:
* Remove i18n fields from translated documents

Content Processing:
* Add support for copy-only files configuration, with all files under apis/ directory set to copy-only by default

Technical Changes:
* Upgrade model from gpt-4o-mini to gpt-4.1-mini
* Update language constants from Chinese to English descriptions
* Add title translation mapping table for common sections

* fix typecov and improve title translation map.

* fix issues found by coderabbit ai.

* fix issues found be coderabbit ai.

* revert to delete frontmatter fields method.
@renovate renovate bot mentioned this pull request Jul 9, 2025
1 task
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants