Skip to content

Support content in languages other than English #696

Open
@natoverse

Description

@natoverse

GraphRAG does not explicitly support any particular language, however, the prompts are written in English and most of our evaluation has been done using English-language datasets. Many users would like to use GraphRAG for non-English datasets, and have reported varying levels of success. GraphRAG performance may vary across languages based on prompting, encoding/tokenizing, and the training and biases of the chosen model.

While we don't plan to implement explicit features or support for any language in particular at this time, there are a number of things users can do to try and improve non-English language support. A few examples:

  • Tune the prompts to request responses in a specific language. Notably, you can use our auto-tuning CLI tool and specify the language to use.
  • Rewrite the prompts in your language of choice. If you have used the init command to generate your starting config, all of the prompts are exported as text files that you can edit.
  • Experiment with different models. We haven't confirmed any specific model/language alignments, but please do experiment and report back in the discussion comments if you have helpful results. If you need to try a non-OpenAI model that is trained on a language other than English, please see issue Support model providers other than OpenAI and Azure #657.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions