Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MSC3554: Extensible Events - Translatable Text #3554

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

turt2live
Copy link
Member

@turt2live turt2live commented Dec 7, 2021

@turt2live turt2live changed the title Extensible Events - Translatable Text MSC3554: Extensible Events - Translatable Text Dec 7, 2021
@turt2live turt2live added kind:core MSC which is critical to the protocol's success needs-implementation This MSC does not have a qualifying implementation for the SCT to review. The MSC cannot enter FCP. p2 proposal A matrix spec change proposal proposal-in-review labels Dec 7, 2021
unaware clients would use, which in the example above would be French. Clients which are aware of language
support might end up picking the English version instead.

By default, messages are assumed to be sent in English (`en`).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it makes a lot of sense to assume a language in this case. There is no fault prove way to guess a language, so many clients will probably default to just sending whatever the user typed without a language. What is the benefit of assuming English, if that is probably wrong in a lot of cases? Shouldn't it rather just be unspecified?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The vast majority of software in the ecosystem makes assumptions about text being English. This is just to help implementations which might be searching for a language code, not to define the language itself.

Unspecified leads to all kinds of issues with software, whereas French-as-default-English is generally fine.

@tobiasdiez
Copy link

What is the state of this spec proposal? The PR linked under blocker in the description is merged by now. Would be awesome to see this land soon

@noaho

This comment was marked as duplicate.

Comment on lines +34 to +50
```json5
{
"type": "m.message",
"content": {
"m.text": [
{
"body": "Je suis un poisson",
"lang": "fr"
},
{
"body": "I am a fish",
"lang": "en"
}
]
}
}
```
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved to a comment thread, a question from @noaho:

If it’s possible for clients to send in multiple languages, this might lead to a situation where the client auto translates to a bunch of languages and it’s not clear which is the source text, which would then also make it hard for the receiver to translate on their own (possibly to a new language or just with a better language model) because they won’t know which is the source text and which is machine translated.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we could add a new boolean field to an m.text object which indicates that a given translation was the one that the user originally typed/entered? Such as lang_source: true|false?

@tulir tulir mentioned this pull request Dec 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind:core MSC which is critical to the protocol's success needs-implementation This MSC does not have a qualifying implementation for the SCT to review. The MSC cannot enter FCP. p2 proposal A matrix spec change proposal
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants