Portugues #2150

dantee-e · 2025-11-07T15:06:26Z

Description

Adding support to the portuguese language to Harper. It is not even close to being operational.
For now, added only superficial alterations:

New portuguese dictionary and annotations with new rules for stress in words, which is important when determining the plural for words in the language.
Created new enum for the languages supported.
New parser for plain portuguese. Changed tiny things on it, removing checks that are wrong for the language. The function lex_token now has another variable that receives the language.
Some temporary helper files have been added to make it easier to build the portuguese dictionary

Demo

Nothing yet

How Has This Been Tested?

No tests yet

Checklist

I have performed a self-review of my own code
I have added tests to cover my changes

If any maintainer sees this, I apologize for not keeping the commit history clean using the conventional commits practices. If there's a way to correct this besides squashing them all into one commit when merging, let me know. I make many commits just as a tiny checkpoint, so I don't usually have the patience to make it nice.

No processo de incluir ox parox e proparoxitonas

Helper file ox-paros-proparox that classifies the words based on the stress

Adding derive default to the language enum

It invokes different dictionaries depending on what language you pass to it

Refactored a bit of the main function to add global arguments for language and dialect. Not yet decided how to deal with dialects for other languages than English

Added both masculine and feminine genders. Also added a neutral gender so that expansions to other languages is made easier.

Now the replace can have a condition (for example, checks if it's a noun or a verb before replacing)

Now Dialect is a trait that can be used to implement other Dialects. The old Dialect struct is now called EnglishDialect, and has the same functionalities as before. The only caveat is that now one must include also the traits to be able to use the functions

Put the new properties in the wrong place. It was the }. Just a misplaced }.

Documentation had the wrong use of the Dialog trait

Now DictWordMetadata takes a DialectFlagsEnum instead of a EnglishDialect. There are some adaptations that needed to be made, for example creating the Dialect trait, and adding to this trait a function that gets most used dialects from document using a provided language (LanguageFamily, which is a variant of the enum that doesn't have the dialect contained inside it). This function should not be used for any of the specific DialectFlags implementations, only for the DialectFlagsEnum. On the counterpart, the variant of the function that doesn't receive any languages should only be used by the specific implementations, which should be progressively more private, being replaced in general by the enum. By the way, the original idea was using in the DictWordMetadata struct a Box<dyn DialogFlag>, but in the end the trait is not dyn compatible, hence this workaround

Adapting what needs to be adapted from Language to LanguageFamily

…uages eventually

Basically EnglishDialect::American => Language::English(EnglishDialect::American) everywhere that uses LintGroup::new_curated

not yet ready

Added the language for the assert_suggestion_result function.

Dante Ramacciotti added 4 commits November 6, 2025 12:48

Adicionando o dicionario e as anotacoes para portuges

71a2e30

No processo de incluir ox parox e proparoxitonas

Added helper functions

415cad4

Helper file ox-paros-proparox that classifies the words based on the stress

Added the structure for the plain portuguese parser

ca9d0dc

correcting warning

66db7f6

Adding derive default to the language enum

dantee-e marked this pull request as draft November 7, 2025 15:11

Dante Ramacciotti added 25 commits November 7, 2025 17:18

Added the language to every FstDictionary::curated call

abdca18

It invokes different dictionaries depending on what language you pass to it

broke english too

695ef9c

Refactored a bit of the main function to add global arguments for language and dialect. Not yet decided how to deal with dialects for other languages than English

Forgot to add dictionary to merged_dict

9c3292f

Added genders

be08d39

Added both masculine and feminine genders. Also added a neutral gender so that expansions to other languages is made easier.

In the process of adding the metadata_conditional function

cbf997a

Untested implementation of the condition

90e9d39

Now the replace can have a condition (for example, checks if it's a noun or a verb before replacing)

Renaming some stuff

f7a46d5

Merge branch 'master' into portugues

ea8999f

Tiny fixes

8f12b7c

Merge branch 'master' into portugues (updated master)

60b355d

Dialect -> EnglishDialect

e30449b

Fixed annotations.json

3df7618

Put the new properties in the wrong place. It was the }. Just a misplaced }.

Fixed documentation and some tests

10ee129

Documentation had the wrong use of the Dialog trait

added new masculine, feminine and animate property

1bde6ae

Remove file added by mistake

a235b51

just fmt

9c43094

thinking about possibilities

7621cdb

Fixes because of last commit

703dda4

Adapting what needs to be adapted from Language to LanguageFamily

cargo check passes

7a7d25e

Corrected annotations and default for english dialect

70dd0bd

Temporary fix for the harper-ls. Must be updated to support more lang…

87d537e

…uages eventually

backup for the wknd

0af195a

Fixing tests to use Language instead of dialog

dcb9dcf

Basically EnglishDialect::American => Language::English(EnglishDialect::American) everywhere that uses LintGroup::new_curated

Dante Ramacciotti added 2 commits November 24, 2025 17:19

Introducing portuguese spellcheck

782f5e1

not yet ready

I heart ssr

d06ba19

Added the language for the assert_suggestion_result function.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Portugues #2150

Portugues #2150

dantee-e commented Nov 7, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Portugues #2150

Are you sure you want to change the base?

Portugues #2150

Conversation

dantee-e commented Nov 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Demo

How Has This Been Tested?

Checklist

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

dantee-e commented Nov 7, 2025 •

edited

Loading