Skip to content

Conversation

@dantee-e
Copy link

@dantee-e dantee-e commented Nov 7, 2025

Description

Adding support to the portuguese language to Harper. It is not even close to being operational.
For now, added only superficial alterations:

  • New portuguese dictionary and annotations with new rules for stress in words, which is important when determining the plural for words in the language.
  • Created new enum for the languages supported.
  • New parser for plain portuguese. Changed tiny things on it, removing checks that are wrong for the language. The function lex_token now has another variable that receives the language.
  • Some temporary helper files have been added to make it easier to build the portuguese dictionary

Demo

Nothing yet

How Has This Been Tested?

No tests yet

Checklist

  • I have performed a self-review of my own code
  • I have added tests to cover my changes

If any maintainer sees this, I apologize for not keeping the commit history clean using the conventional commits practices. If there's a way to correct this besides squashing them all into one commit when merging, let me know. I make many commits just as a tiny checkpoint, so I don't usually have the patience to make it nice.

Dante Ramacciotti added 4 commits November 6, 2025 12:48
No processo de incluir ox parox e proparoxitonas
Helper file ox-paros-proparox that classifies the words based on the
stress
Adding derive default to the language enum
@dantee-e dantee-e marked this pull request as draft November 7, 2025 15:11
Dante Ramacciotti added 25 commits November 7, 2025 17:18
It invokes different dictionaries depending on what language you pass to
it
Refactored a bit of the main function to add global arguments for
language and dialect. Not yet decided how to deal with dialects for
other languages than English
Added both masculine and feminine genders. Also added a neutral gender
so that expansions to other languages is made easier.
Now the replace can have a condition (for example, checks if it's a noun
or a verb before replacing)
Now Dialect is a trait that can be used to implement other Dialects. The
old Dialect struct is now called EnglishDialect, and has the same
functionalities as before. The only caveat is that now one must include
also the traits to be able to use the functions
Put the new properties in the wrong place. It was the }. Just a
misplaced }.
Documentation had the wrong use of the Dialog trait
Now DictWordMetadata takes a DialectFlagsEnum instead of a
EnglishDialect. There are some adaptations that needed to be made, for
example creating the Dialect trait, and adding to this trait a function
that gets most used dialects from document using a provided language
(LanguageFamily, which is a variant of the enum that doesn't have the
dialect contained inside it). This  function should not be used for any
of the specific DialectFlags implementations, only for the
DialectFlagsEnum.

On the counterpart, the variant of the function that doesn't receive any
languages should only be used by the specific implementations, which
should be progressively more private, being replaced in general by the
enum.

By the way, the original idea was using in the DictWordMetadata struct a
Box<dyn DialogFlag>, but in the end the trait is not dyn compatible,
hence this workaround
Adapting what needs to be adapted from Language to LanguageFamily
Basically EnglishDialect::American =>
Language::English(EnglishDialect::American) everywhere that uses
LintGroup::new_curated
Dante Ramacciotti added 2 commits November 24, 2025 17:19
Added the language for the assert_suggestion_result function.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant