Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add aliases for encodings #104

Merged
merged 3 commits into from
Oct 22, 2024
Merged

Add aliases for encodings #104

merged 3 commits into from
Oct 22, 2024

Conversation

tfausak
Copy link
Owner

@tfausak tfausak commented Sep 11, 2024

Fixes #103.

Still to do:

  • Add documentation. Some basic usage examples would be nice.
  • Bike shed naming. How should lazy and strict be differentiated? What about little endian and big endian?
  • Consider deprecating ISO_8859_1 in favor of LATIN_1, which is both shorter and what the text package calls it anyway.

@tfausak
Copy link
Owner Author

tfausak commented Sep 11, 2024

How should lazy and strict be differentiated?

Options:

  • As it is now, with *Strict and *Lazy suffixes.
  • Make lazy variants have no suffix and strict variants have a *' suffix.
  • Make strict variants have no suffix and lazy variants have *Lazy suffixes.
  • As above, but with prefixes instead of suffixes.
  • As above, but with abbreviations (S & L) instead.

@tfausak
Copy link
Owner Author

tfausak commented Sep 11, 2024

What about little endian and big endian?

Options:

  • As it is now, with *le and *be suffixes.
  • Same as above, but with uppercase suffixes instead.
  • Pick a default to have no suffix and only apply the suffix to the other one.
  • Provide an option without the suffix that attempts to figure out the correct endianness.

https://en.wikipedia.org/wiki/UTF-16#Byte-order_encoding_schemes

@tfausak
Copy link
Owner Author

tfausak commented Sep 11, 2024

It may be good to avoid using "strict" since it could mean either "not lazy" or "not lenient", both of which are applicable to encodings.

@tfausak
Copy link
Owner Author

tfausak commented Sep 16, 2024

Another alternative: Only provide aliases for strict byte strings.

type Utf8 = UTF_8 ByteString
via @Utf8 "..." :: ByteString
via @(UTF_8 LazyByteString) "..." :: LazyByteString

I have a hunch that conversions for strict byte strings are more common, but I don't know how to prove it. And if I later want to add aliases for lazy byte strings, I'll be right back here.

@tfausak tfausak merged commit 7e7cd44 into main Oct 22, 2024
11 checks passed
@tfausak tfausak deleted the gh-103-encoding-aliases branch October 22, 2024 15:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add aliases for encodings
1 participant