Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Strictness and laziness ergonomics #317

Closed
Kleidukos opened this issue Mar 14, 2021 · 18 comments · Fixed by #547
Closed

Strictness and laziness ergonomics #317

Kleidukos opened this issue Mar 14, 2021 · 18 comments · Fixed by #547

Comments

@Kleidukos
Copy link
Member

The fact that we have both lazy and strict Text types without any marker outside of their namespace is rather unfortunate.
In pretty big codebases or modules, it's also easy to forget that Data.Text.Lazy was imported.

This is why, inspired by Kowainik's work, I would like to introduce the following type alias:

import qualified Data.Text.Lazy as LT

-- | Type synonym for 'Data.Text.Lazy.Text'.
type LText = LT.Text

What it does

  • Allows the codebase contributor to specify with more clarity to other contributors that the Text flavour that is used is lazy, and reducing the cognitive overhead of said contributors. Very useful in codebase operated by multiple people over a long time with new onboardings.

What it does not

  • Provides a magic tool that solves every problem under the sun
  • Forces you to use it
  • Kills your kitten

A similar ticket was opened in the haskell/bytestring repo: haskell/bytestring#375

@kozross
Copy link

kozross commented Mar 15, 2021

In my experience, type synonyms aren't that great. They're expanded sometimes by GHC (never consistently), which means you can easily end up with confusing messaging from the compiler. This basically forces you to keep this information in your head at all times.

I certainly do agree that there needs to be a much stronger distinction between the two Text flavours. I just don't think this is a good way to do this.

@Kleidukos
Copy link
Member Author

@kozross As I can understand this argument for Servant's type aliases, which are pretty much indecipherable without good knowledge of the library's inner working, I have a hard time imagining being confused by Data.Text.Lazy.Text vs. LText.
The target audience is pretty much aware that they are using this type alias to be more explicit about the lazy nature of the type, and it doesn't hide complex type machinery behind.

@parsonsmatt
Copy link

One irritating thing that i see a lot in Haddock is something like:

toLazyText :: Foo -> Text
toStrictText :: Foo -> Text

An LText alias would be much easier to follow and disambiguate in the documentation, which would be great not just for text but also any library that depends on it.

One way to cut the Gordian knot here is to deprecate Lazy entirely and have users defer to streaming abstractions for these use cases. I've never been a fan of that dichotomy, and it mostly seems useful for lazy IO, which I'm also not a huge fan of.

@Kleidukos
Copy link
Member Author

We should push for a broader adoption of proper streaming abstractions instead of Lazy Text for IO.
@parsonsmatt: Would you be open to the idea of publishing an article / tutorial on how to better replace lazy text / bytestring with, idk pipes for example?

@parsonsmatt
Copy link

If we decide that deprecating the Lazy stuff is worthwhile then I'd be happy to write for the cause.

@Kleidukos
Copy link
Member Author

Kleidukos commented Mar 15, 2021

To be quite honest, I don't think it will ever be considered unless we are certain that the ecosystem is ready for the migration (which means pedagogical materials for learners, adoption by popular libraries and projects, etc).
And this is a cultural change that I see happening over the course of several years.

@kozross
Copy link

kozross commented Mar 15, 2021

I agree with @parsonsmatt - lazy Text should go. I would also be quite happy to assist with the necessary writing.

@Bodigrim
Copy link
Contributor

No one forces people to use lazy Text, if they do not want to. As soon as the majority of ecosystem is convinced to use something else and has switched, we will be in a good position to discuss its deprecation.

I understand why people damn lazy I/O. But it is only a small facet of lazy Text. What would happen with Builder, for example?

Deprecating Data.Text.Lazy.IO could be a more actionable item, if someone wishes it to raise as a separate issue.

@parsonsmatt
Copy link

I talked to Snoyman about Builder and he suggested that it might be worth deprecating, as well. The gist of it is that a Builder type is almost always used for I/O, at which point you should be using a ByteString builder and not a Text builder anyway.

In any case, deprecating any part of text's API is a bit out of scope for this discussion - my apologies for raising it at all 😅

@kozross
Copy link

kozross commented Mar 15, 2021

Not to mention that (for ByteString at least), there's faster Builder implementations that don't rely on the corresponding lazy type. This is quite likely true for Text as well. It's not even worth it for builders IMHO.

@Bodigrim
Copy link
Contributor

I must reiterate: no one forces people to use lazy Text or Builder. Feel free to promote alternatives and convince others to switch. Simply breaking thousands of packages is not a sensible option IMO. Users do not normally have an option to stick to an older text, it is a boot package.

Sticking to a topic, I think it is a good idea. To avoid undesirable expansion of type synonyms, we can actually do things vice versa, defining data LazyText = ... and type Text = LazyText.

@kozross
Copy link

kozross commented Mar 15, 2021

@Bodigrim I completely disagree. Try using a library which uses lazy Text - then you have to interact with it, whether you like it or not. If that's not 'forced', I dunno what is - especially if the library isn't really about text as-such.

But I agree, this is off-topic, so I'll stop.

@kozross
Copy link

kozross commented Mar 16, 2021

@Kleidukos I'm not sure such a community migration will ever happen unless driven aggressively by us. Alternatives and better solutions, both semantically and performance-wise, to lazy Text, for both streaming and building, have been around for nine years now, and more get added at least every couple of months.

@Bodigrim
Copy link
Contributor

Back to the topic, bytestring has added LazyByteString and StrictByteString. I'm in favor of adding type LazyText = Data.Text.Lazy.Text at least, and do not mind type StrictText = Data.Text.Text as well.

Making it other way around (defining data LazyText = ... and type Text = LazyText) is arguably better, because expanded type synonyms look nice, but this is a breaking change: if a third-party library used to define an instance Foo Text, it has to enable TypeSynonymInstances now. However, I think this is acceptable, if we are looking for a major release.

@Kleidukos
Copy link
Member Author

@Bodigrim I don't know very well the release policy of text, but wouldn't it be better to first introduce the type aliases, and in a couple of releases have them as the Real Types? And we can litter the changelog with "new text names phase1, be ready for when the next breaking release arrives"?

@Bodigrim
Copy link
Contributor

@Kleidukos I expect that the next release of text will be a major one, and then we'll stick to it for a couple of years or so. Staged introduction makes sense, when there is no way to be compatible with both versions of the package without CPP, but this is not the case here.

@Bodigrim
Copy link
Contributor

Bodigrim commented Mar 21, 2021

Now back to off-topic :)

Try using a library which uses lazy Text - then you have to interact with it, whether you like it or not. If that's not 'forced', I dunno what is - especially if the library isn't really about text as-such.

If a library uses a lazy Text, where a strict one should be used, then it is arguably a wrong interface (but not a fault of lazy Text per se). You can convert it to a strict Text at API boundary, not a big deal IMHO.

However, if laziness is there for a purpose, then I'm not sure that returning a lazy Text is bad. It is an unopinionated lingua franca. What would you prefer instead? There is no standard streaming library, and I cannot reasonably expect that a streaming interface chosen by a third-party library coincides with my own choice, so I'll end up depending on both and writing my own plumbing. This is not cool IMHO.

@kozross
Copy link

kozross commented Mar 21, 2021

@Bodigrim I agree that the lack of a standard streaming library is the real problem here.

Kleidukos added a commit to Kleidukos/text that referenced this issue Nov 27, 2023
Kleidukos added a commit to Kleidukos/text that referenced this issue Nov 27, 2023
Kleidukos added a commit to Kleidukos/text that referenced this issue Nov 27, 2023
Lysxia pushed a commit that referenced this issue Dec 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants