Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Add non-throwing variants of parse() and to() #6665

Closed
wants to merge 8 commits into from

Conversation

nordlow
Copy link
Contributor

@nordlow nordlow commented Aug 15, 2018

Putting it out in the open just to get early feedback.

See also: https://forum.dlang.org/post/hotgouygobclmesixayw@forum.dlang.org

@dlang-bot
Copy link
Contributor

Thanks for your pull request and interest in making D better, @nordlow! We are looking forward to reviewing it, and you should be hearing from a maintainer soon.
Please verify that your PR follows this checklist:

  • My PR is fully covered with tests (you can see the annotated coverage diff directly on GitHub with CodeCov's browser extension
  • My PR is as minimal as possible (smaller, focused PRs are easier to review than big ones)
  • I have provided a detailed rationale explaining my changes
  • New or modified functions have Ddoc comments (with Params: and Returns:)

Please see CONTRIBUTING.md for more information.


If you have addressed all reviews or aren't sure how to proceed, don't hesitate to ping us with a simple comment.

Bugzilla references

Your PR doesn't reference any Bugzilla issue.

If your PR contains non-trivial changes, please reference a Bugzilla issue or create a manual changelog.

Testing this PR locally

If you don't have a local development environment setup, you can use Digger to test this PR:

dub fetch digger
dub run digger -- build "master + phobos#6665"

@nordlow nordlow force-pushed the try-conv branch 4 times, most recently from 31b89a8 to 7c57fa1 Compare August 15, 2018 12:49
@wilzbach
Copy link
Member

I like the idea (and I'm generally in favor of using more Nullable / Optional in Phobos).
However a few concerns for now:

  • should we really promote a different type and not use Nullable. I'm aware of the problem that Nullable doesn't allow one to store a custom error message. I guess at the very least the API should be similar (maybe we can generalize the Nullable concept like we did with ranges). Also there are lots of nice goodies in Nullable like apply (the analog of Folly's then)
  • expected.result (.get in Nullable) should automatically throw when accessed and I think we might be better off with sth. like get!ConvException. Only Nullable.get(<fallback>) is nothrow

@nordlow
Copy link
Contributor Author

nordlow commented Aug 15, 2018

Sounds good to me.

Two key questions:

  • Is Nullable.apply nothrow?
  • Can we feed Nullable.apply a lambda and still be nothrow?

I'll have a go at your proposal anyway... :)

Update: Changed to using Nullable.

@wilzbach
Copy link
Member

wilzbach commented Aug 15, 2018

Is Nullable.apply nothrow?
Can we feed Nullable.apply a lambda and still be nothrow?

Yes and even @nogc e.g.

phobos/std/typecons.d

Lines 3807 to 3823 in bb769cf

nothrow pure @nogc @safe unittest
{
alias toFloat = i => cast(float) i;
Nullable!int sample;
// apply(null) results in a null `Nullable` of the function's return type.
Nullable!float f = sample.apply!toFloat;
assert(sample.isNull && f.isNull);
sample = 3;
// apply(non-null) calls the function and wraps the result in a `Nullable`.
f = sample.apply!toFloat;
assert(!sample.isNull && !f.isNull);
assert(f.get == 3.0f);
}

See also: https://dlang.org/changelog/2.080.0.html#std-typecons-nullable-apply

That's because Nullable.get doesn't use exceptions, but only assert

@nordlow
Copy link
Contributor Author

nordlow commented Aug 15, 2018

Ok, I'm all in on using Nullable. Let's make this happen!

@jacob-carlborg
Copy link
Contributor

jacob-carlborg commented Aug 15, 2018

I don’t think we should use Nullable, instead implement a proper Optional type. I have an implementation here [1] that implements the range interface and safe access.

[1] https://github.com/jacob-carlborg/dlp/blob/master/source/dlp/core/optional.d

@nordlow
Copy link
Contributor Author

nordlow commented Aug 15, 2018

Can you please describe why we need an Optional instead of a Nullable?

Are there any other reasons than the above mentioned?

@jacob-carlborg
Copy link
Contributor

Can you please describe why we need an Optional instead of a Nullable? Are there any other reasons than the above mentioned?

No, it's mostly because it has a range interface and safe access interface. I also think it's a better, more descriptive name.

@nordlow
Copy link
Contributor Author

nordlow commented Aug 15, 2018

How the is the interface @safer?

AFAICT, Optional.get is as (un)safe as Nullable.get is, which is the critical function when it comes to (memory) safety. Both may result in unexpected behaviour when compiled without asserts and called when isPresent is false.

This bug can at least be avoided if we instead use Nullable.apply and port it to Optional.

I agree that Optional is better naming, though. More similar to other languages naming, such as C++'s std::optional.

@jmdavis
Copy link
Member

jmdavis commented Aug 15, 2018

No, it's mostly because it has a range interface and safe access interface. I also think it's a better, more descriptive name.

We already have Nullable in Phobos, and this is exactly the sort of thing it's designed for - the case where a variable may or may not have a value.

@jacob-carlborg
Copy link
Contributor

jacob-carlborg commented Aug 15, 2018

How the is the interface @safer?

I didn't say @safe, I said "safe access", without the at sign. "safe" might not be the best name but what I mean is you can do this:

class Bar { int a; }
class Foo { Bar bar; }
auto foo = none!Foo;
auto a = foo.bar.a; // safe access, "a" will be an optional int
auto b = a.or(3); // unwraps "a" if it has a value, otherwise returns 3
auto c = a.map!(i => i * 4); // it also supports the range interface

If the standard algorithms are used, the lambda will only be called if the optional has a value. When using map it's basically like apply for Nullable but with a standard interface.

@nordlow
Copy link
Contributor Author

nordlow commented Aug 15, 2018

So lets try adding your opDispatch to Nullable then!

@PetarKirov
Copy link
Member

How about until we all settle on the return type, we simply use auto? That way the exact return type becomes an implementation detail and in future, when we have made the necessary changes to Nullable or we have added an new type, we could switch from auto to an explicit return type.

@PetarKirov
Copy link
Member

Another option would be add these functions first to std.experimental, so we would be free to change the return type later.

@jacob-carlborg
Copy link
Contributor

So lets try adding your opDispatch to Nullable then!

Please no. I suggest we deprecate Nullable and add a new implementation called Optional. The name Nullable is poorly chosen. The name Nullable suggests it should only be used for value types which otherwise cannot be assigned null. But Optional is supposed to be used for all types, even those that can be assigned null. An optional pointer would mean that the pointer might not point to something, a plain pointer would mean that it always points to something, it can never be null (I know that this is not enforced by the language).

@aliak00
Copy link
Contributor

aliak00 commented Aug 16, 2018

I agree Nullable can be used here but it's says nothing about "this value may be present" in a truely generic sense - null is a very valid part of some value type domains - e.g. pointers.

An optional is explicit in intent and it'd be a much nicer interface with an Optional. Also, once we start using Nullable in phobos for stuff like this it's going to be very hard to go back :/

So super big +1 for putting this in std.experimental until we have an Optional type.

@jacob-carlborg
Copy link
Contributor

Yes, I agree. I meant get should not be used, regardless if it's called directly or through alias this. Functions like getOrElse, map or filter should be used instead.

@nordlow
Copy link
Contributor Author

nordlow commented Aug 20, 2018

I've added a very basic version of Expected here. Would you be so kind to add some comments?

@FeepingCreature
Copy link
Contributor

FeepingCreature commented Aug 20, 2018

I can't seem to add comments to an ordinary file. Could you open a mini-PR and link it here?

edit: At a glance, you probably want to use the .moveEmplace/.destroy logic from this PR. Since you do use union, you also need opEquals and toHash.

edit: destroy does not do the right thing for class references! You want to detect that case and only set your reference to null.

edit: It seems to me there is little reason for this to not be a replacement for Algebraic rather than Nullable?

@jmdavis
Copy link
Member

jmdavis commented Aug 20, 2018

Yes, I agree. I meant get should not be used, regardless if it's called directly or through alias this. Functions like getOrElse, map or filter should be used instead.

I can't say that I agree with that. Given that Nullable is supposed to act more or less like a pointer, using get makes perfect sense to me and is what I would expect to be used. It's just that if you don't know for sure that it has a value, you need to check isNull first. It doesn't make any sense to me that ranges would be involved at all unless the type being wrapped was a range.

@FeepingCreature
Copy link
Contributor

FeepingCreature commented Aug 20, 2018

The logic of using ranges is, I think, that a type which may either contain one value or no value, ie. Nullable, is sort of a special case of a type which may contain any number of values, ie. ranges.

Also possibly to gain a little of the generality of functional languages with their ability to write algorithms that apply to any arbitrary wrapper/container type, ie. monads.

edit: The logic of not using get at all, is that get allows you to write wrong code. It's not as bad as current Nullable, which allows you to write wrong code completely by accident, but it's still not very nice.

@jacob-carlborg
Copy link
Contributor

Optional is not supposed to act like a pointer, in my opinion. Perhaps Nullable is.

@nordlow
Copy link
Contributor Author

nordlow commented Aug 20, 2018

Is it ok to add Expected to this PR for now in order for you to make comments in it?

@jacob-carlborg
Copy link
Contributor

Is it ok to add Expected to this PR for now in order for you to make comments in it?

It would be better to have a separate PR. Would it make sense to build a new Optional type as well and have it be based on each other?

@FeepingCreature
Copy link
Contributor

FeepingCreature commented Aug 20, 2018

Maybe VariantN should be the backing type in any case. It seems to solve a very similar problem. I don't know if it has open issues though.

edit: Ah, the flexibility of VariantN seems quite overkill. Looking at the implementation, I honestly have no clue why it needs to do things like encode operations in a serialized format. Maybe there's a need for a VariantN that's not quite as baroque?

@jmdavis
Copy link
Member

jmdavis commented Aug 20, 2018

Optional is not supposed to act like a pointer, in my opinion. Perhaps Nullable is.

Honestly, the way Nullable works is pretty much exactly what I'd expect from an Optional type except maybe for the issue of alias this. I really don't see any difference between a type emulating a pointer and one that just optionally holds a value.

Now, having a type that either holds a value or hold some sort of error type as proposed here is something different, but in its case, it basically holds a value regardless. It's just that it could be the target type or the error type.

Maybe VariantN should be the backing type in any case. I don't know if it has open issues.

Using std.variant does tend to cause subtle problems, but union has at least some of those problems on its own already. So, if the solution is going to contain a union, then maybe it would make more sense to use VariantN. Personally, when I've created types like Expected in the past, I've just used separate variables for the value and the error, since it's simpler and avoids the problems you typically get with unions. However, it does have the downside of making the struct take up more space. I don't know which approach is ultimately better.

@FeepingCreature
Copy link
Contributor

FeepingCreature commented Aug 20, 2018

I've just used separate variables for the value and the error

This runs into the problems that I've recently fixed in Nullable (by using a union 😄) involving structs whose .init state is not a valid type instance, where destruction of the sum type leads to a false destructor call on the field, possibly causing crashes. So I definitely recommend using a union, in order to convince DMD to let us manage the destruction of the contained value manually.

The Turducken post and the Nullable unittests may provide testcases.

edit: Note that the struct part of the wrapper may not be necessary, since I misunderstood how unions work. Try it out! edit: Yes, it is necessary, since moveEmplace refuses to emplace into an immutable variable otherwise.

edit: After looking at VariantN I personally recommend against any use of it going forward. It terrifies me.

@aliak00
Copy link
Contributor

aliak00 commented Aug 20, 2018

A union seems like a pretty good approach to back an Expected type - here's a dirty proof of concept based on SumType by @pbackus.

I really don't see any difference between a type emulating a pointer and one that just optionally holds a value.

  1. The semantics are completely different. Is "no value" a valid value? (this is a valid question for a pointer type, but not for optional). Who owns the value (valid for pointer, but not for an optional value type, barring indirections which applies to any value type). 2) safety - a pointer does not force you to check for presence in anyway. An optional, depending on how it's implemented, does. 3) intent - what does a pointer return mean? An optional is clear as day - ok this overlaps with 1.

While you can treat a pointer the same sometime, it becomes conventional and not enforceable.

@FeepingCreature
Copy link
Contributor

FeepingCreature commented Aug 20, 2018

here's a dirty proof of concept based on SumType by @pbackus.

I played with it a bit, and added the Turducken testcase: code. Seems to eat it fine.

@pbackus
Copy link
Contributor

pbackus commented Aug 28, 2018

I'm working on a fleshed-out version of Expected. Comments welcome (though it's still pretty rough).

@andralex
Copy link
Member

Just a note - returning either Nulltable or Optional from parsing functions makes it impossible to distinguish the cause of error when there could be more than one (e.g. end of stream vs. overflow). With Expected the result contains arbitrarily rich information in case of failure.

@nordlow
Copy link
Contributor Author

nordlow commented May 18, 2021

Just a note - returning either Nulltable or Optional from parsing functions makes it impossible to distinguish the cause of error when there could be more than one (e.g. end of stream vs. overflow). With Expected the result contains arbitrarily rich information in case of failure.

Is Expected different from Result?

@thewilsonator
Copy link
Contributor

Result is usually

struct Result(T) { T thing; bool error; } 

whereas expected is more like

struct Expected(T)
{
     union { T thing; struct { void* errorObject; TypeInfo ti;} }
     bool error;
}

@pbackus
Copy link
Contributor

pbackus commented May 18, 2021

Result is typically a union of an expected value and an error value (e.g. in Rust, F#, OCaml).

Expected(T, E) is just another name for Result(T, E); Expected(T) is equivalent to Result(T, E) with E fixed to some specific type, such as Exception.

Personally I would recommend against using the name "expected". The most common names for this type in functional-programming languages are "result" and "either".

@andralex
Copy link
Member

One issue with Result!(T, E) is that it commits to the error type, which in our case means we need to use heap allocation for the exception. I wish we had a few more options for the error case:

  • Exception
  • a lambda
  • perhaps a string describing the error

Wrt names I find Result is better than Either because the latter gives equal importance to the branches, whereas Result can be easier thought as "normal path and contingency path".

@nordlow
Copy link
Contributor Author

nordlow commented May 18, 2021

One issue with Result!(T, E) is that it commits to the error type, which in our case means we need to use heap allocation for the exception. I wish we had a few more options for the error case:

How does Result!(T, E) limit E to be a heap-allocated exception? Do you mean instead of using the more general

Result!(T, alias E)

?

@andralex
Copy link
Member

@nordlow I just meant it limits the error to one type. I think it would be better (without adding other types to Result) to allow not only allocated exceptions but lambdas that throw exceptions and strings from which exceptions can be constructed.

In brief: Result!(T, Exception) allocates the exception eagerly even if not thrown. A better approach would be Result!T that makes the exception construction lazy as well.

@pbackus
Copy link
Contributor

pbackus commented May 18, 2021

@andralex If we want to store information from which an exception can be constructed, rather than an actual Exception, then there's no reason to use Exception as the error type. Instead, we can use the type of that information directly; e.g.,

alias ExceptionArgs = Tuple!(string, "msg", string, "file", size_t, "line");

Result!(T, ExceptionArgs) fun(...)
{
    // ...
    if (ok)
        return typeof(return)(T(...));
    else
        return typeof(return)(ExceptionArgs("no good", __FILE__, __LINE__));
}

There's plenty of room for polish here, of course, but the basic approach ought to work.

@andralex
Copy link
Member

@pbackus understood, though exposing the arguments all the way to the user-facing API is not really nice (what if you change them later etc).

Users should have a simple mechanism to get some uniform Result!T type they can they inspect - is this a T, and if not, what was the problem? Give me an integral error code, or a string description, or an Exception object I can throw around.

@nordlow
Copy link
Contributor Author

nordlow commented May 19, 2021

Users should have a simple mechanism to get some uniform Result!T type they can they inspect - is this a T, and if not, what was the problem? Give me an integral error code, or a string description, or an Exception object I can throw around.

Note that SIL's Result type by design (currently) doesn't take the error type as a template parameter because of the same reason.

@pbackus
Copy link
Contributor

pbackus commented May 19, 2021

This sounds like a mechanism-vs-policy issue. Certainly it would be ideal if we could live in a world where all functions reported errors in some uniform way, and it might be a good idea to adopt such a policy for Phobos, but we do not live in such a world, and I do not think that a general-purpose Result implementation is the right place to encode such a policy. Rather, Result should provide the mechanism, and leave the policy to its users.

@LightBender LightBender added the Review:Phantom Zone Has value/information for future work, but closed for now label Oct 27, 2024
@LightBender
Copy link
Contributor

This idea would be better pursued as a low-level API design for Phobos 3.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Merge:Needs Rebase Merge:stalled Review:Phantom Zone Has value/information for future work, but closed for now
Projects
None yet
Development

Successfully merging this pull request may close these issues.