-
-
Notifications
You must be signed in to change notification settings - Fork 5.6k
Add support for namespaced method call #25052
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Interesting approach. Could we also make |
Although I'm not sure I understand how |
You mean making the call implicit? I think that's too "matlab".....
Because it's an example of a function that operate on the object. As explained in the comment, it is true that the current approach does work with functions defined on abstract type yet though it will just work with base functions. |
Mentioning this plan during the very long discussion of that change would have been helpful. |
I do think this is a problem that needs a solution and had pondered and occasionally suggested similar solutions. I'd prefer the
On the whole, I'd prefer to think through what we might need to reserve in order to make this possible to opt into in the future. |
Just a check, is this effectively something similar to https://en.wikipedia.org/wiki/Uniform_Function_Call_Syntax? |
💯 |
@KristofferC, it's similar as far as I can tell, but there's the additional namespacing consideration. In Julia at least (and maybe some of these other languages), you can't just say that it makes |
I did. #16307 (comment)
As mentioned above, I'm fine with that too as long as the breaking part is merged. I also explained why I kind of prefer a distinct syntax than making it look similar to field access. It really isn't field access after all.
That's exactly the argument about making |
Not just that. |
I hugely dislike this. It basically splits the system into two, where you need to constantly wonder which style of function call to use. This doesn't really solve the namespace problem; rather it adds a different kind of dispatch that has a different kind of namespace problem. It doesn't work for operators and it doesn't work for functions dispatched on a non-first argument. There are cases where this feature seems nice, but I believe there is a net increase in complexity and confusion. |
Based on the wide availability of this feature in other languages and the existence of namespace in julia, I disagree.
It doesn't add a different kind of dispatch, nothing more than allowing modules to have/be their own namespace that we have now already. I don't see what new namespace problem this add since all what this does is to remove cases where import (namespace) is needed so it'll have strictly less namespace problem. Any namespace issue on the object's namespace would be exactly the same as the namespace issue of the corresponding module and will never be a different kind/new problem.
Correct. I'm not saying it does. That's why the
Also, to
after reading that wiki page, no and not even a little bit. IIUC that concept is about making sth like |
I'm somewhat sympathetic to this feature, but @JeffBezanson's concerns are very legit and I really don't think we should rush this sort of thing. One observation: if we ever add generic function merging, it also addresses this problem and in many ways in a better way than this does. |
No, that will not solve conflicts with local variables or other non-function objects. |
For those we have module prefixing. |
And that's exactly the problem this solves, a syntax sugar for just that. See the "instead of" part in the very first post. |
Overall I love the symmetry of Julia's dispatch rules. While I understand some of the attractions of this feature, to me the symmetry is not something to give up on lightly. |
You don't need to though. This is purely a syntax sugar to change namespace lookup and it doesn't affect dispatch at all. You can still use the same dispatch on this function and they behave exactly the same as normal dispatch. julia> struct A
end
julia> struct B
end
julia> f(::A) = 1
f (generic function with 1 method)
julia> f(::B) = 2
f (generic function with 2 methods)
julia> A()-->f()
1
julia> B()-->f()
2 In fact, adding custom dispatch to the lookup is invalid under the current implementation. As such, it doesn't add any asymmetry or complexity more than allowing the same name to be bound to different functions/objects in different modules. |
As Yichao just posted, this adds a new namespace/scope and doesn't use dispatch. But whether this "solves" the problem is certainly a fair question, regardless of the nomenclature.
The advantage here is that it divides it according to a namespacing question, while leaving dispatch unchanged. With this PR, you can always use the existing variable scope to call the same function. I'm not sure if I saw Yichao emphasizing this above, but one of the goals of this PR is to dissuade people from using |
|
triage: Not for 1.0 |
And types too.
So just to merge the breaking part would be good enough, as I said in the top post. |
4b30685
to
6ad6b34
Compare
It's true that this leaves the current dispatch system alone, but my objection is that it adds a new single dispatch OO system. The module of an object's type is used as the class pointer, which is used for dispatch. It's implemented using something we previously called a namespace lookup, but since the namespace lookup is hidden inside a type-based lookup, in reality it's a method lookup. Another way to look at it is that currently, which module a function or type is defined in is just a matter of code organization and API organization. But with this feature, it's semantically significant. Here's an example scenario. User code has a bunch of calls to @yuyichao I know you didn't claim that this works for operators or non-first arguments; my point is that a true "solution" would need to handle everything. As I said, this doesn't solve a problem for julia, but rather introduces a different, non-julia object system that doesn't have the problem. I claim that in the long run, having two object systems is worse than having one with a slight problem (introduced by its more powerful model). I don't want my code to be a random mix of |
No it's not, not more than looking up the function in the wrong namespace won't return the right thing like what we have now. The user always need to know the namespace the function needs to be found and properly use import/qualified name for that. With this, the module the function and type defined in is still insignificant, it is the implementation detail/an organization decision for the types' author how and what function needs to be brought into the scope of the type's namespace.
As any API for the type, it needs to be well defined and documented. It's unclear what scenario you are talking about but here's a few guesses,
So I don't see why this is true. The availability of a method call can be relied on as much and as little as the document. Nothing different from what we have in every other cases.
There's nothing unsafe about it, there's no where in this that says that a function that takes the object as the first argument has to be called with the new syntax. If you don't like it and don't want your functions to be possibly called this way, don't put them in the same module. If you want to use module to further organize the functions but still want to make them callable this way, simply import them into the right module. It is a new feature so it'll obviously affect the best practice to do things but nothing more than that. Finally, I should point out that every single thing mentioned above about reliability apply directly to
And I claim that I'm introducing a standard way for doing what can be done in a more messy way anyway that will have more problems. It is also as julia as I think it can be. In it's current form, it doesn't introduce a new namespace, it doesn't introduce new dispatch rules and after the lookup dispatch works exactly the same they would work otherwise and that can even be improved if function defined on abstract type can be handled better. It's entirely built on the current dispatch system and can't work without it. It is not even hiding anything about it (everything about dispatch works exactly the same after the lookup, you can even get method error due to mismatch on the first argument). As such I will definitely not call this a completely different non-julia object system.
And the point here is not how powerful it is. It is true that the current dispatch system can mimic everything that can be done elsewhere. However, from code organization and other constraints, it is clear that it can sometimes be not as convenient to actually express those. That's why, as mentioned above, this is strictly only a syntax sugar for a name lookup on top of dispatch and is intentionally made not-powerful so that it won't fight or replace dispatch. I would not like this syntax to be sensitive to anything that's not equivalent to a name lookup in the types namespace. The namespace is AFAICT a fundamental issue. Nothing else mentioned in this thread or anywhere else is even close at resolving that. It is certainly not a big problem in base but I've been running into the issue in packages over and over again. |
And yet another way to see this is that since the current system is powerful enough to support what may look like a single dispatch without introducing it. Or this is basically a prove that the less powerful use case is indeed covered and can also be done in real code easily. If you want to call any "type based lookup" method lookup or dispatch, I guess I can't argue with that since a type sensitive lookup is the core of the problem. But I think that's too broad a definition and other than that, there's a few more evidents (or maybe at least a few different prospective of the same property) that this is not a single dispatch or a new dispatch system by any mean.
I'll also highlight again that this does not enable you to do anything that you can't possibly do today, only to make doing something easier. This alone already imply that it's a feature you can use if it makes your code easier to use and ignore if not. Combined with the fact that this is useful for many types (i.e. the positive effect is non-zero) and that even the sugar part will be implementable in a non-standard way (i.e. the negative effect all exists without this so there's no new negative effect), this should not encourage more misuse than what will be there otherwise and it should have a net positive effect overall. |
6ad6b34
to
617300e
Compare
The big-picture view of this is that you're advocating for julia code to contain a random mix of method call styles. It makes the language uglier and harder to explain. In fact I don't think there is any crisp explanation for when to use the other style. I don't know how to identify functions that "operate on an object". Every function operates on an object. And since method calls are very nearly the only thing in the language, some vague guidelines won't cut it. The bar for changing something so fundamental is way higher. I don't understand how
"After the lookup" is much more significant than it sounds at first --- this is a double dispatch system, where first one kind of dispatch is used and then another. I understand the argument that it's possible to do things like this anyway, so we might as well have a standard way to do it. However, it doesn't seem likely to me that lots of packages will start abusing dot overloading (or other syntax) to provide |
No, I intend to tell people they can write |
How do you know when to write that? The expression basically means "call the |
I don't see how random this is. I'm not advocating a style. If anything, I said that the new feature should be used only when it is useful so any case that's ambiguous should stay the same.
See above. And also as I explained in the concrete case above in #25052 (comment). The primary target of this feature is when you know that this is an API of the type. And as I also said in the same comment, I agree the current implementation does not make propagating this API from an abstract type easy but it can still be done and can be improved in future version. As that is improved, the usecase of this will expand and the way to decide when to write that will also change.
That's why I mentioned in the top post that the
I'll not comment on if
Practically I don't find that to be an issue at all, see above for current limitation and what we can allow/encourage later.
I disagree,
As I said, if any lookup that's automatically based on the type is called dispatch then sure you can call it that way. However, I don't really see what practical significance it has. It's a dispatch that doesn't allow customization. It's evaluating a single expression ( |
I don't see how you can say that. If this feature gets used, julia code will contain a mix of The specific
Unfortunately language design doesn't work that way. Adding features has a complexity cost that is hard to contain. This is also nearly a tautology: if features are only used in ways that cause no problems, then it's not possible to argue that any feature is bad. We might as well just add every possible feature.
Ok, I can imagine a package documenting that its functions should be called with |
What I mean is that there are some cases that the new syntax is unambiguously better. I'm not saying any case that's not clear enough should switch.
I agree, and that's exactly why I've been repeating about the constraint I want to have on such a feature. I want it to be equivalent to a namespace lookup, not customizable other than that, and be a context independent evaluation of a single operator. Not all possible features has this property.
The safe answer now for any when should I write |
And also to clarify more, In terms of when I want to have this, I agree this doesn't have to be in 1.0. As long as it (along with the change in style recommendation) can be added without breakage I think this can be polished more before it is merged, which is the original plan anyway. (And the recommendation change is more about the final version with improved support for abstract type, the guide for the current version will mainly affect functions for leaftypes. For functions on abstract it's only a heads up for how to make those more compatible with future changes) In terms of complexity, there are two aspects that I can think of,
I would in general agree with
but by making sure it can be treated completely independently by both the frontend and the runtime, the complexity is entirely contained. This is actually a much harder requirement on a (low level) feature and a sufficient one for containing the complexity so it does not contradict with the hardness of adding a feature with contained complexity. |
I would be ok with reserving the
or even like the following, if we allow continuing a pipeline when the following line starts with
I know that you may say that this proposal already addresses that use case, but I will preemptively say that nice syntax for data pipelining and whether |
Just gathering some empirical evidence for when popular packages could use this to solve namespacing issues. Let's look at the top however many:
This analysis highlights a few things. First of all, the things that clearly benefit from this syntax are most data-oriented (Distributions.jl, DataFrames.jl). However, it doesn't seem to capture most of the namespace pollution. A lot of these libraries have their namespace pollution in terms of their algorithm types. DiffEq has a few hundred algorithms yet always calls And none of this addresses the issue with To me, this points out that the biggest problems with namespace pollution is the amount of short names for the types, not the functions on which they are called. Solving this seems like it would require two things:
Altogether, looking at the most starred Julia packages, this new syntax doesn't seem to really address the real issue, and when it does, that issue is better addressed in other ways. Thus if thinking about it in a cost vs benefit sense, I don't think that it gives a major advantage while it would reduce the readability of cases where it exists (or at least, it would make the syntax different). |
Those not the entire set of use cases this had in mind. I would say the primary goal was to assist with places that need getter/setter-like APIs, where there is a soft distinction between "the API of the type provided by the package" and "functions you can call on it (provided elsewhere)". Some places this might be useful include: HTTP.jl ( This can be beneficial anytime it makes more sense to use the scope from the object to find the method. And is not a big feature since it doesn't alter dispatch, just name-binding-resolution in the scope. (For the DifferentialEquations.jl example, |
I'm getting a bit far from the OP, but please note that I already find piping visually too similar to anonymous functions, making it hard to decode: data |> x -> map(f, x) |> x -> groupby(by, g, x) |> x -> reduce(h, x)
data --> x -> map(f, x) --> x -> groupby(by, g, x) --> x -> reduce(h, x)
data |> map(f, _) |> groupby(by, g, _) |> reduce(h, _) # currying, for comparison's sake The second is even worse, IMO. (Of course, users can solve this with extra carriage returns). |
@vtjnash, you've really hit on all the examples where this syntax would makes us look ridiculous:
Telling people coming from OO languages that you write |
Definitely—had not thought of that. |
Sorry, I feel like I've missed something, because I'm confused what this solves that #24960 does not. If you want method namespacing on object classes like C++, wouldn't you just have Beyond that, this PR seems to add unnecessary additional syntax and complexity (but like I said, I may have missed something important). |
AFAIK the
|
AFAICT, Jameson's PR is already quite friendly to hacky attempts at OOP, minus the vtable version of dynamic dispatch. (But maybe I'm wrong... I'll have to try it out more fully to confirm that). |
I think this is either a straw man, or you got misled by the PR description somewhere. This expression is simply a syntax error. The proposed syntax of this PR follows roughly the same syntax patterns as For a data pipeline, there could be a package providing eager(data) --> map(f) --> groupby(by, g) --> reduce(h) |
No that's a completely different issue. AFAICT what you have still require importing all those names.
As Jameson said, this is useful for any libraries that use OOP style in other languages, this include basically all GUI toolkit and many other operations that has a "main object" to operate on (yes this is vague but it's always pretty easy to decide that for each cases). There's no way this can solve all the problems in all those packages, and FWIW some of the problems mentioned aren't even related. Many of the packages from the list you give are obviously packages that fit the current julia syntax very well particularly because they don't fit the OOP method well and so they are not implemented nicely in other language. I've certainly opt to do things in other language since the syntax fix the problem so much better. In another word, the list you have is a biased selection that does not include usecases that julia doesn't have good support for.
Honestly I don't care what people from OO languages like. I only care about being able to do the same thing. I don't care about people not very familiar with the real advantage of
Bringing that up is not helpful. What's needed is explanation as Jeff gave above and I'm arguing against.
Correct, it's a PR not designed for it but is powerful enough to allow people to write bad implementations. As mentioned above many times, that is exactly why a more standard way is better. |
@vtjnash To clarify, I was merely responding to #25052 (comment) (Stefan's suggestion of I do like the idea of being able to switch between eager and lazy like that :) |
The "standard" OOP way is that member methods are namespaced by types and classes (or even objects). The "standard" Julia / procedural / functional programming way is to use functions namespaced by modules. This PR seems to be syntax sugar emulating an entirely new hybrid of the two. |
I kind of agree. (Though I wouldn't attribute namespace to julia/functional/etc languages) but as I said above, it simply relate types and functions through the module, which is a concept already exist in julia so I don't see much issue with it. I should also note that this idea was not proposed by me and I had the same thought but I now think it is a pretty safe fallback (mentioned in the top post) and it maintains the syntax for how functions are defined which is a property that I'd like to maintain. |
I don't get it. When I know the type of I know this isn't specific to Another way I look at this is to compare how to address the namespace problem before and after this change.
After:
I don't see a clear win here. |
This is only a limitation of the current implementation so that's not really a fundamental issue.
Yes.
Package shouldn't have a function called finalizer. It's not needed in most cases (including the target case) for
This is wrong to start with. As I hope that I have made this very clear, this is for OOP patterns. This means that, even after abstract type support is improved, I would not recommend trying to use this to fix all namespace problems. There are namespace problems that I identified above as not possible to have a similar solution and this is never intended to solve those. Also, there's no need to collaborate in this case, since it belongs to your own namespace.
Correct.
So no. The comparison seems to be entirely based on asking this to solve all tangentially related problems. It is not and I don't think it's even possible. I'll repeat that this is purely a syntax sugar to do a simple syntax transformation (i.e. |
I'm gonna close this. I think it's fair to say that it won't land in its current form and any such language feature will require a fair amount of design work for which this issue is probably not the right place. It'll still be here as a reference of course. |
This patch add preliminary support for calling a function (method in OOB sense)
without having to export the function and therefore solves the namespace
issue for end-users, i.e. one can write
instead of
It doesn't solve the issue for abstract types or implementer of a concrete
extending an abstract type. All the functions that belongs to the
abstract type still need to be imported in order for them to work on the
concrete type. (See below)
Breakage/impact
With this change, the recommendation argument order for functions
that is operation on an object
and happens to take a callable argument is changed.
As mentioned elsewhere,
finalizer
is the only ambiguous caseand can't be changed in a backward compatible way
when the recommendation changes which is why this has to be done now.
If the
finalizer
change was not done initially,this feature can be added later as a non-breaking change
in some way, which was my original plan.
Parsing of
-->
The selection of the current syntax is breaking for macro writers.
There are ways to do this without any such breakage but those
may require more testing in real use to know whether they are
good/useful for the users. (See below.)
This also means that only the first commit and the parser change
need to be in before the release. The actual implementation
of the feature can be added later if needed.
Syntax
There are a few different new and old features that require similar
syntax (i.e. automatic quoting of the RHS of the operator)
The list and their properties/requirements
getfield
Or the operation that actually returns the field of a composit type
This will not be overloadable (or need to be accessed by the
overloaded version).
Accessing method that are bound to object without export (this PR)
This doesn't need overloading either.
getproperty
Or the overloadable version of
getfield
gepfield
For mutation embeded immutable. Should also be overloadable
The first two are (low level) operations to provide certain features
while the last two are syntax sugars to give better syntax for an
otherwise existing feature (4 is kind of in between,
it adds a new low level operation but the user facing API/syntax
can have user defined meaning). The separation suggests that
the first two could use the same syntax while the latter two could
use another syntax, given that, IMHO, the overload for 3 and 4
generally shouldn't have overlapping meanings.
This would mean using
.
forgetfield
and this PR while using@
for the overloadable version to mean the overloadable
gepfield
by default.However, I kind of doubt that everyone would be happy with the syntax
and I would agree such an decision should not be made in a rush.
Therefore, given that I don't really care what syntax this PR uses,
(as long as it provide the per-object/type namespace) I just pick
one of the unused syntactic operators to minimize the change
as well as the inteference with other features.
Changing the operator on top of this can easily be done as long as
we keep the old one in the same major release.
OTOH, I do like that the method call is visually distinct from
normal getfield. In my experience, it almost never help to give the two
the same syntax and in rare case where it does (like swaping out a
method definition for an instance in python), it can still be supported
by going through the overloadable version.
To conclude, the alternative syntax I've considerred are,
Using
@
No fundamental difference from
-->
. Closer to MATLAB syntax.Using
.
Or as
getfield
internal.This is basically what python have.
I kind of feel like it's confusing sometimes but YMMV.
Using
.
but only ina.b(...)
The parser change is actually surprisingly easy.
A special case is needed for
Module
(included in current impl).I don't really like it but it's an option and it's not that hard
to explain. It will make it harder to access the closure
version of the function but we have anonymous function for that
already...
Semantics
The whole point of this feature is namespace (avoiding importing/
namespace pollution). There are at least two aspect of it,
without having to import them (edit: or use fully qualified names).
functions defined for the abstract type shouldn't require
importing either. (edit: use of fully qualified name is less of an issue here since this should happen less often(see below))
1 is the most important case and is what this implementation solve.
Implementing 2 and making it inferable is harder but since there's
hopefully fewer creator of types than the users of them, it's less
important than 1 (p.s. if this is not ture, it implies that
some types are not being used and should probably be deleted...)
Additionally, a significant fraction of abstract types that are not defined
in the same module as concrete types are base abstract types.
As long as we keep the function exported, those will just work.
This leave the question of what namespace to use for the type.
The simplest solution and what used in the current implementation is
the module (which is our only namespace) the type is defined in.
It does not provide 2 naturally but is the simplest and safest fallback.
Extensions can be added and I'm pretty convinced that those should
generally not interfere with a fallback/default like this.
The current implementation also only concerns function calls.
Function definition syntax can also be added later as part of 2.
Two types can be special
Module
This is kind of only needed if we want to use
.
.In principle not needed for currently implementation
but I decide to keep it to be more forward compatible.
It's unlikely going to cause much issue in practice.
DataType
The obvious alternative meaning would be to access the
function for the given type. The special case not included
only because the current implementation was based on using
.
syntax before I decide
-->
is a better one for now.An error for it can easily be added for now to make future
extension easier.
Implementation
It is true that this can be implemented by the user with
getfield
overload.However, it's non-trivial, requires accessing internal data structure
and will not have the same semantics for all possible implementations.
The
getfield
overload feature is basically not designed for thisusecase since this (the lookup) is static in nature.
Having a base implementation will make sure this is available
for all types, making it easier to reason about.
The current version can almost certainly be implemented with a normal
function. It is implemented in it's current form since the low level
implementation was done before such implementation is inferxrable and
it is kept this way due to the uncertainty about the desired future
semantics which may need special typeinf support again.
Given the current selection of syntax, I also want to avoid adding
another overloadable syntax before we believe it's really necessary...
There's a special case for
Core
type which is needed so they canaccess functions that are otherwise not defined in their module.