-
-
Notifications
You must be signed in to change notification settings - Fork 740
Refactor std.range.chain to reduce symbol size bloat. #5611
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
See PR dlang#5610 for further explanation.
Thanks for your pull request, @quickfur! Bugzilla referencesYour PR doesn't reference any Bugzilla issue. If your PR contains non-trivial changes, please reference a Bugzilla issue or create a manual changelog. |
cc @rainers As a matter of strategy: we want to improve implementation to make Voldemort types more convivial. Only if that is not practical we should start a refactoring of the standard library. |
Actually, Voldemort types have other issues that have already forced the refactoring of several Phobos algorithms, such as One issue that I can think of off the top of my head is that functions with In general, I agree with your sentiment that we need to improve the implementation of Voldemort types... but making them usable seems a long way away (Rainers' PR has been rotting in the queue for almost a year now, and as I mentioned there are other related issues that have already been worked around in Phobos), and I need my code to work now. I find it hard to swallow, for example, that my code is only ~8000 lines (a good part of which are comments and unittests, mind you), yet I'm already running into symbol sizes so huge they are overflowing dmd's symbol size limit and causing linker errors (segfaults and other ELF errors). Where would larger projects be, once they start buying into the range-based idiom we're promoting? So I'd say, also as a matter of strategy, that it's a wiser idea to put in the current stop-gap measures to make Phobos at least nearer to usable first, and then clean things up once we actually have some hope of making Voldemort types work as they ought to. |
@rainers has been actively working on this and essentially we'll do what it takes to get his changes in, so let's hear him out. cc @WalterBright @MartinNowak |
I have to agree with quickfur. From my perspective, the range piping idiom is such a huge selling point of D that either the DMD PR gets priority or we start adding fixes to Phobos. The benefit of PRs like this is that can be easily reverted after the DMD PR gets pulled. |
P.S. As an addendum, @schveiguy's "horcrux" technique actually retains basically all of the benefits of Voldemort types, without its drawbacks:
Given all of the above points, I'd say this is a win-win situation. The only "loss" is the ego of discovering that Voldemort types are perhaps not the best of idioms they first appeared to be. But the bright side of that is, horcrux types may turn out to be the better idiom that we're looking for. ;-) |
One possibility is to have the compiler do the same transformation automatically, and use the same name for mangling that results from the use of the template. Voldemort types are the path of least resistance of achieving a need in a principled manner: the function defines a type and returns a value of it. The template is a mechanical trick that obscures intent rather than clarifying it. We need to make implementation better. |
I think the patch from @rainers is imminent. I kind of want to wait until that's merged, because I still think we can get some savings from this PR, and I want to see how much. It might still be worth merging this, even with the mangling changes. But at the same time, this changes 0 documentation, and can be done orthogonal to the mangling update (and will fix a bunch of problems today). This PR has no conflicts, whereas the mangling update may break some code, and has lots of hairy issues. The downside is the readability of the source -- this is harder to understand than straight-up function templates. I'm mixed on whether this should be merged. If this were last year when the mangling was a proof-of-concept, I'd say do it now. If we do do this, I'd say reorder the declaration so the |
As long as the struct doesn't use compile-time definitions from inside the function, this should be trivial. Should be the majority of cases since most wrapper things like this just spend most of the function defining the range struct, and then the implementation is to return one. I have the same experience in iopipe. Would be a nice project. |
I've wondered about that ever since Steven first brought up the issues with Voldemort types. I would have thought that the obvious solution would be to transform them into the other at least as far as mangling goes, but I figured that there was something I was missing that made that not work. Still, it at least seems like if all it takes to fix the mangling problem is to move the type out of the function that it would be possible to do it automatically. |
You know, I'm actually thinking we don't even need to require this for the transformation. It's really just a guarantee that the name doesn't conflict with another name (i.e. the symbol is unique). I think all we require is that there is NOTHING in the template namespace besides the eponymous function, and then we can just simplify the symbol without having to pretend it's outside the function itself. I think you could even have non-static structs be named better. I bet the DMD changes wouldn't even be that hard. Update: Hm... there could be multiple template function overloads. In that case, there would be a conflict. So I'm not sure it works in all cases. |
I tested, and actually it doesn't work at all. Redefining "Result" inside another identical template (but with different runtime parameters) is completely ignored: template foo(T)
{
struct Result
{
int x;
}
auto foo(T t)
{
return Result(1);
}
}
template foo(T)
{
struct Result
{
long x;
}
auto foo(T t, int y)
{
//return Result(long(y) << 32L); // error! thinks I'm referring to Result in first overload!
return Result(y);
}
}
void main()
{
import std.stdio;
auto s1 = foo("hi");
auto s2 = foo("hi", 3);
static assert(s1.sizeof == s2.sizeof);
static assert(typeof(s1).mangleof == typeof(s2).mangleof);
} I think there's a bug here somewhere. Either the second |
@schveiguy said:
Sure, that's trivial to do. I kinda like moving Result to the bottom, though, so that it's more clear that this is an eponymous template. Besides, Result is an implementation detail that needn't occupy the prime real estate right after the template declaration. But that's just a matter of taste, I suppose. |
Quoting @UplinkCoder from the forum:
While of course it would be nice to do this stuff automatically in the compiler, applying the horcrux trick to the critical bits will have a huge impact now.
I can't agree more, but whenever sth. like this pops up, we should declare one way as standard in the style guide and stop wasting time on such matters. In some cases we are even able to automatically enforce a specific style. |
IMO the new mangling is ready for prime time. It would be great if someone actually hurt by the huge symbols could build dmd with dlang/dmd#5855 and the runtime with dlang/druntime#1830 and see how it works out. |
I once tried to make it so that chain(a, chain(b, c)) would be inferred as chain(a, b, c) but did not manage to. I believe that if somebody manages to do that it could also shorten the symbols quite a bit. |
Now that dlang/dmd#5855 has been pulled, what's the process for getting it into master so that this patch isn't needed? |
@JackStouffer Not so fast; that pull was only merged into a special branch. It hasn't landed in master yet. That won't be until dlang/dmd#6998 is merged. |
Right
|
Hehe ensuring that it doesn't cause performance regressions or other issues. Otherwise I think everyone is in favor of this PR and it looks like we might see this PR getting merged quite soon. It could even be this weekend :) |
Adding the blocked label because it's still unclear what the consensus is regarding pulling this vs getting the new mangling in master |
The two are not mutually exclusive. I'd like to see the effect of both. |
Finally, dlang/dmd#6998 has merged into master! I did a quick test with git HEAD dmd/druntime/phobos on my range-heavy project. Here are the executable sizes, compared with the sizes compiled with a local branch of Phobos containing this PR and #5610:
The difference is 57192 bytes and 46824 bytes, respectively. Percentage-wise, that's a pretty small improvement, but in terms of size, that's an average of 52 KB savings just in symbol sizes alone; the code is exactly the same. I'll leave it to you guys to decide whether or not this is worthwhile. :D |
Hmm, seems like the transformation that Andrei mentioned will still give us some savings then. Here's the things blocking this in my mind:
|
From @JackStouffer:
I agree on this. As the Voldemort pattern is frequently used in Phobos and other libraries, I don't think we should go on the road of addressing this problem per function, but once in the compiler for all symbols. If people still complain about the symbol size after using rainer's great backreference mangling, we could also look into going the "ugly" road for the most important symbols. What do you think/vote? Merge or close? |
Is this still necessary now that the symbol backref patch has been merged? |
It's not necessary, the question is, is it worth the incremental improvement? I'd also much rather see the compiler do this, as it would immediately generate lots of savings all places voldemort types are used. I would say let's close this, especially since it only addresses one function. At the same time, I reopened the enhancement request that I closed, noting the issues involved. If someone wants to tackle that, they can: https://issues.dlang.org/show_bug.cgi?id=17379 |
Even with the symbol backref patch, this change still does give some amount of savings. Not that much anymore, but still some. But yeah, it's probably better to do it once in the compiler rather than many times for each function in Phobos. And then for each function in user code. |
See PR #5610 for further explanation.
This is another commonly-used range function that causes symbol size bloat. This fix alone caused a 28-30% reduction in executable size in my project.